Freddie

This is a computer program that simulates a Data Analyst chatbot, named Freddie, capable of processing user input and returning desired output following the rules and directions in the script. The chatbot can help identify your computer’s operating system (OS), set the input and output paths, read input data stored into the memory of your computer, provide descriptive statistics for key variables in the analysis, and finally, run a linear regression model of your choice. Check a sample output here.

################################# NY HOSPITALS #################################

Files: cleaning.py,hospitalsNY.py,input.py, main.py,path.py,regression.py,statistics.py,test_2.py,test.py

Author: Naixin Zhang Email: nzhang228@wisc.edu

############################### OUTSIDE HELP CREDITS ###########################

Online sources:

1.Importing csv from a subdirectory in Python https://stackoverflow.com/questions/10235752/importing-csv-from-a-subdirectory-in-python

2.How to open my files in data_folder with pandas using relative path? https://stackoverflow.com/questions/35384358/how-to-open-my-files-in-data-folder-with-pandas-using-relative-path

3.how to join path https://stackoverflow.com/questions/17438027/os-path-join-and-os-path-normpath-both-add-double-backwards-slash-on-windows os.path.join() and os.path.normpath() both add double backwards slash on windows

4.get the drive letter https://docs.python.org/2/library/os.path.html

5.get the users system https://docs.python.org/2/library/platform.html

6.how to drop na using panda https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html

7.remove missing values https://towardsdatascience.com/data-cleaning-with-python-and-pandas-detecting-missing-values-3e9c6ebcf78b

8.Finding outliers in dataset using python https://medium.com/datadriveninvestor/finding-outliers-in-dataset-using-python-efc3fce6ce32

9.using pretty table for drawing http://zetcode.com/python/prettytable/

10.pandas.DataFrame.align¶ https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.align.html

  1. pandas.concat

    https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

12.compare two data series using this Matplotlib code: https://pythonspot.com/matplotlib-bar-chart/

13.using tabulate https://pypi.org/project/tabulate/

14.draw for scatter https://pythonspot.com/matplotlib-scatterplot/

15.for data splitting into training and testing dataset sklearn.model_selection.train_test_split https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

16.Linear Regression Example https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html


  TOC