CS301_Proj3

CS301_Proj3

Description

The City of Madison has many different agencies providing a variety of services. In this project, you’ll analyze real spending data from 2015 to 2018 for five of the largest agencies: police, fire, streets, library, and parks. You’ll get practice calling functions from a project module, which we’ll provide, and practice writing your own functions.

Start by downloading project.py, test.py and madison.csv. Double check that these files don’t get renamed by your browser (by running ls in the terminal from your p3 project directory). You’ll do all your work in a new main.ipynb notebook that you’ll create and hand in when you’re done (please do not write your functions in a separate .py file). You’ll test as usual by running python test.py (or similar, depending on your laptop setup). Before handing in, please put the project, submitter, and partner info in a comment in the first cell, in the same format you used for previous projects (please continue doing so for all projects this semester).

We won’t explain how to use the project module here (the code is in the project.py file). The lab this week is designed to teach you how it works, so be sure to do the lab from home (if you missed it) before starting the project.

This project consists of writing code to answer 20 questions. If you’re answering a particular question in a cell in your notebook, you need to put a comment in the cell so we know what you’re answering. For example, if you’re answering question 13, the first line of your cell should contain #q13.

Dataset

The data looks like this:

agency_id agency 2015 2016 2017 2018
5 police 68.06346877 71.32575615000002 73.24794765999998 77.87553504
6 fire 49.73757877 51.96834048 53.14405332 55.215007260000014
9 library 16.96543425 18.12552139 19.13634773 19.845065799999997
12 parks 18.371421039999998 19.159243279999995 19.316837019999994 19.7607100000000
15 streets 25.368879940000006 28.2286218 26.655754419999994 27.798933740000003

The dataset is in the madison.csv file. We’ll learn about CSV files later in the semester. For now, you should know this about them:

  • it’s easy to create them by exporting from Excel
  • it’s easy to use them in Python programs
  • we’ll give you a project.py module to help you extract data from CSV files until we teach you to do it directly yourself

All the numbers in the dataset are in millions of dollars. Answer questions in millions of dollars unless we specify otherwise.

# project: p3
# submitter: naixinzhang
# partner: none
import project
project.init("madison.csv")
streets_id = project.get_id("streets")
police_id = project.get_id("police")
fire_id = project.get_id("fire")
library_id = project.get_id("library")
parks_id = project.get_id("parks")
#q1 What is the agency ID of the parks agency?
parks_id
12
#q2 How much did the agency with ID 6 spend in 2018?
project.get_spending(6, 2018)
55.215007260000014
#q3 How much did "streets" spend in 2017?
project.get_spending(streets_id, 2017)
26.655754419999994
#Function 1: year_max(year)
def year_max(year):
    # grab the spending by each agency in the given year
    police_spending = project.get_spending(project.get_id("police"), year)
    fire_spending = project.get_spending(project.get_id("fire"), year)
    library_spending = project.get_spending(project.get_id("library"), year)
    parks_spending = project.get_spending(project.get_id("parks"), year)
    streets_spending = project.get_spending(project.get_id("streets"), year)

    # use builtin max function to get the largest of the five values
    return max(police_spending, fire_spending, library_spending, parks_spending, streets_spending)
#q4 What was the most spent by a single agency in 2015?
year_max(2015)
68.06346877
#q5 What was the most spent by a single agency in 2018?
year_max(2018)
77.87553504
# Function 2: agency_min(agency)
def agency_min(agency):
    agency_id = project.get_id(agency)
    y15 = project.get_spending(agency_id, 2015)
    y16 = project.get_spending(agency_id, 2016)
    # grab the other years
    y17 = project.get_spending(agency_id, 2017)
    y18 = project.get_spending(agency_id, 2018)

    # use the min function (similar to the max function)
    # to get the minimum across the four years, and return
    # that value
    return min(y15, y16, y17, y18)
#q6 What was the least the police ever spent in a year?
agency_min(agency = 'police')
68.06346877
#q7 What was the least that library ever spent in a year?
agency_min(agency = 'library')
16.96543425
#q8 What was the least that parks ever spent in a year?
agency_min(agency = 'parks')
18.371421039999998
#Function 3: agency_avg(agency)

def agency_avg(agency):
    agency_id = project.get_id(agency)
    y15 = project.get_spending(agency_id, 2015)
    y16 = project.get_spending(agency_id, 2016)
    y17 = project.get_spending(agency_id, 2017)
    y18 = project.get_spending(agency_id, 2018)
    num = [y15, y16, y17, y18]
    return sum(num)/4
#q9 How much is spent per year on streets, on average?
agency_avg(agency='streets')
27.013047475
#q10 How much is spent per year on fire, on average?
agency_avg(agency='fire')
52.5162449575
#q11 How much did the police spend above their average in 2018?
y18 = project.get_spending(police_id, 2018)
average = agency_avg(agency = 'police')
(y18 -average) / average * 100
7.224961934351909
# Function 4: change_per_year(agency, start_year=2015, end_year=2018)
def change_per_year(agency, start_year= 2015, end_year = 2018):
    agency_id = project.get_id(agency)
    spending_startyear = project.get_spending(agency_id,start_year)
    spending_endyear = project.get_spending(agency_id,end_year)
    return (spending_endyear-spending_startyear)/(end_year-start_year)
#q12 how much has spending increased per year (on average) for police from 2015 to 2018?

change_per_year(agency ='police')
3.2706887566666674
#q13 how much has spending increased per year (on average) for police from 2017 to 2018?
change_per_year(agency = 'police', start_year = 2017)
4.627587380000023
#q14 how much has spending increased per year (on average) for streets from 2016 to 2018?
change_per_year(agency = 'streets', start_year = 2016)
-0.2148440299999983
#Function 5: extrapolate(agency, year1, year2, year3)
def extrapolate(agency, year1, year2, year3):
    change = change_per_year(agency, start_year = year1, end_year = year2)
    agency_id = project.get_id(agency)
    spending_year2 = project.get_spending(agency_id,year2)
    return spending_year2+ (year3 - year2)*change 
#q15 how much will library spend in 2019?
extrapolate(agency ='library',year1 = 2015, year2 = 2018, year3 = 2019)
20.80494298333333
#q16 how much will library spend in 2100?
extrapolate(agency ='library', year1=2015, year2=2018, year3=2100)
98.55499483333321
#q17 how much will library spend in 2100?
extrapolate(agency = 'library', year1 =2017, year2=2018, year3=2100)
77.95994753999969
#Function 6: extrapolate_error
def extrapolate_error(agency, year1, year2, year3):
    agency_id = project.get_id(agency)
    predict = extrapolate(agency, year1, year2, year3)
    actual = project.get_spending(agency_id, year3)
    return predict - actual
#q18  what is the error if we extrapolate to 2018 from the 2015-to-2017 data for police?
extrapolate_error(agency ='police', year1=2015, year2=2017, year3=2018)
-2.0353479350000327
#q19  what is the error if we extrapolate to 2018 from the 2015-to-2016 data for streets?
extrapolate_error(agency='streets', year1=2015, year2=2016, year3=2018)
6.149171779999982
#q20 what is the standard deviation for library spending over the 4 years?
def std_cal(agency, year1, year2, year3, year4):
    library_id = project.get_id(agency)
    library_1= project.get_spending(library_id, year1)
    library_2= project.get_spending(library_id, year2)
    library_3= project.get_spending(library_id, year3)
    library_4= project.get_spending(library_id, year4)
    mean = (library_1+library_2+library_3+library_4)/4
    var = ((library_1 -mean)**2+(library_2 -mean)**2+(library_3 -mean)**2+(library_4 -mean)**2)/4
    return var ** (1/2)
1.0848913984858986
#q20
std_cal(agency = 'library',year1 = 2015,year2=2016, year3=2017,year4=2018)
1.0848913984858986

  TOC