query_by_zip module

This module contains the primary functions of ziptool, query_by_zip. You don’t need to use any other function to get data by ZIP code, but they are documented regardless for special use cases.

ziptool.query_by_zip.data_by_zip(zips: List[str], acs_data, variables=None, year='2019')

Extracts data from the ACS pertraining to a particular ZIP code. Can either return the full raw data or summary statistics.

Parameters
  • zips – a list of zipcodes, represented as strings i.e. [‘02906’, ‘72901’, …]

  • acs_data – a string representing the path of the datafile OR a dataframe containing ACS datafile

  • variables (optional) –

    To return the raw data, pass None. To extract summary statistics, pass a dictionary of the form:

    {
        variable_of_interest_1: { #the variable name in IPUMS
            "null": null_val, #the value (float or int) of null data
            "type": type #"household" or "individual"
        },
        variable_of_interest_2: {
            "null": null_val,
            "type": type
        }
    }
    

  • year (optional) – a string representing the year of shapefiles to use for matching PUMAs to ZIPs. Default is 2019.

Returns

When variables of interest are passed, a pd.DataFrame containing the summary statistics foor each ZIP code.

When variables of interest are NOT passed, a dictionary of the form:

{
    zip_1: [
        [
            puma_1_df,
            puma_1_ratio
        ],
        [
            puma_2_df,
            puma_2_ratio
        ],
        ...,
    ],
    zip_2: ...
}