How many people live near a landmark / point-of-interest?

It can be useful to know how many people live near a landmark / point-of-interest (POI). For example, a location is often considered “walkable” if you can walk to it in 10 minutes or less. Understanding how many people live near a POI is one way of estimating how many people are within walking distance of a POI, if they were to walk from their home to the POI.

In this post, I start with a point-of-interest, “Times Square, NYC”, and using the Census API I find out how many people live within the census tract that contains this POI (a tract is one of the smallest sub-divisions for which the Census provides population estimates).

If we wanted to go a bit further down this path of estimating “population within walking distance” we could take this approach and expand it to include all census tracts within a certain distance of our POI census tract. I used this approach one time to understand the potential “reach” of outdoor neighborhood advertising.

from census import Census
from us import states
import censusgeocode as cg
import folium
import geocoder
import pprint
import wget
import zipfile
import shapefile as shp
import pandas as pd
from shapely.geometry import Polygon
import numpy as np

c = Census("MY_API_KEY")

Geocoding the point-of-interest

First we’ll geocode our point-of-interest (POI) using geocoder. This will give us the latitude and longitude coordinates, among other things.

poi = geocoder.osm('Times Square').json

# Print keys
pprint.pprint(poi.keys())

print("\n" + poi['address'])
dict_keys(['accuracy', 'address', 'bbox', 'city', 'confidence', 'country', 'country_code', 'county', 'icon', 'importance', 'lat', 'lng', 'ok', 'osm_id', 'osm_type', 'place_id', 'place_rank', 'postal', 'quality', 'raw', 'region', 'state', 'status', 'street', 'suburb', 'type'])

Times Square, West 45th Street, Times Square, Theater District, Manhattan Community Board 5, Manhattan, New York County, New York, 10036, United States of America

Next, we’ll use censusgeocode to get the geo IDs used by the census. As input, we’ll use the latitude and longitude from the previous geocoding.

poi_cg = cg.coordinates(y = poi['lat'], x = poi['lng'])

# Print top-level keys
pprint.pprint(poi_cg.keys())

# Print Census Tract keys
pprint.pprint(poi_cg['Census Tracts'][0].keys())
dict_keys(['2010 Census Blocks', 'States', 'Counties', 'Census Tracts'])
dict_keys(['GEOID', 'CENTLAT', 'AREAWATER', 'STATE', 'BASENAME', 'OID', 'LSADC', 'FUNCSTAT', 'INTPTLAT', 'NAME', 'OBJECTID', 'TRACT', 'CENTLON', 'AREALAND', 'INTPTLON', 'MTFCC', 'COUNTY', 'CENT', 'INTPT'])

Mapping the tract

First we’ll find the shapefiles for NY state tracts using the us library.

Getting shapefiles

shpurls = states.NY.shapefile_urls()
for region, url in shpurls.items():
    print(region, url)
tract https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/tl_2010_36_tract10.zip
cd https://www2.census.gov/geo/tiger/TIGER2010/CD/111/tl_2010_36_cd111.zip
county https://www2.census.gov/geo/tiger/TIGER2010/COUNTY/2010/tl_2010_36_county10.zip
state https://www2.census.gov/geo/tiger/TIGER2010/STATE/2010/tl_2010_36_state10.zip
zcta https://www2.census.gov/geo/tiger/TIGER2010/ZCTA5/2010/tl_2010_36_zcta510.zip
block https://www2.census.gov/geo/tiger/TIGER2010/TABBLOCK/2010/tl_2010_36_tabblock10.zip
blockgroup https://www2.census.gov/geo/tiger/TIGER2010/BG/2010/tl_2010_36_bg10.zip

Now we’ll download and unzip the shapefiles for the tract zip.

wget.download('https://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/tl_2010_36_tract10.zip')
with zipfile.ZipFile('tl_2010_36_tract10.zip', 'r') as zip_ref:
    zip_ref.extractall()
-1 / unknown

Converting shapefile to LNG/LAT coords

Next, we’ll read the shape files into a dataframe, giving us a “coords” column, using the shapefile library. Note that these will be in LNG, LAT (not LAT, LNG).

def read_shapefile(sf):
    """
    Read a shapefile into a Pandas dataframe with a 'coords' 
    column holding the geometry information. This uses the pyshp
    package
    """
    fields = [x[0] for x in sf.fields][1:]
    records = sf.records()
    shps = [s.points for s in sf.shapes()]    
    df = pd.DataFrame(columns=fields, data=records)
    df = df.assign(coords=shps)
    return df

shp_path = 'tl_2010_36_tract10.shp'
sf = shp.Reader(shp_path)
df = read_shapefile(sf)

This is the tract we’re interested in:

poi_tract = df[df['GEOID10'] == poi_cg['Census Tracts'][0]['GEOID']].reset_index()
poi_tract['coords']
0    [(-73.985085, 40.758589), (-73.98225599999999,...
Name: coords, dtype: object

Map the POI marker and tract polygon

Finally, we’re ready to map the POI and the tract polygon in which it is located. We’ll use shapely to get the polygon and folium to do the mapping.

# Convert to Polygon
poi_poly = Polygon(poi_tract['coords'][0])

# Initialize map with zoom and custom tileset, centered on POI
m = folium.Map(location=[poi['lat'], poi['lng']],
               zoom_start=16,
               tiles='cartodbpositron')

# Add a map pin
folium.Marker([poi['lat'], poi['lng']]).add_to(m)

# Add the polygon
folium.GeoJson(poi_poly).add_to(m)

m

Population for census tract

Now, the last step is finding the population for this census tract.

The variable we are interested in is B01003_001E, an estimate of the total population. A full list of the variable codes and definitions can be found at the ACS developer page (see the “2018 ACS Detailed Tables Variables”).

poi_pop = c.acs5.state_county_tract(('NAME', 'B01003_001E'),
                                     poi_cg['Census Tracts'][0]['STATE'], 
                                     poi_cg['Census Tracts'][0]['COUNTY'], 
                                     poi_cg['Census Tracts'][0]['TRACT'])

poi_pop_str = str(round(poi_pop[0]['B01003_001E']))

poi_pop_str
'1027'
folium.Marker(
    [poi['lat'], poi['lng']],
    icon=folium.DivIcon(
        icon_size=(150, 36),
        icon_anchor=(0, 0),
        html='<div style="font-size: 10pt;\
                          background:white;\
                          padding:5px;\
                          border:1px solid red">\
                          The population in the census tract\
                          containing Times Square is ' + poi_pop_str + '\
              </div>',
        )
    ).add_to(m)

m