This readme file was generated on 2023-06-06 by Nicholas Wolf GENERAL INFORMATION Egress Behavior from Select NYC COVID-19 Exposed Health Facilities March-May 2020 Principal Investigator Information Name: Tom Kirchner ORCID: 0000-0001-5764-4980 Institution: New York University Email: tom.kirchner@nyu.edu Co-investigator Information Name: Debra F. Laefer ORCID: 0000-0001-5134-5322 Institution: New York University Email: dfl256@nyu.edu * Date of data collection: 2020-03 / 2020-05 * Geographic location of data collection: New York, NY, USA * Funding Sources: This project was funded by a National Science Foundation grant, "RAPID: DETER: Developing Epidemiology mechanisms in Three-dimensions to Enhance Response,” Award 2027293, 05/20-4/21. The project also received funding from the Data Science and Software Services (DS3) program, which was funded through the NYU Moore Sloan Data Science Environment. *** SHARING/ACCESS INFORMATION * Licenses/restrictions placed on the data: The data are freely available under a CC-BY 4.0 license. * Links/relationships to supporting or related data sets: https://geo.nyu.edu/catalog/nyu-2451-60075 * Recommended citation for this dataset: Laefer, Debra F., Tom Kirchner, and Huang (Frank) Jiang. “Egress Behavior from Select NYC COVID-19 Exposed Health Facilities March-may 2020”. New York University, February 1, 2022. https://doi.org/10.58153/390e6-v2v66. *** DATA & FILE OVERVIEW NYU_egress_covid19_raw_v3.csv: 5030 rows of data, each (1) egress trajectory record NYU_egress_covid19_codebook_v3.csv: Column-variable matched data dictionary with labels and descriptions of NYU_egress_covid19_raw.csv NYU_egress_covid19_GIS_v3.zip: Compressed shape file with trajectory distances and facility-level average radius of location . ├── NYU2020_line_final_maxdist.dbf ├── NYU2020_line_final_maxdist.prj ├── NYU2020_line_final_maxdist.shp ├── NYU2020_line_final_maxdist.shx ├── NYU2020_polygon_final_maxdist.dbf ├── NYU2020_polygon_final_maxdist.prj ├── NYU2020_polygon_final_maxdist.shp └── NYU2020_polygon_final_maxdist.shx * Versions In keeping with the rapid response needs of this data collection during pandemic, an initial dataset was released in 2020 (version 1) with codebook, raw observational data, cleaned an amalgamated final dat, and a sample GIS file. A subsequent version 2, released in February 2022, included full GIS files along with the tabular spatial data. Version 3, released in spring 2023, adds calculated max distance as explicit variables to the version 2 files. *** METHODOLOGICAL INFORMATION * Description of methods used for collection/generation of data: This rapid response surveillance project was funded by the National Science Foundation (NSF) to collect “perishable” data on egress behaviors and neighborhood conditions surrounding healthcare centers (HCCs) in New York City (NYC) during the initial NYC COVID-19 PAUSE ordinance from March 22nd to May 19th, 2020. Anonymized data on NYC HCC egress behaviors were collected by observational field workers using a phone-based mapping application. Each egress trip record includes the day of week, time of day, destination category type, along with an array of behavioral outcome categories, ambient weather conditions and socio-economic factors. Egress trajectories with precise estimates of distance traveled and the spatial dispersion or “spread” around each HCC were added via post-processing. The data collection and cleaning process resulted in 5,030 individual egress records from 18 facilities. The study was funded for 9 weeks with student observers collecting data 10-20 hours per week until May 19, 2020. Ultimately, 18 facilities across 4 of New York City’s 5 boroughs (Queens, Brooklyn, Manhattan, and the Bronx) were selected. Procedurally, field observers positioned themselves across the street from their assigned HCC egress location, and then traced each subject’s egress route, noting locations of interactions with the built environment or other individuals. Final destinations were categorized by location type (e.g., coffee shop, pharmacy, deli, food trucks), including whether subjects returned to the medical facility or entering a nearby one (e.g., temporary tent, adjacent clinic or campus building). For each HCC, egress recordings extended from the same pre-specified point until one of three outcomes occurred: (1) the subject entered a vehicle, subway station, building or other final destination and was no longer visible, 2) tracking exceeded 20 minutes (an average observation period lasted 5 minutes in duration); or (3) the subject walked more than 1.3 km from the HCC. * Methods for processing the data: Post-processing of the behavioral dataset also involved extraction and coding of meta-data – notes associated with each behavioral record and included in the attribute table associated with each shape file. Descriptive notes were extracted manually from the KML/KMZ files and entered into a spreadsheet. To introduce quality control measures, the notes were scraped by one researcher and checked on an entry-by-entry level by a second researcher. A secondary coding was also conducted on some key data fields, standardizing and allowing for a more generalized accounting of the data. For example, taxi, Uber, and Lyft were combined in the secondary coding as “vehicle for hire.” Calculation of egress trajectory distances and geographic dispersion around each HCC: Spatial dispersion around each HCC facility was defined as the spatial magnitude of the geographic area encompassing all egress trajectory records from each HCC and was approximated with a minor adaptation of the well-established radius of gyration (Rg) statistic, which is essentially the standard deviation of a set of locations around their center of mass, typically reported in meters. See Zheng Y, Capra L, Wolfson O, Yang H. Urban computing: concepts, methodologies, and applications. In proceedings of the ACM TIST; New York, NY: ACM Press; 2014. To facilitate research on the neighborhood areas around each HCC, we calculated a collective radius of egress Re statistic that centers on each HCC exit point (rather than the center of mass used to calculate Rg). This Re metric provides a standardized estimate of the spatial dispersion associated with the egress records collected from each HCC facility. R_e=√(1/n ∑_(k=1)^n▒〖(d_k-〖HCC〗_exit)〗^2 ) where n is the total number of egress records collected from each HCC, and 〖HCC〗_exit is the location center of mass, or the longitude and latitude of each HCC exit point. The great circle distance in meters between the final destination observed for each egress record and their collective center of mass (d_k-〖HCC〗_exit) was calculated using Vincenty's formulae. See Vincenty T. Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Surv Rev. 1975;23:88-93. Neighborhood estimates associated with each HCC: An array of socio-economic indicator variables used by the Center for Disease Control (CDC) to calculate the Social Vulnerability index (SVI) were included in the archived dataset, each linked to the geo-rectified location of the HCC entry/exit points used for field observations. The majority of variables included were taken from the American Community Survey (ACS; 2014-2018) with estimates of and margins of error provided in numbers and percentages for the total population, along with housing units, and other standard household indicators for the zip-code around each HCC. To enrich and ease re-use for future analysis, also included in the archival dataset were metadata and indicators of ambient weather conditions outside each HCC on the days that data collection occurred. * Instrument- or software-specific information needed to interpret the data: The shape files provided were generated and can be opened with the open-source software QGIS, or a similar GIS software platform such as ArcGIS. * People involved with sample collection, processing, analysis and/or submission: Local resident fieldworkers (i.e., New York University students) were recruited to work as observers immediately prior to New York’s implementation of the PAUSE order at 8PM on March 22, 2020. *** DATA-SPECIFIC INFORMATION See NYU_egress_covid19_codebook.csv for all variable-level information