About Data

Data Sources

The L.A. crime dataset from 2020 to the present is sourced from the United States government’s open data website, comprising roughly 800,000 data points across 28 variables. This data is transcribed from original crime reports that are typed on paper and therefore there may be some inaccuracies within the data. For example, some location fields with missing data are noted as (0°, 0°) and address fields are only provided to the nearest hundred block to maintain privacy. Given LAPD OpenData’s weekly updates, I focused specifically on the 2022 data due to the dataset’s substantial size. Using the OhioSuperComputer Center, I accelerated the process by filtering the 2022 records, replacing the missing values as ‘NA,’ and reorganizing and renaming variables as necessary.

Further Explanation