Mathematical Models of Crime: Prediction, Discrimination, Interpretability, Under-reporting and Equilibrium

I had the honor to give the closing plenary lecture at the first Latin American Conference in Applied Mathematics, January 30th – February 3rd, 2023 in Rio de Janeiro, Brazil. The talk was based on the results of a three year publicly funded project that constructed, developed and deployed analytical models to understand criminality in the city of Bogota.

The project was executed in collaboration with the Secretary of Security of the City, the National University of Colombia (lead by Professor Francisco Gomez) and the private firm Quantil (lead by myself). Funding acknowledgments are given at the end. *

In this post, I just want to make a short summary of what went into the talk that is, in my opinion, some of the most relevant work we did:

Motivation:

  • Between January 2012 and September 2015, all homicides and 25% of all crimes reported in Bogotá occurred in 2% of the street segments.
  • During the same period these segments received only 10% of the attention of police resources (Blattman et.al 2017).
  • More than 60 cities use mathematical models including: Los Ángeles CA, Atlanta GA, Chicago IL, New York NY, Alhambra CA, San Francisco CA, Modesto CA, Santa Cruz, CA, etc.
  • The first obvious task is prediction: can we predict crime?
  • Besides fostering efficiency, prediction models are useful for many other applications: surveillance camera prioritization, optimal allocation of police stations, etc., and have positive externalities within public institutions (promotes a data culture and scientific discipline).

Modelling Under-reported Spatio-temporal Events

Equilibrium

Prediction models ignore the strategic reaction of criminals. We use unique experimental data and a structural model of crime location (i.e. a discrete choice model), to identify the causal impact of police patrolling on crime. There is a hot discussion on the effects of police patrolling on crime. For example, look at this comprehensive study for the US: David Weisburd and Malay K. Majmundar. 2018. Proactive Policing: Effects on Crime and Communities. Now, in line with our previous topic, some studies have explored the implications of errors in police measurement. For example, Aaron, Ch., and J. McCrary. 2018. Are U.S. Cities Underpoliced? Theory and Evidence; suggest that once we correct for some measurement errors in official statistics, they find that the police elasticity (number of policemen) of violent crime is between -0.289 to -0.361, and for property crime this elasticity is between -0.152 to -0.195. Our study also estimates elasticities, but in a structural model. Also, we focus on the time spent by police at different places in a city, rather than the aggregate number of policemen,

Our estimation strategy capitalizes on the experimental data used in Blattman, Ch., Green, D.,Ortega, D. and S. Tobon. 2021. Place-based interventions at scale: the direct and spillover effects of policing and city services on crime. In this study the authors randomly assigned 756 (206) streets to an 8-month treatment of doubled police patrols (greater municipal services) and measures the direct effect on crime of these interventions. They also measured spillovers (indirect effects) to streets in a radius of 250 meters. Their main result suggest that they can rule out total reductions in crime of more than 2%.

As opposed to Blattman, et.al we use their experimental data to estimate a structural model of crime. That is, we assume citizens can be potential criminals and they derive utility of committing a crime depending on the location of the city (one out of approximately 500 different places), and the police presence at the place. The location is characterized by many urban characteristics (presence of schools, churches, parks, public transportation, etc.). There is also an outside option of not committing a crime which is, the agents majority choice in this model. Given that police presence and criminality are jointly determined, police presence is an endogenous variable in this model and, the dependence of utility on police presence may not be consistently estimated without further considerations. This is where the randomized experiment of Blattman come in, and using their randomized assignment of police time we are able to separate an exogenous component of police presence and use this component to consistently estimate de dependence of crime on police presence.

In agreement with Blatman et.al we find strong direct effects, a mean elasticity of crime to police presence (measured in terms of time spent at a location) of -0.262 for violent crimes, -0.382 for property crime and -0.38 for total crime. Our estimates of spillover effects (i.e., cross location elasticities) are low and close to zero. However, since in our model these spillover effects measure the mean displacement across all locations (not just those at 250 meters) this doesn’t mean that the aggregate spillover effects are small. In fact, preliminary results show that at least in certain kinds of interventions, they can offset direct effects, and in this sense, our model is consistent with Blattman et.al results.

An advantage of using structural models is the fact that we can construct counterfactual scenarios. Our most important result is, therefore, the estimation of the causal impact of different police patrolling strategies (the allocation of time across all locations of the city). In our model we are able to estimate the allocation time, subject to a constraint of invariant aggregate time, that results in the minimum amount of crime. In a nutshell, we show that if police were to be allocated according to our model there would be an aggregate reduction of 7% in violent crimes, 8.5% in property crimes and 5.2% in total crimes. This reduction in crime, especially violent crime, is very important from a social point of view. Even more important because it comes from a more efficient allocation of police time rather than from additional costs.

The complete presentation:

Tags
Inteligencia artificial

Newsletter

Obtén información sobre Ciencia de datos, Inteligencia Artificial, Machine Learning y más.

Artículos recientes

En los artículos de Blog, podrás conocer las últimas noticias, publicaciones, estudios y artículos de interés de la actualidad.

Justicia Algorítmica

Justicia en los Modelos de Inteligencia Artificial: Nueva Perspectiva Basada en el Re-diseño de Algoritmos

En los últimos años, los modelos de inteligencia artificial han demostrado un potencial increíble para transformar industrias, desde la salud hasta las finanzas. Sin embargo, también han expuesto un problema preocupante: el sesgo algorítmico.

Machine Learning

Inferencia Robusta y Cuantificación de Incertidumbre para la Toma de Decisiones Basada en Datos

Los modelos de aprendizaje automático se han convertido en herramientas esenciales para la toma de decisiones en sectores críticos como la salud, las políticas públicas y las finanzas. Sin embargo, su aplicación práctica enfrenta dos grandes desafíos: el sesgo de selección en los datos y la cuantificación adecuada de la incertidumbre.

Redes Neuronales

El Potencial Impacto del Aprendizaje de Máquinas en el Diseño de las Políticas Públicas en Colombia: Una década de experiencias

Este blog es un resumen extendido del articulo Riascos, A. (2025).1 Desde el inicio de la llamada tercera ola de redes neuronales (Goodfellow et al., (2016)), en la primera década de este siglo, se ha generado una gran esperanza en las posibilidades de la inteligencia artificial para transformar todas las actividades humanas. Asimismo, se han levantado alertas sobre los riesgos que conlleva la introducción de esta nueva tecnología (Bengio et al., (2024)).

Deep Learning

Explorando Redes Neuronales en Grafos para la Clasificación de Asentamientos Informales en Bogotá, Colombia

Los asentamientos informales son definidos como áreas residenciales cuyos habitantes no poseen tenencia legal de las tierras, los barrios carecen de servicios básicos e infraestructura urbana y no cumplen con requisitos de planificación, así como se pueden encontrar en zonas de peligro ambiental y geográfico (ONU, 2015).

Tecnología

Reinforcement Learning para Optimización de Portafolios

En el contexto de los mercados financieros, la optimización de portafolios consiste en identificar la combinación óptima de activos para maximizar la relación retorno-riesgo. No obstante, esta toma de decisiones se realiza en un entorno de incertidumbre, ya que el comportamiento de los activos no es estacionario a lo largo del tiempo.

Tecnología

Clustering de datos genómicos

La secuenciación de RNA es una técnica que permite analizar la actividad de los genes en una muestra, como sangre, cerebro u otro tejido animal. Actualmente, es una de las herramientas más utilizadas en biología computacional y medicina, ya que facilita el estudio del impacto de las enfermedades en la expresión génica, lo que, a su vez, afecta la síntesis de proteínas y, en consecuencia, el funcionamiento celular.