Presentation: Applied Supervised Learning: Predicting Recidivism

Location:

Duration

Duration: 
2:55pm - 3:45pm

Day of week:

Abstract

People who repeatedly cycle through the criminal justice system, often for low-level crimes, cost the country millions of dollars in time and effort, but little is known about them. Who are these repeat offenders, and what drives their recidivism? Can we predict who they are to intervene before they enter the vicious cycle of recidivism?

The steps to build a predictive model may vary by project, but the fundamental workflow and principles remain the same. This talk will walk through the process from the beginning to end: data preparation, variable definition, visualization, model fitting, and robustness testing. As a case study, we will discuss a project completed by a group of pro bono data scientists at a charity called DataKind. In partnership with the Laura and John Arnold Foundation, the DataKind team built a predictive model to identify the criminals most likely to reoffend in order to better address the risk factors and offer appropriate interventions and services.

Through examples of techniques used and challenges faced during this project, the presentation will point out potential pitfalls and implicit assumptions in various models, with code snippets in R and Python.

This session is suitable for those with a limited background in statistics and data science.

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers