The Myth of the Impartial Machine

The Myth of the Impartial Machine

In our example, let’s say the city experienced a total of 22 crimes in the past year, with 12 of those occurring in precinct A and 10 in precinct B, as shown below: The predictive algorithm uses this historical data to determine where to send John. When the model chooses to send John to precinct A, more instances of crime will be logged for precinct A, while crime that occur in B are ignored and remain uncaptured in the data. Feedback loops are especially problematic when sub-groups in the training data exhibit large statistical differences (e.g. one precinct has a much higher crime rate than others); a model trained on such data will quickly “run away” and make predictions that fall into the majority group only, thereby generating ever-more lopsided data that are fed back into the model.

Source: parametric.press