Finding the Right Fit: How Plaid Reconciles Pending and Posted Transactions

Finding the Right Fit: How Plaid Reconciles Pending and Posted Transactions

To reduce the trees’ correlation, our model also randomly sampled features in addition to randomly sampling training data, resulting in a random forest. This meant our training sets had an imbalance in which a large majority of the data was “not matching”; as a result, our random forest model erred on the side of predicting lower probabilities of matching, resulting in a high false negative rate. Our new boosting model lowered our false negative rate by 96% compared to the random forest model, ultimately providing higher quality transactions data to our clients and consumers.

Source: blog.plaid.com