PPP Loan Fraud: What the Data Shows
We analyzed 968,522 PPP loans using Isolation Forest ML. $32 billion in anomalous patterns detected: round dollar amounts, fake employees, shared addresses. Here is what the data reveals.
The Paycheck Protection Program distributed $793 billion in forgivable loans between 2020 and 2021. The program was designed for speed over verification, and the data shows what happened as a result.
We ran Isolation Forest anomaly detection on 968,522 PPP loans from SBA FOIA data. The model flagged $32 billion in anomalous patterns.
What the Anomaly Detection Found
Our PPP loan analysis applied machine learning to the full SBA loan-level dataset. The Isolation Forest algorithm identifies statistical outliers by measuring how easily a data point can be separated from the rest of the distribution.
Three patterns dominated the flagged loans:
Round dollar amounts. Loans ending in exactly $000 appeared at a rate significantly higher than chance. Legitimate payroll calculations based on actual employee counts and salaries produce uneven numbers. Round amounts suggest the application was not based on real payroll data.
Employee count anomalies. Some borrowers claimed employee counts that did not match the loan amount. A loan of $150,000 for a business claiming 1 employee is a statistical outlier. These cases appeared in clusters, suggesting organized fraud rather than isolated mistakes.
Shared addresses and registered agents. Multiple PPP applications routed through the same physical address or the same registered agent. In several cases, dozens of "businesses" at a single location each received six-figure loans. Our analysis identified these geographic clusters as high-risk.
The Timeline Matters
Speed was the explicit design goal of PPP. The SBA approved loans in days, sometimes hours. Lenders had limited liability for verifying borrower information. This created a window where fraudulent applications faced almost no screening.
Our COVID fraud timeline maps the entire disbursement chronology across pandemic programs. $4.67 trillion was distributed with minimal checks. The enforcement response lagged the disbursement by months or years.
The timeline shows a pattern: money went out fast, anomalies appeared immediately in the data, and enforcement actions did not begin until well after the disbursement window closed.
How Much Was Actually Fraud?
The SBA OIG estimates that at least $200 billion in PPP and EIDL loans were potentially fraudulent. The DOJ has charged over 3,000 defendants in PPP fraud cases as of 2025.
Our $32 billion anomaly figure is not an estimate of fraud. It is the total dollar value of loans that the Isolation Forest model flagged as statistically unusual based on their combination of features. Some of these may be legitimate businesses with unusual characteristics. But the concentration of round amounts, mismatched employee counts, and address clustering suggests that a substantial portion represents intentional misrepresentation.
Cross-Program Stacking
PPP fraud did not happen in isolation. Our analysis shows that 113,836 entities received both PPP and EIDL loans. Healthcare providers could access up to four pandemic relief programs simultaneously. This cross-program stacking is documented in our COVID stacking analysis, where $4.67 trillion across 21.8 million awards shows systemic gaps in cross-referencing between programs.
The SBA, Treasury, and HHS each ran their own programs with separate application systems. No central database checked whether an applicant was claiming overlapping benefits across programs.
What Public Data Reveals
Every number in this article comes from SBA FOIA releases, DOJ press releases, and OIG reports. The PPP loan-level data is public. The anomaly detection methods are published. Anyone can replicate the analysis.
The lesson from PPP is not that fraud is inevitable in emergency programs. It is that the signals of fraud were visible in the data from day one. Statistical methods applied at the point of disbursement could have flagged the most obvious anomalies in real time.
About the Author
Founder & Principal Consultant
Josh helps SMBs implement AI and analytics that drive measurable outcomes. With experience building data products and scaling analytics infrastructure, he focuses on practical, cost-effective solutions that deliver ROI within months, not years.
Get practical AI & analytics insights delivered to your inbox
No spam, ever. Unsubscribe anytime.
Related Posts
March 27, 2026
March 27, 2026
March 27, 2026
Ready to discuss your needs?
I work with SMBs to implement analytics and adopt AI that drives measurable outcomes.