Fair Code: Algorithmic Bias Detection

COMPAS CRIMINAL JUSTICE AI HIRING DISCRIMINATION FAIRNESS GAP 86.77% → 15.69% 97.3% BIAS REDUCTION RANDOM FOREST CLASSIFIER DEMOGRAPHIC PARITY METRIC PROXY VARIABLE REMOVAL GERMAN CREDIT LENDING BIAS AGE DISCRIMINATION · 73.6% REDUCTION OPEN SOURCE · PUBLIC DATA INSURANCE DENIAL BIAS BMI + SMOKER + DIABETIC PROXIES AGE GAP 7.93% → 3.18% · 60% REDUCTION BENEFITS DENIAL BIAS RELATIONSHIP + MARITAL STATUS PROXIES SEX GAP 18.00% → 8.52% · 53% REDUCTION EQUALIZED ODDS · TPR + FPR PARITY DISPARATE IMPACT · 80% RULE FOUR-FIFTHS RULE · EEOC STANDARD FEEDBACK LOOP BIAS RETRAINING AMPLIFIES BIAS OVER TIME LABEL BIAS HISTORICAL DECISIONS CORRUPT TRAINING LABELS INDIVIDUAL FAIRNESS SIMILAR PEOPLE MUST BE TREATED SIMILARLY NEURAL NETWORKS WEIGHTS ENCODE BIAS FROM TRAINING DATA AI HALLUCINATION HIGH CONFIDENCE · ZERO EVIDENCE · SPARSE REGIONS REINFORCEMENT LEARNING REWARD FUNCTION IS POLITICAL · NOT TECHNICAL PROXY ENTANGLEMENT REDUNDANT CHANNELS · REMOVE THE CLUSTER · NOT ONE PROXY COMPAS CRIMINAL JUSTICE AI HIRING DISCRIMINATION FAIRNESS GAP 86.77% → 15.69% 97.3% BIAS REDUCTION RANDOM FOREST CLASSIFIER DEMOGRAPHIC PARITY METRIC PROXY VARIABLE REMOVAL GERMAN CREDIT LENDING BIAS AGE DISCRIMINATION · 73.6% REDUCTION OPEN SOURCE · PUBLIC DATA INSURANCE DENIAL BIAS BMI + SMOKER + DIABETIC PROXIES AGE GAP 7.93% → 3.18% · 60% REDUCTION BENEFITS DENIAL BIAS RELATIONSHIP + MARITAL STATUS PROXIES SEX GAP 18.00% → 8.52% · 53% REDUCTION EQUALIZED ODDS · TPR + FPR PARITY DISPARATE IMPACT · 80% RULE FOUR-FIFTHS RULE · EEOC STANDARD FEEDBACK LOOP BIAS RETRAINING AMPLIFIES BIAS OVER TIME LABEL BIAS HISTORICAL DECISIONS CORRUPT TRAINING LABELS INDIVIDUAL FAIRNESS SIMILAR PEOPLE MUST BE TREATED SIMILARLY NEURAL NETWORKS WEIGHTS ENCODE BIAS FROM TRAINING DATA AI HALLUCINATION HIGH CONFIDENCE · ZERO EVIDENCE · SPARSE REGIONS REINFORCEMENT LEARNING REWARD FUNCTION IS POLITICAL · NOT TECHNICAL PROXY ENTANGLEMENT REDUNDANT CHANNELS · REMOVE THE CLUSTER · NOT ONE PROXY

03Projects

The experiments

↗ View All Code

⌕

No results

No experiments match that query. Try a different term or clear the filter.

PROJECT — 01

COMPAS

A real algorithm used in US courtrooms. ProPublica's public dataset, 70,000+ records. The bias is not a glitch. It's baked in.

FAIRNESS GAP BEFORE

86.77%

AFTER MITIGATION

15.69%

71% reduction

unfair.py — BIASED MODEL

--- BIASED MODEL RESULTS ---

Black Defendant High-Risk Rate: 87.16%

White Defendant High-Risk Rate: 0.40%

Fairness Gap: 86.77%

fair.py — MITIGATED MODEL

--- MITIGATED (UNBIASED) RESULTS ---

Black Defendant High-Risk Rate: 84.71%

White Defendant High-Risk Rate: 69.02%

New Fairness Gap: 15.69%

Key Insight

Removing race alone isn't enough. Custody Status is a proxy variable. It carries the racial signal through the model even when the race column is dropped. Both features had to go.

Black · BIASED

87.16%

White · BIASED

0.40%

Black · FAIR

84.71%

White · FAIR

69.02%

PROJECT — 02

AI HIRING

Women hired 20.9% less than equally qualified men. The algorithm wasn't told to discriminate. It learned to.

FAIRNESS GAP BEFORE

4.51%

AFTER MITIGATION

0.12%

97.3% reduction

unfair.py — BIASED MODEL

--- BIASED MODEL OUTPUT ---

Male Candidate Hire Rate: 21.62%

Female Candidate Hire Rate: 17.10%

Original Fairness Gap: 4.51%

fair.py — MITIGATED MODEL

--- MITIGATED MODEL OUTPUT ---

Male Candidate Hire Rate: 11.48%

Female Candidate Hire Rate: 11.35%

New Fairness Gap: 0.12%

Key Insight

Dropping gender and age, retaining only Experience Years and Technical Test Score, collapsed the fairness gap from 4.51% to 0.12%. Merit features alone produce near-perfect demographic parity.

Male · BIASED

21.62%

Female · BIASED

17.10%

Male · FAIR

11.48%

Female · FAIR

11.35%

PROJECT — 03

GERMAN CREDIT

A lending model rates young applicants as bad credit risks at 6+ points higher than older applicants with identical financial profiles. It learned age from job tenure.

FAIRNESS GAP BEFORE

7.16%

AFTER MITIGATION

1.89%

73.6% reduction

unfair.py — BIASED MODEL

--- BIASED MODEL RESULTS ---

Older Applicants (30+) Good Credit: 83.97%

Young Applicants (<30) Good Credit: 76.81%

Fairness Gap: 7.16%

fair.py — MITIGATED MODEL

--- MITIGATED (UNBIASED) RESULTS ---

Older Applicants (30+) Good Credit: 80.15%

Young Applicants (<30) Good Credit: 78.26%

New Fairness Gap: 1.89%

Key Insight

Employment tenure looks like a legitimate financial signal — and it is. But it's also a near-perfect proxy for age. A 24-year-old cannot have 10 years of employment history. The model penalising short tenure was partially penalising youth. Dropping both age and employment forced it to evaluate what a borrower has — savings, credit history, loan purpose — rather than how long they've been alive.

Older · BIASED

83.97%

Young · BIASED

76.81%

Older · FAIR

80.15%

Young · FAIR

78.26%

PROJECT — 04

INSURANCE DENIAL

An insurance AI flags older patients for high-cost claims at 7.93 points higher than younger patients. BMI, smoking status, and diabetic status encode race without naming it.

AGE FAIRNESS GAP BEFORE

7.93%

AFTER MITIGATION

3.18%

60% reduction · Gender gap: 72% reduction

unfair.py — BIASED MODEL

--- BIASED MODEL RESULTS ---

Older (35+) High-Cost Flag Rate: 44.59%

Young (<35) High-Cost Flag Rate: 36.67%

Age Fairness Gap: 7.93%

Gender Fairness Gap: 5.44%

fair.py — MITIGATED MODEL

--- MITIGATED (UNBIASED) RESULTS ---

Older (35+) High-Cost Flag Rate: 50.68%

Young (<35) High-Cost Flag Rate: 47.50%

New Age Fairness Gap: 3.18%

New Gender Fairness Gap: 1.54%

Key Insight

BMI, smoker status, and diabetic diagnosis rates all differ significantly by race and class — so a model trained on them learns to discriminate by race without the word ever appearing. These are the CustodyStatus of health insurance: clinical-sounding features that carry protected-class signal because of structural inequalities baked into American healthcare. Dropping them alongside age and gender reduced the age gap by 60% and the gender gap by 72%.

Older · BIASED

44.59%

Young · BIASED

36.67%

Older · FAIR

50.68%

Young · FAIR

47.50%

PROJECT — 05

BENEFITS DENIAL

An automated means-test flags male applicants as ineligible at 18 points higher than female applicants — not because of what they earn, but because of who they're married to.

SEX FAIRNESS GAP BEFORE

18.00%

AFTER MITIGATION

8.52%

53% reduction · Race gap: 46% · Origin gap: 88%

unfair.py — BIASED MODEL

Model Accuracy: 85.66%

Male applicants : 25.71% flagged ineligible

Female applicants : 7.71% flagged ineligible

Fairness Gap (Sex): 18.00%

US-born : 20.20% flagged ineligible

Foreign-born : 15.81% flagged ineligible

Fairness Gap (Origin): 4.40%

Under 55 : 19.36% flagged ineligible

55+ (elderly) : 22.08% flagged ineligible

Fairness Gap (Age): −2.72%

White/Asian-PI : 21.22% flagged ineligible

Other minorities : 8.47% flagged ineligible

Fairness Gap (Race): 12.75%

fair.py — MITIGATED MODEL

Model Accuracy: 83.05%

Male applicants : 14.84% flagged ineligible

Female applicants : 6.32% flagged ineligible

New Fairness Gap (Sex): 8.52%

US-born : 12.08% flagged ineligible

Foreign-born : 11.55% flagged ineligible

New Fairness Gap (Origin): 0.52%

Under 55 : 11.61% flagged ineligible

55+ (elderly) : 14.41% flagged ineligible

New Fairness Gap (Age): −2.79%

White/Asian-PI : 12.81% flagged ineligible

Other minorities : 5.91% flagged ineligible

New Fairness Gap (Race): 6.90%

Key Insight

Automated benefits systems don't need to name sex or race to discriminate by them. relationship (Husband/Wife), marital.status, hours.per.week, and occupation are the CustodyStatus of welfare AI — features that sound purely economic but carry protected-class signal because of how work, caregiving, and labour markets are structurally organised. Dropping all four alongside the direct protected attributes reduced the sex gap by 53%, the race gap by 46%, and the national-origin gap by 88%.

Male · BIASED

25.71%

Female · BIASED

7.71%

Male · FAIR

14.84%

Female · FAIR

6.32%

PROJECT — 06

HEALTHCARE READMISSION

A hospital readmission model flags patients for high clinical risk using payer code and discharge destination — variables that measure insurance access, not medical severity.

AGE FAIRNESS GAP BEFORE

0.28%

AFTER MITIGATION

0.09%

Age: 68% reduction · Race: 25% reduction

unfair.py — BIASED MODEL

Model Accuracy: 88.79%

Male patients : 0.22% flagged high-risk

Female patients : 0.24% flagged high-risk

Fairness Gap (Gender): 0.02%

Caucasian/Asian : 0.25% flagged high-risk

Other minorities : 0.17% flagged high-risk

Fairness Gap (Race): 0.08%

Under 70 : 0.36% flagged high-risk

70+ (elderly) : 0.08% flagged high-risk

Fairness Gap (Age): 0.28%

fair.py — MITIGATED MODEL

Model Accuracy: 88.74%

Male patients : 0.11% flagged high-risk

Female patients : 0.06% flagged high-risk

New Fairness Gap (Gender): 0.04% ↑

Caucasian/Asian : 0.10% flagged high-risk

Other minorities : 0.04% flagged high-risk

New Fairness Gap (Race): 0.06%

Under 70 : 0.13% flagged high-risk

70+ (elderly) : 0.03% flagged high-risk

New Fairness Gap (Age): 0.09%

Key Insight

Healthcare readmission models don't need race or gender to discriminate by them. payer_code, discharge_disposition_id, medical_specialty, and number_inpatient are the CustodyStatus of clinical AI — features that look like neutral operational data but encode structural inequalities in insurance, geography, and access to preventive care. The age gap reduced 68% and the race gap 25%. The gender gap increased slightly (0.02% → 0.04%) — proxy removal shifted the model in a way that widened it by 0.02pp. The causal direction matters: lower SNF access creates readmission risk. The patient does not bring the risk to the gap — the gap creates the risk.

Under 70 · BIASED

0.36%

70+ · BIASED

0.08%

Under 70 · FAIR

0.13%

70+ · FAIR

0.03%

06What's Next

The roadmap

More datasets, more domains, more bias exposed and fixed. Follow the project on Instagram for updates.

COMPAS Criminal Justice

ProPublica · 70k+ records · 71% bias reduction

AI Recruitment Bias

Kaggle dataset · gender + age · 97.3% bias reduction

German Credit Lending Bias

UCI Statlog · age + tenure proxy · 73.6% bias reduction

Insurance Denial — Healthcare Bias

Kaggle · age + gender + BMI/smoker/diabetic proxies · 60% + 72% reduction

Benefits Denial — Welfare Eligibility Bias

UCI Adult Census · sex + race + origin + age + 4 proxies · 53% / 46% / 88% reduction

Explainer: Proxy Variables

Concept deep-dive · detection code · real-world proof

Explainer: Sampling Bias

Gender Shades study · audit code · mitigation strategies

Explainer: SHAP Values

Model explainability · bias auditing · Shapley attribution

Explainer: Equalized Odds

TPR + FPR parity · COMPAS proof · calibration conflict · fairlearn detection code

Explainer: Disparate Impact (80% Rule)

Four-Fifths Rule · EEOC standard · hiring audit proof · Griggs v. Duke Power

Explainer: Why Fairness Metrics Conflict

Chouldechova impossibility · COMPAS dual-reading · full audit code · base rate analysis

Explainer: Calibration

Differential calibration · COMPAS score bands · MACE detection code · Chouldechova trade-off

Explainer: Demographic Parity

EEOC 80% rule · hiring bias proof · multi-group audit code · impossibility trade-offs

Explainer: Feedback Loop Bias

Predictive policing case · retraining amplification · drift detection code · mitigation strategies

Explainer: Disparate Treatment

Title VII · feature set audit · two-stage treatment → impact check · proxy detection · McDonnell Douglas

Explainer: Label Bias

Biased ground truth · conditional label audit · noise-robust training · propensity matching · Jacobs & Wallach

Explainer: Individual Fairness

Lipschitz condition · matched-pair audit · consistency score · similarity metric · Dwork et al. 2012

Explainer: Counterfactual Fairness

Structural causal model · causal DAG · proxy vs resolving variables · COMPAS policing chain · Kusner et al. 2017

Explainer: What Happens Inside a Neural Network

Forward pass · weights · loss function · backpropagation · SHAP inspection · hiring bias proof

Explainer: Why AI Hallucinates

OOD confidence · sparse feature regions · Mata v. Avianca · insurance denial proof · RAG limitations

Explainer: Reinforcement Learning

Reward function design · COMPAS as RL-adjacent system · proxy exploitation · credit assignment failure

Explainer: Proxy Entanglement

Correlated proxy clusters · Healthcare Readmission proof · cluster removal · causal root analysis

Explainer: ML Bias

Training data bias · label bias · proxy variables · feedback loops · disparate impact vs treatment · COMPAS proof · detection code

Explainer: Data Leakage

Target leakage · train-test contamination · COMPAS CustodyStatus proof · temporal splitting · SMOTE order · detect_target_leakage() · check_preprocessing_leakage()

Facial Recognition Gaps

MIT Gender Shades methodology · coming soon

HMDA Mortgage Lending

Federal lending data · racial bias in loan approvals

Fairness Dashboard

Interactive web app for live bias analysis

The bias is real.
So is the fix.

Code that holds algorithms accountable

The pipeline

Load Dataset

Train Biased Model

Measure Fairness Gap

Remove Attributes + Proxies

Retrain & Re-measure

The experiments

The concepts

These are not hypotheticals

The roadmap

The bias is real.So is the fix.

Code that holds algorithms accountable

The pipeline

Load Dataset

Train Biased Model

Measure Fairness Gap

Remove Attributes + Proxies

Retrain & Re-measure

The experiments

The concepts

These are not hypotheticals

The roadmap

The bias is real.
So is the fix.