Methodology

How Predict's case valuations are made.

The methodology is in the open. Training data, model architecture, held-out test methodology, confidence-band derivation, and the recalibration policy. If a prediction misses outside the band, we recalibrate and disclose.

In plain terms: Predict gives a plaintiff PI attorney a defensible settlement number for a motor vehicle accident or premises liability case, tuned to your jurisdiction and shown with a 90% confidence band, backed by the comparable verdicts behind it. This page is how that number is made. The sections below get technical on purpose, because you should be able to defend the number, not just trust it.

What ships with each prediction inside Predict
  • The four facts that moved the number, so you can see why the model landed where it did.
  • A 90% confidence band, the range the case should land in 9 times out of 10 on the same facts.
  • 5 to 10 anonymized comparable verdicts you can attach as an appendix to a demand letter.

The sections below show how each of those is built.

Training data

The Predict model is trained on 312,000 jury verdicts and reported settlements from 2018 through 2025, drawn from 50 US jurisdictions. The dataset is plaintiff-only — every record is sourced from publicly available verdict and settlement filings, plaintiff-side reported settlements, and PACER docket extracts. No claims-side carrier data is used in training, and none will be.

Dataset composition
312,000raw records · before dedup
320K segments after stratification
Motor vehicle accidents
198K
In-scope · production predictions
Premises liability
72K
In-scope · production predictions
Out-of-scope
50K
Med-mal, mass tort, comp · transfer-learning signal only
After deduplication and the 4-year rolling-window cut-off, the active training set is 248K records. The 50K out-of-scope records inform the model but are never used to produce a prediction in those case types.

The 312K count is the raw record count before deduplication. After deduplication and the 4-year rolling-window training cut-off, the active training set is roughly 248K records, of which 198K are motor vehicle accident cases and 72K are premises liability cases. The remaining 50K records are out-of-scope case types (medical malpractice, mass tort, workers compensation) carried for transfer-learning signal but not used in production predictions.

Per-record attributes: case type, jurisdiction (state and county), liability classification, injury severity tier, medical specials, property damage, plaintiff-counsel firm, judge composition, defendant insurance posture, and outcome amount with verdict-versus-settlement classification.

Model architecture

The production model is a gradient-boosted regression ensemble — 800-tree XGBoost trained with monotonicity constraints on the severity and medical-specials features. Hyperparameters were tuned via 5-fold cross-validation on the pre-2024 training portion, with the 2024–2025 records held out as the test set.

Feature importance · top 8 features by mean SHAP contribution
Jurisdiction
0.42
Injury severity
0.31
Liability clarity
0.22
Medical specials
0.16
Case type
0.13
Defendant policy
0.10
Plaintiff counsel
0.07
Property damage
0.04
Per-prediction SHAP attributions surface in the workspace as "the four facts that moved the number." The aggregate ranking above is what a typical MVA prediction looks like; PL cases re-weight notice status and premise type higher.

Gradient-boosted regression was selected over neural network alternatives for two reasons. First, the feature space is tabular and modestly sized — the strong inductive bias of trees on tabular data is well-established. Second, interpretability matters in legal contexts more than in most ML applications: an attorney must be able to defend the number, which means understanding the feature contributions. The model produces per-prediction SHAP attributions inside the full Predict subscription, surfaced as "the four facts that moved the number" alongside each prediction.

Accuracy and held-out testing

The accuracy figure cited across the site — 90–92% — is the Median Absolute Percentage Error (MdAPE) on the held-out test set, by case type:

Held-out MdAPE · by case type
MVA
92%
PL
91%
Strictly temporal train / test split
Train: 2018-01-01 → 2023-12-31 Test: 2024-01-01 → 2025-12-31
Bars show held-out prediction accuracy by case type, measured against the realized settlement on the strictly temporal test set. Vertical line marks the 90% mark — both case-type folds clear it.

MdAPE is the metric most appropriate for skewed long-tail damages distributions — mean-based error metrics over-weight the catastrophic tail and under-represent typical case performance. The methodology mirrors the held-out test design used in the underwriting-model literature.

The held-out split is strictly temporal — training data ends 2023-12-31, the test set is 2024-01-01 through 2025-12-31. No leak between train and test.

What this means at your desk: when a case sits well below the bottom of its 90% band, that is the model telling you a number deserves a second look before you accept it. The headline is the number; the band is where the case should land 9 times out of 10 on the same facts.

Confidence-band derivation

Every Predict prediction ships with a 90% confidence interval. The band is computed from quantile regression — a parallel set of 800-tree XGBoost models trained against the 5th and 95th percentiles of the conditional outcome distribution, rather than the median.

The band widens with two conditions: (1) injury severity — catastrophic cases carry materially wider conditional distributions than minor cases; (2) data density — predictions in jurisdictions with fewer than 100 comparable verdicts in the training set widen by a density correction factor.

Operational interpretation: 9 out of 10 cases with the same input profile settle inside the published range. The remaining 10% are the long tail — high-stakes outliers, judges with unusual records, plaintiff counsel with extraordinary individual histories.

Want to see this on one of your own cases? Start a 14-day trial, no credit card, or try the public calculator first.

Jurisdiction folds

Jurisdiction is the single largest factor in settlement value. The model treats jurisdiction with stratified folds — 50 state-level folds, with additional county-level folds for the 14 counties with ≥150 verdicts in the training set (Harris TX, Cook IL, Los Angeles CA, Bronx NY, Miami-Dade FL, Alameda CA, Fulton GA, Maricopa AZ, Philadelphia PA, Dallas TX, Bexar TX, Cuyahoga OH, Kings NY, Queens NY).

County-level folds · the 14 counties with ≥ 150 verdicts
TX
Harris
IL
Cook
CA
Los Angeles
NY
Bronx
FL
Miami-Dade
CA
Alameda
GA
Fulton
AZ
Maricopa
PA
Philadelphia
TX
Dallas
TX
Bexar
OH
Cuyahoga
NY
Kings
NY
Queens
50 state folds · all 50 US jurisdictions modeled separatelyLower-density jurisdictions: wider confidence band
Counties with thin verdict density still get a prediction — the band widens by a density correction factor instead. No silent failure mode for under-represented jurisdictions.

The fold architecture matters because state-level PI economics differ by an order of magnitude. A jurisdiction-agnostic model would over-predict no-fault states and under-predict high-recovery tort jurisdictions. The stratified fold approach is what allows the 90–92% MdAPE figure to hold across the 50-state range rather than collapsing to an aggregate average.

Recalibration policy

If a prediction misses outside the published 90% confidence band, the model is recalibrated for that jurisdiction and the recalibration is disclosed. This is a brand commitment, not a marketing line.

Quarterly model refresh · public version history
v3.0
2025-10-01
Shipped
v3.1
2026-01-01
Shipped
v3.2
2026-04-01
Current
v3.3
2026-07-01
Scheduled
Continuous on the feedback side — new outcomes scored against prior predictions weekly. Quarterly full refresh ships with a per-jurisdiction MdAPE delta changelog. Out-of-band misses trigger jurisdiction-specific recalibration and a public disclosure.

Recalibration cadence: continuous on the verdict-feedback side (new outcomes are scored against prior predictions weekly), with a full model refresh on a quarterly cadence. Each quarterly refresh ships with a changelog documenting the per-jurisdiction MdAPE delta versus the prior model.

The model versioning is public. The current production model is v3.2, refreshed 2026-04-01. The next scheduled refresh is 2026-07-01.

Sourcing and citations

Every prediction inside the full Predict subscription cites 5–10 comparable verdicts drawn from the training set. The citation includes the jurisdiction, case-type classification, severity tier, medical specials, and outcome amount — anonymized to the case-caption level but specific enough to support the prediction defensively.

Sample comparable-verdict citations · what gets attached
HC-24-0832
TX · HARRIS · 2024
MVA · rear-end · moderate severity
Cervical strain · $19K medical specials · clear liability · commercial defendant policy
$152,000
HC-24-1102
TX · HARRIS · 2024
MVA · rear-end · moderate severity
Cervical · $14K medical specials · clear liability · standard commercial policy
$138,500
HC-23-2218
TX · HARRIS · 2023
MVA · rear-end · moderate severity
Cervical strain · $22K medical specials · clear liability · commercial fleet defendant
$165,000
Each citation carries enough specificity to support the prediction under cross, anonymized to the case-caption level. The full cohort (typically 5–10 verdicts) ships with every prediction inside the trial and as an appendix on the demand-letter export.

The annual State of PI Case Valuation report sources every chart and every claim against the underlying training data. Every accuracy claim names the held-out test set. Every jurisdiction stat names the year and the N.

What the model does not do

Scope · what Predict will and won't produce

Motor vehicle accidents

Rear-end, intersection, commercial vehicle, wrongful death. Calibrated.

Premises liability

Slip-and-fall, retail, commercial, residential. Calibrated.

Medical malpractice

Out of scope at launch. No prediction returned.

Mass tort & class actions

Out of scope. Aggregation economics differ structurally.

Trial-verdict prediction

Output is expected settlement value, not jury verdict.

Legal opinions

Statistical estimate, not legal advice. Attorney owns the judgment.

The model refuses to produce a prediction outside its calibrated case types. We'd rather say "not yet" than ship a number we can't defend. Emerging case types get added to training with each quarterly refresh, with appropriately wider bands until density supports tightening.
  • The model is not a legal opinion. Predictions are statistical estimates of likely settlement value, not legal advice on case viability, recovery strategy, or settlement positioning. The attorney owns the legal judgment.
  • The model is not calibrated outside MVA and PL. Medical malpractice, mass tort, workers compensation, family law, and immigration cases are out of scope at launch. Predictions on out-of-scope case types are not produced.
  • The model does not predict trial outcomes. The output is the expected settlement value, not the expected jury verdict. The two diverge materially in many jurisdictions — and where the divergence is large, the band widens to reflect it.
  • The model does not know about cases that haven't been filed. Predictions are calibrated against the historical training set; emerging case types (autonomous-vehicle cases, new statutory tort regimes) are added to the training set with each quarterly refresh, with appropriately wider bands until the data density supports tightening.