Machine-learning reserving and individual-claim modelling: the next reserving frontier

Triangle methods aren’t broken. They’re just leaving information on the table. The interesting reserving research of the last two years is not about replacing chain-ladder — it is about getting at the claim-level signal that aggregate cells throw away, without losing the audit trail that aggregate methods quietly give you for free.

By Meyr Kruger, FASSA, Actuary

Chain-ladder, Bornhuetter-Ferguson and Cape Cod remain valuable precisely because they are transparent and familiar to auditors, boards and regulators. The frontier is to extend reserving from aggregate patterns to richer claim-level information — using policy, claim, claimant, operational and development data that disappears when claims are squashed into accident-period and development-period cells.

Why the reserving problem has changed

Aggregate methods hide the drivers that matter most to reserve risk: changes in claim mix, inflation, litigation behaviour, repair networks, claim-handling practices, settlement speed, bodily injury severity, catastrophe exposure and large-loss development. Modern insurers have far more granular data than the original triangle methods were designed to use. The question is how to use it responsibly. Individual-claim reserving estimates outstanding liability at claim level and aggregates upward. Machine learning offers tools for non-linear relationships, high-cardinality interactions and development patterns that are awkward to capture in a triangle.

The bridge from chain-ladder to claim-level

Wüthrich’s 2018 work established that machine-learning techniques can be applied at individual-claim level. The ASTIN working party on Individual Claim Development made the same case from the data side — aggregation is information loss, and ML deserves serious investigation in reserving.

The more recent Richman and Wüthrich bridge paper matters because it restructures the chain-ladder prediction procedure into direct projection-to-ultimate factors, creating a natural conceptual pathway from the familiar method to claim-level ML. Their follow-up One-Shot Individual Claims Reserving continues the push toward something the industry can actually adopt. Avanzi and co-authors have opened a third front by treating micro-level reserving as a sequential decision process — reinforcement learning that updates outstanding-liability estimates as claims develop, which is closer to how reserving actually works than a single-shot prediction.

The four wins, named

Claim-level reserving earns its place in four specific ways:

Mix versus inflation versus handling. A triangle that is “developing badly” rarely tells you which of those is the cause. Claim-level models can separate them.
Better case-estimate and IBNR. Notification delay, jurisdiction, peril, repair channel, claimant age, medical codes, litigation flags, text-derived features — all available, all currently discarded.
Reserve uncertainty. ML does not solve uncertainty on its own, but it combines naturally with bootstrap, simulation, Bayesian methods and scenario testing.
Operational insight. A good reserving model surfaces claim cohorts that drive adverse development, late settlements or strengthening. That information is useful to the claims function long before it shows up in a finance disclosure.

And the five ways it goes wrong

The risks are equally specific. False precision — a complex model produces highly detailed numbers that look authoritative on a weak data foundation. Poor explainability — reserving sits in financial statements; auditors and boards need to understand why the number moved. Data leakage — extracts that include post-valuation updates quietly destroy back-tests. Operational drift — handling practices, inflation and litigation patterns change, and the model that worked last year stops working without anyone noticing. Notebook risk — a model that affects financial reporting cannot live on one analyst’s laptop.

None of these is fatal. All of them are addressed by the same engineering discipline that should already exist around the actuarial estate.

The implementation order that works

A sensible modernisation programme starts with diagnostics, not replacement. Build a clean reserving data mart that joins policy, claim, payment, reserve, exposure and operational data on valuation dates. Reconcile it back to finance and to the existing triangles. Run ML models in parallel with classical methods — not instead of them — and ask whether they explain residual development, large-loss behaviour or claim-mix change. Only after enough evidence builds up should a team consider an ML output as a selected method or formal input to booked reserves.

Model governance is the same governance the firm should already have on its core methods: version control, documented data cuts, feature definitions, validation reports, holdout testing, back-testing across valuation dates, sensitivity analysis, independent review, and an explicit policy on where actuarial judgement overrides the model output. Where AI is involved in any of this, we apply the controls described on our How we use AI page.

The questions the reviewer should be asking

Before adopting any of this, force the modelling team to answer seven questions. Does the model add insight beyond existing methods? Can we explain reserve movements to auditors, boards and management? Are valuation-date data cuts controlled? Do we understand which features drive the result? Is the model robust across underwriting cycles and operational changes? Can we reproduce every model run? Are judgemental selections documented separately from model outputs?

Anything less and the technology will outrun the actuarial control.

If you are scoping a move from aggregate to claim-level reserving without losing the governance you have today, our Finance Modernisation practice does this end-to-end.

One clear argument

Why the reserving problem has changed

The bridge from chain-ladder to claim-level

The four wins, named

And the five ways it goes wrong

The implementation order that works

The questions the reviewer should be asking

Sources

Practitioner writing, monthly.

Have a similar opportunity? Talk to us.