The “data warehouse vs data lake” debate is over. The lakehouse won, and most insurers and banks are now choosing between Snowflake, Databricks and Microsoft Fabric rather than between two architectural patterns. This article is a practical orientation for an actuarial or finance leader scoping their next data investment.

What changed

Three years ago a serious insurer’s data architecture meant a Teradata or SQL Server warehouse for finance, an S3-or-Hadoop lake for everything else, and a chain of brittle pipelines moving data between them. The two were genuinely different: the warehouse was structured, schema-on-write, expensive per terabyte, performant for known queries; the lake was raw, schema-on-read, cheap, slow to query, hard to govern.

By 2026, that distinction is mostly historical. The lakehouse pattern — a single store that holds both raw and curated data, with table-level governance and ACID transactions — is delivered out of the box by Snowflake, Databricks (Delta Lake), and Microsoft Fabric (OneLake). The architectural argument has collapsed into a vendor choice.

What still matters in the choice

What you already run. If your organisation has Office 365 and Power BI live, Microsoft Fabric is the path of least resistance. If you have Spark workloads, ML pipelines and engineers who know Python, Databricks is the natural fit. If you have a SQL-shop with finance, BI and analyst teams who do not write Spark, Snowflake gives you the lowest friction.

How you handle credit and risk modelling. Banks running IFRS 9 ECL pipelines tend to land on Databricks because of Spark’s distributed reach across exposure data. Insurers running IFRS 17 or SAM-style cashflow projections tend to favour Snowflake because the queries are heavy-aggregate but not heavy-distributed.

Where governance and lineage live. All three platforms now do table-level lineage tolerably. The question is whether your team will use it. We have seen estates where the platform was capable but nobody had the discipline to mark “this is the source of truth for capital ratio”. The platform did not fix the problem; the platform exposed the problem.

What we see go wrong

Lift-and-shift without redesign. Migrating an old warehouse table-for-table into the lakehouse, with the old joins and the old transformations and the old quirks. The new platform inherits all the old pain. You spend twelve months migrating and end up with a more expensive version of what you had.

Treating the lake as a dumping ground. Connect the policy admin system, dump everything into the bronze layer, “we will sort it out later”. Two years later there are 40,000 untagged tables and nobody can find the canonical exposure dataset.

Buying the platform without buying the discipline. The lakehouse is software; the discipline is people. Versioned data products, named owners, contract tests at the boundary, lineage that is actually used — these are practices, not features. The platform helps. The platform does not do them for you.

What good looks like

A working lakehouse for an insurer or bank, two years in, looks like this. There are three or four named data products that finance, actuarial and risk teams agree are the source of truth — exposure, policy, claims, treaty for an insurer; account, customer, transaction, exposure for a bank. Each has an owner. Each has a contract that fails the build if upstream changes break it. Lineage is wired through to the BI layer so the executive dashboard’s headline number traces back to the policy admin system in a click.

That is what to aim for. The platform you pick to build it on matters less than the discipline you bring to it. The teams that get this right end up with reporting that reconciles, AI projects that are not blocked on data quality, and an audit conversation that is much shorter than it used to be.

If you are scoping this for your own estate, our Data Engineering practice does exactly this work — domain-aware, lakehouse-native, lineage built in.