Clay logo, go to homepage

The complete guide to waterfall enrichment

No single provider wins on both quality and coverage. Waterfall enrichment chains them so you get broad coverage at high accuracy. See the benchmarks.

April 13, 20268 min read

The instinct when a record comes back empty is to go shopping for a better data provider. It is the wrong instinct. Run the same list of contacts through five email finders and you get five different winners: one returns near-perfect data but only on half your accounts, one reaches almost everyone but ships values you cannot trust, and the rest sit in between. No provider tops both quality and coverage, because the two trade off against each other by design. Waterfall enrichment is the move that ends the choice: you chain providers in a set order, take the first confident answer, and walk away with the coverage of all of them at the accuracy of your best one. This is how it works, why it beats any single vendor, and how to order one yourself.

What is waterfall enrichment?

Waterfall enrichment is a fallback chain, not a single lookup. You line several data providers up in a fixed order, send a record to the first one, and check whether it returns a confident result. If it does, you stop and keep that value. If it does not, the record falls through to the second provider, then the third, until something lands or the chain runs out. The name comes from the shape: records spill down the stack and each one exits at the first level that can answer it.

One rule governs everything: you stop at the first confident result. A waterfall is not five providers enriching the same field while you pick a winner afterward. It is sequential. The second provider only sees the records the first one missed, the third only sees what the first two missed, and the work shrinks at every level. Query in order, stop early. That single property is what makes the cost model and the coverage math break in your favor.

This works on any field with more than one supplier behind it: work emails, mobile phone numbers, firmographics like revenue and headcount, professional profile URLs. The data type changes. The chain logic does not.

Why no single provider wins, and what that costs you

Quality and coverage trade off by design, so every provider is best at one and worst at the other. A vendor that verifies each record against a primary source returns data you can dial on, but it only holds what it has verified, so coverage drops. A vendor that aggregates from everywhere reaches almost any account you send, and a chunk of those values are stale or wrong. A homepage advertising high accuracy and millions of contacts measures those two claims on two different samples.

The numbers below come from Clay's 2025 benchmark of work email providers, run against ground-truth verified data. Read the shape before the names.

Quality vs coverage across five work email providers, then stacked into a waterfall

90%95%100%0%50%100%IcypeasEnrowHunterFindymailWiza
Accuracy (vertical) · Coverage (horizontal) · dot size = cost per lookup

Tap any dot for its exact accuracy, coverage, and cost. The dots fall along a tradeoff line. Switch to Stack to see the fix.

No single provider wins on both axes, so chaining them cheapest-first lifts usable coverage toward the widest provider while accuracy stays near the best one. Tap a dot for its exact quality, coverage, and cost, then switch to Stack to step down the chain. Source: Clay 2025 work email benchmark by region. Clay 2025 benchmark

Hunter returns the most trustworthy emails in the set and reaches barely half your contacts. Findymail reaches nine in ten and gives up two points of quality to do it. Pick one and you pick your failure: a clean list that is half empty, or a full list you cannot trust. The waterfall refuses the choice. Usable coverage climbs to the widest provider's level because every record gets a shot at every source, and accuracy stays near the top because each record exits at the most accurate level that could answer it.

The cost model: you only pay the chain until one lands

A waterfall does not cost the sum of its providers. It costs the path each record walks. A record stops at the first confident result, and you are billed only for the lookup that returns it, not for the cheaper sources above that ran and came back empty. A contact found by the first provider never touches the second, so you never pay the second's credit on it. Only the hard records, the ones the cheap broad sources missed, ever reach the expensive specialist at the bottom.

So order is a budget decision, not just an accuracy one. Put a cheap, decent-coverage provider first and it clears the bulk of the list at the lowest price per record. Each level below inherits a smaller pile, so your most expensive provider runs on the fewest records. In Clay's work email chain, the two cheapest providers cost the same per lookup yet clear most of the list, and the dollar-a-lookup option at the bottom only fires on the handful nobody else could find. Flip the order, premium first, and you pay top dollar to enrich records a free inference could have handled.

Clay's work email waterfall makes this concrete: it cascades across more than 100 email providers in sequence and charges credits only for the one that returns the match. An Infer Email step runs first for free, constructing a likely address from a name and domain before any paid provider is touched. For the full field-by-field build, follow the step-by-step guide to finding work email addresses; this guide stays on the chain logic underneath it.

Replay the waterfall: it runs until a provider lands, and bills only for the one that does

Click any step to hold it, or replay

Credits spent

0

The free inference and the first two providers come back empty and cost nothing. Provider C lands the verified match for one credit, and the run stops there. Clay bills only for the lookup that returns data.

A waterfall runs top-down and stops at the first verified match, and Clay bills only for the lookup that actually returns data, so a free inference and two misses still cost nothing while the one provider that lands costs a single credit.

How to order a waterfall

Order is the only real decision in a waterfall: cheapest confident answer first, broadest source last. You are not ranking providers by overall quality. You are sequencing them so the lowest-cost source that returns trustworthy data gets first crack, and each fallback exists only to catch what the level above it could not.

Reorder the same five providers over a batch of 100 records and watch the bill move while the final coverage holds.

Reorder the chain over 100 records: coverage holds, the bill moves

1

Icypeas

$0.20/lookup · ran on 100 of 100

2

Enrow

$0.20/lookup · ran on 48 of 100

3

Hunter

$0.40/lookup · ran on 14 of 100

4

Findymail

$0.50/lookup · ran on 7 of 100

5

Wiza

$1.00/lookup · ran on 1 of 100

$39total cost / 100 records
94%final usable coverage

Final coverage is set by the set of providers, because every record still falls through to every level. Only the bill moves: cheapest-first runs the dollar-a-lookup specialist on the few records nobody cheaper could match, while premium-first runs it on all 100 up front.

The order of a waterfall is a budget decision, not a coverage one. Reorder the chain or tap a preset: final usable coverage holds near 94 percent in any arrangement, while total cost swings because premium sources only ever run on the records nobody cheaper could solve.

Three things set the order. Price per lookup comes first, because the early levels run on every record and the late levels run on almost none, so the cheap providers belong on top where volume is highest. Quality at the confidence threshold comes second: a provider earns an early slot only if its data clears your bar, since a cheap source that ships garbage just passes bad records downstream. Coverage comes last and goes at the bottom, where a high-coverage specialist mops up the long tail nothing cheaper could find.

One setting governs all of it: the confidence threshold that decides what counts as found. Set it strict and more records fall through to the next level, raising both coverage cost and final quality. Set it loose and the chain stops early on shakier data. For cold outreach, set it strict: a wrong email is worse than a missing one.

The threshold is worth feeling, because one setting moves coverage, quality, and cost at the same time.

Set the confidence threshold and watch coverage, quality, and cost trade off

Match-confidence thresholdBalanced
LooseBalancedStrict
94%Usable coverage
96%Accepted quality
207Credits / 100 records
Balanced: Most pipelines.

A stricter cutoff rejects borderline matches, so records fall further down the chain: quality climbs, coverage dips a little, and cost rises.

The confidence threshold is the dial behind a waterfall. Stricter rejects borderline matches, so quality climbs while coverage dips and cost rises, which is the right trade for cold outreach where a wrong address is worse than a missing one.

Order is the whole game, which is why the deeper question of which providers belong in your chain at all lives in the guide to choosing a data enrichment provider, where you run them against your own list before committing.

Where waterfall enrichment applies

A waterfall is a pattern, not a feature, so it runs anywhere more than one provider sells the same field. Work emails are the most common chain because the verification step sits right beside it and providers are plentiful. Mobile phone numbers are the hardest case, where quality tops out lower and coverage rarely clears 90 percent on any one source, which is exactly where stacking pays off most. Firmographics like revenue and headcount run the same way, with one wrinkle: the most accurate firmographic providers cover well under half your accounts, so the chain does real work.

A waterfall is the practical face of data enrichment itself. Enrichment turns a thin record into a complete one. A waterfall is how you do that reliably when no single supplier holds everything. Treat enrichment as always-on infrastructure: run a waterfall on every field that matters, refresh it on a schedule, and stop thinking about providers as a one-time purchase.

Clay has become our primary source of enrichment. We built an automated flow that identifies signals, enriches data, and pushes leads to our sales team only when it's most relevant. Instead of flooding HubSpot with data, we only push leads once they've shown strong signals. This helps our sales team stay focused and reduces noise.

How to build your first waterfall

Start with one field and one list. Pull a sample of 500 to 1,000 records that look like the accounts you actually sell to, messy ones included. Fill work email first: it is the easiest to test and the most useful to get right. Add providers in cheapest-confident-first order, set the confidence threshold strict, and run the sample.

Watch the diminishing-returns curve. Coverage climbs fast on the first two levels, then flattens, while spend keeps ticking up. The bottom of your chain is where you confirm a provider earns its slot rather than billing you for records the cheaper levels already caught. The single provider running inbound enrichment at OpenAI left gaps the team lived with. Moving to a chained motion changed the number, not the effort.

40% → 80%

OpenAI roughly doubled its inbound enrichment coverage after moving from a single provider to Clay's waterfall.

Read the full story

Once the email chain works, the pattern copies. Clone it onto phone numbers, then firmographics, and set each to refresh on a schedule. Records stay complete as they enter your system instead of getting cleaned up before a campaign. That is the whole motion: one field, ordered cheapest-first, then repeated across every field that decides whether a rep knows who they are calling.

Build a waterfall that fills every record

Chain 100+ providers in Clay, pay only for the lookup that lands, and watch coverage climb past any single vendor.

Frequently asked questions

What is waterfall enrichment?

Waterfall enrichment queries several data providers in a fixed order and keeps the first confident result for each record. A record falls through to the next provider only when the one before it returns nothing usable, so the chain stops the moment a match is found. You reach the coverage of every provider in the stack while keeping accuracy near your single best source.

How does a waterfall enrichment differ from using one provider?

A single provider gives you one fixed tradeoff: high quality on a fraction of your list, or broad coverage with values you cannot trust. A waterfall removes the tradeoff by chaining providers, so every record gets a shot at every source and exits at the most accurate level that can answer it. In Clay's 2025 work email benchmark, no single provider cleared both 95 percent quality and 90 percent coverage. A cheapest-first chain reaches both at once.

Does a waterfall mean paying for every provider on every record?

No. Each record stops at the first confident result, so you are billed only for the lookups that ran before the match. A contact found by the first provider never touches the second, so you never pay the second's credit on it, and your most expensive provider fires only on the handful of records nobody cheaper could find. Total cost is the path each record walks, not the sum of the stack.

How should I order the providers in a waterfall?

Put the cheapest provider that returns trustworthy data first, since the early levels run on every record and the late levels run on almost none. Each fallback should add coverage the levels above it lack, and your highest-coverage specialist goes last to catch the long tail. Set the confidence threshold strict for cold outreach: a wrong value is worse than a missing one.

What data can you enrich with a waterfall?

Any field with more than one supplier behind it: work emails, mobile phone numbers, professional profile URLs, and firmographics like revenue and headcount. Phone and firmographic data benefit most, where quality and coverage are lowest and no single source comes close to filling a list. The chain logic stays identical across fields. Only the providers in the stack change.