Clay logo, go to homepage

Clay GTM guide

How to Build a Targeted Prospect List from Scratch

Most prospect lists do not fail at sourcing. They fail at definition. The work that decides whether a list converts happens before you source anything. Here is how to build a targeted list from scratch, from the ICP through to a scored, rep-ready set of accounts.

May 19, 202610 min read

Most prospect lists do not fail at sourcing. They fail at definition. By the time a list is converting badly, the instinct is to blame the data vendor, the email tool, or the rep, but the damage was done before a single company was pulled: the list was built to match a vague idea of "companies like our customers" instead of a precise definition of who actually buys.

The work that decides whether a prospect list converts happens before you source anything, when you write down the firmographics and the signals that actually predict a purchase. Get that right and sourcing, enrichment, and scoring all inherit the precision. Get it wrong and you are building a faster version of the wrong list. This is how to build a targeted list from scratch, from the ICP definition through to a scored, rep-ready set of accounts.

Step 1: Define your ICP from the signals that predict a buy

A prospect list is a hypothesis about who will buy, and most teams write that hypothesis backwards. They start with demographics ("100 to 500 employees, SaaS, North America") and never ask why those numbers matter. The sharper starting point is the pain: who feels the problem you solve daily, who has budget to fix it, who has authority to buy, and who is actively looking right now. Demographics are the proxy for that pain, not a substitute for it.

The most valuable part of an ICP is the part that is not firmographic at all. Two companies can be identical on size, industry, and geography while one is a perfect fit and the other will never buy, and the difference is almost always a signal: they just raised, they are hiring for a role your product supports, they already run a tool yours plugs into. A precise ICP names those signals explicitly, because they are what separate a list that looks right from one that converts.

Adding precision feels like it should cost you reach. It does the opposite of what people fear.

Add ICP criteria and watch scope shrink as fit rises

Companies in scope

140,000

Whole addressable market

Predicted fit

10%

Illustrative counts; actual numbers depend on your market and definition.

Each criterion you add shrinks the universe but raises the odds every remaining account actually buys. Precision is not the enemy of reach, it is the point of it.

The fastest way to find your real signals is to study your closed-won accounts: pull the deals that closed fast, stayed, and expanded, and ask what they had in common the quarter before they bought. The patterns that repeat are your ICP. Write the definition down as a list of filters you can source against later, split into firmographic (size, industry, geography), technographic (the tools they run), and signal-based (funding, hiring, expansion). Treat it as a hypothesis you will refine, not a fixed answer.

Step 2: Source the company universe that matches your ICP

With a definition in hand, sourcing is a filtering problem, not a collection problem. The goal is to start from the universe of companies in your market and cut it down with the firmographic criteria you just wrote, so what comes back is the slice you can actually sell to rather than a database dump.

In Clay, this starts with Find Companies, a queryable dataset built from over 100 providers and cleaned into one usable source. Create a workbook, add a Find Companies table, and apply your firmographic filters: industry and category, headcount range, geography down to country or city, founding date, keywords in the company description. Preview the result before importing so you can gut-check the list against your ICP, then import. Set the search to re-run on a schedule so new companies that cross into your definition flow in automatically instead of the list going stale the day you built it.

Your firmographic filters are the wide cut. The signals from Step 1 are where the precision lives, and you can source against several at once.

Stack signal sources to find the high-intent core of your market

Firmographic fit (industry, size, geo) — 6,200 in scope

Switch on a signal source to re-score the base set.

Counts are illustrative; the overlap is the multi-signal core to prioritize.

Sourcing is not one search; it is a firmographic base re-scored by stacking independent signal sources, and the accounts sitting in the overlap of several signals are the core of the list.

Tech stack is one of the strongest of these signals, and worth treating as its own discipline; the full method lives in the companion guide on building a prospect list by tech stack. For segments traditional databases miss, like local businesses, family-owned shops, or niche verticals, Clay's SMB discovery (OpenMart) and third-party scrapers like Apify reach lists that ZoomInfo and Apollo never cover. The sourcing method changes by segment; the principle does not. Start from the market, cut by fit, sort by signal.

Step 3: Find the right decision-makers at each account

A company list is not a prospect list. It is a list of front doors with nobody named behind them, and a list of companies you cannot email is a research project, not a pipeline. Turning accounts into prospects means finding the specific people inside each one whose problem your product solves and who can act on it.

In Clay, Find People layers directly on top of the company list you just built. Point it at your companies table, then filter by job title, seniority, department, and location to pull the personas that matter for your sale. The clearer your persona definition, the tighter the result, so name the roles before you search: a technical evaluator, an economic buyer, a champion who feels the pain daily. One signal often means three different conversations, and the same account should surface more than one name when more than one stakeholder owns part of the decision.

The mistake here is finding one contact per account and calling it done. The number of people you pull per account should scale with how the buying decision actually gets made.

Drag deal complexity to see how many contacts an account really needs

Target account

Transactional

Owner / decision-maker

Holds budget and authority alone

Angle: Direct value and price

Transactional, one ownerCommittee, multi-stakeholder

1 contact for a Transactional deal

How many contacts you pull per account is set by how the decision gets made. Single-threading a committee deal is how a good account quietly dies.

Step 4: Enrich every record into something a rep can act on

A name and a title are not enough to sell on. Enrichment is the step that turns a thin row into a record a rep can open and act on without doing their own research first, by filling the contact and company gaps the source data left behind. Without at least a verified work email or a social profile, a contact is functionally worthless; with the right context layered on, it is a conversation.

Clay does this with waterfall enrichment. A waterfall queries data providers in order, starting with the cheapest, and moves to the next one only when the current provider comes up empty, so you get the coverage of every vendor at the cost of the first one that succeeds. Run a company waterfall for firmographics, funding, headcount, and tech stack, then a people waterfall to recover work emails and direct phone numbers from professional profiles and public web data across more than 150 sources. Stacking providers this way pushes coverage from the 40 to 60 percent a single vendor returns up toward 80 to 95 percent.

25,000+

Accounts Mistral AI sourced in just two weeks to form a starter TAM in Clay, a process that normally takes two months.

Read the full story

When every provider runs and you still have a gap, that is the last-mile data problem: facts too specific for any database to carry, like whether a company just shipped an enterprise tier or whether a founder spoke at a recent panel. Clay's AI research agent, Claygent, browses the public web to fill exactly those gaps. Point it at a company and ask it the qualifying question no vendor can answer.

Claygent: ICP-fit and last-mile research
Research {{company_name}} ({{company_domain}}) and answer for this ICP:{{one-line ICP, e.g. "B2B SaaS, 50-1,000 employees, runs an outboundsales motion, expanding go-to-market"}}.Check the company website, careers page, pricing page, and recentpublic news. Return:- fit: "Strong" / "Partial" / "No"- reason: one line citing the specific evidence you used- buying_signal: any active trigger you found (funding, hiring for  relevant roles, a new product tier, a tooling change), or "none"If you cannot confirm fit from public sources, return "Unconfirmed"rather than guessing.

This is the difference between a list of rows and a list a rep trusts. Enrich for the fields your scoring model and your outreach will actually use, not every field a provider offers, so you are not paying to enrich a market.

Step 5: Verify the list before you load it

Verification is the cheapest insurance in the whole build, and the step teams skip when they are in a hurry. B2B contact data decays fast: people change jobs, companies rebrand, and email addresses go dead, so a list that was clean a quarter ago is partly fiction today. Loading an unverified list into a sending tool is how a sender reputation gets wrecked in a week, because every hard bounce tells the mailbox providers you do not know who you are emailing.

Run every email through validation before it touches a sequencer, and let the bad addresses fall out before they cost you anything. In Clay, this is a verification step in the same table: validate the email, flag catch-all and risky domains, and route only the deliverable addresses forward. The ten minutes this takes saves the weeks of domain recovery a bounce spike causes. Verification is not a one-time gate either; because the data decays, schedule a re-verification pass on the live list so dead contacts drop out on their own rather than sitting in your sequences quietly failing.

Step 6: Score and prioritize so reps work the best accounts first

A finished list still has a ranking problem. Every account on it cleared your ICP filters, which means they are all "qualified," but they are not all equally worth a rep's next hour. Scoring is what turns a flat list into a queue, so reps spend their time top-down on the accounts most likely to close rather than working alphabetically through a directory.

The model that works is the simplest one: priority is fit multiplied by intent. Fit is how closely an account matches your ICP. Intent is whether there is an active buying signal right now. The two axes are not interchangeable, and treating them as one number is how teams mis-route their best accounts.

Set fit and intent to see how an account should be routed

ICP fit

Buying intent

Routing decision

Deprioritize: off the list

Account priority is fit times intent, not a single score: a perfect-fit account with no signal is a nurture, and a high-intent account that does not fit is a distraction.

In Clay, the score is built from the same enriched fields you already sourced and verified on, combined with live signals, using a formula or an AI column, so it updates as accounts change rather than freezing the day you built the list. Map the score and its reasoning into your CRM as the field reps sort by, and the build pays off: the rep opens a record that already knows who the account is, why it ranked, and what to say.

Every lead is pre-qualified, scored on unique signals, and routed automatically through Clay. We're now generating pipeline from segments we weren't even touching before.

Common failure modes

A targeted list fails in a handful of predictable ways, and every one of them traces back to a step that got skipped or rushed. None of them are data problems; they are process problems, which means they are all preventable before you hit send.

Five ways a targeted list fails, and the step that prevents each

The throughline across all five: quality is decided upstream. Define precisely, source by fit, multi-thread the accounts that warrant it, verify before you load, and score before you route.

Build a prospect list that converts, from ICP to scored accounts

Define your ICP, source matching companies, enrich and verify the contacts, and score the list, all in one place in Clay.

Frequently asked questions

What is a prospect list?

A prospect list is a structured set of companies and the specific people inside them who match your ideal customer profile and are worth a sales rep's time. A good one is not a raw export of every contact in a database; it is a filtered, enriched, and ranked set built from a precise definition of who buys. The difference between a prospect list and a lead list is mostly intent: a prospect list is companies you have deliberately targeted as a fit, before anyone has raised a hand.

Should you buy or build a prospect list?

Build it. A bought list optimizes for size, the one attribute that does not predict revenue, so most of it is wrong on something that matters: the industry does not fit, the contact is not a decision-maker, there is no active need, or the email already bounces. Building a list from your own ICP definition costs more upfront effort and far less wasted outreach, because every account on it was chosen for a reason. With Clay you build from over 100 company-data providers and 150-plus enrichment sources, so building is now faster than vetting a purchased list.

How many contacts should a prospect list have?

Fewer than you think, and the right number depends on fit, not a target count. A list of a few hundred accounts that match your ICP and show a buying signal produces more qualified conversations than ten thousand cold names, at a fraction of the wasted effort. Per account, the contact count should scale with deal complexity: one name for a transactional sale, three or four stakeholders for a committee decision. Optimize for the share of the list that converts, not its length.

How often should you update a prospect list?

Continuously, not quarterly. B2B contact data decays roughly 25 to 30 percent a year as people change jobs and companies rebrand, so a list built once is partly dead within months. The better model is to stop treating the list as a one-time pull: set your sourcing search to re-run on a schedule so new ICP matches flow in, re-verify emails so dead contacts drop out, and let your scoring update as signals change. In Clay this runs automatically, so the list stays current without a manual rebuild.

How do you use AI to build a prospect list faster?

AI does the two jobs that used to be manual research: filling last-mile data gaps and qualifying fit at scale. Clay's research agent, Claygent, browses the public web to answer the questions no database carries, like whether a company just launched an enterprise tier, then returns a structured fit verdict and any buying signal it found. Feeding those AI-derived fields into a scoring column lets you rank thousands of accounts on criteria a traditional filter cannot capture, so the list is not just bigger, it is better qualified before a rep ever sees it.