Most lead scores are point systems built on guesses that nobody ever checks. Someone in a planning meeting decides a demo request is worth +10, a webinar signup +5, an enterprise title +15, and those numbers go live without a single comparison against a deal that actually closed. A score worth routing on is fit times intent, weighted by what the data says predicts a purchase, and validated against real outcomes. Fit without intent ranks a perfect-profile account that will never buy. Intent without fit ranks a curious student downloading your whitepaper. The weights are not opinions; they are the variables that separated the accounts you won from the ones you lost.
This guide covers what lead scoring is, the two axes it runs on, how the major model types trade off against each other, what makes a score trustworthy, how a score drives routing, and the failure modes that quietly break most models.
What is lead scoring?
Lead scoring is the practice of assigning every lead a number that estimates how likely it is to become a customer, so the highest-value leads surface first instead of getting buried in arrival order. The number replaces the rep's gut read with a ranking the whole team can act on the same way.
A score is only as good as the inputs feeding it. A model that reads job title and company size produces a tidy number that has nothing to do with whether the account is in a buying cycle. A model that reads behavior with no notion of fit ranks a freelancer who clicked three emails above a target account that visited your pricing page once. The number looks authoritative either way, which is the trap: a lead score earns trust through what it predicts, not through how precise it looks. The rest of this guide is about closing the gap between a score that looks right and a score that ranks the right leads first.
Fit and intent: the two axes every score runs on
Every lead score is built from two questions that have nothing to do with each other. Fit asks whether this account looks like the customers you already won; intent asks whether they are showing signs of buying right now. A score that collapses both into one number loses the distinction that decides what a rep should do next.
Place each lead in the fit-and-intent grid and read the play it calls for
Place this lead
Target-profile account that just visited pricing twice
Pick the quadrant above that matches its fit and intent.
Fit and intent are independent axes, and the right next action depends on which quadrant a lead lands in, not on a single blended number.
The quadrant a lead occupies decides the action, and a single blended score erases it. Scoring fit and intent separately, then multiplying them, keeps the two readings visible: a high-fit, high-intent lead scores high on both and rises to the top, while a strong showing on one axis cannot paper over a zero on the other. Fit is the question your ideal customer profile answers, covered in the companion complete guide to ICP; intent is the question intent data answers.
Manual, rule-based, and AI scoring: how the model types compare
Three model types dominate lead scoring, and they trade off in predictable ways across accuracy, effort to maintain, and how well they hold up as your inputs change.
The three lead scoring model types
| Model type | How it scores | Strength | Where it breaks |
|---|---|---|---|
| Manual point system | A human assigns fixed points per attribute (demo +10, enterprise title +15) | Fast to set up, easy to explain | Weights are guessed; nobody updates them as the business changes |
| Rule-based / deterministic | If-then logic on enriched fields (if headcount > 500 and uses Salesforce, score 9) | Transparent, predictable, cheap to run | Brittle on edge cases; every new condition is hand-coded |
| AI / predictive | A model or AI formula reads messy inputs and outputs a score with reasoning | Handles edge cases and unstructured data; adapts | Needs validation and guardrails; a black box is hard to trust blind |
The right answer is rarely one type. The strongest models run deterministic rules for the 80% of cases that follow clear logic, then layer an AI formula on the 20% of edge cases that rules can't cleanly handle, using a conditional so the AI only fires when the rule falls through. That keeps the model cheap and predictable where it can be and flexible where it has to be. The hands-on build, including the exact rule and AI formula setup, lives in the companion guide on how to build a lead scoring model.
How to score fit and intent with AI in Clay
A score becomes accurate when the inputs feeding it are accurate, and most fit inputs are messier than a point system assumes. Industry is the clearest example: standard category labels lump cosmetics and apparel both under "Retail," and different data providers use "IT," "Software," and "Internet technology" for the same sector. A rule that scores on raw industry strings inherits all of that noise. An AI formula can map each company to your own custom industry list before the score ever reads it.
The same approach works for the fit inputs that feed a score: Claygent inside Clay classifies each prospect into a defined industry list instead of trusting the provider's label. Here is a seniority-scoring formula in the same pattern, scoring decision-making power from the job title rather than keyword-matching it.
Based on this person's job title, assign a seniority score from 1 to 10.Consider: C-level executives (9-10), VPs and Directors (7-8), Managers (5-6), Individual contributors (3-4), Interns/Entry-level (1-2).Return only the number.
You will be provided with an industry value. Map this value to the list of industries in <List of Industries></List of Industries> tags and select the single best match. Only return values from the list; do not generate new ones. If no reasonable match exists, return "Other". Return only the matching value, with no other words or explanation.Start with a native function for the bulk of cases, test it on a batch of 10 to 20 rows, and add an AI formula as a fallback only for the cases the function gets wrong. That balance of deterministic speed and AI flexibility is what keeps the model both cheap and accurate as your inputs drift.
What makes a lead score trustworthy
A score becomes trustworthy the moment it is checked against who actually closed, and almost no score is. The validation step is simple and almost always skipped: take six months of closed deals, run them back through the model as they looked at the moment they were leads, and check whether the deals that closed actually scored higher than the deals that died. If your "hot" leads converted at the same rate as your "warm" ones, the score is decoration.
That check is also how you set the weights in the first place. A weight is not a vote on how important an attribute feels; it is a measurement of how much that attribute separated buyers from non-buyers in your own history. If accounts using a competing tool closed at three times the base rate and accounts in a target industry closed at the base rate, the competitor signal earns a heavier weight and the industry signal earns almost none, regardless of which one felt more important in the planning meeting. Trust in a model comes from one property: the leads it ranks highest convert at a measurably higher rate than the leads it ranks lowest.
“Clay transformed how we source, enrich and act on data. Having the ability to define what really matters in an ICP and deliver high-quality lists in minutes has driven both stronger revenue outcomes and significantly lower acquisition costs for our teams.”
When scoring runs on the signals that predict a purchase rather than the attributes that describe a company, the downstream numbers move. Teams that score every lead on fit and intent and then route on that score see the share of leads their reps treat as genuinely qualified climb, because the ranking finally matches reality.
Lift in sales-qualified leads ElevenLabs saw after scoring every lead on fit and intent in Clay.
Read the full storyHow a lead score drives routing and prioritization
A score is only useful if it changes what happens to the lead. The number's whole job is to decide order of attention: which leads reach a live rep in minutes, which go to a nurture track, and which sit in marketing until a signal moves them. Without that downstream wiring, a score is a column nobody reads.
The routing logic reads the score and acts on it automatically. High-fit, high-intent leads get pushed to a rep instantly, because for inbound the odds of qualifying a lead drop by 80% after the first five minutes. Mid-tier leads enter a nurture sequence. Low scores hold in marketing. In Clay, the score feeds a round-robin distribution action that reads a live rep table (rep name, status, territory, weight) and assigns the lead to an available rep, with higher-weighted reps drawing more leads. The score decides priority; the routing decides destination. The full inbound routing build is covered in the companion guide on automating inbound lead qualification.
The failure modes that break most lead scoring models
Most broken scoring models fail in one of three identical ways, and each is a shortcut taken at setup that compounds silently for months.
Open each failure mode to see its root cause and the fix
All three are invisible until you measure. A model on guessed weights, never validated, scoring on vanity activity still produces confident-looking numbers every day, and reps work them in good faith until someone finally compares the ranking against revenue and finds no relationship.
The three ways lead scoring breaks are all setup shortcuts: guessed weights, no validation, and scoring on activity that does not predict revenue.
The cheapest insurance against all three is the same validation pass: run closed deals back through the model and let the data tell you which inputs earned their weight.
Where to start with lead scoring
Start with outcomes, not attributes. Pull your last six months of closed-won and closed-lost, and look for the attributes and signals that actually separated the two; those become your weighted inputs, and everything that did not separate them gets dropped. Then split the model into fit and intent, score each separately, and multiply, so a strong reading on one axis can never hide a zero on the other.
Build the deterministic rules first for the cases that follow clear logic, add AI formulas only for the edge cases rules can't handle cleanly, and wire the score into routing so it actually changes which leads reach a rep first. Then close the loop: every quarter, replay closed deals through the model and re-weight on what the data now shows. The build itself, step by step, is in the companion guide on how to build a lead scoring model; the fit definition that anchors it is in the complete guide to ICP.