How to Use Claygent for AI Prospect Research

Answer the custom questions no database has a field for. Use Claygent to run structured, cited AI prospect research across a whole list.

April 30, 20269 min read

ClaygentFilterSort

1CompanyAny URL

2ClaygentResearch

3InsightStructured

A database can only answer the questions everyone else can also ask. Industry, headcount, funding stage, technologies installed: any provider sells those, which means every competitor targeting your market already has them. The research that actually wins is the question no vendor has a field for. Does this company run an in-house security team, or outsource it? Did they just open a second location? Did the founder say something on a recent podcast you can quote back to them? Claygent answers the custom questions no database has a field for, at the scale of a whole list, and the skill that separates good research from garbage is writing prompts that return structured, cited output instead of confident guesses. This is how to build it.

What AI prospect research actually changes

Most "AI prospecting" is a faster way to pull the same fields. The shift that matters is different: you can now ask a question in plain language and get a structured answer for every row, drawn from the live web rather than a stale export.

Claygent is Clay's agentic web research agent. It reads pages the way a person does, then returns a structured answer you define. Point it at a column of companies, give it a prompt, and it runs that prompt against every row, visiting the actual sites and returning the field you asked for. The difference between a static provider and an agent is not speed. It is what kind of question you are allowed to ask.

Same question, two systems: a fixed record vs. a researched answer

Research question

Static data provider

IndustrySoftware

Employee count240

HQ locationAustin, TX

Funding stageSeries B

Send a question to see what each side returns

Claygent

Idle

A database returns the same fixed fields no matter what you ask. An agent answers the specific question, with evidence, for every row.

Clay's own team frames this as the last-mile data problem: the data no provider can give you because it is too niche to sell. Whether a local dentist offers Invisalign. Whether a founder spoke at a panel last month. What a company's pricing page actually says. That data lives on the open web, not in a database, and Claygent is built to go get it.

Step 1: Decide what to ask before you open Clay

The hardest part of AI prospect research is not the tool. It is knowing which question separates a fit account from a waste of time, and where the answer lives publicly.

Start in a chat model, not in Clay. Brainstorm your ICP and, for each qualifying trait, ask one practical question: where would a stranger find proof of this on the public web? "Runs an in-house security team" shows up in careers pages and engineering blogs. "Sells into regulated industries" shows up in case studies and compliance pages. "Recently expanded" shows up in newsrooms and press releases. If you cannot name a public place the answer lives, Claygent cannot find it either, and you need a different question.

This is also where you separate the two jobs an agent does well. Account research and scoring are judgment calls that read messy pages and return a verdict. A field lookup like a headcount or a funding round is better handled by a regular enrichment, which is cheaper and more deterministic. Use the agent for the questions that require reading and reasoning, not for facts a provider already sells.

Step 2: Build the prompt as a structured contract

A good Claygent prompt is not a question. It is a contract: here is the context, here is exactly where to look, here is the shape of the answer I will accept, and here is what to do when you cannot find it.

Clay's team describes a strong research prompt as roughly one to two pages with a lot of context, because the model behaves like the world's smartest twelve-year-old: brilliant, literal, and lost without explicit instructions. Three parts do most of the work, and the order matters.

Turn a vague ask into a structured prompt by switching on three blocks

Starting ask

Find out if this company is a good fit for us.

Output preview

Find out if this company is a good fit for us.

Reliabilitylow

A reliable prompt is a contract with three parts: context, named sources, and a locked output schema with a "Not found" floor. Each part removes a way the model can hallucinate.

The output lock is the part most people skip, and it is the part that makes the difference between research you can trust and a column of confident-sounding fiction. Three habits make Claygent output trustworthy.

Force a structured schema, require a citation, and give the model permission to say "Not found." Tell it the exact fields to return and nothing else. Make it attach the URL it read the answer from, so you can spot-check. And give it an explicit floor: when the page does not contain the answer, return "Not found" rather than inventing one. A model with no permission to fail will fabricate to fill the field. A model told "return Not found rather than guess" will leave the cell honestly empty, which is the only way the column stays usable at scale.

Here is a copy-pasteable account-research prompt that bakes in all three habits.

Claygent: in-house security team check

You are researching {{company}} ({{domain}}) for a sales team.Context on this company: {{all_columns}}Search the public web to determine whether this company runs anin-house security or AppSec team, versus outsourcing security.Look first at: the company's careers page, its engineering orsecurity blog, and its own about/security pages.Return EXACTLY these fields and nothing else:- in_house_security: "yes", "no", or "Not found"- evidence: one sentence quoting or paraphrasing what the page says- source_url: the exact page the evidence came fromRules:- Base every field only on what a real page states.- If you cannot confirm from a page, return "Not found" for  in_house_security and leave evidence and source_url empty.- Do not guess, infer, or use prior knowledge about the company.

Step 3: Feed Claygent the whole row as context

A prompt that only knows the company name is working blind. The agent does better research when it can see everything the row already knows.

In Clay, you reference other columns directly inside the prompt, so the funding stage, the headcount, the industry, and any earlier enrichment all become context the agent reads before it starts searching. Clay's team uses a "table in the table" move for this: self-look up the row on domain equals domain, which feeds the entire row back in as a single variable, no manual column-by-column referencing. The agent walks in already knowing who this company is, which keeps it from wasting steps rediscovering facts you already have and tightens what it searches for.

Run enrichments before the agent, not after. Headcount, growth, recent job openings filtered to the keywords that matter to you: get those into the row first, then let Claygent read them as context. The agent's job is the last-mile question, and last-mile questions get sharper answers when the agent already has the firmographic floor underneath it.

Step 4: Pick a model and run a test batch for free

The same prompt costs and performs differently depending on the model behind it, and the gap is wide enough that picking blind wastes money.

A heavier reasoning model returns sharper judgment on messy pages but costs more per row. A lighter model is cheap enough to run across a large list but needs an even tighter prompt to stay accurate. Test before you commit. Claygent Builder lets you run test rows for free, so you iterate on the prompt and compare models without burning a single credit, then deploy only when the output looks right.

Turn off auto-run before you add the column. Otherwise the agent fires on every row the moment it exists, and you pay for a run you have not validated yet. Add the column with auto-update off, run two or three rows, read the actual answers and the cited sources, fix the prompt, then run the list.

Run the prompt across a sample list and watch each row fill in

Company	Result	Source
Northwind Robotics	—	—
Cedar Health Systems	—	—
Atlas Freight Co	—	—
Brightline Tutoring	—	—
Vela Payments	—	—

Run at list scale, a good prompt fills most rows with cited answers and leaves the rest honestly blank, so the column never hides a guess inside a confident-looking cell.

Teams using Clay this way describe compressing market-research projects that would otherwise take months into a few days. That speed only holds because the output was structured enough to refine. A V1 answer you can iterate on is a structured answer with sources attached, not a paragraph of prose you have to re-read row by row.

Step 5: Validate the output before you trust the column

A research column you have not spot-checked is a liability, because the failure mode of a confident model is a wrong answer that looks exactly like a right one.

The reason the structured-output and citation habits from Step 2 matter is that they make validation fast. Sort by the answer field, open five or six rows across the range, and click through to the cited source on each. You are checking one thing: does the page actually say what the agent reported? If the citations hold on a sample, the column is trustworthy. If the agent is returning answers with weak or missing sources, the prompt needs a harder output lock, not a different model.

This is also where the "Not found" floor pays off. Rows that came back "Not found" are honest gaps you can route to a human or a second enrichment. Rows that came back with a fabricated answer and no source are landmines that only surface after a rep quotes them on a call. The structured contract is what lets you tell the two apart at a glance.

Anthropic's enrichment coverage after running custom GTM research and a work-email chain through Clay.

Read the full story

That kind of lift comes from chaining research and enrichment in one place: get the firmographic floor in, run the custom question on top, and verify the work email before any of it reaches a rep. Coverage goes up because no single source has to be complete on its own.

Step 6: Score and route on the answer

A research column is only worth the time if it changes what a rep does next. The point of running the question across the whole list is to act on it differently per row.

Once Claygent has returned a structured field, you can score on it the same way you score on any firmographic. Pipe the answer into a scoring agent or a formula, weight it against the rest of the row, and produce a high/medium/low fit so reps work the strongest accounts first. The custom research is what makes the score yours: anyone can score on headcount, but a score that knows whether a company runs an in-house security team is an edge no competitor pulling the same database has.

Force the final verdict to be the last token the agent returns, after its reasoning and evidence. A model that scores first and rationalizes after tends to anchor on the score; a model that reasons, argues the other side, then commits to a score produces a verdict you can actually defend.

Common failure modes

Most Claygent research that disappoints fails for one of a few predictable reasons, and all of them trace back to the prompt, not the agent.

The first is asking a database question. If a provider already sells the field, do not pay an agent to read pages for it; use enrichment and save the agent for the question no one sells. The second is the open-ended ask with no output schema, which returns a paragraph of prose that is impossible to score or validate at scale. The third is the missing "Not found" floor: a prompt that demands an answer for every row guarantees fabrication on the rows where no answer exists. The fourth is running the full list before testing, which turns a prompt bug into a full-list bill. And the last is skipping validation entirely, then discovering the hallucination only after a rep has already used it.

Fix all five at the prompt layer. Ask a question the web can answer, lock the output to named fields, require a citation, permit "Not found," test a small batch for free, and spot-check the sources before you trust the column.

Run your first AI research prompt across a real list

Build a Claygent prompt that returns structured, cited answers to the question no database has a field for.

Start 14 day trial Watch Clay's team build a Claygent prompt live

Frequently asked questions

What is AI prospect research?

AI prospect research is using an AI agent to read the live web and return answers about your prospects that no database stores: custom qualification questions, last-mile facts, and account-specific context. Instead of pulling the same firmographic fields every competitor can buy, you ask a question in plain language and get a structured answer for every account on your list, drawn from the companies' actual sites.

Can AI actually do prospect research, or just summarize what's already there?

A good AI research agent does original lookup, not just summarization. Claygent visits real pages, reads them, and returns the specific field you asked for, so it can answer questions a static database has no field for, like whether a company runs an in-house security team or recently opened a second location. The quality depends entirely on the prompt: a structured prompt that names where to look returns researched answers, while a vague one returns guesses.

How accurate is AI for prospect research, and how do you avoid hallucinations?

Accuracy comes from the prompt, not the model. Three habits keep output trustworthy: force a structured output schema so every field is defined, require the agent to cite the source URL it read the answer from, and give it explicit permission to return "Not found" rather than invent an answer. With those in place you can sort the column, spot-check the cited sources on a sample of rows, and trust the rest.

How do you automate prospect research with AI at scale?

Put your accounts in a Clay table, add a Claygent column, and write one prompt that runs against every row. Reference the other columns so the agent reads the full context of each account before it searches, run a small test batch for free to validate the prompt, then run the whole list. The single prompt does the research a team of SDRs would do manually, with the same structured field returned for every account.

What's the difference between using Claygent and a regular data enrichment?

A data enrichment looks up facts a provider already sells: headcount, funding, technologies, location. It is cheaper and more deterministic, so use it for anything a database stores. Claygent is for the judgment-based, last-mile questions no provider has a field for, where the agent has to read pages and reason about what it finds. Run enrichments first to build the firmographic floor, then point the agent at the custom question that gives you an edge.

Related guides

Account research

How to scrape any website, no code

Pull data off any page and into a table you can use

Complete guide10 min read

Account research

What firmographic data is and how to use it

The data that decides ICP fit, and why it's the least accurate

Complete guide10 min read

Account research

What technographic data is and how to use it

What a company's tools reveal about fit and timing

Complete guide9 min read

Account research

10 Best ABM & Intent Data Platforms for 2026

We compared 10 ABM and intent data platforms on targeting, intent signals, orchestration, pricing, and real G2 ratings. Find the one that fits your team in 2026.

Comparison18 min read

Account research

10 Best Sales Intelligence Tools for 2026

We compared 10 sales intelligence tools on data, buying signals, account research, pricing, and real G2 ratings. Find who to reach and when in 2026.

Comparison18 min read

CRM enrichment

10 Best 6sense Alternatives for 2026

We compared 10 alternatives to 6sense on account targeting, intent signals, orchestration, pricing, and real G2 ratings. Find the right fit for your team in 2026.

Comparison18 min read

How to Use Claygent for AI Prospect Research

What AI prospect research actually changes

Step 1: Decide what to ask before you open Clay

Step 2: Build the prompt as a structured contract

Step 3: Feed Claygent the whole row as context

Step 4: Pick a model and run a test batch for free

Step 5: Validate the output before you trust the column

Step 6: Score and route on the answer

Common failure modes

Run your first AI research prompt across a real list

Frequently asked questions

Related guides

How to scrape any website, no code

What firmographic data is and how to use it

What technographic data is and how to use it

10 Best ABM & Intent Data Platforms for 2026

10 Best Sales Intelligence Tools for 2026

10 Best 6sense Alternatives for 2026

Use cases

Customers

Product

Blog

Resources

Company

Legal

Customers

Legal