revops · 15 min read

ICP Scoring Criteria for B2B Sales: The 2026 Operating Framework

A concrete ICP scoring rubric for B2B sales — five weighted dimensions, the data source behind each, a -5 to +5 scale, and one worked example that takes 500 accounts down to the 50 your SDRs work first. Build it in a spreadsheet by lunchtime.

June 13, 2026

turned on monitoring screenPhoto by Stephen Dawson on Unsplash

You have a list of 500 accounts. You have a five-person SDR team. By Friday, they need to be working the right 50 first. This article exists to make that decision defensible.

The temptation, when a sales leader sits down to build an ICP scoring rubric, is to list every imaginable attribute and assign it a point value. That produces a spreadsheet that scores every account at exactly the same number, because every attribute matters a little and nothing matters a lot. The operating framework below does the opposite — five dimensions, weighted unevenly, anchored to data sources, and stress-tested against a worked example.

What ICP scoring actually is — and what it isn't

An Ideal Customer Profile scoring rubric is a numerical model — usually producing a 0–100 fit score — that ranks accounts by how closely they resemble your highest-value, lowest-churn, most-expansion-prone customers. It is not the same as lead scoring, which adds behavioral and intent signals on top of fit; ICP scoring is the firmographic and structural substrate. HubSpot's documentation makes the distinction cleanly: fit scoring measures demographic and firmographic alignment — job title, company size, industry — while engagement scoring tracks behavioral signals.

The reason this distinction matters is operational. ICP fit is mostly static — a company's industry, employee count, and tech stack don't change weekly. Intent and engagement signals move daily. If you mix them in a single score, you can't tell whether an account jumped from 60 to 85 because they're suddenly a better fit (rare) or because they downloaded three white papers (common). Keeping the scores separate, then layering them, is the practice that wins.

The upside is well documented: scoring your ICP focuses marketing and sales effort on the accounts most likely to convert, and the resulting effects compound across prioritization, personalization, sales-cycle compression, GTM alignment, and resource allocation.

The five-dimension framework

The framework below collapses the wide field of published ICP-scoring work — 6sense's account model, HubSpot's fit/engagement split, Clearbit's data-activation approach, GTM Partners' −5 to +5 scale, and the rubrics published across the RevOps community — into five dimensions with defensible weights.

Dimension 1 — Firmographic fit (30%)

What it is: the structural shape of the account. Industry, sub-vertical, employee count, revenue band, geography, funding stage.

Why it gets the largest single weight: firmographic mismatch is the strongest predictor of churn in nearly every published cohort study. A company that doesn't fit your firmographic profile can sometimes be sold, but rarely renews and almost never expands. The 30% weight reflects that this is the criterion that most predicts long-term value, not just initial fit.

Data source: ZoomInfo, Clearbit (now part of HubSpot), Apollo, Cognism. For private companies missing revenue, use employee count as the proxy.

Sub-criterion	Internal weight
Industry / sub-vertical match	40%
Employee count band	25%
Revenue band or funding stage	20%
Geography	10%
Headquarters vs. operating market	5%

Dimension 2 — Technographic fit (20%)

What it is: what they already use, and what they don't. CRM, marketing automation, data warehouse, communication stack, billing platform, integrations.

Why it matters: technographic fit predicts both implementation speed and stickiness. A prospect already running on the tools your product integrates with closes faster, implements faster, and churns less. It's the single most under-used signal in mid-market B2B.

Data source: BuiltWith, HG Insights, Wappalyzer, Clearbit Reveal. For enterprise targets, augment with public-source signals — job postings often reveal the tech stack.

Sub-criterion	Internal weight
Integration with required tech stack	50%
Incumbent in your category (present/absent)	25%
Adjacent tool presence (readiness signal)	15%
Tech sophistication (public engineering footprint)	10%

Dimension 3 — Intent signals (20%)

What it is: the dynamic layer. Are they researching your category right now? Bombora research-topic data, G2 buyer-intent signals, third-party content engagement, anonymous website visits.

Why it gets 20%: intent is the closest thing to a buying signal you can detect before the prospect raises their hand. But it's noisy — and a prospect with high intent but low firmographic fit is still a bad prospect. Intent acts as a multiplier on fit, not a substitute for it. As Bombora itself frames it: having both an ICP and intent data is ideal, because only ~15% of your ICP is in-market at a given time.

Data source: Bombora, G2, 6sense, ZoomInfo Intent, Clearbit Reveal, Demandbase.

Sub-criterion	Internal weight
Surging intent topics relevant to your category	50%
Anonymous website traffic from target accounts	30%
Competitor-mention engagement	20%

Dimension 4 — Behavioral engagement (15%)

What it is: what the account has actually done with you. Email opens, form fills, content downloads, demo requests, webinar attendance, sales-call interactions.

Why it gets 15%: engagement is the strongest single predictor of meeting acceptance — but it can be gamed by free-content downloaders who never buy. The 15% weight keeps it material without letting white-paper grazers dominate the queue.

Data source: HubSpot, Marketo, Salesforce Pardot, your own product analytics if you're product-led.

Sub-criterion	Internal weight
Multi-stakeholder engagement (buying-committee signal)	40%
High-intent action (demo request, pricing-page visit)	35%
Recency (last 30 vs. 90 days)	15%
Frequency (number of engagement events)	10%

Dimension 5 — Trigger events and negative signals (15%)

What it is: the active state of the account. Recent funding, leadership change, new office, M&A, a hiring surge in the relevant function. And the inverse: layoffs, downsizing, public financial distress, a recent competitor purchase.

Why 15%: trigger events are the most actionable, most time-sensitive signal in the rubric. A perfect-fit account that just hired a new VP of the relevant function deserves to be at the top of the queue this week. The negative signals matter just as much — most rubrics fail to subtract points for "just signed a competitor," and the result is wasted SDR cycles.

Data source: Crunchbase, LinkedIn Sales Navigator, news-intent feeds, your own win/loss data.

Sub-criterion	Internal weight
Relevant hire	30%
Funding event	25%
Tech-stack change	20%
Negative signal (competitor signed, contract locked)	−50% modifier on the whole account

The scoring scale — and why it's not 1 to 10

The most-replicated single insight from the GTM Partners framework is that the scoring scale should not be 1 to 10, or even 1 to 5. It should be −5, −3, −1, +1, +3, +5 — and you should explicitly forbid yourself from using −4, −2, +2, or +4.

The reasoning is forcing-function. When you allow a 2 or a 4, you allow a middle-ground answer to every question. When you force scorers to choose between −3 and −1 (or +1 and +3), you force them to take a position. The clarity that emerges from forced choice is dramatically higher than the false precision of a 1–10 scale.

The interpretive frame:

Value	What it means
−5	Companies with this attribute churn at a higher rate than average
−3	May not churn as much, but consume more time and resources
−1	Uncertainty — you haven't sold to such companies before
+1	You can service them, but growth beyond the initial contract is limited
+3	Clear advantages; customers are happy and there's differentiation
+5	High growth potential; lifetime value is much higher via upsell and expansion

These six values get applied to every sub-criterion. The weighted sum becomes the dimension score. The five dimensions get composited into the overall fit score.

When you allow a 2 or a 4, you allow a middle-ground answer to every question. Forced choice between −3 and −1 makes the scorer take a position — and that clarity beats the false precision of a 1–10 scale.

The worked example: scoring 500 SaaS accounts for an SDR team

Let's run the framework end-to-end. You sell a sales-engagement platform priced at $35K–$120K ARR. Your best customers are mid-market B2B SaaS companies with 200–1,500 employees, an existing Salesforce instance, an inbound team already in motion, and recent Series B/C funding.

You have an outbound list of 500 accounts pulled from Apollo. Your five SDRs can credibly work about 50 accounts each per week, so you need the top 50 prioritized by Monday morning. Here are two representative accounts scored against the rubric — Account #1 (a Series C SaaS company, 800 employees, on Salesforce) versus Account #2 (a Series A, 80 employees, on HubSpot CRM):

Sub-criterion	Weight	Acct #1	Acct #2
Industry match — B2B SaaS	12%	+5	+5
Employee count band	7.5%	+5	−3
Revenue / funding	6%	+5	−1
Geography	3%	+3	+3
HQ / ops market	1.5%	+3	+3
Tech-stack integration — Salesforce	10%	+5	−3
Incumbent in category	5%	+1	−1
Adjacent tool presence	3%	+5	+1
Tech sophistication	2%	+5	+3
Bombora surging topics	10%	+5	−1
Anonymous traffic	6%	+3	−1
Competitor engagement	4%	+1	−1
Multi-stakeholder engagement	6%	+5	−1
High-intent action	5.25%	+3	−1
Recency	2.25%	+5	−1
Frequency	1.5%	+3	−1
Relevant hire	4.5%	+5	−1
Funding event	3.75%	+5	−1
Tech-stack change	3%	+1	−1
Negative-signal modifier	—	0	0

The weighted sum for Account #1 lands near +4.2 on the −5 to +5 scale. Mapped to the 0–100 range (where +5 = 100 and −5 = 0), that's a fit score of roughly 92.

The weighted sum for Account #2 lands around −0.5, which maps to a fit score of about 45.

You now have a defensible reason to put Account #1 in the "work this week" pile and Account #2 in the "nurture, revisit in 90 days" pile. Run that same calculation on all 500, sort descending, and the top 50 are your week. The prioritization debate becomes a function call.

How to operationalize this without a six-month project

The biggest reason most ICP scoring projects die is scope. The team agrees on the framework, then someone asks "but how do we automate it?" and the project disappears into a multi-vendor procurement question. Don't do that. Do this:

Week 1 — Build the model in a spreadsheet. One row per account, one column per sub-criterion, formulas for the weighted score, conditional formatting on the fit-score column. Skip automation entirely. Score 50 accounts manually using public-source data — LinkedIn, the prospect's website, a news search. The exercise surfaces which criteria are easy to populate and which are slow, which informs your eventual data-source choices.

Week 2 — Validate against your win/loss data. Pull your last 50 closed-won and 50 closed-lost deals and score them retroactively. If the model predicts "high fit" for accounts that closed-lost, the weights need to shift. If it predicts "low fit" for accounts that closed-won at high ACV, you're missing a criterion. This calibration step is the one most teams skip — and it's the only thing that turns a theoretical model into a working one.

Week 3 — Plug in the data you already have. Most companies already pay for ZoomInfo, Apollo, or Clearbit; HubSpot or Salesforce; and one intent provider. Use what you have before buying anything. Order of operations: firmographic first (static, cheap), technographic second, intent third (dynamic, needs a real subscription), engagement fourth (already in your CRM), trigger events last (manual surveillance is fine to start).

Week 4 — Hand the model to one SDR for a pilot. Don't roll it out to the whole team yet. One SDR working a sorted list, against a control SDR working their old method. After two weeks you'll know whether the model produces measurably better meeting-acceptance rates. If it does, scale. If it doesn't, the model is wrong and you recalibrate.

Small, fast, validated, then scaled — not a big-bang rollout. The published 6sense case studies describing 450 new opportunities in six months and large meeting-to-SQL lifts all describe a version of this incremental deployment, not a vacuum-built perfect model.

Five common mistakes — each one expensive

1. Treating ICP as a marketing artifact instead of an operating system. A document called "Our ICP" sitting in a Google Drive is not an ICP. An ICP is a scoring model that ranks every account in your CRM. The discipline of scoring produces the result, not the discipline of describing.

2. Letting recency bias dominate the calibration. The deals you closed last quarter feel important. The customers who've been with you three years and renewed twice are more important. Calibrate against multi-year retention and expansion, not just initial close. A criterion that produces high initial-close rates but mediocre renewals is mispriced.

3. Ignoring negative signals. Most rubrics in the wild only add points. Subtracting for "recently signed a multi-year competitor contract" or "in an active RIF" is the difference between a model that works and one that burns SDR cycles on accounts that cannot buy.

4. Building the perfect model before testing the imperfect one. Ship a v1 that's 80% right, run it on real accounts, iterate. You cannot calibrate a model you haven't run.

5. Confusing ICP with persona. ICP scores the account. Persona is who you talk to within it. The same account can have a great fit score and the wrong contact, or vice versa. Score them separately, prioritize on the composite — and once an account clears the bar, the work shifts to the human across the table. That's where a tight list of discovery questions and a qualification framework like MEDDIC take over, turning a high-fit account into a real opportunity.

The deeper point

The discipline of building this rubric — not just the rubric itself — is the actual value. By the time you've defined your five dimensions, weighted them, sourced the data, and validated against win/loss, you've learned more about your own business than any external consultant could tell you.

The teams that win at ICP scoring treat it as a quarterly operating ritual, not a one-time exercise. Each quarter, recalibrate against the previous quarter's closed-won deals. Watch which criteria moved. Adjust the weights. The lack of a specific ideal customer profile is the root cause of most marketing waste, and the rubric above is the cheapest way to fix it — for marketing and for the sellers who have to prioritize the VP of Sales and the buying committee underneath every account.

The 500-to-50 problem you started with becomes a function call, not a debate. Five SDRs, top 50 accounts each, scored against a defensible model, calibrated against your own data. That's the entire operating system. Everything else is tooling.

The score gets you to the call. The call is still yours to win.

ICP scoring tells your SDRs which 50 accounts to work — it doesn't make the conversation land. Once a high-fit account picks up, the deal turns on discovery depth, qualification discipline, and how you handle the first objection. SalesArmor lets you rehearse that exact conversation against an AI buyer built from the real prospect's role and company, then scores you on whether you qualified before you pitched.

Practice the high-fit call →

A note on sources

This article synthesizes the published ICP and lead-scoring literature: the GTM Partners / GTMonday framework on ICP scoring and the −5 to +5 forced-choice scale; Bombora's intent-data research on in-market percentages; HubSpot's fit-versus-engagement scoring documentation and ICP worksheet; Clearbit's work on ICP construction and B2B intent data; 6sense's account-scoring methodology and customer case studies; Mark Roberge's The Sales Acceleration Formula; Lenny Rachitsky's founder interviews on identifying an ICP; and the practitioner writing of RevOps leaders including Sangram Vajre, Jeff Ignacio, Asia Orangio, and Pete Kazanjy across the Pavilion, RevOps FM, and SaaStr communities. The five-dimension rubric is the operating distillation of those sources, calibrated for a sales team that has to turn a 500-account list into this week's queue.

Stop reading. Start practicing.

You can read fifty objection responses or you can rehearse three against an AI buyer who pushes back the way real ones do. SalesArmor scores you on whether you agreed before you addressed, asked before you pitched, and surfaced the layer beneath the surface. Free to try, no card.

Practice on SalesArmor →