Removes common words like “the”, “and”, “is” from phrase tables only.
Live results
Total words
0
Keyword count
0
Density
0.00%
Density band
Enter a target keyword or phrase to see a density band.
Recommendations
Type a target keyword or phrase to compare against your word count.
Highlighted preview
Exact matches of your target phrase are marked below (read-only).
Paste content to see highlights.
Top 20 words (1-gram)
Phrase
Count
Density
No data yet—paste text or enable different options.
Top 20 phrases (2-gram)
Phrase
Count
Density
No data yet—paste text or enable different options.
Top 20 phrases (3-gram)
Phrase
Count
Density
No data yet—paste text or enable different options.
Bands shown (green ~1–2%, yellow ~2–3%, red >3%) are rules of thumb for English marketing copy—not a Google ranking formula. Always prioritize helpful, natural language over hitting a number.
About this tool
Keyword density is the percentage of words in a piece of text that match a specific keyword or phrase, usually calculated as (number of exact matches ÷ total word count) × 100. In the early 2000s, many SEO checklists treated density like a dial you could turn to “rank higher.” Search engines have long since moved toward semantic understanding, user intent, and overall content quality—so no reputable practitioner today optimizes to a single magic percentage. What remains valuable is moderation: copy that never mentions its topic can feel off-topic to readers and weak in topical signals, while copy that repeats the same phrase mechanically triggers quality alarms and reads like spam.
This free SynthQuery Keyword Density Calculator helps content writers, SEO specialists, editors, and bloggers sanity-check drafts in the browser. Paste body copy, enter a target phrase, and see live word count, exact-match frequency, density, a highlighted preview, and ranked 1-, 2-, and 3-word phrases (N-grams) so you can spot accidental repetition and natural language variety. Optional stop-word filtering removes high-frequency glue words such as “the,” “a,” “is,” and “and” from the N-gram tables—useful when you want to see meaningful multi-word patterns instead of phrases dominated by function words. Case-insensitive matching is toggled on by default so “SEO” and “seo” count together when that is what you intend.
Treat the green, yellow, and red bands in the tool as editorial guardrails, not ranking guarantees. Google’s public guidance has consistently emphasized helpful content and avoiding keyword stuffing rather than endorsing a fixed density target. Pair this utility with SynthQuery’s Word Counter for length and readability context, the Meta Title & Description Length Checker for SERP packaging, and the Grammar Checker before you publish. When you need the full AI writing stack—detection, humanization, plagiarism, and more—continue from the tools directory at synthquery.com/tools.
What this tool does
The calculator recomputes on every edit with no server round trip: your text stays in the browser session, with optional sessionStorage recovery if you refresh accidentally (within a generous character cap designed to keep tabs responsive). That architecture matters for agencies handling embargoed copy, legal drafts, or unpublished campaigns—you are not uploading paragraphs to a third-party API just to see a percentage.
Core metrics include total word count (whitespace-delimited, the same family of behavior most SEO tools use for quick checks), exact-match count for your chosen phrase, and keyword density derived from those two numbers. The read-only preview mirrors your paste and wraps each match in a semantic mark element styled with SynthQuery tokens so highlights remain visible in dark mode without hard-coded hex colors. N-gram extraction tokenizes on non-word boundaries, lowercases for grouping, and rolls sliding windows of size one, two, and three to list the twenty most frequent phrases in each category with both raw count and density relative to total words.
The stop-word list covers common English function words; enabling the filter removes them before building N-grams so tables surface substantive phrases—handy for blog posts and guides where you expect varied vocabulary. It does not delete stop words from your original textarea or from keyword matching; it only changes how the auxiliary tables are built. Case-insensitive matching compares a normalized lowercase view while preserving your original characters in the highlight preview. Clear resets inputs, toggles, and local scratch storage so you can demo the tool or start a fresh audit quickly.
Recommendations below the headline stats translate density into plain language: they flag very low presence, comfortable bands, and potential stuffing, echoing the color cues (green for roughly one to two percent, yellow for roughly two to three percent, red above three percent). Export delivers either a multi-section CSV—summary row plus separate blocks for 1-, 2-, and 3-grams—or a narrative text file you can attach to Jira, Notion, or email. Together, these pieces give editors a fast, repeatable checklist pass without pretending that density alone defines SEO success.
Technical details
Keyword density here is strictly (exact matches of your target string ÷ total words) × 100, where “word” means whitespace-separated tokens in the full pasted text. That definition is easy to audit and matches how many classic SEO calculators explain their headline number, but it is not identical to every CMS analytics package—always align methodology when you compare scores across tools.
Google does not publish a recommended keyword density percentage, and its quality guidelines have long warned against keyword stuffing—filling pages with lists of terms or repeating words unnaturally. Modern ranking systems lean on natural language processing, embeddings, and signals of satisfaction rather than crude term frequency alone. TF-IDF (term frequency–inverse document frequency) weights how important a word is to one document relative to a corpus; it is richer than raw density but still only one lens. SEO workflows therefore combine simple density checks with topical coverage, internal links, structured data where appropriate, and performance metrics from Search Console.
Semantic relevance—whether the page comprehensively answers the query—matters more than repeating one exact string. “LSI keywords” is a legacy marketing phrase often used to mean related terms, entities, and subtopics you should include; search engineers do not treat “LSI” as a special dial inside Google’s public documentation, but the underlying idea—write with vocabulary that reflects real expertise—remains sound. Use this calculator to catch extremes and patterns; use research, outlines, and subject-matter review to earn trust.
Use cases
Blog editors optimizing pillar posts for a primary phrase use density alongside editorial judgment: they confirm the head term appears where readers expect it—title-adjacent copy, H2s that mirror task language—without turning every paragraph into a repetitive template. SEO agencies auditing client drafts before handoff paste each major section separately when templates concatenate blocks with different authors, then compare density across sections for balance.
Content strategists studying competitor articles can paste winning URLs after stripping chrome to see which repeated phrases dominate the body. The insight is not “copy their percentage” but “understand which collocations and subtopics they reinforce,” then plan outlines that cover the same intent with fresher examples. Technical writers refining help-center pages watch 2-grams and 3-grams for accidental duplication of UI labels, which often stack up when multiple authors extend the same template.
E-commerce operators check category blurbs where programmatic SEO inserts attributes; N-grams sometimes reveal unintended repetition of “free shipping” or “best price” that reads as spam even when each insertion felt minor in isolation. Academic and nonprofit communicators use the tool sparingly—aware that scholarly tone may intentionally repeat precise terms—yet still benefit from spotting tripled phrases in executive summaries meant for the general public.
Finally, in-house teams performing periodic content refreshes export CSVs to attach to CMS tickets, documenting “before” density when they broaden vocabulary and add semantic variants. Connect that workflow with the Word Counter for length compliance, the Readability-oriented tools in SynthQuery when tone must stay accessible, and the Grammar Checker for final polish.
How SynthQuery compares
Many hosted utilities such as SEOReviewTools, SmallSEOTools, and similar free calculators offer a headline density percentage with ad-supported layouts and sometimes undisclosed input limits. They are fine for quick one-off checks when you trust the destination. SynthQuery’s Keyword Density Calculator is built for writers who already live in the SynthQuery ecosystem: it layers N-gram tables, optional stop-word filtering, live highlighting, CSV and text export, explicit character guidance, and direct paths to adjacent utilities without routing your prose through an opaque backend for the basic math.
Because analysis runs client-side in JavaScript, you can paste long articles within the published cap without a proprietary “daily query” gate tied to density alone—your limit is browser practicality, not an artificial word ceiling designed to upsell a basic counter. The comparison table below highlights typical differences; features vary by vendor over time, so verify specifics when you choose a primary workflow.
Aspect
SynthQuery
Typical alternatives
N-gram depth
Top 20 each for 1-, 2-, and 3-word phrases with counts and density columns.
SEOReviewTools and SmallSEOTools often emphasize the single headline term; multi-word phrase tables may be shallower or hidden behind extra steps.
Highlighting
Read-only preview marks every exact match of your target phrase in context.
Many calculators show numbers only, so you reopen the doc manually to find clusters.
Stop words & case
Toggle stop-word removal for N-grams; toggle case-insensitive matching for the target phrase.
Some tools fix case behavior silently; stop-word handling may be absent or fixed on.
Export & privacy
CSV and plain-text summary downloads; core analysis stays in your browser tab.
Hosted calculators may transmit text to servers; export is not always offered on free tiers.
Workflow context
Same account ecosystem as Word Counter, Meta Checker, Grammar Checker, and full AI tools.
Standalone sites require extra tabs and separate logins for advanced features.
How to use this tool effectively
Start by pasting only the body copy you want to measure—article paragraphs, landing-page sections, or newsletter HTML stripped to plain text. Exclude global navigation, cookie banners, and legal footers unless those strings are genuinely part of the document you are optimizing; including boilerplate dilutes density for the narrative you care about and can skew N-grams toward repeated chrome words.
Enter your target keyword or phrase in the dedicated field. Single-word targets are fastest to interpret; multi-word phrases match exactly as typed (non-overlapping), so pick the precise string your brief specifies—often a head term for reporting, not every synonym you also want to cover in prose. Toggle case-insensitive matching if capitalization varies across the draft (product names, acronyms, sentence starts). Watch the live totals: total words, exact-match count, and density percentage update as you type.
Open the highlighted preview to see every occurrence of your phrase emphasized with a background mark. Skim for clusters: several hits in one paragraph often read worse than the same count spread across sections. Scroll the N-gram tables to compare your target phrase against the top repeating words and phrases the model extracted. When “Exclude stop words” is enabled, the tables emphasize content-bearing tokens and collocations; turn it off when you want the raw distribution including function words—sometimes useful when auditing templated pages where “the” or “your” repeats abnormally.
Interpret the density band message alongside your own read-aloud test. If density is very low but the topic is strategically important, consider adding the phrase where it clarifies meaning—headings, the introduction, or a summary sentence—rather than forcing it into every paragraph. If density is high, remove redundant mentions, replace some with pronouns or accepted synonyms, and merge choppy sentences that repeat the same construction. Export a CSV when you need a spreadsheet trail for clients, or a plain-text summary for tickets. When metadata is part of the same task, jump to the Meta Checker and SERP Preview so titles and snippets align with the body terminology you just validated.
Limitations and best practices
Tokenization is English-oriented and tuned for Latin-script marketing copy; code snippets, URLs, hashtags, and non-Latin scripts may produce N-grams that require manual judgment. Exact-match density ignores stemming and synonyms—“run,” “running,” and “runner” are different targets—so brief your phrase carefully when you track variants. The green/yellow/red bands are heuristic editorial cues only. For canonical product direction and crawl strategy, continue to https://synthquery.com/tools and pair content work with analytics—not a single on-page percentage.
Full catalog of AI detection, humanization, plagiarism scanning, and premium workflows at synthquery.com/tools.
Frequently asked questions
Keyword density is a simple ratio: how often a chosen keyword or phrase appears compared to the total number of words in a text, almost always expressed as a percentage. For example, if a five-hundred-word article contains ten instances of the exact phrase “email marketing,” the density is two percent. It is a descriptive statistic about your draft, not a direct measurement of Google’s internal scoring. Useful workflows treat density as a sanity check alongside outline quality, citations, internal links, and whether the page actually answers the reader’s question.
There is no universally correct ideal percentage. Google does not publish a target density, and rankings depend on competition, backlinks, site quality, intent match, and hundreds of other signals. Practitioners sometimes cite informal bands—roughly one to two percent for visible focus, caution above three percent for potential stuffing—but those rules of thumb come from editorial experience, not an official formula. A how-to guide that uses a head term naturally in the title, intro, and a few strategic headings may land lower than one percent yet rank well because subtopics and entities are strong. Conversely, a page can exceed three percent and still be fine if the phrase is part of unavoidable product nomenclature—though it should still read well aloud. Use this tool to detect extremes, not to chase a magic number.
Exact-match density matters far less than it did in early search engine eras, when crude algorithms rewarded repetition. Today, search systems emphasize semantic relevance, helpfulness, experience signals, and how well the content satisfies intent. That said, terminology still matters: if a page never mentions its topic in any recognizable form, readers may find it vague, and search engines may struggle to align the page with queries that use concrete language. The modern approach is balanced—use clear vocabulary, cover related subtopics, avoid stuffing, and measure outcomes with analytics rather than optimizing a single percentage in isolation.
Keyword stuffing is the practice of loading a page with keywords or numbers in an attempt to manipulate rankings, often producing awkward lists, blocks of cities or phone numbers, or sentences that repeat the same phrase without adding information. Google’s spam policies explicitly call out this behavior, and it harms user trust even when algorithms miss it temporarily. Warning signs include copy that sounds robotic aloud, unnatural mid-sentence insertions, and footer blocks that repeat queries verbatim. If your density calculator flags a high percentage, read the highlighted preview: when clusters feel forced, rewrite for clarity before you worry about the exact percentage.
Most calculators, including this one, divide the number of non-overlapping exact matches of your target string by the total word count of the pasted text, then multiply by one hundred. “Word” typically means tokens separated by whitespace after you paste plain text. Punctuation attached to words usually stays attached to that token for counting purposes. Variations exist: some tools stem words, ignore stop words in the denominator, or count only body HTML nodes. Always document which definition you used when reporting to clients so numbers stay comparable across revisions.
An N-gram is a contiguous sequence of N tokens. A 1-gram is a single word, a 2-gram is a two-word phrase (“content strategy”), and a 3-gram is a three-word phrase (“enterprise content strategy”). By ranking frequent N-grams, you see which phrases dominate a draft beyond your chosen target—useful for spotting accidental repetition, templated boilerplate, or missing variety. Researchers and NLP pipelines use the same concept at larger scale; here it is simplified for editorial review. Toggle stop-word filtering when you want tables to emphasize content words rather than sequences like “in the.”
Healthy pages usually mix exact phrases, pronouns, synonyms, and related entities so writing sounds natural while still being clear about the topic. Exact-match density is easy to measure—hence this tool—but it should not be your only lens. If briefs require a trademarked product string, track that exact form. For broader topics, plan a vocabulary list (problems, outcomes, components) and use the N-gram tables to ensure you are not accidentally overusing one awkward construction. Search engines map many variants to similar intents; readers reward clarity and specificity more than mechanical repetition.
TF-IDF stands for term frequency–inverse document frequency. It compares how often a term appears in one document with how common that term is across a larger corpus, highlighting words that are distinctive to the page. Raw keyword density only looks inside one document and does not ask whether a word is rare or ubiquitous everywhere. TF-IDF is therefore more informative for tasks like building topic models or comparing pages within a site, but it requires a corpus baseline—something a quick browser calculator usually does not provide. Use simple density for fast drafts; graduate to corpus-aware methods when you run serious content engineering programs.
Start by deleting true duplicates: if two consecutive sentences make the same point with the same phrase, merge them. Replace some mentions with pronouns when the antecedent is obvious. Swap in precise synonyms or narrower hyponyms where they remain accurate—say “the campaign” instead of repeating a five-word product name in every sentence. Reorganize headings so one strong occurrence sits in the visible structure instead of three weaker ones in body text. Add concrete examples, data, or steps that carry meaning without repeating the head term. Re-run the calculator after each pass; density should fall while topical depth rises.
“LSI keywords” is marketing shorthand—often traced to latent semantic indexing, an older information-retrieval idea—for related words and phrases that help search engines understand context. In practice, teams compile subtopics, questions, entities, and synonyms that belong in a thorough article: if the head term is “project management,” related coverage might include milestones, stakeholders, Gantt charts, and risk logs depending on intent. Google’s public documents do not ask you to implement “LSI” as a checklist, but they do reward content that demonstrates experience and depth. Use this density tool to avoid overusing one phrase, then research real user questions—People Also Ask, support logs, interviews—to expand vocabulary responsibly.