Home · Blog · AI Detection
AI DETECTION

Sentence Length Variance: The One Pattern That Separates Human Writing From AI

GPT-4o clusters 85% of sentences in the 15–28 word range. Human writing spans 4–55+ words with no central cluster. This difference is the core signal in AI detection.

Sentence Length Variance: The One Pattern That Separates Human Writing From AI SE

Stop. Right now, before you read any further — count the words in your last three sentences of anything you've recently written. Write the numbers down.

Now ask: are they close to each other? Within ten words of each other?

If yes, you might have a problem. And it doesn't matter whether you used AI.

What Burstiness Actually Means

"Burstiness" is a term from information theory that made its way into NLP research. Originally it described the tendency of certain events to cluster in time rather than distribute evenly. Scientists applied it to word and sentence patterns in text: bursty writing has high variation, with long gaps and sudden clusters. Low-burstiness writing is more uniform.

In practical terms: burstiness measures how much your sentence lengths vary from each other.

High burstiness = one sentence is 7 words, the next is 41, the next is 4. Your sentences are all over the map in the best possible way.

Low burstiness = one sentence is 19 words, the next is 22, the next is 17. Everything clusters in a narrow band.

Human writing is bursty. AI writing is not. This is the single most reliable technical differentiator in detection research, and understanding it gives you real leverage for editing AI output.

The Numbers on ChatGPT vs Human Writing

We've measured this directly, and the gap is stark.

GPT-4o, writing analytical or explanatory content, produces sentences that cluster heavily in the 15–28 word range. Approximately 85% of sentences fall within that window. The distribution looks like a spike on a graph — a narrow mountain around the 20-word mark, with thin tails on either side.

Human writers, writing in the same genres, produce a completely different distribution. The range typically spans from 4 words to 55+ words with no strong central cluster. The graph looks more like a broad plateau — or genuinely irregular, with spikes scattered across a wide range. There's no "average sentence length" that captures what human writing actually does, because human writing doesn't have one.

This isn't a subtle difference. A researcher looking at sentence length histograms from human vs. AI writing can distinguish them at a glance.

Why AI Produces Low Burstiness

The explanation is in how language models work.

A transformer model generates text one token at a time, each new token predicted from the probability distribution given everything that came before. The "most likely" continuation at any given point is the one that most closely matches patterns in the training data.

What does the training data look like? Overwhelmingly, competent professional writing: articles, reports, documentation, essays. That writing tends toward medium-length sentences. Very short sentences (fragments, one-word thoughts, three-word punches) appear in the training data but in smaller proportions. Very long sentences (trailing, looping, exploratory 60-word constructions) also appear, but less frequently.

The model has internalized the average of human writing. And average human writing has moderate sentence length. So the model generates sentence after sentence that lands in the moderate range — not because it's incapable of variation, but because variation away from the center is statistically less probable at every token prediction step.

There's a deeper issue too. Human sentence length variation isn't random. It's purposeful. Writers make short sentences at moments of emphasis. They make long sentences when pursuing a complicated thought, when building up a sequence of qualifications, when mimicking the breathlessness of anxiety or excitement. The variance is tied to the meaning and emotional register of the writing.

AI doesn't have meaning and emotional register in that sense. It has no reason to slow down or speed up. So it doesn't.

The Diagnostic Test

Here's the simplest version. Take the last piece of writing you want to check — yours or AI-generated.

Count the words in 10 consecutive sentences. Write them in a list:

19, 22, 17, 24, 20, 18, 25, 21, 19, 23

vs.

7, 31, 4, 28, 12, 47, 9, 3, 35, 16

The first is AI rhythm. The second is human.

Calculate the range: the difference between your longest and shortest sentence. In our AI example above, the range is 8 (25–17). In the human example, it's 44 (47–3).

For any 10-sentence sample:

  • Range under 15: strong AI signal
  • Range 15–30: possible AI signal, depending on other factors
  • Range over 30: consistent with human writing

This isn't a perfect test — it can be fooled — but it's accurate enough to be a useful first filter. Run your own writing through it. You might be surprised.

Why Humans Write With High Burstiness Naturally

Think about the last time you wrote something genuinely important — a difficult email, a personal essay, an argument you cared about making.

The short sentences are the moments of certainty. "This is wrong." "We tried everything." "It didn't work." Those short statements carry emotional weight precisely because they're short — they stop the flow, they make you land.

The long sentences are the moments of working something out — following a thread, building a case, capturing an experience that doesn't fit in a simple statement and requires subordinate clauses and qualifications and sometimes even a change of direction mid-sentence before you find the ending you were looking for.

Sentence length is emotional punctuation. It signals to the reader: slow down here. This matters. Or: speed up, don't dwell.

AI can't signal emotional registers it doesn't have. The sentences come out even because there's nothing uneven driving them.

How to Add Burstiness Deliberately

This is the part most guides skip. They tell you to "vary your sentence length" without telling you how.

Here's a practical system:

The forced break. Find your three longest sentences in any passage. For each one, identify the comma closest to the midpoint. Remove the comma and replace it with a period. You've just created two sentences — and they don't need to be the same length. Often the second fragment will be much shorter, which is exactly what you want.

The forced extension. Find three short sentences in a row. Take the middle one and extend it — not by adding filler, but by actually adding more thought. "The tool works." becomes "The tool works, but only if you give it something to work with — and most people paste in text that's already been through three rounds of revision." Now you have a sentence of actual length.

The fragment experiment. Pick any medium-length sentence in an important position — end of a section, end of a paragraph, conclusion of an argument. Cut it in half and turn the second half into a fragment. "It doesn't solve the underlying problem." can become "It doesn't solve the underlying problem. Just delays it." Fragments are fast. Effective. Human.

The run-on allowance. Somewhere in every 500 words, allow yourself one sentence that runs genuinely long — 45 words or more. Not padded-long. Genuinely-following-a-complicated-thought-long. The kind of sentence that a good editor might flag but that a reader who's paying attention will follow and feel the weight of when it finally ends.

What This Does to Your TextSight Score

Sentence length variance is one of the primary signals feeding TextSight's Humanization Score algorithm. It's weighted heavily because it's one of the most reliable differentiators between human and AI text — and because it's difficult to fake convincingly at scale.

When we've taken heavily AI-flagged text (scores in the 20–35 range) and applied burstiness editing — deliberately creating high sentence length variance using the techniques above — scores consistently move into the 55–75 range.

That's a large swing. But it makes sense: you're not just adding random variation. You're forcing the text to do what human writing does, which is to have reasons for its lengths.

If your text scores below 40 on TextSight, the AI Vocabulary Highlighter will show you specific phrases pulling the score down. But alongside that vocabulary work, run the sentence length audit. It's the structural edit that makes everything else stick.

One Common Mistake: Mechanical Variance

There's a trap in the burstiness advice that's worth calling out directly.

Some writers, after learning about sentence length variance, start writing in an obviously mechanical pattern: short sentence. Long sentence. Short sentence. Long sentence. The lengths vary, but the alternation is so regular it creates its own kind of rhythm — just a different one than AI's.

Real human burstiness isn't regular alternation. It's variation driven by meaning. Three long sentences in a row because you're following a complicated thread. Then one short one because you've reached the point. Then a medium one that starts the next thought. Then an unexpectedly long one because the next thought turned out to be more complicated than you expected.

The variation emerges from the writing, not from a rule about varying. If you're consciously counting words and alternating, readers will feel that. Write for meaning first. Let the sentence lengths follow from what you're actually saying and how much space it needs.

The Burstiness Score in TextSight

TextSight's Humanization Score incorporates burstiness as one of its primary signals — weighted more heavily than vocabulary choices and more heavily than paragraph structure, because burstiness is one of the hardest AI patterns to fake convincingly at scale.

When the AI Vocabulary Highlighter surfaces flagged phrases, those are the visible layer. The underlying burstiness analysis is running simultaneously. You can have vocabulary that passes — no flagged phrases, no AI-specific vocabulary — and still score in the 30–40 range because the sentence length distribution is too uniform.

This is important for anyone who thinks they can clear a detector by running text through a paraphraser. Paraphrasing tools swap vocabulary and sentence structure, but they don't fundamentally change sentence length patterns. A paraphrased AI sentence that was 20 words long usually comes out as a paraphrased sentence that's 18–22 words long. The variance doesn't increase. The burstiness doesn't improve.

The only way to genuinely improve burstiness is to rewrite — to make actual decisions about where sentences should be short and where they should run long. Automated paraphrasers can't make those decisions because those decisions require understanding the rhetorical purpose of each sentence.

Paste your writing into TextSight and look at the overall score before and after your burstiness edits. The numbers will tell you whether you're actually creating variance or just shuffling words.

The Broader Lesson

Low burstiness is AI writing's most fundamental structural signature. Every other tell — the em-dash overuse, the passive voice hedging, the predictable paragraph structure — exists within a frame of consistent sentence length. Fix the frame and you fix a lot of other problems at once.

This is the foundational pattern. Other detection signals are symptoms. This is the condition.

Your sentences shouldn't all be the same length. They should be as long as the thought requires — no more, no less. And since your thoughts aren't all the same size, your sentences shouldn't be either.


Related reading:

DB

Dipak Bhosale

Founder & CEO · TextSight

Writing about AI detection, humanization, and the strange new craft of writing in 2026. Operates Lacewing Technologies from Maharashtra, India.

Try the detector free.

Paste any text. See where AI signals show up. Fix what's flagged in minutes.

Start free — no card More from the blog