How Transformers Learned to Write Like Every LinkedIn Post
Everyone's noticing the same tells. The real question is whether it matters.
There’s a running joke on social media right now: if you see an em dash, it was written by ChatGPT.
It’s funny because it’s mostly true.
The em dash, that long horizontal line that replaces commas, parentheses, or colons, has become the unofficial watermark of AI-generated text. But it’s not alone. There’s a whole set of tells that have become so common, so consistent across every platform and every model, that readers can now spot machine-written prose the way a sommelier spots a corked bottle. The patterns are unmistakable once you know what to look for.
And that last sentence? That’s one of them.
The tells
Start with the obvious. Em dashes everywhere. AI models love them because they’re grammatically versatile, a single punctuation mark that can substitute for several others. In training data, em dashes appear in polished writing. The model learns that polished writing uses em dashes. It then uses them constantly, often where a comma or a full stop would serve better. That’s the irony, by the way, of writing this piece in the first place. I had to actively strip the em dashes out of my own draft as I went. The pull is real.
Then there’s the three-point structure. AI-generated copy almost invariably arranges ideas in neat triads. “It’s fast, reliable, and scalable.” “We need courage, clarity, and conviction.” Three feels complete to a language model. It’s the smallest number that signals a pattern without becoming a list. Human writers use threes too, but they’re messier about it. They’ll throw in a fourth point, or break the rhythm deliberately. AI doesn’t break rhythm. It maintains it.
Arrow notation has crept in from technical writing: “lead → MQL → opportunity → close.” It looks clean. It feels structured. And it’s spreading across LinkedIn posts, marketing emails, and startup blogs like a rash. The arrow signals efficiency, it implies a process so well-understood that you don’t need words to connect the steps. AI models have absorbed this from SaaS documentation and now deploy it everywhere, whether the context warrants it or not.
Other tells: bullet points that start with bolded phrases followed by a colon. Headlines that pose a question and then answer it. The phrase “It’s not just X, it’s Y.” The word “landscape” used to describe any competitive environment. “In today’s fast-paced...” as an opener. The list goes on, and anyone who reads a reasonable amount of AI-generated content has their own private collection of tells.
Why these patterns exist
The tells aren’t random quirks. They’re statistical artifacts of how large language models work.
A language model predicts the next token based on probability. It doesn’t “choose” to use an em dash the way a writer decides a comma won’t do. It lands on the em dash because, given the preceding tokens, that punctuation mark has the highest probability of appearing in the training data at that point. Em dashes correlate with certain syntactic structures, parenthetical asides, appositives, dramatic pauses, and those structures correlate with formal, edited prose. The model has learned the association. It hasn’t learned when to break it.
The three-point structure is even more mechanical. Transformers are attention machines. They look for patterns, then reproduce them. Three-part lists are wildly overrepresented in training data. Rhetoric, advertising, religious texts, political speechmaking. “Life, liberty, and the pursuit of happiness.” “Of the people, by the people, for the people.” The model has absorbed an enormous corpus of tricolon, and it regurgitates it at every opportunity because the statistical signal is overwhelming.
Symmetrical completions are another artifact. When a model generates a clause like “It’s not just X, it’s Y,” it’s completing a pattern it has seen thousands of times. The structure is self-reinforcing. The opening half (”It’s not just...”) creates a strong expectation for the second half (”...it’s Y”). The model obliges. A human writer might derail the expectation for effect. The model almost never does, because derailing would be a lower-probability token sequence.
The laziness problem
The tells aren’t really the problem.
Language has always had markers of origin. Press releases have a style. Academic papers have a style. Legal documents have a style. Nobody accuses a contract of being “inauthentic” because it uses “hereinafter” and “notwithstanding.” We recognize those patterns as genre conventions and move on.
AI-generated text is developing its own genre conventions. The em dashes, the triads, the arrow notation. They’re the boilerplate of machine prose. They’ll probably stabilize into something we accept the way we accept the inverted pyramid in journalism or the five-paragraph essay in academia.
The real problem is that the tells are evidence of something deeper. A failure of effort on the part of the person using the tool.
When someone prompts an LLM with “Write me a blog post about the future of remote work” and publishes whatever comes back, the output will be thick with tells because the prompt gave the model no reason to deviate from its default distribution. It’s the most probable text given a generic request. Generic input, generic output.
But when someone uses the same model with a specific voice prompt, a detailed brief, examples of the style they want, and then edits the output, the tells diminish. Not because the model has become more creative, but because the human has constrained the probability space. They’ve given the model something other than the median to aim at.
The tells are a signal of lazy prompting and absent editing. They’re the visible symptom of treating a language model as a content vending machine rather than a drafting tool.
Does it matter?
This is the question worth sitting with.
On one hand. If the content is accurate, useful, and well-structured, does it matter that it was generated by a model? Readers are increasingly comfortable with AI-assisted writing. The stigma is fading. Most people care about whether a piece answers their question, not whether a human typed every word. The output works. It gets the job done. Why should anyone care about the mechanism that produced it?
On the other hand. The uniformity is real. When every blog post, LinkedIn update, and marketing email sounds the same, same rhythms, same punctuation, same structural tics, something is lost. Writing becomes wallpaper. Background noise. The kind of content your eyes slide over without registering. The distinctiveness that makes a voice worth listening to gets smoothed away by statistical averaging.
I think the em-dash epidemic isn’t a crisis of authenticity. It’s a crisis of distinctiveness. The markers of AI writing are a reminder that most people are using these tools to produce the average of everything ever written, and the average of everything is, by definition, unremarkable.
The fix isn’t to ban AI from the writing process. It’s to bring something to the process that the model can’t generate on its own: a point of view, a specific argument, a willingness to break the pattern. Give the machine something to work with, then edit what comes back. Push it off the median. Make it yours.
The em dash isn’t the enemy. Laziness is.


