The Medium-Tier Phrase Blocklist

#What this Playbook gives you

A three-tier phrase blocklist that catches predictable LLM drift patterns before they reach the Sonnet 4.6 verifier. The blocklist is the cheap defense; the verifier is the expensive one. Together they're better than either alone.

perea runs the Medium tier in production. Strict and Loose are documented as the tiers above and below — pick the tier that matches your risk tolerance and audience expectations.

#The three tiers

Strict. Blocks urgency manipulation, forward-looking predictions, engagement-bait, and common AI-tells. Suited to highly-trusted brands where any sign of LLM authorship hurts. Phrases like "delve," "in this article," "it's worth noting," and AI-tic transitions are blocklisted alongside the substantive drift patterns.

Medium (perea's tier). Blocks the three substantive drift categories: urgency manipulation, forward-looking confidence, and engagement-bait. Does not block AI-tells — perea is comfortable with the model's natural voice as long as the content stays honest.

Loose. Blocks only engagement-bait. Suited to channels where forward-looking framing is part of the brand voice (e.g., a prediction-market account where forward-looking is the product).

#What each category catches

#Urgency manipulation

Phrases that manufacture urgency the source paper doesn't actually contain:

"right now"
"today only"
"before it's too late"
"the window is closing"
"act fast"
"limited time"

The category is about false urgency. If a regulation has an actual effective date of 2026-12-31, naming the date in a post is fine — the deadline exists in the source. The blocklist catches the pattern where the LLM invents a deadline that isn't in the paper.

#Forward-looking confidence

Phrases that turn hedged claims into confident predictions:

"by 2027, X will ..."
"this will reshape ..."
"the future of X is ..."
"X is going to ..."

The Hybrid Contract already locks numerics and named entities. The forward-looking blocklist is the second line of defense: even if the LLM keeps numerics correct, it sometimes drops the hedge. "Could become" is the source; "will become" is the drift. The blocklist catches the substitution.

#Engagement-bait

Phrases that prioritize click/like/share over information:

"you won't believe ..."
"the one thing X doesn't want you to know"
"thread"
"🧵"
"subscribe for more"
"follow for daily takes"
"did this surprise you?"

perea's standing instruction for the close is: link to the source, nothing else. No CTA, no "thread," no engagement question. The blocklist enforces that.

#Why the blocklist is the cheap defense

The Sonnet 4.6 verifier costs real inference per draft. The blocklist is a regex match — milliseconds. Catching a draft at the blocklist saves the verifier round-trip on that draft and gets the optimizer back into the retry loop sooner.

Three-tier framing helps:

Blocklist hits in production are concentrated in 2–3 patterns. Track the rate by pattern.
When a new drift pattern appears, the blocklist update is one line. The verifier update is a prompt change with all the testing that entails.
The blocklist is operator-editable in production. The verifier prompt is not.

That said, the blocklist is not a substitute for the verifier. The verifier catches semantic drift (a number that's wrong, an entity that's renamed); the blocklist catches lexical patterns (a specific phrase the LLM picked up from its training distribution). Both layers are needed.

#Production results

perea seeded the Medium-tier blocklist on day 1 of tick-X-post-from-research running. The seed list lives at state/blocklist.json in the skill's state directory and is editable by the operator.^[1]

Hit rate. In the first 14 days, the most-hit category was engagement-bait — the optimizer occasionally tries to add a 🧵 emoji or a "thread" hint at the start. The Medium tier rejects this on regex, the optimizer retries, the second draft drops it.^[2]

Tier escalation policy. The operator may temporarily escalate to Strict (e.g., during a launch when brand voice matters most) by setting blocklist.json to the Strict tier. The escalation is reversible by editing the same file. No code change.

Recurring drift → contract update. When a phrase recurs across retries (not just one draft), the operator's standing instruction is to update the Hybrid Contract section, not just the blocklist. The contract is the primary defense; the blocklist is the secondary. A recurring pattern indicates the contract is silent on something it should speak to.^[3]

#Failure modes

Over-blocking the model's voice. If you start blocklisting common transition words ("interestingly," "notably," "for example"), the drafts become stilted. The Medium tier deliberately leaves these alone.
Under-specified patterns. A simple word match on "thread" will also reject the legitimate use of the word ("the thread of the argument"). perea uses contextual patterns — ^thread: at the start of a line, 🧵 as a standalone marker — not bare-word matches.
Tier drift. Over weeks, the blocklist accumulates patterns. Without occasional pruning, the list grows until the rejection rate hurts throughput. perea reviews the blocklist quarterly.
The retry loop loops. If the optimizer is blocked on the same phrase 3 times, the retry loop logs the slug to unpostable.json. The operator reviews. Usually the fix is a contract update; sometimes it is a blocklist correction.

#How to apply in your stack

Pick a tier based on brand. Most pipelines should start at Medium.
Seed the blocklist as a JSON file with one entry per category. Each entry has a regex + a one-line reason + a category tag.
Run the blocklist before the verifier in your pipeline. If the blocklist rejects, surface the matched phrase + reason to the optimizer for the retry.
Log every hit. The hit log is your signal for whether a phrase belongs at a higher tier (rare hit → keep at Medium; constant hit → escalate or update the optimizer prompt).
Schedule a quarterly review. Prune unused patterns. Add patterns that recurred across drafts.

The blocklist itself is small — perea's seed is ~30 patterns total across the three categories. Most of the value is in the framing: tier picks, hit-rate tracking, and the contract-update standing instruction.

#Quotable Findings

The blocklist is the cheap defense; the verifier is the expensive one — together they catch more than either alone.^[1]

The Medium tier blocks urgency manipulation, forward-looking confidence, and engagement-bait — three substantive drift categories — without blocking the model's natural voice.^[1]

When a phrase recurs across retries the operator updates the Hybrid Contract, not just the blocklist — the contract is the primary defense, the blocklist is the secondary.^[3]

The most-hit category in early production was engagement-bait — the optimizer occasionally adds a thread emoji or a "thread" hint at the start; the regex rejects on first pass.^[2]

A simple word match on "thread" rejects legitimate use ("the thread of the argument") — perea uses contextual patterns, not bare-word matches.^[1]

#References

state/blocklist.json seed contents and tier framing in tick-X-post-from-research SKILL.md. Evidence: tier framing. ↩ ↩² ↩³ ↩⁴
Hit-rate distribution across categories in first 14 days of production. Evidence: hit rate profile. ↩ ↩²
Operator standing instruction documented in the SKILL.md decisions reference. Evidence: contract-update standing instruction. ↩ ↩²