v1.3.03 loop iterations

AI Decay Audit

Prompts decay silently. A model update lands, your old prompts start producing longer, more hedged, slightly-off output — and you don't notice until something visibly breaks. The AI Decay Audit treats prompts like code that needs maintenance, then tells you which fixes will survive the next update and which won't.

What this skill does

Models get updated. Fine-tuning shifts, safety adjustments land, default verbosity creeps up — and prompts that produced perfect output on the last version quietly start producing different output on the new one. That's not a bug; it's the nature of evolving models. But it means prompt maintenance is real work, and most people only notice the decay once something visibly breaks downstream.

This skill is a diagnostician, not a rewriter. It runs three structured tests on each prompt — baseline reproduction, instruction sensitivity, and edge-case stability — then names which of the six common decay patterns is responsible: format drift, verbosity creep, tone shift, capability changes, safety boundary shifts, or instruction sensitivity changes. The diagnosis is specific. "Your output got worse" is not a diagnosis. "Your output gained 660 tokens of hedge language because the new model defaults to more cautious framing, and your prompt has no negative instruction blocking it" is.

The fix output is rated for durability. Few-shot examples and explicit format templates are Durable — they give the model concrete targets that survive most updates. Negative instructions and length workarounds are Fragile — they fight current model defaults that may shift again. You see which is which, so you can plan re-tests accordingly. For API users, the audit also calculates the cost impact of verbosity inflation in dollars, not just tokens.

It deliberately separates decay from cross-model differences. "Works on ChatGPT but not Claude" isn't decay — it's adaptation, and the skill says so rather than running a useless audit. It also has two modes: reactive (something broke, run the full process) and proactive (nothing's broken yet, harden the prompts now). The skill asks which one you need before starting, instead of forcing the full audit on a prompt that's actually fine.

When this triggers

·A prompt that worked last month is now producing different or worse output
·Outputs feel longer, more cautious, or full of new disclaimers
·You want to test important prompts after a model version bump
·You're paying per-token and verbosity inflation is quietly doubling your bill
·You want to harden prompts proactively before they break

Example

Trigger

User pastes a prompt and says: 'this used to produce a clean 3-bullet summary, now it gives me 800 words with a disclaimer at the end.'

Output

Diagnosis: Verbosity Inflation + Caveat Accumulation. Old output: ~120 tokens, 3 bullets, no hedging. New output: ~780 tokens, prose + bullets + "it's worth noting" closer. Cost impact: ~6.5x your previous per-call output spend. Updated prompt (changes highlighted): + "Maximum 3 bullets. No preamble, no closing summary." + "Do not include caveats, disclaimers, or hedge language." + [few-shot example of the exact format you want] Durability rating per fix: · Few-shot example — Durable (survives updates) · Negative instruction ("do NOT add caveats") — Fragile (works now, may need re-testing next version) Re-test after: any major model version bump, or in 30 days.

Get this skill + 15 more

Included in the The Solopreneur Stack — run a one-person empire. Save $110+ vs buying individually.

Get The Solopreneur Stack — $129

What you get

100-line SKILL.md, ready to drop into ~/.claude/skills/
Tested through 3 Karpathy-loop iterations (versions v1.0.0 → v1.3.0)
Triggers automatically when relevant — no command to remember
Lifetime updates as the skill is refined further