Separating what the AI knows from what the AI does
I'm a technical documentation writer. I spend my days writing developer docs, reviewing PRs, triaging Jira tickets, auditing existing content for accuracy, and trying to keep up with a platform that ships features faster than any one person can document.
Up until recently, when people told me AI was going to change my job, I mostly nodded and changed the subject. I'd used the chatbots. I'd seen the "write my docs for me" demos. The output was fluent, confident, and almost always wrong in ways that mattered: wrong tone, wrong structure, wrong assumptions about what the reader already knew.
But over the past several months, I've been building something different. Not an AI that writes my docs for me, but a system of AI-assisted workflows that handles the mechanical parts of my job, the parts that eat hours without requiring much creative judgment. And it's genuinely changed how I work. Anytime I say "we" in this post, I mean myself and the AI.
Here's what it took to get there.
What it actually does
I use OpenCode, a CLI-based coding assistant, as the foundation. On top of it, I've built a set of custom workflows (slash commands) that automate specific documentation tasks. Not "write me a page about feature X" but things like:
Researching context before writing. When a new documentation ticket lands, a workflow pulls context from the wiki, the existing docs, the ticket history, and related escalation tickets. It synthesizes a research brief so I start writing with full context instead of spending 30 minutes or more tab-switching.
Auditing documentation. A workflow walks through every page in a product's docs and checks structure, style conformance, frontmatter completeness, content gaps, and stale information. It groups findings by severity. What used to take me half a day takes minutes.
Triaging tickets. A daily workflow scans my ticket queue, cross-references against what's already documented, checks for related tickets, and produces a prioritized list with context. I still make the priority calls, but I'm making them with full information instead of gut feel.
Reviewing PRs for style conformance. A workflow checks documentation changes against the style guide, flags structural issues, and catches the things humans reliably miss after the nth review of the day: inconsistent heading levels, missing frontmatter fields, incorrect component usage.
Generating changelog entries. Given a source document describing a product change, a workflow produces a properly formatted changelog entry following all the conventions. I review and edit, but the mechanical formatting work is done.
None of these replace my judgment. All of them free up time I used to spend on tasks where the judgment-to-busywork ratio was low.
The wall we hit
My quest into automation started small. One workflow to help with ticket triage. Another for audits. They were each self-contained: a single file with all the instructions the AI needed to do one job.
Then the number of commands grew.
Each workflow needed to know about our style guide. Our repository structure. Our Jira conventions. Our frontmatter rules. Our git workflow. So each one got its own copy of that knowledge, pasted inline. When a convention changed (say, a new frontmatter field became required), I had to update it in every workflow that referenced it. There were over 20 by this point.
The workflows also got long. Really long. And here's the thing about large language models: they follow long instructions less faithfully than short ones, what's known as context rot. I'd built a system that was useful but sometimes fragile. Every workflow was a monolith: part domain knowledge, part task logic, part orchestration, all tangled together.
The kitchen
The fix came from an insight that seems obvious in retrospect: separate what the AI knows from what the AI does from what the user asks for.
Think of a kitchen.
Ingredients
Skills are like ingredients: small, focused knowledge files that cover one topic each. Style rules. Frontmatter conventions. Jira project patterns. Git workflow conventions. Each ingredient is useful on its own, and any cook or recipe can grab what it needs.
The old system had the equivalent of a few giant bags of pre-mixed spices. One 600-line file mixed writing tone, frontmatter rules, component syntax, content types, and folder structure into a single blob. Every workflow that needed any of that knowledge had to load all of it. We broke it into focused, single-domain files. The pantry now has 14 ingredients. Workflows pick exactly what they need.
Cooks
Agents are like cooks: specialised AI personas, each with a defined role, a set of default ingredients they know to grab, and a quality checklist they run before finishing any task. A writing specialist. A reviewer. A researcher. An analyst. An adversarial reviewer whose only job is to find problems in the other cooks' output.
This layer didn't exist before. Every workflow had to be its own generalist, which meant every workflow duplicated the same orchestration logic. Now, a workflow just says "hand this task to the reviewer" and the reviewer already knows which ingredients to load and what quality bar to hit.
Recipes
Slash commands are like recipes: the user-facing workflows that say which cook to call, what extra ingredients they might need, and what the task-specific requirements are. The recipe doesn't explain what the style guide says; it just says "check against the style guide." That knowledge lives in the ingredient.
The result: workflows got dramatically shorter. Shorter workflows are more reliable. And when a convention changes, I update it in one place.
Principles that held up
Through the restructuring, a few principles proved themselves repeatedly:
| Principle | Why it matters |
|---|---|
| Shorter instructions are more reliable | LLMs follow 50 lines of clear instruction more faithfully than 300 lines of comprehensive instruction. Brevity isn't just aesthetic; it's functional. |
| Decision tables over prose | A 5-row table conveys branching logic more reliably than paragraphs of if/then descriptions. The AI parses structure better than nuance. |
| Separate knowledge from workflow | If you're writing reference content inside a workflow (style rules, conventions, schemas), it belongs in a shared knowledge file. |
| Delegate to specialists | A workflow that tries to research, write, and review in one pass will do each phase more poorly than a specialist would. |
| Detect context at runtime | Don't hardcode paths, usernames, or environment details. Have the AI figure out what it's working with when it runs. |
Two design tests also proved useful for deciding when to split things apart:
The splitting test: A piece of knowledge deserves its own file if (1) it can be used independently, (2) at least two workflows would use it, and (3) it's stable reference knowledge, not runtime logic. If all three are true, extract it. Otherwise, keep it inline.
The standalone threshold: A workflow stays simple (no specialist delegation, no adversarial review) if it's single-phase execution with no judgment calls. If the output requires domain expertise or the task has multiple phases (research, then synthesize, then validate), route it through the full pipeline.
What AI still can't do
I want to be honest about the limitations, because the value of this system depends on understanding where the boundaries are.
Smaller models aren't reliable enough. The workflows require a model that can hold complex, multi-step instructions in context. Smaller, faster models consistently lose track of later steps or start improvising. This limits your model choices and increases cost.
Long sessions degrade. The AI follows constraints less faithfully as conversations get long. In one session, a safety constraint I'd set early on ("never push code without my approval") faded from context after enough back-and-forth, and the AI pushed to the remote repository on its own. Constraints need reinforcement, not just declaration.
AI-managed knowledge files can silently contain wrong data. The AI populated a knowledge file with my team information and got my manager's name wrong, assigned me to products I don't cover, and used an incorrect surname. The way the system worked seemed robust, so I forgot to check it for a few days. Unlike code bugs, knowledge file errors don't produce visible failures. They just make the AI subtly wrong in ways you attribute to "AI being AI" instead of corrupted source data.
Content quality still needs a human. The AI can check style conformance and structural completeness mechanically. It cannot tell you whether an explanation actually helps the reader. It cannot judge whether a concept page makes the right tradeoffs between precision and accessibility. It cannot sense when something is technically correct but pedagogically useless. Those judgment calls are the core of the job, and they're staying with us.
If you're on the fence
If you're a tech writer wondering whether AI tooling is worth the investment: I think it is, but probably not in the way the hype suggests.
The value isn't in asking the AI to write your docs. The output quality isn't there for that, not for documentation that needs to be accurate, well-structured, and written for a specific audience. Maybe one day, but not today.
The value is in the mechanical work that surrounds the writing. Researching context. Auditing existing content. Triaging tickets. Checking conformance. Generating boilerplate. Formatting changelogs. The work that takes hours, follows patterns, and doesn't require your expertise, just your time.
The catch is that getting there requires building, not just prompting. You have to teach the AI your domain: your style guide, your repo structure, your conventions, and how to organise that knowledge in a way the AI can use reliably. That's not a one-afternoon project. It took many weeks of iteration.
But the payoff compounds. Every piece of domain knowledge you formalize makes every workflow more reliable. Every workflow you build gives you back hours. And the system gets better as it grows, because new workflows can lean on knowledge and specialists that already exist.
Start small. Pick one mechanical task that eats your time. Build a workflow for it. See if the tradeoff works for you. If it does, the architecture will follow naturally, because once you have three or four workflows duplicating the same knowledge, you'll want to separate what the AI knows from what the AI does.
That's when it gets interesting.