A workflow for clinicians, researchers, and analysts who want to use AI as a force multiplier, not a toy. Spec‑driven prompting, multi‑chat orchestration, verified outputs. Actionable from day one.
Most of what is written about AI right now assumes you are either a software engineer or trying to become one. That is not the population I am writing for. I am writing for clinicians, researchers, public health folks, analysts, and MBAs who want AI to do real work in their day job and are tired of the prompt‑engineering hype.
Here is what actually matters, based on about a year of using AI daily to build real products, write research, and produce long‑form health content. Everything below works without writing a line of code.
The first mistake most people make is picking a tool. ChatGPT, Claude, Gemini, Copilot, a dozen wrappers. The tool matters less than the workflow. Start from this question instead:
What is a thing I do repeatedly, that is cognitively draining, that has a clear right and wrong answer, and that I can verify after? That is the shape of a task AI is good at. Examples:
Notice what is missing: "give me a creative idea," "tell me what I should do," "make a decision for me." AI is weakest exactly where it is most confident: giving judgment. It is strongest where you would have hired an intern: compression, drafts, first passes, structured extraction.
You do not need ten tools. You need three, and you need to use them well.
This is where you think out loud, specify a problem, review drafts, and iterate. Claude, ChatGPT, or Gemini. Pick one and learn it well. Do not bounce between them trying to find the "best" one. The best one is the one you use consistently enough to develop a working rhythm.
My default is Claude because the long‑context behavior and the cite‑your‑sources discipline are the best in class right now. But the specific model matters less than the habit.
For anything where citations matter (clinical reference, competitive teardown, lit review, policy synthesis), a research‑mode tool is a step up from a plain chat. Claude's research feature, Perplexity, or Google's deep research fall in this category. They search the web, pull primary sources, and produce a structured output with inline links.
The critical habit: treat the output as a first draft. Open the cited sources. Check that the claim matches what the source actually says. At least half the time there is a subtle drift between what a model writes and what the cited source actually said. That is the part that stays your job.
If you are working on something that spans weeks (a grant, a research project, a build), you need a way to keep context across sessions. Claude Projects, ChatGPT Custom GPTs, and similar features all solve this. You upload the reference documents once, and the assistant has them available in every new chat.
This is the single biggest leverage point most people miss. Without it, you re‑explain your project in every chat. With it, you start every conversation already in context.
Here is the loop I run, whether I am doing research, writing, or building. Four phases, every time.
Before you open a chat, write a short specification in plain text. What are you trying to produce. What are the inputs. What does good look like. What are the non‑goals. What is the output format. Save it. This is your spec. It does not need to be long, but it needs to exist before you start prompting.
Rule: Open‑ended prompts produce debt. Locked prompts produce verifiable output. Writing the spec is the hard part; once you have it, the prompts almost write themselves.
For anything bigger than a single task, run multiple chats in parallel instead of one long chat. Each chat holds its own context:
The reason this works: every chat gets longer over time, and the longer it gets the worse its judgment becomes. Keeping chats scoped and short keeps the quality consistent. You are effectively running a team of specialist assistants, not a single confused generalist.
Give the implementation chat the spec, the research output, and the decisions. Ask for one step. Get the output. Read it. Flag what is wrong. Ask for the next step.
Do not ask for the whole thing at once. That is the most common beginner mistake. A single shot at an entire document is almost always worse than five iterations on discrete sections.
Every claim that came from the model gets checked against a source. Every number gets verified. Every citation gets clicked. This is tedious. It is also the reason the output is useful instead of a liability.
The speed gain from AI is real, but it is not in the producing. It is in the iterating.
A lot of people treat prompts like a chat with a smart friend. That works for small tasks. For anything serious, prompt like you are handing work to a contractor.
Structure every non‑trivial prompt like this:
# Context
What I am working on. Why. Who will read the output.
# Input
The materials I am giving you.
# Task
What you should produce. In what format. At what length.
# Constraints
Do not do X. Do do Y. Cite all claims inline. Flag any uncertainty.
# Success criteria
The output is useful if it does A, B, and C.
That takes 5 minutes to write. It saves 30 minutes of re‑prompting because the first output was vague.
When you decide something (a threshold, a format, a scope), write it down in the prompt and tell the model it is locked. "We have decided to use NHANES cycles 1999–2018. Do not propose alternatives." Otherwise the model will helpfully offer to change decisions that are already closed, every time.
Never accept a claim without a citation. This is non‑negotiable for clinical, research, or business‑critical work. If the model says it cannot cite, treat the claim as unsupported.
The first output is never the output. It is the input to the second prompt. Your job is to read it, mark what is wrong, and feed that back. Three iterations beats one "perfect" prompt.
Every time you build a prompt that produces good output, save it. You will reuse it. Over a year, your personal prompt library becomes more valuable than any specific tool.
There are specific ways this goes wrong. Recognize them before they cost you.
The model produces a fluent, specific, plausible claim that is completely fabricated. Common with citations, statistics, and quotes. The only defense is verification. Every time.
The model takes a complicated situation and smooths it into a clean summary. The smoothing is often where the useful information lives. Ask "what did I lose when you summarized this?" It is a surprisingly productive question.
Models default to agreeing with you. If you ask "is X a good idea?" you will usually get "yes, X is a good idea." Ask instead: "argue against X" or "what are the three strongest critiques of X?" You will learn more in ten seconds than in a year of yes‑answers.
As chats get long, models start forgetting earlier instructions. When you notice the quality dropping, start a new chat with a fresh spec and the key artifacts from the old one. Do not try to push through a degraded context.
If you want to go from zero to genuinely useful with AI, here is a plan that works.
At the end of 30 days you will not be an engineer. You will not need to be. You will be measurably faster at the knowledge work that was already your job, and you will have the discipline to do it without the quality going down.
A short list of traps that cost me time I will not get back.
The goal is not to use AI more. The goal is to do better work, faster, with fewer errors, using AI as one of several tools. When you notice the tool is getting in the way, stop and do the work the old way. When you notice it is pulling its weight, lean in.
Thinking through how AI fits into your work? Happy to talk.
Get in touch