Why rare disease definitions belong under governance, not in a prompt
May 20, 2026 · Robert Smith
There’s a tempting shortcut in modern clinical AI: describe a disease to a language model in free text, ask it to find matching patients, and ship the result. It demos well. It is also the wrong foundation for rare disease patient finding.
The reason is simple. If two runs produce two different cohorts, the output isn’t trustworthy. In rare disease, each candidate represents a real person who may have waited years for an answer. Reproducibility is the whole point.
Governed content vs. a prompt
We draw a hard line between two kinds of work:
- Governed, deterministic content. The definition of a disease (its codes, signals, lab thresholds, inclusions, and exclusions) is reproducible by construction. The same inputs always produce the same patients. It changes only through curation and sign-off, never by editing a free-text prompt at runtime.
- Orchestration. The surrounding workflow (drafting, presentation, exploration) can be AI-driven and flexible. It may invoke the governed definition, but it never re-derives it on the fly.
This mirrors a long tradition in clinical informatics: standardized, peer-reviewed phenotype definitions exist precisely so that a cohort means the same thing across sites and over time (the OHDSI community and shared phenotype libraries are built on exactly this principle).
Validated by the people who define the field
Reproducible is necessary but not sufficient. The definition also has to be clinically right. That’s why every disease pathway is curated and signed off by key opinion leaders (KOLs) before it’s used. The criteria reflect the judgment of leading clinicians, not an opaque algorithm, and changes are versioned and auditable.
The payoff is that a cohort PathfindEHR™ produces is defensible: you can show exactly why each patient was surfaced, who approved the definition that surfaced them, and that the same definition will behave identically tomorrow.
What this looks like in practice
- A clinician panel curates and approves a disease definition.
- That definition compiles to governed query logic, not a prompt.
- Running it across real-world data yields an explainable, ranked candidate list.
- Every change to the definition is reviewed and versioned.
Trustworthy clinical decision support has little to do with having the cleverest model. What matters is being able to stand behind every result. Governance is how you do that.
Read more about how PathfindEHR™ works.
Sources
- Observational Health Data Sciences and Informatics (OHDSI), standardized analytics and phenotype development.
This article reflects Sagacity’s engineering approach and is not medical advice.
Robert Smith, Co-founder & CTO
Sagacity Diagnostics, rare disease clinical decision support. Published May 20, 2026.
