Decision Support at Scale: How Structured Data, Modeling, and LLM Summaries Work Together
In the previous article, we focused on cTAKES and the engineering work required to make clinical NLP viable on large, multi‑page medical evidence documents. This article moves one step downstream — into Decision Support, the stage where most of the system’s modeling occurs and where summarization is generated.
But it’s important to understand that modeling is not confined to Decision Support. Modeling is used throughout the pipeline:
- To validate extracted text
- To detect malformed regions
- To identify OCR anomalies
- To classify document types
- To support region detection
- To drive Decision Support and summarization
Decision Support is simply the stage where modeling becomes the primary activity, and where the outputs directly support examiner workflows.
The role of Decision Support in the pipeline
Decision Support consumes:
- Structured data from cTAKES
- TF‑IDF features
- Region boundaries
- Encounter‑level signals
- Temporal anchors
- Supplemental retrieval from SOLR
And produces:
- Predictive model outputs
- Evidence‑grounded summaries
- Citation‑linked passages
- Examiner‑ready insights
This is the stage where extracted evidence becomes actionable.
Our current modeling approach: structured features + TF‑IDF + shallow neural networks
Before LLMs were viable at scale, our system used a hybrid modeling approach:
- Structured features extracted by cTAKES
- TF‑IDF n‑grams
- Shallow neural networks trained on these combined features
This architecture is:
- Fast
- Predictable
- Easy to maintain
- Scalable to national workloads
And it works extremely well — but it has a known limitation: explainability.
Explainability today: useful, but imperfect
Current explainability is generated by:
- Scoring cTAKES features against domain‑specific rules
- Mapping model outcomes to TF‑IDF terms
- Highlighting passages that contain those terms
- Displaying those passages as the “explanation”
This is effective, but it introduces a mismatch:
The model may be using different passages than the ones shown in the UI.
Because the model’s decision boundary is based on TF‑IDF vectors and structured features, the highlighted passages are a proxy, not a guarantee.
In a high‑stakes environment, proxies are not enough.
Why LLMs change the explainability model
Modern LLMs allow us to shift from:
Model → Explanation
to
Evidence → Summary → Explanation
Instead of trying to infer what the model used, we can now:
- Identify the right evidence using structured data
- Feed only that evidence to the LLM
- Generate a grounded summary
- Use the summary itself as the explanation
This eliminates the mismatch entirely.
Structured data is the foundation of scalable LLM summarization
The most important part of this transition is not the LLM — it’s the structured data extracted by cTAKES.
Structured data allows us to:
- Identify clinically relevant regions
- Filter out irrelevant text
- Anchor findings to encounters and timelines
- Score features against domain knowledge
- Select the correct passages for retrieval
This ensures the LLM sees only the evidence that matters, not the entire document.
Why this matters at 20M+ pages/day
- Lower token counts → lower cost and higher throughput
- Reduced hallucinations → because the LLM is grounded in curated evidence
- Better accuracy → because irrelevant text is excluded
- Predictable behavior → because the input is deterministic
Token count is not a theoretical concern — it is a cost and throughput constraint.
RAG driven by structured data — not blind retrieval
Our Retrieval‑Augmented Generation (RAG) pipeline is structured‑first:
- cTAKES extracts entities, regions, encounters, and temporal anchors
- We score these features against domain‑specific rules
- We identify the exact passages relevant to the Decision Support task
- Only those passages are fed to the LLM
This ensures the LLM is grounded in the same evidence the model uses.
Where SOLR fits: supplementing structure when needed
Structured data is powerful, but not always sufficient.
Some tasks require:
- Narrative context
- Rare or unusual phrasing
- Evidence that is difficult or impossible to extract in a structured form.
For these cases, SOLR supplements structured data by retrieving:
- Nearby text spans
- Related passages
- Rare patterns not captured by cTAKES
SOLR is not the primary retrieval engine — it is the fallback and enhancer when structure alone cannot provide the full picture.
This hybrid approach ensures:
- High recall
- High precision
- Low token counts
- Strong grounding
- Minimal hallucination risk
Decision Support: structured → scored → retrieved → summarized
The Decision Support stage looks like this:
- cTAKES extracts structured data
- Domain scoring identifies the most relevant features
- RAG selects passages using structure first, SOLR second
- The LLM receives only curated evidence
- The LLM produces a grounded, citation‑rich summary
- The summary becomes both the output and the explanation
This architecture is:
- Scalable
- Explainable
- Cost‑efficient
- Token‑efficient
- Auditable
- Future‑proof
Why this transition matters
This is not about replacing models with LLMs. It’s about:
- Improving explainability
- Eliminating mismatches between model logic and UI
- Reducing cost through token efficiency
- Ensuring every summary is grounded in real evidence
- Making the system more maintainable long‑term
- Preserving the strengths of structured extraction while adding LLM flexibility
LLMs are powerful — but structured data is what makes them safe, scalable, and affordable.
Previous Post
Next Post