SOLR and Selective Retrieval at Scale: Full‑Document Indexing for User Search and LLM Summaries

March 2, 2026 Administrator

Off

Share This

In the previous article, we explored Decision Support — the stage where structured data, modeling, and LLM‑based summarization converge. This next article focuses on a critical enabler of that workflow: SOLR and selective retrieval.

Our system indexes the entire document, not just structured slices. This supports two major use cases:

User search within a single case — examiners can search for any phrase, abbreviation, or concept across all documents in that case.
Summarization and Decision Support — SOLR retrieves contextual passages when structured data alone isn’t enough.

And it does this at a scale that supports over 20,000 users working across millions of documents.

Full‑document indexing: the foundation of flexibility

Medical evidence is long, heterogeneous, and unpredictable. Even with high‑quality structured extraction, some information is best retrieved directly from text.

By indexing the entire document, we ensure:

Examiners can search for anything they need
Summarization can retrieve context around structured hits
Rare or unusual phrasing remains discoverable
No clinically relevant text is “lost”

Full indexing gives us completeness and flexibility without sacrificing performance.

User search: UMLS‑powered synonym expansion

Examiners search within a single case, but they need comprehensive results. To support this, we integrate UMLS‑based synonym expansion directly into SOLR queries.

Examples:

Searching PTSD also returns post‑traumatic stress disorder
Searching ESRD also returns end stage renal disease

This ensures examiners get complete results without needing to know every variant of a clinical term.

Index lifecycle: purge after 3 days of non‑use

Full‑document indexing at national scale creates massive data volumes. To keep indexes lean and performant, we:

Track document access
Purge documents after 3 days of non‑use
Re‑index on demand if needed

This keeps index size stable and predictable while ensuring active cases remain fast and responsive.

Architecture: 10 independent SOLR servers, round‑robin assigned

We deploy 10 independent SOLR servers and assign cases round‑robin using a PostgreSQL‑backed routing table.

This design is driven by SOLR’s own guidance:
Zookeeper‑managed clusters are recommended only for either 5 or fewer nodes or thousands of nodes.
Anything in between introduces unnecessary operational overhead without meaningful benefit.

Our architecture provides:

High throughput
Isolation between cases
Predictable latency
No cross‑node contention
Simple operational behavior
Easy horizontal scaling
Reliable performance for 20,000+ users

It’s a pragmatic design optimized for national‑scale workloads.

Structured fields inside the index: enabling advanced filtering

Our SOLR indexes contain structured fields such as:

Region
Encounter type
Encounter date
Section boundaries
cTAKES‑aligned offsets
Domain‑scored features
Document metadata

These fields enable:

Advanced user filtering (e.g., “show only imaging reports from 2018–2020”)
Targeted summarization (e.g., “retrieve passages from the ADL section”)
Efficient RAG (e.g., “pull context around this specific hit”)

This hybrid of structured and unstructured indexing is what makes retrieval both precise and flexible.

Selective retrieval for summarization: structure first, SOLR second

Even though we index the entire document, summarization does not retrieve the entire document.

The workflow is:

Structured data identifies the relevant entities, regions, and encounters
Domain scoring determines which features matter
We retrieve the exact passages tied to those features
SOLR supplements with nearby context or rare phrasing
We assemble a curated evidence set
The LLM receives only this curated evidence

This keeps token counts low while maintaining high recall.

Token efficiency: the hidden constraint

At 20M+ pages/day:

Every unnecessary token costs money
Every unnecessary token increases latency
Every unnecessary token risks hitting provider limits

Selective retrieval ensures that the LLM sees:

Only the relevant passages
Only the necessary context
Only the evidence tied to structured features

This is how we maintain both accuracy and affordability.

Monitoring: retrieval must be predictable

We monitor:

Query latency
Index size (which stays flat due to regular purging)
Query patterns

When we see users repeatedly searching for the same concepts, it becomes a signal to:

Promote those concepts into structured data
Add new classification models
Improve region detection
Enhance domain scoring

User behavior directly informs system evolution.

Why SOLR still matters in an LLM world

LLMs are powerful, but they cannot replace retrieval. In fact, they make retrieval more important.

LLMs need:

Grounding
Structure
Relevance
Context
Token efficiency

SOLR ensures that the LLM receives the right evidence, in the right order, at the right scale.

Administrator

AI-Native Enterprise Anatomy – The Future of Enterprise Software

Detecting Fraud Through Document Intelligence

Human‑Centered Design for Government NLP Systems

The Future of NLP in Government Systems: How Users Will Work, Think, and Decide in the Next Decade

Comments are closed.

Interactive Consulting Services, Inc.

SOLR and Selective Retrieval at Scale: Full‑Document Indexing for User Search and LLM Summaries

Administrator

Related Posts

Services

Contact Us