Heuristic vs LLM¶
The pipeline can run query expansion, synthesis, and gap analysis with heuristics (no LLM tokens) or LLM agents (structured JSON via pydantic-ai). Default settings favor heuristics for speed and low-resource machines.
Source: src/config/resolve_llm_features.py, src/research/query_expansion.py, src/analysis/synthesis.py, src/analysis/gap_analysis.py.
Default behavior¶
Out of the box on Ollama with llama3.2:3b (catalog fallback) and llm_mode: auto:
| Stage | LLM used? | Mechanism |
|---|---|---|
| Query expansion | No | Synonym tables, acronym expansion, Jaccard gates |
| Synthesis | No | Abstract sentence extraction, embedding-aligned agreements |
| Gap analysis | No | Derived from heuristic synthesis fields |
Reports may contain placeholders such as "Details inferred from abstract only" and "Cross-paper disagreement analysis limited in heuristic mode."
Quality impact
A successful pipeline run with zero LLM tokens can still produce misleading executive summaries when retrieval returns off-topic papers. See Known issues for the full analysis and fixes applied to heuristic paths.
Feature flag resolution¶
resolve_effective_settings() runs once at pipeline start and sets synthesis.llm_enabled and query_expansion.llm_enabled on the config passed to all stages.
Precedence (per feature — synthesis and query expansion independently):
RA_{SECTION}__LLM_ENABLEDenv (true/false/1/0)llm_mode: on→ enabled;llm_mode: off→ disabledllm_mode: auto→ rules below
Auto-mode rules¶
| Provider | synthesis LLM | query expansion LLM |
|---|---|---|
openai, anthropic |
Always on | Always on |
ollama |
On if catalog entry has synthesis.llm_enabled: true |
Same catalog hint |
| Other | Off | Off |
For Ollama, the resolved model name (after auto selection) determines catalog hints:
| Model | llm_mode: auto |
max_llm_papers hint |
|---|---|---|
llama3.1:8b |
LLM on | 5 |
llama3.2:3b |
LLM off | 3 |
# Force LLM on any Ollama model
RA_SYNTHESIS__LLM_ENABLED=true
RA_QUERY_EXPANSION__LLM_ENABLED=true
What each path does¶
Query expansion¶
Heuristic (expand_query_heuristic):
- Domain synonym map and acronym expansion
- Phrase-aware variant generation with Jaccard overlap gate
- Broad-term guard to avoid degenerate single-word variants
- No API calls; deterministic for a given query
LLM (AgentRole.EXPANSION):
- Structured JSON: search variants + sub-questions
- Uses
ctx.configLLM settings after resolution - Falls back to heuristics on timeout or parse failure
Synthesis (two-pass)¶
Heuristic:
- Extracts sentences from abstracts aligned with query concepts
- Aggregates agreements from top-quartile embedding-similar papers
- Limited cross-paper disagreement analysis
- No per-paper deep reading
LLM:
- Pass A (
AgentRole.EXTRACTION): structured per-paper analysis (up tomax_llm_papers) - Pass B (
AgentRole.SYNTHESIS): cross-paper synthesis JSON - Parallel workers controlled by
synthesis.concurrency - Timeout recovery via
src/core/stage_recovery.py
Gap analysis¶
Coupled to synthesis
Gap analysis LLM is gated by synthesis.llm_enabled, not a separate gap_analysis.llm_mode. When synthesis LLM is off, gap analysis uses heuristics derived from synthesis output.
Heuristic gap analysis: infers gaps from synthesis agreement/disagreement fields.
LLM gap analysis (AgentRole.GAP_ANALYSIS): structured gaps and research opportunities JSON.
Quality tradeoffs¶
| Dimension | Heuristic | LLM |
|---|---|---|
| Speed | ~seconds for synthesis stage | Minutes on local 8B; faster on cloud |
| RAM / cost | Minimal | 8–10 GB RAM (local 8B) or API fees (cloud) |
| Query expansion | Good for common CS/ML terms | Better for niche or interdisciplinary queries |
| Synthesis depth | Abstract snippets only | Per-paper extraction + cross-paper reasoning |
| Disagreement analysis | Placeholder text | LLM compares conflicting claims |
| Failure mode | Can rank/ summarize off-topic papers | Same retrieval issues, but richer analysis when papers are relevant |
When to enable LLM¶
| Goal | Suggested config |
|---|---|
| Fast scan, low RAM | Defaults (3B + heuristic) — Cookbook recipe 1 |
| Local quality | llama3.1:8b + catalog auto or RA_SYNTHESIS__LLM_ENABLED=true — Recipe 2 |
| Best quality | Cloud OpenAI/Anthropic — Recipe 3 |
| Debug heuristic-only bugs | Keep LLM off; inspect logs/debug/ pipeline dumps |
Configuration reference¶
| Variable | Default | Effect |
|---|---|---|
RA_SYNTHESIS__LLM_MODE |
auto |
Tri-state synthesis LLM |
RA_SYNTHESIS__LLM_ENABLED |
unset | Force override |
RA_SYNTHESIS__MAX_LLM_PAPERS |
3 |
Cap LLM extraction pass |
RA_QUERY_EXPANSION__LLM_MODE |
auto |
Tri-state expansion LLM |
RA_QUERY_EXPANSION__LLM_ENABLED |
unset | Force override |
YAML equivalents under synthesis: and query_expansion: in config/default.yaml. See Stage toggles.
Verify LLM is active¶
Check pipeline logs or debug JSON in logs/debug/:
- Log line:
Resolved LLM features (provider=..., model=...): synthesis=True, query_expansion=True - Metrics: non-zero
llm_tokens_in/llm_tokens_outin pipeline metrics
Enable debug and inspect dumps: Logging and debug — Debug walkthrough.
Related pages¶
- Ollama — catalog hints for auto mode
- Cloud providers — auto-enables LLM
- Synthesis stage — two-pass workflow
- Known issues — heuristic quality RCA