Models & Algorithms

Multi-hop RAG: Why It Still Fails After Temporal RAG

You added Temporal RAG, but "who is my boss's boss?" still returns wrong answers. RAG now understands time, but it still doesn't know "what to search for next."

Multi-hop RAG: Why It Still Fails After Temporal RAG

Multi-hop RAG: Why It Still Fails After Temporal RAG

You added Temporal RAG, but "who is my boss's boss?" still returns wrong answers. RAG now understands time, but it still doesn't know "what to search for next."

Redefining the Problem: Single-hop vs Multi-hop

The conventional explanation goes:

"Single-hop is easy, Multi-hop is hard"

But this describes the symptom, not the cause. A more precise definition:

Single-hop is a 'document selection problem.'

Multi-hop is a 'search sequence planning problem.'

text
Single-hop: Query → Retrieve → Answer
Multi-hop:  Query → Retrieve₁ → Reasoning → Retrieve₂ → ... → Answer

Temporal RAG narrows down 'when,' but it still can't reason about 'what to search next.'

Examples of Multi-hop Questions

text
"What did Microsoft's CEO say when OpenAI's CEO was fired?"

To answer this:

  1. Identify when OpenAI CEO was fired (2023-11-17)
  2. Search for Microsoft CEO's statement at that time
  3. Connect both pieces of information
text
"Who was CEO before Sam Altman returned?"

To answer this:

  1. Identify when Sam Altman returned (2023-11-22)
  2. Search for CEO just before that (backward reasoning)
  3. Understand the sequence: Emmett Shear → Mira Murati

Failure Pattern Analysis

Pattern A: Partial Retrieval

python
query = "What did Microsoft's CEO say when OpenAI's CEO was fired?"

# What Temporal RAG does
temporal_filter = "2023-11-17 ± 1 day"
retrieved = search(query, time_filter=temporal_filter)

# Result: Only gets CEO firing articles
# Documents with "Microsoft CEO statement" aren't in search scope

Problem: Temporal filtering actually narrows the search space, killing the next hop.

Pattern B: Context Mixing

python
query = "How did Tesla's stock react when they cut prices?"

# Search results
doc1: "Tesla cut Model 3 prices by 15%"     # 2023-01
doc2: "Tesla stock dropped 12%"              # 2023-04
doc3: "Tesla stock hits all-time high"       # 2023-07

# LLM connects by co-occurrence, not causation

Problem: Connects by keyword similarity, not causal relationship.

Pattern C: Chain Collapse

python
query = "Who was CEO before Sam Altman returned?"

# LLM's reasoning
# Current CEO: Sam Altman ✓
# Past reasoning: ??? (can't go backward)

# Why?
# - Can't convert "before return" into a search query
# - Doesn't know the sequence: Emmett Shear → Mira Murati

Problem: Can't reason backward through time.

The Core Insight

RAG failures start not at answer generation, but at 'failing to plan the search path.'
text
❌ Wrong diagnosis: "LLM can't reason"
✓ Correct diagnosis: "Required documents were never retrieved"

Solutions: Role Separation, Not Enumeration

1. Query Decomposition

Role: Break questions into searchable units

python
original = "What did Microsoft's CEO say when OpenAI's CEO was fired?"

decomposed = [
    "When was OpenAI's CEO fired?",
    "What did Microsoft's CEO say on November 17, 2023?"
]

Limitation: Decomposition works, but sequencing still relies on LLM intuition

2. Iterative Retrieval

Role: Use previous answers for next search

python
# Step 1
q1 = "When was OpenAI's CEO fired?"
a1 = "Sam Altman was fired on 2023-11-17"

# Step 2 (using a1)
q2 = f"What did Microsoft's CEO say on 2023-11-17?"
a2 = "Satya Nadella expressed support for Sam Altman"

# Final
answer = combine(a1, a2)

Limitation: One wrong hop accumulates errors

3. Graph-based Reasoning

Role: Explicitly define traversable paths

text
[Sam Altman] --fired_from--> [OpenAI] --at_time--> [2023-11-17]
                                |
                                v
[Satya Nadella] --commented_on--> [Sam Altman firing]

Advantage: Stable ordering and relationships

Disadvantage: Graph construction cost, schema design required

4. Chain-of-Thought + RAG

Role: Make reasoning visible

python
thought_chain = """
1. First, find when OpenAI CEO was fired
2. Then, search for Microsoft CEO's statement at that time
3. Connect both to form answer
"""

Limitation: Retrieval remains a black box

Solution Comparison

ApproachRoleStrengthLimitation
Query DecompositionBreak into searchable unitsSimple to implementSequencing depends on LLM
Iterative RetrievalUse previous answersDynamic searchError accumulation risk
Graph-basedExplicit pathsStable orderingHigh construction cost
CoT + RAGVisible reasoningEasy debuggingRetrieval is black box

Conclusion

Multi-hop isn't about 'smarter LLMs'—it's about 'structuring the search path.'

Temporal RAG became a necessary condition, but it's not sufficient for Multi-hop.

The next bottleneck is 'Retrieval Planning.'

text
Temporal RAG: Solved "when" ✓
Multi-hop RAG: "What to search next" → Unsolved

Next Step: Retrieval Planning
- Explicitly plan search sequence
- Pass each hop's result to the next
- Backtrack on failure

Next Post Preview

"Retrieval Planning: Making RAG Plan Its Search Sequence"
  • ReAct Pattern
  • Self-Ask
  • Plan-and-Solve