Multi-hop RAG: Why It Still Fails After Temporal RAG

You added Temporal RAG, but "who is my boss's boss?" still returns wrong answers. RAG now understands time, but it still doesn't know "what to search for next."

Redefining the Problem: Single-hop vs Multi-hop

The conventional explanation goes:

"Single-hop is easy, Multi-hop is hard"

But this describes the symptom, not the cause. A more precise definition:

Single-hop is a 'document selection problem.'

Multi-hop is a 'search sequence planning problem.'

text

Single-hop: Query → Retrieve → Answer
Multi-hop:  Query → Retrieve₁ → Reasoning → Retrieve₂ → ... → Answer

Temporal RAG narrows down 'when,' but it still can't reason about 'what to search next.'

Examples of Multi-hop Questions

text

"What did Microsoft's CEO say when OpenAI's CEO was fired?"

To answer this:

Identify when OpenAI CEO was fired (2023-11-17)
Search for Microsoft CEO's statement at that time
Connect both pieces of information

text

"Who was CEO before Sam Altman returned?"

To answer this:

Identify when Sam Altman returned (2023-11-22)
Search for CEO just before that (backward reasoning)
Understand the sequence: Emmett Shear → Mira Murati

Failure Pattern Analysis

Pattern A: Partial Retrieval

python

query = "What did Microsoft's CEO say when OpenAI's CEO was fired?"

# What Temporal RAG does
temporal_filter = "2023-11-17 ± 1 day"
retrieved = search(query, time_filter=temporal_filter)

# Result: Only gets CEO firing articles
# Documents with "Microsoft CEO statement" aren't in search scope

Problem: Temporal filtering actually narrows the search space, killing the next hop.

Pattern B: Context Mixing

python

query = "How did Tesla's stock react when they cut prices?"

# Search results
doc1: "Tesla cut Model 3 prices by 15%"     # 2023-01
doc2: "Tesla stock dropped 12%"              # 2023-04
doc3: "Tesla stock hits all-time high"       # 2023-07

# LLM connects by co-occurrence, not causation

Problem: Connects by keyword similarity, not causal relationship.

Pattern C: Chain Collapse

python

query = "Who was CEO before Sam Altman returned?"

# LLM's reasoning
# Current CEO: Sam Altman ✓
# Past reasoning: ??? (can't go backward)

# Why?
# - Can't convert "before return" into a search query
# - Doesn't know the sequence: Emmett Shear → Mira Murati

Problem: Can't reason backward through time.

The Core Insight

RAG failures start not at answer generation, but at 'failing to plan the search path.'

text

❌ Wrong diagnosis: "LLM can't reason"
✓ Correct diagnosis: "Required documents were never retrieved"

Solutions: Role Separation, Not Enumeration

1. Query Decomposition

Role: Break questions into searchable units

python

original = "What did Microsoft's CEO say when OpenAI's CEO was fired?"

decomposed = [
    "When was OpenAI's CEO fired?",
    "What did Microsoft's CEO say on November 17, 2023?"
]

Limitation: Decomposition works, but sequencing still relies on LLM intuition

2. Iterative Retrieval

Role: Use previous answers for next search

python

# Step 1
q1 = "When was OpenAI's CEO fired?"
a1 = "Sam Altman was fired on 2023-11-17"

# Step 2 (using a1)
q2 = f"What did Microsoft's CEO say on 2023-11-17?"
a2 = "Satya Nadella expressed support for Sam Altman"

# Final
answer = combine(a1, a2)

Limitation: One wrong hop accumulates errors

3. Graph-based Reasoning

Role: Explicitly define traversable paths

text

[Sam Altman] --fired_from--> [OpenAI] --at_time--> [2023-11-17]
                                |
                                v
[Satya Nadella] --commented_on--> [Sam Altman firing]

Advantage: Stable ordering and relationships

Disadvantage: Graph construction cost, schema design required

4. Chain-of-Thought + RAG

Role: Make reasoning visible

python

thought_chain = """
1. First, find when OpenAI CEO was fired
2. Then, search for Microsoft CEO's statement at that time
3. Connect both to form answer
"""

Limitation: Retrieval remains a black box

Solution Comparison

Approach	Role	Strength	Limitation
Query Decomposition	Break into searchable units	Simple to implement	Sequencing depends on LLM
Iterative Retrieval	Use previous answers	Dynamic search	Error accumulation risk
Graph-based	Explicit paths	Stable ordering	High construction cost
CoT + RAG	Visible reasoning	Easy debugging	Retrieval is black box

Conclusion

Multi-hop isn't about 'smarter LLMs'—it's about 'structuring the search path.'

Temporal RAG became a necessary condition, but it's not sufficient for Multi-hop.

The next bottleneck is 'Retrieval Planning.'

text

Temporal RAG: Solved "when" ✓
Multi-hop RAG: "What to search next" → Unsolved

Next Step: Retrieval Planning
- Explicitly plan search sequence
- Pass each hop's result to the next
- Backtrack on failure

Next Post Preview

"Retrieval Planning: Making RAG Plan Its Search Sequence"

ReAct Pattern
Self-Ask
Plan-and-Solve

Multi-hop RAG: Why It Still Fails After Temporal RAG