Deep Dive🇰🇷 한국어

Temporal RAG: Why RAG Always Gets 'When' Questions Wrong

"Who was the CEO in 2023?" "What about now?" — Why RAG gives wrong answers to these simple questions, and how to fix it.

Temporal RAG: Why RAG Always Gets 'When' Questions Wrong

Temporal RAG: Why RAG Always Gets 'When' Questions Wrong

"Who was the CEO in 2023?" "What about now?" — Why RAG gives wrong answers to these simple questions, and how to fix it.

Introduction: RAG's Time Blindness

Ask your RAG system this question:

"Who is OpenAI's CEO?"

Answer: "Sam Altman."

Good. Now ask this:

"Who was OpenAI's CEO on November 18, 2023?"

Answer: "Sam Altman."

Wrong. From November 17-22, 2023, Sam Altman was fired, and Mira Murati was interim CEO.

More Failure Cases

QuestionExpected AnswerRAG AnswerProblem
"Tesla stock price in 2022?"$100-400 range"Currently $248"Ignores timeframe
"Last year vs this year Apple earnings?"Comparison analysisMixed dataTime confusion
"Who is Twitter's CEO?""Linda Yaccarino" (2024)"Elon Musk" (2022)Stale data
"COVID cases status"Latest data2021 peak dataRecency failure
"Company policy back then?"Historical policyCurrent policyCan't track past

Why Does This Happen?

The Fundamental Limitation of Embeddings

Vector embeddings only capture semantic similarity. Time information is not included.

python
# These two sentences have very high embedding similarity
text1 = "Sam Altman is the CEO of OpenAI"  # 2024 document
text2 = "Sam Altman is the CEO of OpenAI"  # 2020 document

# But also high similarity with this
text3 = "Mira Murati is the CEO of OpenAI"  # Nov 2023 document

# Embeddings don't know 'when'
similarity(embed(text1), embed(text2)) ≈ 1.0  # Same content
similarity(embed(text1), embed(text3)) ≈ 0.85  # Both relevant to CEO question

Types of Temporal Questions

1. Point-in-Time Questions

  • "What was Q3 2023 revenue?"
  • "What was the policy back then?"
  • "What happened this time last year?"

2. Time Range Questions

  • "Changes from 2020 to 2023"
  • "Trends over the last 3 months"
  • "What's different this year?"

3. Relative Time Questions

  • "Recent news" (when exactly?)
  • "How was it before?" (how far back?)
  • "What changed since then?"

4. Temporal Comparison Questions

  • "Year-over-year growth rate"
  • "Before vs after policy change"
  • "Performance before and after CEO change"

5. Time Series Questions

  • "Quarterly revenue trends"
  • "Annual user growth"
  • "Monthly traffic changes"

Solution 1: Metadata Filtering

The most basic approach. Add time metadata to documents and filter during search.

Implementation

python
from datetime import datetime, timedelta
from typing import List, Optional
import chromadb

class TemporalVectorStore:
    """Time-aware vector store"""

    def __init__(self):
        self.client = chromadb.Client()
        self.collection = self.client.create_collection("temporal_docs")

    def add_document(self, doc_id: str, text: str, timestamp: datetime,
                     source: str = None):
        """Add document with time metadata"""
        self.collection.add(
            ids=[doc_id],
            documents=[text],
            metadatas=[{
                "timestamp": timestamp.isoformat(),
                "year": timestamp.year,
                "month": timestamp.month,
                "quarter": (timestamp.month - 1) // 3 + 1,
                "source": source or "unknown"
            }]
        )

    def query_with_time_filter(
        self,
        query: str,
        start_date: Optional[datetime] = None,
        end_date: Optional[datetime] = None,
        top_k: int = 5
    ) -> List[dict]:
        """Time-filtered search"""

        where_filter = {}

        if start_date and end_date:
            where_filter = {
                "$and": [
                    {"timestamp": {"$gte": start_date.isoformat()}},
                    {"timestamp": {"$lte": end_date.isoformat()}}
                ]
            }
        elif start_date:
            where_filter = {"timestamp": {"$gte": start_date.isoformat()}}
        elif end_date:
            where_filter = {"timestamp": {"$lte": end_date.isoformat()}}

        results = self.collection.query(
            query_texts=[query],
            n_results=top_k,
            where=where_filter if where_filter else None
        )

        return results

Temporal Expression Parsing

python
import re
from dateutil import parser
from dateutil.relativedelta import relativedelta

class TemporalQueryParser:
    """Extract temporal information from queries"""

    def parse(self, query: str, reference_date: datetime = None) -> dict:
        """Extract time range from query"""
        if reference_date is None:
            reference_date = datetime.now()

        result = {
            "original_query": query,
            "start_date": None,
            "end_date": None,
            "temporal_type": "none"
        }

        # Absolute year
        year_match = re.search(r'(\d{4})', query)
        if year_match:
            year = int(year_match.group(1))
            result["start_date"] = datetime(year, 1, 1)
            result["end_date"] = datetime(year, 12, 31)
            result["temporal_type"] = "absolute_year"
            return result

        # Recent N days/months
        recent_days = re.search(r'(last|recent|past)\s*(\d+)\s*days?', query, re.I)
        if recent_days:
            days = int(recent_days.group(2))
            result["start_date"] = reference_date - timedelta(days=days)
            result["end_date"] = reference_date
            result["temporal_type"] = "relative_recent"
            return result

        # Last year/this year
        if 'last year' in query.lower():
            last_year = reference_date.year - 1
            result["start_date"] = datetime(last_year, 1, 1)
            result["end_date"] = datetime(last_year, 12, 31)
            result["temporal_type"] = "relative_year"
            return result

        if 'this year' in query.lower():
            result["start_date"] = datetime(reference_date.year, 1, 1)
            result["end_date"] = reference_date
            result["temporal_type"] = "relative_year"
            return result

        # Current/now
        if any(kw in query.lower() for kw in ['current', 'now', 'today']):
            result["start_date"] = reference_date - timedelta(days=7)
            result["end_date"] = reference_date
            result["temporal_type"] = "current"
            return result

        return result

Limitations

Metadata filtering is simple but has limitations:

  1. Hard filtering: Documents just outside the boundary are completely excluded
  2. Sparsity problem: No results if no documents exist in the specified period
  3. Complex expressions: Hard to handle "early 2020s" type expressions

Solution 2: Temporal Decay

Assign higher weights to recent documents.

Implementation

python
import numpy as np
from datetime import datetime

class TemporalDecayScorer:
    """Time-based score decay"""

    def __init__(self, half_life_days: int = 30):
        """
        half_life_days: Period for score to halve
        Example: 30 days means a 30-day-old document gets 50% score
        """
        self.half_life_days = half_life_days
        self.decay_rate = np.log(2) / half_life_days

    def exponential_decay(self, doc_date: datetime,
                          reference_date: datetime = None) -> float:
        """Exponential decay function"""
        if reference_date is None:
            reference_date = datetime.now()

        age_days = (reference_date - doc_date).days
        return np.exp(-self.decay_rate * age_days)

    def gaussian_decay(self, doc_date: datetime,
                       target_date: datetime,
                       sigma_days: int = 30) -> float:
        """
        Gaussian decay - peaks near specific point
        Suitable for point-in-time questions
        """
        diff_days = abs((target_date - doc_date).days)
        return np.exp(-(diff_days ** 2) / (2 * sigma_days ** 2))

    def apply_temporal_score(
        self,
        results: List[dict],
        query_type: str = "recent",
        target_date: datetime = None
    ) -> List[dict]:
        """Apply temporal scoring to search results"""

        scored_results = []

        for result in results:
            doc_date = datetime.fromisoformat(result['metadata']['timestamp'])
            semantic_score = result.get('score', 1.0)

            if query_type == "recent":
                # Prefer recent documents
                temporal_score = self.exponential_decay(doc_date)
            elif query_type == "point_in_time" and target_date:
                # Prefer documents near specific point
                temporal_score = self.gaussian_decay(doc_date, target_date)
            else:
                temporal_score = 1.0

            # Final score = semantic score * temporal score
            final_score = semantic_score * temporal_score

            scored_results.append({
                **result,
                'semantic_score': semantic_score,
                'temporal_score': temporal_score,
                'final_score': final_score
            })

        # Re-sort by final score
        scored_results.sort(key=lambda x: x['final_score'], reverse=True)

        return scored_results

Solution 3: Time-Aware Embedding

Encode time information in the embedding itself.

Method 1: Add Time Tokens

python
class TimeAwareEmbedder:
    """Embed with time context in text"""

    def __init__(self, embedding_model):
        self.model = embedding_model

    def add_temporal_context(self, text: str, timestamp: datetime) -> str:
        """Add time context to text"""
        time_prefix = f"[DATE: {timestamp.strftime('%Y-%m-%d')}] "
        return time_prefix + text

    def embed_with_time(self, text: str, timestamp: datetime) -> np.ndarray:
        """Generate embedding with temporal context"""
        temporal_text = self.add_temporal_context(text, timestamp)
        return self.model.encode(temporal_text)

Method 2: Combine Time Embedding

python
class TemporalEmbedding:
    """Combine text embedding + time embedding"""

    def __init__(self, text_dim: int = 768, time_dim: int = 32):
        self.text_dim = text_dim
        self.time_dim = time_dim

    def encode_time(self, timestamp: datetime) -> np.ndarray:
        """Encode time as vector"""
        features = np.array([
            timestamp.year / 3000,  # Normalize
            timestamp.month / 12,
            timestamp.day / 31,
            timestamp.hour / 24,
            timestamp.weekday() / 7,
            timestamp.timetuple().tm_yday / 366
        ])
        return features

    def combine_embeddings(self, text_emb: np.ndarray,
                           time_emb: np.ndarray,
                           alpha: float = 0.1) -> np.ndarray:
        """Combine text and time embeddings"""
        combined = np.concatenate([
            text_emb * (1 - alpha),
            time_emb * alpha
        ])
        return combined / np.linalg.norm(combined)

Solution 4: Temporal Reranking

Use LLM to re-evaluate temporal relevance after retrieval.

Implementation

python
class TemporalReranker:
    """LLM-based temporal-aware reranking"""

    def __init__(self, llm_client):
        self.llm = llm_client

    def rerank(self, query: str, documents: List[dict],
               temporal_context: dict) -> List[dict]:
        """Rerank considering temporal context"""

        prompt = f"""Given the query and temporal context, rank these documents by relevance.

Query: {query}
Temporal Context: {temporal_context}

Documents:
"""
        for i, doc in enumerate(documents):
            prompt += f"""
[{i+1}] Date: {doc['metadata']['timestamp']}
Content: {doc['text'][:500]}...
"""

        prompt += """
For each document, provide:
1. Temporal relevance score (0-1): How well does the document's date match the query's temporal intent?
2. Content relevance score (0-1): How relevant is the content?
3. Final ranking

Output as JSON array."""

        response = self.llm.generate(prompt)
        rankings = self._parse_rankings(response)

        return self._apply_rankings(documents, rankings)

Solution 5: Temporal Knowledge Graph

Build a Knowledge Graph with a time axis.

Concept

text
Traditional KG: (Sam Altman) --[CEO_OF]--> (OpenAI)

Temporal KG: (Sam Altman) --[CEO_OF {start: 2019, end: 2023-11-17}]--> (OpenAI)
             (Mira Murati) --[CEO_OF {start: 2023-11-17, end: 2023-11-20}]--> (OpenAI)
             (Emmett Shear) --[CEO_OF {start: 2023-11-20, end: 2023-11-22}]--> (OpenAI)
             (Sam Altman) --[CEO_OF {start: 2023-11-22, end: null}]--> (OpenAI)

Implementation

python
from dataclasses import dataclass
from datetime import datetime
from typing import Optional, List

@dataclass
class TemporalTriple:
    """Triple with temporal information"""
    subject: str
    predicate: str
    object: str
    valid_from: datetime
    valid_to: Optional[datetime] = None  # None = currently valid
    confidence: float = 1.0
    source: str = ""

class TemporalKnowledgeGraph:
    """Time-aware Knowledge Graph"""

    def __init__(self):
        self.triples: List[TemporalTriple] = []
        self.entity_index = {}  # entity -> triples
        self.time_index = {}    # (year, month) -> triples

    def add_triple(self, triple: TemporalTriple):
        """Add and index triple"""
        self.triples.append(triple)

        # Entity index
        for entity in [triple.subject, triple.object]:
            if entity not in self.entity_index:
                self.entity_index[entity] = []
            self.entity_index[entity].append(triple)

    def query_at_time(self, entity: str, predicate: str,
                      at_time: datetime) -> List[TemporalTriple]:
        """Query triples valid at a specific time"""
        results = []

        if entity in self.entity_index:
            for triple in self.entity_index[entity]:
                if predicate and triple.predicate != predicate:
                    continue

                # Check temporal validity
                if triple.valid_from <= at_time:
                    if triple.valid_to is None or triple.valid_to >= at_time:
                        results.append(triple)

        return results

    def query_history(self, entity: str, predicate: str) -> List[TemporalTriple]:
        """Query full history of an entity's facts"""
        results = []

        if entity in self.entity_index:
            for triple in self.entity_index[entity]:
                if predicate is None or triple.predicate == predicate:
                    results.append(triple)

        # Sort by time
        results.sort(key=lambda x: x.valid_from)
        return results


# Usage example
tkg = TemporalKnowledgeGraph()

# Add OpenAI CEO history
tkg.add_triple(TemporalTriple(
    subject="Sam Altman",
    predicate="CEO_OF",
    object="OpenAI",
    valid_from=datetime(2019, 3, 1),
    valid_to=datetime(2023, 11, 17),
    source="news_001"
))

tkg.add_triple(TemporalTriple(
    subject="Mira Murati",
    predicate="CEO_OF",
    object="OpenAI",
    valid_from=datetime(2023, 11, 17),
    valid_to=datetime(2023, 11, 20),
    source="news_002"
))

# Query
print("OpenAI CEO on November 18, 2023:")
result = tkg.query_at_time("OpenAI", "CEO_OF", datetime(2023, 11, 18))

Real Usage Examples

Example 1: CEO Change History

python
rag = TemporalRAG(vector_store, kg, llm, embedder)

# Question 1: Specific past point
result = rag.query("Who was OpenAI's CEO on November 18, 2023?")
print(result["answer"])
# Output: "On November 18, 2023, OpenAI's CEO was Mira Murati.
#          She was appointed as interim CEO after Sam Altman was fired on November 17,
#          and was later replaced by Emmett Shear on November 20."

# Question 2: Current
result = rag.query("Who is OpenAI's CEO now?")
print(result["answer"])
# Output: "As of January 2024, OpenAI's CEO is Sam Altman.
#          He returned on November 22, 2023 and remains in the position."

# Question 3: History
result = rag.query("Has OpenAI's CEO ever changed?")
print(result["answer"])
# Output: "Yes, OpenAI's CEO has changed multiple times.
#          - Sam Altman (Mar 2019 ~ Nov 17, 2023)
#          - Mira Murati interim CEO (Nov 17-20, 2023)
#          - Emmett Shear interim CEO (Nov 20-22, 2023)
#          - Sam Altman returns (Nov 22, 2023 ~ present)"

Example 2: Financial Data Time Series

python
# Question: Comparison analysis
result = rag.query("Compare Tesla's 2022 vs 2023 revenue")
print(result["answer"])
# Output: "Tesla annual revenue comparison:
#          - 2022: $81.5B (51% YoY increase)
#          - 2023: $96.8B (19% YoY increase)
#          Growth continued in 2023 but at a slower rate."

# Question: Specific quarter
result = rag.query("What was Tesla's Q3 2023 performance?")
print(result["answer"])
# Output: "Tesla Q3 2023 results:
#          - Revenue: $23.4B
#          - Net income: $1.9B
#          - Deliveries: 435,059 vehicles"

Performance Optimization Tips

1. Time Index Partitioning

python
# Separate collections by year
collections = {
    2022: chroma.create_collection("docs_2022"),
    2023: chroma.create_collection("docs_2023"),
    2024: chroma.create_collection("docs_2024"),
}

# Query only relevant years
def query_by_year(query, year):
    if year in collections:
        return collections[year].query(query)

2. Time-based Caching

python
# Cache by time range
cache_key = f"{query_hash}_{start_date}_{end_date}"
cached_result = cache.get(cache_key)

if cached_result:
    return cached_result

3. Incremental Indexing

python
# Add only new documents (avoid full reindexing)
def incremental_index(new_docs):
    for doc in new_docs:
        if doc.timestamp > last_indexed_time:
            vector_store.add(doc)

    # Update Knowledge Graph too
    kg.update_from_docs(new_docs)

Summary

Core Problems

  • Vector embeddings don't encode time information
  • Can't understand time expressions like "recent", "back then", "current"
  • Can't track fact changes over time

Solution Comparison

MethodProsConsBest For
Metadata FilteringSimple, fastHard filtering, boundary issuesClear time range questions
Temporal DecayNatural recency preferenceNot suitable for past point questions"Latest news" type
Time-Aware EmbeddingFundamental solutionRequires training, complexLarge-scale systems
Temporal RerankingHigh accuracyLLM cost, slowHigh accuracy requirements
Temporal KGPerfect fact change trackingHigh build costStructured knowledge domains

Recommended Combination

  1. Quick start: Metadata Filtering + Temporal Decay
  2. Balanced: Above + Temporal Reranking
  3. Complete solution: All techniques + Temporal KG

Next Steps

  • Multi-hop Temporal Reasoning
  • Event-based Temporal Indexing
  • Temporal Question Decomposition

References

Stay Updated

Follow us for the latest posts and tutorials

Subscribe to Newsletter

Related Posts