Data & Analytics

시계열 분석 SQL: 이동평균, YoY, MoM 트렌드 완전 정복

매출 트렌드가 보이지 않는다면? 이동평균, 전년비, 전월비를 SQL로 구현하는 법.

시계열 분석 SQL: 이동평균, YoY, MoM 트렌드 완전 정복

시계열 분석 SQL: 이동평균, YoY, MoM 트렌드 완전 정복

매출 트렌드가 보이지 않는다면? 이동평균, 전년비, 전월비를 SQL로 구현하는 법.

TL;DR

  • 시계열 분석: 시간에 따른 데이터 패턴 분석
  • 이동평균: 노이즈 제거, 트렌드 파악
  • YoY/MoM: 전년비, 전월비로 성장률 측정
  • Window 함수: LAG, LEAD, AVG OVER로 구현

1. 왜 시계열 분석이 필요한가?

Raw 데이터의 문제

sql
1월 1일: 100만원
1월 2일: 150만원
1월 3일: 80만원
1월 4일: 200만원

질문: 매출이 오르는 건가, 내리는 건가?

→ 일별 변동이 심해서 트렌드가 안 보임!

해결책

  • 이동평균: 변동 스무딩
  • YoY/MoM: 같은 기간 대비 비교
  • 누적합: 전체 흐름 파악

2. 기본 테이블 구조

sql
-- 일별 매출 테이블
CREATE TABLE daily_sales (
    date DATE PRIMARY KEY,
    revenue DECIMAL(15, 2),
    orders INT,
    users INT
);

-- 또는 트랜잭션 테이블에서 집계
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    user_id INT,
    amount DECIMAL(10, 2),
    created_at TIMESTAMP
);

3. 이동평균 (Moving Average)

7일 이동평균

sql
SELECT
    date,
    revenue,
    ROUND(AVG(revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ), 2) as ma_7day,
    ROUND(AVG(revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
    ), 2) as ma_30day
FROM daily_sales
ORDER BY date;

결과:

daterevenuema_7dayma_30day
2024-01-01100000100000100000
2024-01-02120000110000110000
2024-01-07150000128571125000
2024-01-0890000121429122500

중심 이동평균 (Center Moving Average)

sql
SELECT
    date,
    revenue,
    ROUND(AVG(revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
    ), 2) as centered_ma_7
FROM daily_sales
ORDER BY date;

4. 전일비 / 전주비 / 전월비

일별 성장률

sql
SELECT
    date,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY date) as prev_day,
    LAG(revenue, 7) OVER (ORDER BY date) as prev_week,
    ROUND((revenue - LAG(revenue, 1) OVER (ORDER BY date))
        / NULLIF(LAG(revenue, 1) OVER (ORDER BY date), 0) * 100, 2) as dod_growth,
    ROUND((revenue - LAG(revenue, 7) OVER (ORDER BY date))
        / NULLIF(LAG(revenue, 7) OVER (ORDER BY date), 0) * 100, 2) as wow_growth
FROM daily_sales
ORDER BY date;

결과:

daterevenueprev_dayprev_weekdod_growthwow_growth
2024-01-0815000012000010000025.0050.00
2024-01-09140000150000110000-6.6727.27

5. 월별 집계와 MoM

월별 매출과 전월비

sql
WITH monthly_sales AS (
    SELECT
        DATE_TRUNC('month', date) as month,
        SUM(revenue) as revenue,
        SUM(orders) as orders
    FROM daily_sales
    GROUP BY DATE_TRUNC('month', date)
)
SELECT
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) as prev_month,
    ROUND((revenue - LAG(revenue) OVER (ORDER BY month))
        / NULLIF(LAG(revenue) OVER (ORDER BY month), 0) * 100, 2) as mom_growth,
    SUM(revenue) OVER (ORDER BY month) as ytd_revenue
FROM monthly_sales
ORDER BY month;

결과:

monthrevenueprev_monthmom_growthytd_revenue
2024-013000000NULLNULL3000000
2024-02320000030000006.676200000
2024-03350000032000009.389700000

6. YoY (전년 동기 대비)

전년 동월 대비

sql
WITH monthly_sales AS (
    SELECT
        DATE_TRUNC('month', date) as month,
        EXTRACT(YEAR FROM date) as year,
        EXTRACT(MONTH FROM date) as month_num,
        SUM(revenue) as revenue
    FROM daily_sales
    GROUP BY DATE_TRUNC('month', date), EXTRACT(YEAR FROM date), EXTRACT(MONTH FROM date)
)
SELECT
    month,
    revenue,
    LAG(revenue, 12) OVER (ORDER BY month) as same_month_last_year,
    ROUND((revenue - LAG(revenue, 12) OVER (ORDER BY month))
        / NULLIF(LAG(revenue, 12) OVER (ORDER BY month), 0) * 100, 2) as yoy_growth
FROM monthly_sales
ORDER BY month;

전년 동기 대비 (더 정확한 방법)

sql
WITH monthly_sales AS (
    SELECT
        EXTRACT(YEAR FROM date) as year,
        EXTRACT(MONTH FROM date) as month,
        SUM(revenue) as revenue
    FROM daily_sales
    GROUP BY EXTRACT(YEAR FROM date), EXTRACT(MONTH FROM date)
)
SELECT
    a.year,
    a.month,
    a.revenue as current_year,
    b.revenue as last_year,
    ROUND((a.revenue - b.revenue) / NULLIF(b.revenue, 0) * 100, 2) as yoy_growth
FROM monthly_sales a
LEFT JOIN monthly_sales b
    ON a.month = b.month
    AND a.year = b.year + 1
ORDER BY a.year, a.month;

7. 누적합과 달성률

월 목표 대비 누적 달성률

sql
WITH daily_with_target AS (
    SELECT
        date,
        revenue,
        DATE_TRUNC('month', date) as month,
        1000000 as monthly_target  -- 월 목표 100만
    FROM daily_sales
),
cumulative AS (
    SELECT
        date,
        revenue,
        month,
        monthly_target,
        SUM(revenue) OVER (
            PARTITION BY month
            ORDER BY date
        ) as mtd_revenue,
        EXTRACT(DAY FROM date) as day_of_month,
        EXTRACT(DAY FROM (month + INTERVAL '1 month' - INTERVAL '1 day')) as days_in_month
    FROM daily_with_target
)
SELECT
    date,
    revenue,
    mtd_revenue,
    monthly_target,
    ROUND(mtd_revenue * 100.0 / monthly_target, 2) as achievement_pct,
    -- 예상 월말 매출
    ROUND(mtd_revenue * days_in_month / day_of_month, 0) as projected_month_end
FROM cumulative
ORDER BY date;

8. 계절성 분석

요일별 패턴

sql
SELECT
    EXTRACT(DOW FROM date) as day_of_week,
    TO_CHAR(date, 'Day') as day_name,
    ROUND(AVG(revenue), 0) as avg_revenue,
    ROUND(AVG(revenue) / SUM(AVG(revenue)) OVER () * 7 * 100, 2) as index_vs_avg
FROM daily_sales
GROUP BY EXTRACT(DOW FROM date), TO_CHAR(date, 'Day')
ORDER BY EXTRACT(DOW FROM date);

결과:

day_of_weekday_nameavg_revenueindex_vs_avg
0Sunday8000070.00
1Monday120000105.00
5Friday150000131.25
6Saturday9000078.75

→ 금요일이 평균 대비 131%, 일요일은 70%

월별 패턴

sql
SELECT
    EXTRACT(MONTH FROM date) as month,
    TO_CHAR(date, 'Month') as month_name,
    ROUND(AVG(revenue), 0) as avg_daily_revenue,
    ROUND(AVG(revenue) / AVG(AVG(revenue)) OVER () * 100, 2) as seasonality_index
FROM daily_sales
GROUP BY EXTRACT(MONTH FROM date), TO_CHAR(date, 'Month')
ORDER BY EXTRACT(MONTH FROM date);

9. 트렌드 분리

이동평균으로 트렌드와 노이즈 분리

sql
WITH trend_analysis AS (
    SELECT
        date,
        revenue,
        AVG(revenue) OVER (
            ORDER BY date
            ROWS BETWEEN 14 PRECEDING AND 14 FOLLOWING
        ) as trend,
        revenue - AVG(revenue) OVER (
            ORDER BY date
            ROWS BETWEEN 14 PRECEDING AND 14 FOLLOWING
        ) as residual
    FROM daily_sales
)
SELECT
    date,
    revenue,
    ROUND(trend, 0) as trend,
    ROUND(residual, 0) as residual,
    ROUND(residual * 100.0 / NULLIF(trend, 0), 2) as residual_pct
FROM trend_analysis
WHERE trend IS NOT NULL
ORDER BY date;

10. 성장률 복합 분석

CAGR (연평균 복합 성장률)

sql
WITH yearly_sales AS (
    SELECT
        EXTRACT(YEAR FROM date) as year,
        SUM(revenue) as annual_revenue
    FROM daily_sales
    GROUP BY EXTRACT(YEAR FROM date)
),
cagr_calc AS (
    SELECT
        MIN(year) as start_year,
        MAX(year) as end_year,
        MIN(annual_revenue) as start_revenue,
        MAX(annual_revenue) as end_revenue,
        MAX(year) - MIN(year) as years
    FROM yearly_sales
)
SELECT
    start_year,
    end_year,
    start_revenue,
    end_revenue,
    years,
    ROUND((POWER(end_revenue * 1.0 / start_revenue, 1.0 / years) - 1) * 100, 2) as cagr_pct
FROM cagr_calc;

11. 예측과 목표 설정

이전 트렌드 기반 예측

sql
WITH recent_trend AS (
    SELECT
        date,
        revenue,
        -- 최근 30일 평균 일일 성장
        AVG(revenue - LAG(revenue) OVER (ORDER BY date)) OVER (
            ORDER BY date
            ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING
        ) as avg_daily_growth
    FROM daily_sales
)
SELECT
    date,
    revenue,
    ROUND(avg_daily_growth, 0) as avg_daily_growth,
    -- 다음 7일 예측
    ROUND(revenue + avg_daily_growth * 7, 0) as forecast_7d
FROM recent_trend
WHERE date = (SELECT MAX(date) FROM daily_sales);

12. 실전 대시보드 쿼리

종합 시계열 리포트

sql
WITH daily_metrics AS (
    SELECT
        date,
        revenue,
        orders,
        revenue / NULLIF(orders, 0) as aov,
        LAG(revenue, 1) OVER (ORDER BY date) as prev_day_rev,
        LAG(revenue, 7) OVER (ORDER BY date) as prev_week_rev,
        LAG(revenue, 365) OVER (ORDER BY date) as prev_year_rev,
        AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as ma_7,
        AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) as ma_30
    FROM daily_sales
)
SELECT
    date,
    revenue,
    orders,
    ROUND(aov, 0) as aov,
    ROUND(ma_7, 0) as ma_7day,
    ROUND(ma_30, 0) as ma_30day,
    ROUND((revenue - prev_day_rev) / NULLIF(prev_day_rev, 0) * 100, 1) as dod_pct,
    ROUND((revenue - prev_week_rev) / NULLIF(prev_week_rev, 0) * 100, 1) as wow_pct,
    ROUND((revenue - prev_year_rev) / NULLIF(prev_year_rev, 0) * 100, 1) as yoy_pct,
    CASE
        WHEN revenue > ma_7 * 1.2 THEN 'Spike'
        WHEN revenue < ma_7 * 0.8 THEN 'Drop'
        ELSE 'Normal'
    END as anomaly_flag
FROM daily_metrics
ORDER BY date DESC
LIMIT 30;

결론

분석 유형SQL 핵심용도
이동평균AVG OVER (ROWS BETWEEN)노이즈 제거, 트렌드 파악
DoD/WoWLAG(1), LAG(7)단기 변화 감지
MoMDATE_TRUNC + LAG월별 성장률
YoYLAG(12) 또는 JOIN연간 성장률
누적합SUM OVER (ORDER BY)목표 달성률
계절성GROUP BY 요일/월패턴 파악

시계열 분석 핵심:

  • 이동평균으로 노이즈 제거
  • 동일 기간 비교로 공정한 성장률 측정
  • 계절성 파악으로 예측 정확도 향상

References

  1. Mode Analytics - Time Series Analysis in SQL
  2. PostgreSQL Window Functions Documentation
  3. Redshift - Date and Time Functions