Build a Time Series Forecasting Model with Trend and Seasonality Analysis
Build a time series forecasting model with EDA, model comparison, validation strategy, and production-ready code.
๐ The Prompt
You are a data scientist specializing in time series analysis and forecasting. I need you to build a comprehensive forecasting model for the following business metric.
**Forecasting Target:**
- **Metric to Forecast:** [METRIC, e.g., monthly revenue, daily website traffic, weekly inventory demand]
- **Historical Data Range:** [START_DATE] to [END_DATE] ([NUM_PERIODS] data points)
- **Forecast Horizon:** [HORIZON, e.g., next 3 months, next 90 days, next 12 weeks]
- **Data Granularity:** [GRANULARITY, e.g., daily, weekly, monthly]
- **Known Patterns:** [PATTERNS, e.g., strong weekly seasonality, holiday spikes in December, upward growth trend]
- **External Factors:** [EXTERNAL_FACTORS, e.g., marketing spend, weather data, competitor pricing โ or 'none']
- **Business Use Case:** [USE_CASE, e.g., budget planning, inventory management, capacity planning]
Please provide the following comprehensive analysis:
1. **Exploratory Data Analysis (EDA):** Outline the EDA steps to perform โ trend visualization, seasonal decomposition (additive vs. multiplicative), stationarity testing (ADF test, KPSS test), and autocorrelation analysis (ACF/PACF plots). Describe what to look for at each step.
2. **Data Preprocessing:** Detail the preprocessing pipeline including handling missing values (interpolation strategies), outlier detection and treatment, and any necessary transformations (log, differencing, Box-Cox).
3. **Model Selection:** Compare at least 4 candidate models appropriate for this data:
- Classical statistical (ARIMA/SARIMA, Exponential Smoothing/ETS)
- Modern ML-based (Prophet, XGBoost with lag features)
- Deep learning if applicable (LSTM, N-BEATS)
For each, explain pros/cons and suitability for this specific use case.
4. **Feature Engineering:** If using ML models, define lag features, rolling statistics, calendar features (day of week, holidays, month), and how to incorporate [EXTERNAL_FACTORS].
5. **Validation Strategy:** Describe the time series cross-validation approach (expanding window vs. sliding window), the number of folds, and why random splits are inappropriate.
6. **Evaluation Metrics:** Calculate and interpret MAPE, RMSE, MAE, and SMAPE. Define acceptable error thresholds for [USE_CASE].
7. **Implementation Code:** Provide Python code using [LIBRARY, e.g., statsmodels, Prophet, scikit-learn] that trains the best model and generates forecasts with prediction intervals.
8. **Forecast Output:** Format predictions as a table with date, point forecast, lower bound, and upper bound at [CONFIDENCE_LEVEL]% confidence.
Include visualization recommendations and a model monitoring/retraining schedule.
๐ก Tips for Better Results
Provide as much context about known seasonal patterns and business events (promotions, outages) as possible โ this dramatically improves model accuracy.
Specify the acceptable error range for your business use case; a 10% MAPE may be fine for budget planning but unacceptable for perishable inventory.
Always request prediction intervals, not just point forecasts โ decision-makers need to understand the range of possible outcomes.
๐ฏ Use Cases
Data scientists, analysts, and business planners should use this when they need to forecast a key metric for budgeting, demand planning, staffing, or inventory optimization.