MASE, sMAPE, RMSE: Choosing the Right Forecast Metric
Choosing the right metric to evaluate forecast performance can mean the difference between a trading strategy that makes money and one that loses it. While most data scientists are familiar with RMSE and MAE, financial forecasting requires specialized metrics that account for the unique characteristics of price predictions. This guide explains the most important forecast metrics, when to use each, and how to interpret their results.
Why Forecast Metrics Matter
The metric you optimize during model training fundamentally shapes your model's behavior. A model trained to minimize RMSE behaves differently from one trained to minimize MAPE or maximize directional accuracy. Choosing the wrong metric can lead to models that excel on paper but fail in practice.
The Classic Metrics: RMSE and MAE
Root Mean Squared Error (RMSE)
RMSE measures the square root of the average squared differences between predictions and actual values. It's the most widely used forecast metric in machine learning.
Formula: RMSE = √(Σ(actual - predicted)² / n)
When to Use RMSE
- Large errors are particularly costly: RMSE heavily penalizes large errors due to squaring
- You need a scale-dependent metric: RMSE is in the same units as your target variable
- Optimization with gradient descent: RMSE's smooth gradient makes it easy to optimize
- Normally distributed errors: RMSE works best when errors follow a normal distribution
RMSE Limitations
- Scale-dependent: Can't compare RMSE across stocks with different price levels
- Sensitive to outliers: A few extreme errors dominate the metric
- Difficult to interpret: What does RMSE = 5.2 mean in practical terms?
Mean Absolute Error (MAE)
MAE measures the average absolute difference between predictions and actual values.
Formula: MAE = Σ|actual - predicted| / n
When to Use MAE
- Robustness to outliers: MAE treats all errors equally
- Easier interpretation: "On average, predictions are off by $X"
- Symmetric cost of errors: Overestimates and underestimates are equally bad
Percentage-Based Metrics: MAPE and sMAPE
Mean Absolute Percentage Error (MAPE)
MAPE expresses error as a percentage of actual values, making it scale-independent.
Formula: MAPE = (Σ|actual - predicted| / |actual|) / n × 100%
When to Use MAPE
- Comparing across different stocks: MAPE is scale-independent
- Business communication: "5% error" is easier to explain than "RMSE of 2.3"
- Relative accuracy matters: Being off by $1 is worse for a $10 stock than a $100 stock
MAPE's Fatal Flaw
MAPE has a critical problem: it's asymmetric and undefined when actual values are zero or near-zero. A prediction of $100 when actual is $110 gives 9.1% error, but predicting $110 when actual is $100 gives 10% error. This asymmetry makes MAPE unsuitable for many applications.
Symmetric Mean Absolute Percentage Error (sMAPE)
sMAPE was designed to address MAPE's asymmetry problem.
Formula: sMAPE = (Σ|actual - predicted| / (|actual| + |predicted|)) / n × 100%
When to Use sMAPE
- More balanced than MAPE: Treats over and under-predictions more symmetrically
- Scale-independent comparison: Like MAPE but fairer
- Bounded metric: sMAPE ranges from 0% to 200%
sMAPE Limitations
Despite improvements over MAPE, sMAPE still has issues:
- Still problematic when actual and predicted values are both near zero
- Not truly symmetric—still slightly biases toward over-prediction
- Difficult to optimize during training due to discontinuities
MASE: The Often-Overlooked Champion
Mean Absolute Scaled Error (MASE) is arguably the best general-purpose forecast metric, yet it's surprisingly underutilized in practice.
What is MASE?
MASE compares your model's MAE to the MAE of a naive baseline forecast (typically a simple persistence model that predicts "tomorrow will be like today").
Formula: MASE = MAE of model / MAE of naive baseline
Interpreting MASE
- MASE < 1: Your model outperforms the naive baseline (good!)
- MASE = 1: Your model is no better than the naive approach
- MASE > 1: You'd be better off using the naive baseline (bad!)
When to Use MASE
- Comparing different models: MASE works across different scales and time series
- Reporting to stakeholders: "30% better than naive baseline" is clear and meaningful
- Evaluating whether ML adds value: If MASE > 1, your complex model is worse than doing nothing
- Academic research: MASE is increasingly the standard in forecasting literature
Practical Example
Imagine predicting stock prices for AAPL and a penny stock:
- AAPL: Your model's MAE = $2.50, Naive MAE = $3.00, MASE = 0.83
- Penny stock: Your model's MAE = $0.10, Naive MAE = $0.15, MASE = 0.67
RMSE and MAE can't tell you which model is performing better because they're scale-dependent. MASE shows that the penny stock model (MASE 0.67) is actually performing better relative to its baseline than the AAPL model (MASE 0.83), even though AAPL has a smaller absolute percentage error.
Financial-Specific Metrics
Directional Accuracy
For trading applications, predicting direction (up vs. down) often matters more than exact magnitude.
Formula: Directional Accuracy = (Number of correct direction predictions / Total predictions) × 100%
When to Use Directional Accuracy
- Binary trading decisions: Buy vs. sell decisions based on direction
- Options trading: Direction determines profit/loss
- Simple strategies: When you just need to know "up or down"
Risk-Adjusted Metrics: Sharpe Ratio
The Sharpe ratio measures risk-adjusted returns, showing return per unit of volatility.
Formula: Sharpe Ratio = (Return - Risk-free rate) / Standard deviation of returns
While not strictly a forecast metric, Sharpe ratio is crucial for evaluating trading strategies based on predictions.
When to Use Sharpe Ratio
- Comparing trading strategies: Which approach delivers better risk-adjusted returns?
- Portfolio optimization: Balancing return and volatility
- Evaluating model practical value: Accuracy doesn't matter if the strategy loses money
Choosing the Right Metric: Decision Framework
For Model Development
Use MASE when:
- You need to compare models across different stocks or time periods
- You want scale-independent performance measurement
- You need to justify why ML adds value over simple baselines
Use RMSE when:
- Large errors are disproportionately costly
- You're working with a single stock at consistent price levels
- You need smooth gradients for optimization
Use MAE when:
- All errors should be weighted equally
- Outliers are expected and shouldn't dominate
- You need easy interpretation for stakeholders
For Trading Strategy Evaluation
Primary metrics:
- Directional Accuracy: Are you getting the direction right?
- Sharpe Ratio: Are returns worth the risk?
- Maximum Drawdown: What's the worst-case scenario?
Secondary metrics:
- Win Rate: Percentage of profitable trades
- Profit Factor: Total wins / Total losses
- Calmar Ratio: Return / Maximum drawdown
For Reporting and Communication
Use sMAPE or MASE when:
- Explaining to non-technical stakeholders
- Comparing performance across different assets
- Demonstrating improvement over baselines
Common Mistakes in Metric Selection
1. Optimizing the Wrong Thing
Mistake: Optimizing RMSE when directional accuracy matters for your trading strategy.
Solution: Align your optimization metric with your business objective. If you're making binary trading decisions, optimize for directional accuracy or profit, not RMSE.
2. Ignoring Metric Limitations
Mistake: Using MAPE with assets that have volatile, low price levels.
Solution: Understand each metric's limitations and choose accordingly. For low-price or zero-crossing data, avoid MAPE and use MASE or MAE instead.
3. Single Metric Evaluation
Mistake: Relying solely on one metric to judge model performance.
Solution: Use multiple complementary metrics. A model might have low RMSE but poor directional accuracy, or vice versa. Evaluate holistically.
4. Not Using Baseline Comparisons
Mistake: Celebrating RMSE = 2.5 without knowing if that's good or bad.
Solution: Always compare to a naive baseline. MASE does this automatically, but for other metrics, calculate baseline performance explicitly.
Practical Implementation Tips
Multi-Metric Dashboard
Create a comprehensive evaluation dashboard that tracks:
- Magnitude metrics: RMSE, MAE, MASE
- Percentage metrics: sMAPE
- Direction metrics: Directional accuracy, confusion matrix
- Trading metrics: Sharpe ratio, max drawdown, win rate
- Time-based analysis: How do metrics change across walk-forward folds?
Metric Stability Analysis
Don't just look at average metrics—examine their stability:
- Standard deviation of metrics across validation folds
- Worst-case metric values (95th percentile error)
- Performance in different market regimes (bull vs. bear)
Business-Aligned Custom Metrics
Consider creating custom metrics that directly measure business outcomes:
- Profit per prediction
- Cost-weighted error (where errors in certain directions are more costly)
- Risk-adjusted information ratio
Conclusion
Choosing the right forecast metric isn't a one-size-fits-all decision. MASE offers the best general-purpose solution for comparing models and understanding relative performance. RMSE and MAE remain valuable for optimization and interpretation within a single context. sMAPE provides scale-independent percentages but with limitations. And directional accuracy plus Sharpe ratio are essential for evaluating trading strategy viability.
The most important principle: align your evaluation metric with your objective. If you're making trading decisions, forecast accuracy alone isn't enough—you need to evaluate the profitability and risk of decisions based on those forecasts. If you're doing pure prediction, MASE provides the most robust, interpretable measure of forecast quality.
Don't fall into the trap of optimizing one metric in isolation. The best forecasting systems are evaluated holistically across multiple metrics, ensuring that improvements on one dimension don't come at the cost of critical failures on another. Master these metrics, understand their tradeoffs, and you'll build better, more reliable forecasting systems.