Error, loss functions, and why they are needed
Case 2. Choosing a model via the loss function
This case is a logical continuation of the previous one. There we looked at how a single bad data point can distort the picture; here we answer a different practical question: how to formally choose the better model when there are several options and they all look almost the same by eye.
Case goal:
To show that a loss function turns the subjective "this model seems better" into a measurable criterion that we can actually use to make a decision.
Scenario:
Imagine a product demand forecasting task. We have historical data and two model variants:
- Model A – a simple linear model, interpretable and stable
- Model B – a slightly more complex model with additional parameters
Both models are already trained. We do not discuss their internal structure here – in this case only one thing matters: which of them makes fewer mistakes.
Мы используем ту же функцию MSE, что и в предыдущем кейсе:
<?php
// A simple implementation of MSE (Mean Squared Error).
// We pass in two arrays of the same length:
// $y — true values (ground‑truth observations),
// $yHat — values predicted by the model.
// The function returns a single number: the average squared error over all observations.
function mse(array $y, array $yHat): float {
$n = count($y);
$sum = 0.0;
for ($i = 0; $i < $n; $i++) {
$diff = $y[$i] - $yHat[$i];
$sum += $diff * $diff;
}
return $sum / max($n, 1);
}