Error, loss functions, and why they are needed
Case 2. Choosing a model via the loss function
This case is a logical continuation of the previous one. There we looked at how a single bad data point can distort the picture; here we answer a different practical question: how to formally choose the better model when there are several options and they all look almost the same by eye.
Case goal:
To show that a loss function turns the subjective "this model seems better" into a measurable criterion that we can actually use to make a decision.
Scenario:
Imagine a product demand forecasting task. We have historical data and two model variants:
- Model A – a simple linear model, interpretable and stable
- Model B – a slightly more complex model with additional parameters
Both models are already trained. We do not discuss their internal structure here – in this case only one thing matters: which of them makes fewer mistakes.
Мы используем ту же функцию MSE, что и в предыдущем кейсе:
<?php
// A simple implementation of MSE (Mean Squared Error).
// We pass in two arrays of the same length:
// $y – true values (ground‑truth observations),
// $yHat – values predicted by the model.
// The function returns a single number: the average squared error over all observations.
function mse(array $y, array $yHat): float {
$sum = 0.0;
$n = count($y);
if ($n === 0) {
return 0.0;
}
for ($i = 0; $i < $n; $i++) {
$sum += ($y[$i] - $yHat[$i]) ** 2;
}
return $sum / $n;
}