Error, loss functions, and why they are needed

Case 2. Choosing a model via the loss function

This case is a logical continuation of the previous one. There we looked at how a single bad data point can distort the picture; here we answer a different practical question: how to formally choose the better model when there are several options and they all look almost the same by eye.

Case goal:
To show that a loss function turns the subjective "this model seems better" into a measurable criterion that we can actually use to make a decision.

Scenario:
Imagine a product demand forecasting task. We have historical data and two model variants:

  • Model A – a simple linear model, interpretable and stable
  • Model B – a slightly more complex model with additional parameters

Both models are already trained. We do not discuss their internal structure here – in this case only one thing matters: which of them makes fewer mistakes.

Мы используем ту же функцию MSE, что и в предыдущем кейсе:

 
<?php

// A simple implementation of MSE (Mean Squared Error).
// We pass in two arrays of the same length:
// $y    — true values (ground‑truth observations),
// $yHat — values predicted by the model.
// The function returns a single number: the average squared error over all observations.
function mse(array $y, array $yHat): float {
    
$n count($y);

    
$sum 0.0;

    for (
$i 0$i $n$i++) {
        
$diff $y[$i] - $yHat[$i];

        
$sum += $diff $diff;
    }

    return 
$sum max($n1);
}