Error, loss functions, and why they are needed

Case 2. Choosing a model via the loss function

This case is a logical continuation of the previous one. There we looked at how a single bad data point can distort the picture; here we answer a different practical question: how to formally choose the better model when there are several options and they all look almost the same by eye.

Case goal:
To show that a loss function turns the subjective "this model seems better" into a measurable criterion that we can actually use to make a decision.

Scenario:
Imagine a product demand forecasting task. We have historical data and two model variants:

Model A – a simple linear model, interpretable and stable
Model B – a slightly more complex model with additional parameters

Both models are already trained. We do not discuss their internal structure here – in this case only one thing matters: which of them makes fewer mistakes.

Example of use


                
<?php

require_once __DIR__ . '/code.php';

// Observed (actual) target values from your dataset
$y = [10, 12, 15, 14, 13];

// Predicted values produced by model A for the same inputs
$modelA = [9, 11, 14, 13, 12];

// Predicted values produced by model B for the same inputs
$modelB = [10, 13, 15, 15, 14];

// Compute and print the Mean Squared Error (MSE) for each model.
// The model with the lower MSE is considered to fit the data better.
echo 'MSE A: ' . mse($y, $modelA) . PHP_EOL;
echo 'MSE B: ' . mse($y, $modelB) . PHP_EOL;

Result: Memory: 0.002 Mb Time running: < 0.001 sec.


                MSE A: 1
MSE B: 0.6

After computing the MSE we get two concrete numbers. One of them is smaller – and this is the only formal argument that really matters for training and model selection. Even if the difference between the MSE values is small, it still reflects a systematic advantage of one model over the other within the chosen philosophy of error.
Even if the curves look almost identical visually, the loss gives a numerical basis for making a choice.