Error, loss functions, and why they are needed
Case 2. Choosing a model via the loss function
This case is a logical continuation of the previous one. There we looked at how a single bad data point can distort the picture; here we answer a different practical question: how to formally choose the better model when there are several options and they all look almost the same by eye.
Case goal:
To show that a loss function turns the subjective "this model seems better" into a measurable criterion that we can actually use to make a decision.
Scenario:
Imagine a product demand forecasting task. We have historical data and two model variants:
- Model A – a simple linear model, interpretable and stable
- Model B – a slightly more complex model with additional parameters
Both models are already trained. We do not discuss their internal structure here – in this case only one thing matters: which of them makes fewer mistakes.
Example of use
<?php
require_once __DIR__ . '/code.php';
// Observed (actual) target values from your dataset
$y = [10, 12, 15, 14, 13];
// Predicted values produced by model A for the same inputs
$modelA = [9, 11, 14, 13, 12];
// Predicted values produced by model B for the same inputs
$modelB = [10, 13, 15, 15, 14];
// Compute and print the Mean Squared Error (MSE) for each model.
// The model with the lower MSE is considered to fit the data better.
echo 'MSE A: ' . mse($y, $modelA) . PHP_EOL;
echo 'MSE B: ' . mse($y, $modelB) . PHP_EOL;
MSE A: 1
MSE B: 0.6
After computing the MSE we get two concrete numbers. One of them is smaller – and this is the only formal argument that really matters for training and model selection. Even if the difference between the MSE values is small, it still reflects a systematic advantage of one model over the other within the chosen philosophy of error.
Even if the curves look almost identical visually, the loss gives a numerical basis for making a choice.