What is a model in the mathematical sense
Learning as minimizing error
If we can measure error (loss), then model training becomes very simple: we change the model parameters so that this number goes down. The model itself does not “understand” the task — it just minimizes the loss function we chose.
Below is a minimal example: two observations and a linear model $ŷ = w x + b$. First the parameters are bad and the loss is large. Then we adjust $w$ so that predictions become closer to reality and the loss decreases noticeably.
Example of use
<?php
require_once __DIR__ . '/code.php';
/**
* Training dataset for a simple 1D regression example.
*
* Each row is: [x, yTrue]
*
* @var array<int, array{0: float, 1: float}> $dataset
*/
$dataset = [
[1.0, 2.0],
[2.0, 4.0],
];
/**
* Format a float value for prettier output.
*
* - If the number is effectively an integer (e.g. 2.0), print it as "2".
* - Otherwise print a trimmed decimal representation.
*
* @param float $value
* @return string
*/
$formatNumber = function (float $value): string {
$asInt = (int)$value;
return ($value === (float)$asInt) ? (string)$asInt : rtrim(rtrim(number_format($value, 10, '.', ''), '0'), '.');
};
// Плохая модель (до обучения)
$model = new LinearModel(w: 0.0, b: 0.0);
foreach ($dataset as [$x, $yTrue]) {
$yPredicted = $model->predict($x);
$loss = squaredError($yTrue, $yPredicted);
echo 'x = ' . $formatNumber($x) . ', yTrue = ' . $formatNumber($yTrue) . ', yPredicted = ' . $formatNumber($yPredicted) . ', loss = ' . $formatNumber($loss) . PHP_EOL;
}
echo PHP_EOL;
// Improved model (after several "training steps")
$model = new LinearModel(w: 0.8, b: 0.0);
foreach ($dataset as [$x, $yTrue]) {
$yPredicted = $model->predict($x);
$loss = squaredError($yTrue, $yPredicted);
echo 'x = ' . $formatNumber($x) . ', yTrue = ' . $formatNumber($yTrue) . ', yPredicted = ' . $formatNumber($yPredicted) . ', loss = ' . $formatNumber($loss) . PHP_EOL;
}
Result:
Memory: 0.008 Mb
Time running: 0.001 sec.
x = 1, yTrue = 2, yPredicted = 0, loss = 4
x = 2, yTrue = 4, yPredicted = 0, loss = 16
x = 1, yTrue = 2, yPredicted = 0.8, loss = 1.44
x = 2, yTrue = 4, yPredicted = 1.6, loss = 5.76
Training idea: repeat parameter update steps (for example, using gradient descent) until the average error on the data becomes small enough.