What is a model in the mathematical sense

Learning as minimizing error

If we can measure error (loss), then model training becomes very simple: we change the model parameters so that this number goes down. The model itself does not "understand" the task – it just minimizes the loss function we chose.

Below is a minimal example: two observations and a linear model $ŷ = w x + b$. First the parameters are bad and the loss is large. Then we adjust $w$ so that predictions become closer to reality and the loss decreases noticeably.

Example of use


                
<?php

require_once __DIR__ . '/code.php';

/**
 * Training dataset for a simple 1D regression example.
 *
 * Each row is: [x, yTrue]
 *
 * @var array<int, array{0: float, 1: float}> $dataset
 */
$dataset = [
    [1.0, 2.0],
    [2.0, 4.0],
];

/**
 * Format a float value for prettier output.
 *
 * - If the number is effectively an integer (e.g. 2.0), print it as "2".
 * - Otherwise print a trimmed decimal representation.
 *
 * @param float $value
 * @return string
 */
$formatNumber = function (float $value): string {
    $asInt = (int)$value;

    return ($value === (float)$asInt) ? (string)$asInt : rtrim(rtrim(number_format($value, 10, '.', ''), '0'), '.');
};

// Плохая модель (до обучения)
$model = new LinearModel(w: 0.0, b: 0.0);

foreach ($dataset as [$x, $yTrue]) {
    $yPredicted = $model->predict($x);
    $loss = squaredError($yTrue, $yPredicted);

    echo 'x = ' . $formatNumber($x) . ', yTrue = ' . $formatNumber($yTrue) . ', yPredicted = ' . $formatNumber($yPredicted) . ', loss = ' . $formatNumber($loss) . PHP_EOL;
}

echo PHP_EOL;

// Improved model (after several "training steps")
$model = new LinearModel(w: 0.8, b: 0.0);

foreach ($dataset as [$x, $yTrue]) {
    $yPredicted = $model->predict($x);
    $loss = squaredError($yTrue, $yPredicted);

    echo 'x = ' . $formatNumber($x) . ', yTrue = ' . $formatNumber($yTrue) . ', yPredicted = ' . $formatNumber($yPredicted) . ', loss = ' . $formatNumber($loss) . PHP_EOL;
}

Result: Memory: 0.007 Mb Time running: 0.001 sec.


                x = 1, yTrue = 2, yPredicted = 0, loss = 4
x = 2, yTrue = 4, yPredicted = 0, loss = 16

x = 1, yTrue = 2, yPredicted = 0.8, loss = 1.44
x = 2, yTrue = 4, yPredicted = 1.6, loss = 5.76

Training idea: repeat parameter update steps (for example, using gradient descent) until the average error on the data becomes small enough.