Logistic regression
MNIST. Binary classification: 0 vs 1
Implementation in pure PHP
This is one of the most famous teaching cases in machine learning. It is simple to state, yet it clearly shows how the model works – and where it starts to break down. Here we make an important turn in the book: we move from tabular data to images. Case goal: learn to classify digit images as "0" or "1" with logistic regression and understand the limitations of a linear model.
Pure PHP implementation uses a toy MNIST-like dataset with flattened 28x28 images and the standard logistic regression training loop.
Example of code: Class LogisticRegression
<?php
namespace app\classes;
class LogisticRegression {
private array $weights;
private float $bias;
private float $learningRate;
public function __construct(int $numFeatures, float $learningRate = 0.1) {
$this->learningRate = $learningRate;
$this->weights = array_fill(0, $numFeatures, 0.0);
$this->bias = 0.0;
}
private function sigmoid(float $z): float {
return 1.0 / (1.0 + exp(-$z));
}
private function dot(array $a, array $b): float {
$sum = 0.0;
foreach ($a as $i => $v) {
$sum += $v * $b[$i];
}
return $sum;
}
public function predictProb(array $x): float {
return $this->sigmoid($this->dot($this->weights, $x) + $this->bias);
}
public function predict(array $x): int {
return $this->predictProb($x) >= 0.5 ? 1 : 0;
}
public function train(array $X, array $y, int $epochs = 5): void {
foreach (range(1, $epochs) as $epoch) {
foreach ($X as $i => $x) {
$p = $this->predictProb($x);
$error = $p - $y[$i];
foreach ($this->weights as $j => $w) {
$this->weights[$j] -= $this->learningRate * $error * $x[$j];
}
$this->bias -= $this->learningRate * $error;
}
// echo "Epoch $epoch done\n";
}
}
}
Example of code: Class MnistLoader
<?php
namespace app\classes;
use Exception;
class MnistLoader {
static public function loadMNIST(string $file, string $directory = '', bool $categoricalLabels = false): array {
$X = [];
$y = [];
$handle = @fopen($directory . $file, 'r');
if ($handle === false) {
throw new Exception('Dataset file not found.');
}
while (($row = fgetcsv($handle)) !== false) {
if ($row === [] || $row[0] === null || $row[0] === '') {
continue;
}
$label = (int)$row[0];
// 1. Оставляем только 0 и 1
if ($label !== 0 && $label !== 1) {
continue;
}
$pixels = array_slice($row, 1);
// 2. Нормализация (0–255 → 0–1)
$pixels = array_map(function ($p): float {
return ((float) trim((string) $p)) / 255.0;
}, $pixels);
$X[] = $pixels;
$y[] = $categoricalLabels
? ($label === 1 ? 'one' : 'zero')
: $label;
}
return [$X, $y];
}
}
MNIST case with RubixML
In real projects, people rarely implement logistic regression from scratch. It is much more convenient to use a machine learning library. The same 0-vs-1 MNIST case with RubixML becomes shorter and closer to production-style code:
Example of code
<?php
use Rubix\ML\Classifiers\LogisticRegression;
use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Datasets\Unlabeled;
use Rubix\ML\Extractors\CSV;
function mnistRows(string $file): iterable {
foreach (new CSV($file, false) as $row) {
if (!isset($row[0])) {
continue;
}
$label = (int) $row[0];
if ($label !== 0 && $label !== 1) {
continue;
}
$pixels = array_map(static fn ($value): float => (float) $value, array_slice($row, 1));
yield array_merge($pixels, [$label === 1 ? 'one' : 'zero']);
}
}
$trainRows = mnistRows(__DIR__ . 'train.csv');
$testRows = mnistRows(__DIR__ . 'test.csv');
$dataset = Labeled::fromIterator($trainRows);
$testDataset = Labeled::fromIterator($testRows);
$model = new LogisticRegression(epochs: 5);
$model->train($dataset);