Logistic regression

MNIST. Binary classification: 0 vs 1


Implementation in pure PHP

This is one of the most famous teaching cases in machine learning. It is simple to state, yet it clearly shows how the model works – and where it starts to break down. Here we make an important turn in the book: we move from tabular data to images. Case goal: learn to classify digit images as "0" or "1" with logistic regression and understand the limitations of a linear model.

Pure PHP implementation uses a toy MNIST-like dataset with flattened 28x28 images and the standard logistic regression training loop.

 
<?php

namespace app\classes;

class 
LogisticRegression {
    private array 
$weights;
    private 
float $bias;
    private 
float $learningRate;

    public function 
__construct(int $numFeaturesfloat $learningRate 0.1) {
        
$this->learningRate $learningRate;
        
$this->weights array_fill(0$numFeatures0.0);
        
$this->bias 0.0;
    }

    private function 
sigmoid(float $z): float {
        return 
1.0 / (1.0 exp(-$z));
    }

    private function 
dot(array $a, array $b): float {
        
$sum 0.0;
        foreach (
$a as $i => $v) {
            
$sum += $v $b[$i];
        }
        return 
$sum;
    }

    public function 
predictProb(array $x): float {
        return 
$this->sigmoid($this->dot($this->weights$x) + $this->bias);
    }

    public function 
predict(array $x): int {
        return 
$this->predictProb($x) >= 0.5 0;
    }

    public function 
train(array $X, array $yint $epochs 5): void {
        foreach (
range(1$epochs) as $epoch) {
            foreach (
$X as $i => $x) {
                
$p $this->predictProb($x);
                
$error $p $y[$i];

                foreach (
$this->weights as $j => $w) {
                    
$this->weights[$j] -= $this->learningRate $error $x[$j];
                }

                
$this->bias -= $this->learningRate $error;
            }

            
// echo "Epoch $epoch done\n";
        
}
    }
}
 
<?php

namespace app\classes;

use 
Exception;

class 
MnistLoader {

    static public function 
loadMNIST(string $filestring $directory ''bool $categoricalLabels false): array {
        
$X = [];
        
$y = [];

        
$handle = @fopen($directory $file'r');

        if (
$handle === false) {
            throw new 
Exception('Dataset file not found.');
        }

        while ((
$row fgetcsv($handle)) !== false) {
            if (
$row === [] || $row[0] === null || $row[0] === '') {
                continue;
            }

            
$label = (int)$row[0];

            
// 1. Оставляем только 0 и 1
            
if ($label !== && $label !== 1) {
                continue;
            }

            
$pixels array_slice($row1);

            
// 2. Нормализация (0–255 → 0–1)
            
$pixels array_map(function ($p): float {
                return ((float) 
trim((string) $p)) / 255.0;
            }, 
$pixels);

            
$X[] = $pixels;
            
$y[] = $categoricalLabels
                
? ($label === 'one' 'zero')
                : 
$label;
        }

        return [
$X$y];
    }
}

MNIST case with RubixML

In real projects, people rarely implement logistic regression from scratch. It is much more convenient to use a machine learning library. The same 0-vs-1 MNIST case with RubixML becomes shorter and closer to production-style code:

 
<?php

use Rubix\ML\Classifiers\LogisticRegression;
use 
Rubix\ML\Datasets\Labeled;
use 
Rubix\ML\Datasets\Unlabeled;
use 
Rubix\ML\Extractors\CSV;

function 
mnistRows(string $file): iterable {
    foreach (new 
CSV($filefalse) as $row) {
        if (!isset(
$row[0])) {
            continue;
        }

        
$label = (int) $row[0];

        if (
$label !== && $label !== 1) {
            continue;
        }

        
$pixels array_map(static fn ($value): float => (float) $valuearray_slice($row1));

        yield 
array_merge($pixels, [$label === 'one' 'zero']);
    }
}

$trainRows mnistRows(__DIR__ 'train.csv');
$testRows mnistRows(__DIR__ 'test.csv');

$dataset Labeled::fromIterator($trainRows);
$testDataset Labeled::fromIterator($testRows);

$model = new LogisticRegression(epochs5);
$model->train($dataset);