Logistic regression
MNIST. Binary classification: 0 vs 1
MNIST case with RubixML
In real projects, people rarely implement logistic regression from scratch. It is much more convenient to use a machine learning library. The same 0-vs-1 MNIST case with RubixML becomes shorter and closer to production-style code:
Example of use
<?php
use Rubix\ML\Classifiers\LogisticRegression;
use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Datasets\Unlabeled;
use Rubix\ML\Extractors\CSV;
try {
function mnistRows(string $file): iterable {
foreach (new CSV($file, false) as $row) {
if (!isset($row[0])) {
continue;
}
$label = (int) $row[0];
if ($label !== 0 && $label !== 1) {
continue;
}
$pixels = array_map(static fn ($value): float => (float) $value, array_slice($row, 1));
yield array_merge($pixels, [$label === 1 ? 'one' : 'zero']);
}
}
$trainRows = mnistRows('train.csv');
$testRows = mnistRows('test.csv');
$dataset = Labeled::fromIterator($trainRows);
$testDataset = Labeled::fromIterator($testRows);
} catch (Exception $e) {
echo '<div class="alert alert-danger" role="alert">' . htmlspecialchars($e->getMessage(), ENT_QUOTES, 'UTF-8') . '</div>';
exit;
}
echo 'Train samples handled: ' . number_format($dataset->numSamples()) . PHP_EOL;
echo 'Test samples handled: ' . number_format($testDataset->numSamples()) . PHP_EOL . PHP_EOL;
$model = new LogisticRegression(epochs: 5);
$model->train($dataset);
echo 'Number of epochs: ' . $model->params()['epochs'] . PHP_EOL . PHP_EOL;
$correct = 0;
foreach ($testDataset->samples() as $i => $x) {
$prediction = $model->predict(new Unlabeled([$x]))[0];
if ($prediction === $testDataset->labels()[$i]) {
$correct++;
}
}
$accuracy = $correct / $testDataset->numSamples();
echo 'Accuracy: ' . round($accuracy * 100, 2) . '%';
Sample digit of 0
Probability of digit 0: 1
Predicted digit: 0
Sample of digit 1
Probability of digit 1: 0.978
Predicted digit: 1
Result:
Memory: 0 Mb
Time running: < 0.001 sec.
Train samples handled: 12,666
Test samples handled: 2,116
Number of epochs: 5
Accuracy: 99.95%