MNIST: decision tree – how the model "sees" pixels
Implementation in RubixML
In this case, we will train a decision tree classifier on MNIST pixel vectors and see how a tree-based model predicts a digit from raw pixels. The goal is not to maximize accuracy, but to connect the idea of splitting the feature space with a very concrete representation: a 28×28 image as 784 numeric features.
Example of code:
<?php
use app\classes\MnistLoader;
use Rubix\ML\Classifiers\ClassificationTree;
use Rubix\ML\Datasets\Labeled;
try {
$trainRows = MnistLoader::loadIterable('train.csv', categoricalLabels: true, normalize: true, digits: [0, 1]);
$testRows = MnistLoader::loadIterable('test.csv', categoricalLabels: true, normalize: true, digits: [0, 1]);
$dataset = Labeled::fromIterator($trainRows);
$testDataset = Labeled::fromIterator($testRows);
} catch (Exception $e) {
echo '<div class="alert alert-danger" role="alert">' . htmlspecialchars($e->getMessage(), ENT_QUOTES, 'UTF-8') . '</div>';
exit;
}
$model = new ClassificationTree(
maxHeight: 10,
maxLeafSize: 5
);
$model->train($dataset);