Case 2. Classification using RubixML

Implementation with RubixML

In the previous case we explored a tutorial decision tree implementation in pure PHP and saw how the algorithm works inside. Now we will solve the same task using a production-ready library. The goal of this case is to show how the familiar entropy and information gain logic is applied in a real tool and how a decision tree is integrated into a typical PHP project.

Example of use:

 
<?php

use Rubix\ML\Classifiers\ClassificationTree;
use 
Rubix\ML\Datasets\Labeled;

// Make the example deterministic between runs.
// RubixML may use randomness (e.g. tie-breaking) during training.
mt_srand(42);
srand(42);

// Our tiny training dataset
// Each sample is a 2D numeric vector: [visits, time]
$samples = [
    [
510],
    [
715],
    [
12],
    [
23],
    [
68],
    [
34],
    [
412],
    [
63],
];

// Class labels for each row in $samples
$labels = ['active''active''passive''passive''active''passive''active''passive'];

// Wrap the arrays into a RubixML Labeled dataset.
$dataset = new Labeled($samples$labels);

// Create a decision tree classifier
$estimator = new ClassificationTree(
    
maxHeight5,
    
maxLeafSize2
);

// Fit the model
$estimator->train($dataset);