Data Cleaning with PHP
Data Standardization with Rubix
If standardization is more appropriate (for instance, if we’re using algorithms like SVMs that are sensitive to variance), we can apply the ZScaleStandardizer. The ZScaleStandardizer adjusts the features to have a mean of 0 and a standard deviation of 1, which is ideal for models like Support Vector Machines (SVM) and Principal Component Analysis (PCA).
Dataset
[100, 500, 25],
[150, 300, 15],
[200, 400, 20],
[50, 200, 10]
Example of use:
<?php
use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Transformers\MinMaxNormalizer;
use Rubix\ML\Transformers\ZScaleStandardizer;
// Create a sample dataset with some numerical features
$samples = [
[100, 500, 25],
[150, 300, 15],
[200, 400, 20],
[50, 200, 10],
];
$labels = ['A', 'B', 'C', 'D'];
// Create a labeled dataset
$dataset = new Labeled($samples, $labels);
// Apply standardization
$standardizer = new ZScaleStandardizer();
$dataset->apply($standardizer);
echo "After Standardization: \n";
echo "---------------\n";
print_r($dataset->samples());