Data Cleaning with PHP

Data Normalization with Rubix

RubixML has a MinMaxNormalizer that scales values to a range (usually between 0 and 1). This is especially useful for features like income and spending_score that vary widely.

[100, 500, 25],
[150, 300, 15],
[200, 400, 20],
[50, 200, 10]

Example of use:

 
<?php

use Rubix\ML\Datasets\Labeled;
use 
Rubix\ML\Transformers\MinMaxNormalizer;

// Create a sample dataset with some numerical features
$samples = [
    [
10050025],
    [
15030015],
    [
20040020],
    [
5020010],
];

$labels = ['A''B''C''D'];

// Create a labeled dataset
$dataset = new Labeled($samples$labels);

// Create a MinMaxNormalizer to scale values between 0 and 1
$normalizer = new MinMaxNormalizer(01);

// Apply normalization to the dataset
$dataset->apply($normalizer);

// Print the normalized values
echo "Normalized Dataset:\n";
echo 
"---------------\n";
print_r($dataset->samples());