Linear regression as a basic model

Case 3. Predicting server resource consumption

In this example we use linear regression from RubixML to predict CPU load from a small set of traffic metrics and at the same time inspect the model weights and bias.

Example of use


                
<?php

use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Datasets\Unlabeled;
use Rubix\ML\Regressors\Ridge;

// Training samples: each row is an observation window
// [requests_per_min, avg_response_size_kb, active_users, cron_jobs, hour]
$samples = [
    [1200, 15, 300, 15, 14],
    [800, 10, 200, 9, 2],
    [1500, 18, 450, 20, 19],
    [400, 8, 120, 5, 4],
];

// Target values: CPU load in percent for each window above
$targets = [
    75.0,
    40.0,
    82.0,
    25.0,
];

// Build a labeled dataset (features X + targets y)
$dataset = Labeled::build($samples, $targets);

// Simple linear regression (ordinary least squares)
// With a tiny alpha (1e-6) it behaves almost like ordinary least squares.
$model = new Ridge(1e-6);

// Train the model
$model->train($dataset);


// Future metrics for which we want to forecast CPU load
// [requests_per_min, avg_response_size_kb, active_users, cron_jobs, hour]
$futureMetrics = [1000, 12, 250, 10, 16];

$unlabeled = new Unlabeled([$futureMetrics]);
$predictions = $model->predict($unlabeled);

// Model parameters
$weights = $model->coefficients();
$bias = $model->bias();

echo 'Expected CPU load: ' . round($predictions[0], 1) . '%' . PHP_EOL . PHP_EOL;

echo 'Coefficients (feature weights):' . PHP_EOL;
echo '0 - requests_per_min, 1 - avg_response_size_kb, 2 - active_users, 3 - cron_jobs, 4 - hour' . PHP_EOL;
print_r($weights);

echo PHP_EOL . 'Bias (intercept): ' . $bias . PHP_EOL;

Result: Memory: 0.004 Mb Time running: 0.005 sec.


                Expected CPU load: 69.5%

Coefficients (feature weights):
0 - requests_per_min, 1 - avg_response_size_kb, 2 - active_users, 3 - cron_jobs, 4 - hour
Array
(
    [0] => 0.075903414748609
    [1] => 0.33433830738068
    [2] => -0.17070814035833
    [3] => 0.19324654340744
    [4] => 1.573189586401
)

Bias (intercept): 5.1899118423462

After training, the model can be interpreted similarly to the task‑time case:

the weight at requests per minute shows how sensitive CPU is to increased incoming traffic;
the weight at average response size reflects the impact of "heavy" responses (template rendering, large JSON/HTML);
the weight at the number of active users implicitly accounts for concurrent sessions and competition for resources;
the weight at the number of cron jobs captures the impact of background load (backups, reports, recalculations);
the weight at hour of day allows the model to account for daily load patterns (peaks during the day, dips at night);

Such a model is convenient as a quick estimate of "will the server survive another +X% traffic" and as a starting point for more complex models and alerts in the monitoring system.