Linear regression as a basic model

Case 3. Predicting server resource consumption

In this example we use linear regression from RubixML to predict CPU load from a small set of traffic metrics and at the same time inspect the model weights and bias.

 
<?php

use Rubix\ML\Datasets\Labeled;
use 
Rubix\ML\Datasets\Unlabeled;
use 
Rubix\ML\Regressors\Ridge;

// Training samples: each row is an observation window
// [requests_per_min, avg_response_size_kb, active_users, cron_jobs, hour]
$samples = [
    [
1200153001514],
    [
8001020092],
    [
1500184502019],
    [
400812054],
];

// Target values: CPU load in percent for each window above
$targets = [
    
75.0,
    
40.0,
    
82.0,
    
25.0,
];

// Build a labeled dataset (features X + targets y)
$dataset Labeled::build($samples$targets);

// Simple linear regression (ordinary least squares)
// With a tiny alpha (1e-6) it behaves almost like ordinary least squares.
$model = new Ridge(1e-6);

// Train the model
$model->train($dataset);


// Future metrics for which we want to forecast CPU load
// [requests_per_min, avg_response_size_kb, active_users, cron_jobs, hour]
$futureMetrics = [1000122501016];

$unlabeled = new Unlabeled([$futureMetrics]);
$predictions $model->predict($unlabeled);

// Model parameters
$weights $model->coefficients();
$bias $model->bias();

echo 
'Expected CPU load: ' round($predictions[0], 1) . '%' PHP_EOL PHP_EOL;

echo 
'Coefficients (feature weights):' PHP_EOL;
echo 
'0 - requests_per_min, 1 - avg_response_size_kb, 2 - active_users, 3 - cron_jobs, 4 - hour' PHP_EOL;
print_r($weights);

echo 
PHP_EOL 'Bias (intercept): ' $bias PHP_EOL;
Result: Memory: 0.994 Mb Time running: 0.011 sec.
Expected CPU load: 69.5%

Coefficients (feature weights):
0 - requests_per_min, 1 - avg_response_size_kb, 2 - active_users, 3 - cron_jobs, 4 - hour
Array
(
    [0] => 0.075903414748609
    [1] => 0.33433830738068
    [2] => -0.17070814035833
    [3] => 0.19324654340744
    [4] => 1.573189586401
)

Bias (intercept): 5.1899118423462

After training, the model can be interpreted similarly to the task‑time case:

  • the weight at requests per minute shows how sensitive CPU is to increased incoming traffic;
  • the weight at average response size reflects the impact of "heavy" responses (template rendering, large JSON/HTML);
  • the weight at the number of active users implicitly accounts for concurrent sessions and competition for resources;
  • the weight at the number of cron jobs captures the impact of background load (backups, reports, recalculations);
  • the weight at hour of day allows the model to account for daily load patterns (peaks during the day, dips at night);

Such a model is convenient as a quick estimate of "will the server survive another +X% traffic" and as a starting point for more complex models and alerts in the monitoring system.