Linear regression as a basic model

Case 1. Apartment valuation based on parameters


Implementation in pure PHP

(iterative optimization – gradient descent)

We will start without any libraries. This is useful not for production, but for building intuition. We will use gradient descent, a feature matrix $X$ of size $N$ x $4$, and a weight vector $w$ of length $4$.
We will add the bias as an extra feature with value 1.

 
<?php

// Compute the dot product of two vectors: sum(a_i * b_i)
function dotProduct(array $a, array $b): float {
    
$sum 0.0;

    foreach (
$a as $i => $v) {
        
$sum += $v $b[$i];
    }

    return 
$sum;
}

// Training data: each row is an apartment described by features
// X: [area, floor, distance to city center, building age, bias]
$X = [
    [
5035101],
    [
7010351],
    [
4028301],
];

// Target values: apartment prices in dollars
$y = [
    
66_000,
    
95_000,
    
45_000,
];

// Initialize model parameters (weights) and training hyperparameters
$weights array_fill(050.0);
$learningRate 0.000001;
$epochs 5_000;

// n – number of training examples, m – number of features (including bias)
$n count($X);
$m count($weights);

// Gradient descent loop: repeat several passes over the dataset
for ($epoch 0$epoch $epochs$epoch++) {
    
// Accumulate gradients for each weight over the whole dataset
    
$gradients array_fill(0$m0.0);

    for (
$i 0$i $n$i++) {
        
// Model prediction: y_hat = w · x
        
$prediction dotProduct($weights$X[$i]);
        
// Error for this example: true - predicted
        
$error $y[$i] - $prediction;

        
// Accumulate gradient for each weight (derivative of MSE w.r.t. w_j)
        
for ($j 0$j $m$j++) {
            
$gradients[$j] += -$X[$i][$j] * $error;
        }
    }

    
// Gradient descent update: move weights against the average gradient
    
for ($j 0$j $m$j++) {
        
$weights[$j] -= $learningRate * ($gradients[$j] / $n);
    }
}

// Describe a new apartment with the same set of features as in X
// [square footage, number of bedrooms, number of bathrooms, number of floors, bias]
$newApartment = [6054121];
// Use the trained model to predict its price
$predictedPrice dotProduct($weights$newApartment);

echo 
'Apartment valuation: $' number_format($predictedPrice) . PHP_EOL PHP_EOL;

echo 
'Weights: ' implode(', 'array_map(fn ($weight) => number_format($weight2'.'''), array_splice($weights04))) . PHP_EOL;
echo 
'Bias: ' number_format(array_pop($weights), 2'.''') . PHP_EOL;

Charts:

Chart Type:

Result: Memory: 0.003 Mb Time running: 0.005 sec.
Apartment valuation: $78,374

Weights: 1333.29, 198.45, -0.94, -219.01
Bias: 16.06

The block above shows the result of the script: the predicted apartment price for the following features:

  • Apartment area: 60 m²
  • Number of rooms: 5
  • Number of bathrooms: 4
  • Number of floors: 12