Linear regression as a basic model

Case 1. Apartment valuation based on parameters


Implementation in pure PHP

We will start without any libraries. This is useful not for production, but for building intuition. We will use gradient descent, a feature matrix $X$ of size $N$ x $4$, and a weight vector $w$ of length $4$. We will add the bias as an extra feature with value 1.

 
<?php

// Compute the dot product of two vectors: sum(a_i * b_i)
function dotProduct(array $a, array $b): float {
    
$sum 0.0;

    foreach (
$a as $i => $v) {
        
$sum += $v $b[$i];
    }

    return 
$sum;
}

// Training data: each row is an apartment described by features
// X: [area, floor, distance to city center, building age, bias]
$X = [
    [
5035101],
    [
7010351],
    [
4028301],
];

// Target values: apartment prices in dollars
$y = [
    
66_000,
    
95_000,
    
45_000,
];

// Initialize model parameters (weights) and training hyperparameters
$weights array_fill(050.0);
$learningRate 0.000001;
$epochs 5_000;

// n — number of training examples, m — number of features (including bias)
$n count($X);
$m count($weights);

// Gradient descent loop: repeat several passes over the dataset
for ($epoch 0$epoch $epochs$epoch++) {
    
// Accumulate gradients for each weight over the whole dataset
    
$gradients array_fill(0$m0.0);

    for (
$i 0$i $n$i++) {
        
// Model prediction: y_hat = w · x
        
$prediction dotProduct($weights$X[$i]);
        
// Error for this example: true - predicted
        
$error $y[$i] - $prediction;

        
// Accumulate gradient for each weight (derivative of MSE w.r.t. w_j)
        
for ($j 0$j $m$j++) {
            
$gradients[$j] += -$X[$i][$j] * $error;
        }
    }

    
// Gradient descent update: move weights against the average gradient
    
for ($j 0$j $m$j++) {
        
$weights[$j] -= $learningRate * ($gradients[$j] / $n);
    }
}

// Describe a new apartment with the same set of features as in X
// [square footage, number of bedrooms, number of bathrooms, number of floors, bias]
$newApartment = [6054121];
// Use the trained model to predict its price
$predictedPrice dotProduct($weights$newApartment);

echo 
'Apartment valuation: $' number_format($predictedPrice) . "\n";

Charts:

Chart Type:

Result: Memory: 0.003 Mb Time running: 0.004 sec.
Apartment valuation: $78,374

The block above shows the result of the script: the predicted apartment price for the following features:

  • Apartment area: 60 m²
  • Number of rooms: 5
  • Number of bathrooms: 4
  • Number of floors: 12
  • Deviation: 1