Case 1: Comparing objects and users

Comparing users and objects: normalization, Euclidean distance, and dot product

Let us consider a typical applied ML and recommender-systems task. We have users and items (products, services, subscriptions), each represented by a set of numeric features.

First, we normalize user features to a common range. Then we compute Euclidean distance between users and finish with a dot product between a user profile and an item.

Example of code:

 
<?php

function normalize(array $x, array $min, array $max): array {
    
$result = [];

    foreach (
$x as $i => $value) {
        
$result[$i] = ($value $min[$i]) / ($max[$i] - $min[$i]);
    }

    return 
$result;
}

function 
euclideanDistance(array $a, array $b): float {
    
$n count($a);

    if (
$n !== count($b)) {
        throw new 
InvalidArgumentException('Vectors must have the same length');
    }

    
$sum 0.0;

    for (
$i 0$i $n$i++) {
        
$diff $a[$i] - $b[$i];
        
$sum += $diff ** 2;
    }

    return 
sqrt($sum);
}

function 
dotProduct(array $a, array $b): float {
    
$n count($a);

    if (
$n !== count($b)) {
        throw new 
InvalidArgumentException('Vectors must have the same length');
    }

    
$sum 0.0;

    for (
$i 0$i $n$i++) {
        
$sum += $a[$i] * $b[$i];
    }

    return 
$sum;
}