Case 1: Comparing objects and users

Comparing users and objects: normalization, Euclidean distance, and dot product

Below is a runnable three-step example: feature normalization, Euclidean distance between users, and item relevance scoring with dot product.

 
<?php

require_once __DIR__ '/code.php';

echo 
'Step 1: Feature normalization' PHP_EOL;
echo 
'----------' PHP_EOL;

$userA = [2530005];
$userB = [4050008];
$min = [1810001];
$max = [651000020];

$userANorm normalize($userA$min$max);
$userBNorm normalize($userB$min$max);

echo 
'userA: ' array_to_vector($userA) . PHP_EOL;
echo 
'userB: ' array_to_vector($userB) . PHP_EOL;
echo 
'normalized userA: ' array_to_vector($userANorm) . PHP_EOL;
echo 
'normalized userB: ' array_to_vector($userBNorm) . PHP_EOL;
echo 
'Expected approx userA: [0.1489, 0.2222, 0.2105]' PHP_EOL;
echo 
'Expected approx userB: [0.4681, 0.4444, 0.3684]' PHP_EOL PHP_EOL;

echo 
'Step 2: Euclidean distance between users' PHP_EOL;
echo 
'----------' PHP_EOL;
$distance euclideanDistance($userANorm$userBNorm);
echo 
'distance: ' $distance PHP_EOL;
echo 
'Expected approx: 0.4197' PHP_EOL;
echo 
'Explanation: d = sqrt((0.1489 - 0.4681)^2 + (0.2222 - 0.4444)^2 + (0.2105 - 0.3684)^2)' PHP_EOL PHP_EOL;

echo 
'Step 3: Dot product user-item score' PHP_EOL;
echo 
'----------' PHP_EOL;
$item = [0.20.90.7];
$score dotProduct($userANorm$item);
echo 
'item: ' array_to_vector($item) . PHP_EOL;
echo 
'score: ' $score PHP_EOL;
echo 
'Expected approx: 0.3771' PHP_EOL;
echo 
'Explanation: score = (0.1489 * 0.2) + (0.2222 * 0.9) + (0.2105 * 0.7)' PHP_EOL;
Result: Memory: 0.008 Mb Time running: 0.001 sec.
Step 1: Feature normalization
----------
userA: [25, 3000, 5]
userB: [40, 5000, 8]
normalized userA: [0.14893617021277, 0.22222222222222, 0.21052631578947]
normalized userB: [0.46808510638298, 0.44444444444444, 0.36842105263158]
Expected approx userA: [0.1489, 0.2222, 0.2105]
Expected approx userB: [0.4681, 0.4444, 0.3684]

Step 2: Euclidean distance between users
----------
distance: 0.41972551439053
Expected approx: 0.4197
Explanation: d = sqrt((0.1489 - 0.4681)^2 + (0.2222 - 0.4444)^2 + (0.2105 - 0.3684)^2)

Step 3: Dot product user-item score
----------
item: [0.2, 0.9, 0.7]
score: 0.37715565509518
Expected approx: 0.3771
Explanation: score = (0.1489 * 0.2) + (0.2222 * 0.9) + (0.2105 * 0.7)

Takeaway: normalization makes feature scales comparable, Euclidean distance shows user proximity, and dot product gives a simple relevance score between a user and an item.