Case 3: Comparing texts

Implementation in pure PHP

Below is a runnable pure PHP example: tokenization, term frequencies (TF), cosine similarity, and ranking documents by score.

 
<?php

require_once __DIR__ '/code.php';

echo 
'Implementation in pure PHP' PHP_EOL;
echo 
'----------' PHP_EOL;

$documents = [
    
'Machine learning is used to analyze data, which is crucial for the outcome.',
    
'Deep learning is a branch of machine learning.',
    
'Cats and dogs are popular pets.'
];

$query 'machine learning algorithms';

$queryVector textToVector($query);
$scores = [];

foreach (
$documents as $document) {
    
$docVector textToVector($document);
    [
$alignedQuery$alignedDoc] = alignVectors($queryVector$docVector);

    
$scores[] = [
        
'text' => $document,
        
'score' => cosineSimilarity($alignedQuery$alignedDoc),
    ];
}

usort($scores, fn($a$b) => $b['score'] <=> $a['score']);

print_r($scores);
Result: Memory: 0.012 Mb Time running: 0.001 sec.
Implementation in pure PHP
----------
Array
(
    [0] => Array
        (
            [text] => Deep learning is a branch of machine learning.
            [score] => 0.54772255750517
        )

    [1] => Array
        (
            [text] => Machine learning is used to analyze data, which is crucial for the outcome.
            [score] => 0.29814239699997
        )

    [2] => Array
        (
            [text] => Cats and dogs are popular pets.
            [score] => 0
        )

)

Takeaway: even without libraries, you can build a clear baseline for text search and relevance ranking.