Case 3. When a tree is convenient in a real product

Implementation with RubixML

Below is the runnable code: we build a Labeled dataset, train a ClassificationTree, and predict a decision for a new applicant. Then we generate a short explanation based on simple business rules (DTI, late payments, credit history length, etc.).

 
<?php

use Rubix\ML\Classifiers\ClassificationTree;
use 
Rubix\ML\Datasets\Labeled;
use 
Rubix\ML\Datasets\Unlabeled;

mt_srand(42);
srand(42);

$samples = [
    
// income, loan, dti, history, late, cards, age, job, home
    
[5000100000.25023551],
    [
3000150000.62342820],
    [
8000200000.2580345101],
    [
250050000.51212310],
    [
6000120000.36123871],
    [
4000180000.73453030],
    [
700090000.189024191],
    [
3200110000.552132920],
    [
5200160000.424223661],
    [
2800140000.652342620],
    [
9000250000.33100448121],
    [
420080000.384013340],
    [
3500220000.753553130],
    [
6500150000.287123981],
    [
270060000.491212410],
    [
5600130000.316123771],
    [
3800170000.683443030],
    [
480090000.225012761],
    [
7500140000.624434491],
    [
290070000.286022520],
    [
6200100000.582343451],
];

$labels = [
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
    
'approve',
    
'reject',
];

$dataset = new Labeled($samples$labels);

$tree = new ClassificationTree(
    
maxHeight10,
    
maxLeafSize2
);

$tree->train($dataset);

$applicant = [
    
4500,   // income
    
12000,  // loan
    
0.35,   // dti
    
4,      // history
    
1,      // late payments
    
2,      // cards
    
32,     // age
    
4,      // job years
    
1,      // owns home
];

$dataset = new Unlabeled([$applicant]);

$prediction $tree->predict($dataset);

function 
explainDecision(array $input): string {
    [
        
$income,
        
$loan,
        
$dti,
        
$history,
        
$late,
        
$cards,
        
$age,
        
$job,
        
$home,
    ] = 
$input;

    
$reasons = [];

    if (
$dti 0.5) {
        
$reasons[] = 'высокая долговая нагрузка';
    }

    if (
$late 2) {
        
$reasons[] = 'частые просрочки';
    }

    if (
$history 2) {
        
$reasons[] = 'короткая кредитная история';
    }

    if (empty(
$reasons)) {
        return 
'стабильный профиль заемщика';
    }

    return 
implode(', '$reasons);
}

$result = [
    
'decision' => $prediction[0],
    
'explanation' => explainDecision($applicant),
];

echo 
"Decision: \n";
echo 
print_r($result1);

Graph:

Decision Tree
Result: Memory: 0.476 Mb Time running: 0.009 sec.
sample: [
    4500,   // income
    12000,  // loan
    0.35,   // dti
    4,      // history
    1,      // late payments
    2,      // cards
    32,     // age
    4,      // job years
    1       // owns home
]

Decision: 
Array
(
    [decision] => approve
    [explanation] => стабильный профиль заемщика
)

The output consists of two parts: the tree prediction (approve/reject) and a human-readable explanation. In real products, it is often important to show not only the model decision but also a clear reason why the application looks risky.