Logistic regression

Case 5. Loan approval

In lending tasks, a model rarely answers "yes or no" directly. Instead, it estimates a probability that is then turned into a decision.
Even more important is the client’s risk. This is why logistic regression is used so often in credit scoring: it gives not just a label, but the probability that the client is reliable – something you can work with.

Case Goal
Estimate the client’s credit risk and make a loan approval decision.

The model should:
1) Compute the probability that the client will repay the loan
2) Allow flexible decision control via a threshold

 
<?php

use Rubix\ML\Classifiers\LogisticRegression;
use 
Rubix\ML\Datasets\Labeled;
use 
Rubix\ML\Datasets\Unlabeled;

$samples = [
    [
30006000.4],
    [
80007500.2],
    [
20005000.7],
    [
100008000.1],
];

$labels = ['decline''approve''decline''approve'];

$dataset = new Labeled($samples$labels);

$model = new LogisticRegression();
$model->train($dataset);

$client = new Unlabeled([[50006800.3]]);
$prediction $model->predict($client);

echo 
'Predicted label: ' PHP_EOL;
print_r($prediction);

$probas $model->proba($client);
$probabilityOfApproval $probas[0]['approve'] ?? null;

echo  
PHP_EOL 'Probability of approval (class=approve): ';
print_r($probabilityOfApproval);

echo 
PHP_EOL;

$threshold 0.6;
$approved $probabilityOfApproval !== null && $probabilityOfApproval >= $threshold;

echo 
'Threshold: ' $threshold PHP_EOL;
echo 
'Decision: ' . ($approved 'APPROVE' 'DECLINE') . PHP_EOL;

Result: Memory: 1.191 Mb Time running: 0.006 sec.
Predicted label: 
Array
(
    [0] => decline
)

Probability of approval (class=approve): 0
Threshold: 0.6
Decision: DECLINE