Logistic regression

Case 7. Medical screening

Medical tasks are a special domain where machine learning is used very carefully. It is important not only to predict a class, but to interpret the result correctly.

In such scenarios, logistic regression is often used for screening — a preliminary risk assessment, not a diagnosis.

Case Goal
Estimate the probability of having a disease based on basic patient indicators.

The model should:
1) Assess risk
2) Help identify patients who need additional examination
3) Work as a decision-support tool, not a replacement for a doctor

Example of use:

 
<?php

use Rubix\ML\Classifiers\LogisticRegression;
use 
Rubix\ML\Datasets\Labeled;

// Each sample is a patient described by 4 numeric indicators (toy example).
// Example interpretation (you can rename/adjust for your domain):
// - feature #1: age
// - feature #2: systolic blood pressure
// - feature #3: BMI
// - feature #4: cholesterol (or another lab measurement)
$samples = [
    [
301202290],
    [
6015030140],
    [
4514028130],
    [
251102085],
    [
5013526120],
    [
5514529135],
    [
3512524100],
    [
4013027110],
    [
6516032150],
    [
281152195],
];

// Rubix classifiers require categorical (discrete) labels, so we use strings.
// Here the model is used for screening: low_risk/high_risk, not a diagnosis.
$labels = [
    
'low_risk',
    
'high_risk',
    
'high_risk',
    
'low_risk',
    
'high_risk',
    
'high_risk',
    
'low_risk',
    
'low_risk',
    
'high_risk',
    
'low_risk',
];

$dataset = new Labeled($samples$labels);

// Train the logistic regression classifier on labeled patient examples.
$model = new LogisticRegression();
$model->train($dataset);