Probability as degree of confidence
Case 2. Medical test: updating confidence
When developers hear the word "probability", they often imagine dice, coin flips, and the school formula "favorable outcomes divided by all possible outcomes". This is useful, but a very narrow picture. In machine learning and applied analytics, probability almost always means something else – the degree of our confidence in a statement given the available data.
Scenario
Suppose there is a rare disease. We know the following:
- The disease occurs in 1 person out of 1,000.
- Test sensitivity (probability of a positive result given the disease) is 99%.
- Test specificity (probability of a negative result given a healthy person) is 95%.
A patient takes the test and gets a positive result. Question: what is the probability that they truly have the disease?
An intuitive answer often sounds like “about 99%”. But that is wrong. The reason is the rarity of the event.
Example of use
<?php
$prior = 0.001; // P(disease)
$sensitivity = 0.99; // P(positive | disease)
$specificity = 0.95; // P(negative | healthy)
$falsePositive = 1 - $specificity; // P(negative | ealthy)
$posterior = ($sensitivity * $prior)
/ (($sensitivity * $prior) + ($falsePositive * (1 - $prior)));
echo "True positive cases: " . ($sensitivity * $prior) . PHP_EOL;
echo "False positive results: " . ($falsePositive * (1 - $prior)) . PHP_EOL;
echo "Probability: " . round($posterior, 4);
Result:
Memory: 0 Mb
Time running: < 0.001 sec.
True positive cases: 0.00099
False positive results: 0.04995
Probability: 0.0194