Examine Negative Predictive Parity of a Model
eval_neg_pred_parity.Rd
This function evaluates negative predictive predictive parity, a key fairness criterion that compares the Negative Predictive Value (NPV) between groups defined by a sensitive attribute. In other words, it assesses whether, among individuals predicted to be negative, the probability of being truly negative is equal across subgroups.
Usage
eval_neg_pred_parity(
data,
outcome,
group,
probs,
cutoff = 0.5,
confint = TRUE,
bootstraps = 2500,
alpha = 0.05,
digits = 2,
message = TRUE
)
Arguments
- data
Data frame containing the outcome, predicted outcome, and sensitive attribute
- outcome
Name of the outcome variable, it must be binary
- group
Name of the sensitive attribute
- probs
Name of the predicted outcome variable
- cutoff
Threshold for the predicted outcome, default is 0.5
- confint
Whether to compute 95% confidence interval, default is TRUE
- bootstraps
Number of bootstrap samples, default is 2500
- alpha
The 1 - significance level for the confidence interval, default is 0.05
- digits
Number of digits to round the results to, default is 2
- message
Whether to print the results, default is TRUE
Value
A list containing the following elements:
NPV_Group1: Negative Predictive Value for the first group
NPV_Group2: Negative Predictive Value for the second group
NPV_Diff: Difference in Negative Predictive Value
NPV_Ratio: Ratio in Negative Predictive Value If confidence intervals are computed (
confint = TRUE
):NPV_Diff_CI: A vector of length 2 containing the lower and upper bounds of the 95% confidence interval for the difference in Negative Predictive Value
NPV_Ratio_CI: A vector of length 2 containing the lower and upper bounds of the 95% confidence interval for the ratio in Negative Predictive Value
Examples
# \donttest{
library(fairmetrics)
library(dplyr)
library(magrittr)
library(randomForest)
data("mimic_preprocessed")
set.seed(123)
train_data <- mimic_preprocessed %>%
dplyr::filter(dplyr::row_number() <= 700)
# Fit a random forest model
rf_model <- randomForest::randomForest(factor(day_28_flg) ~ ., data = train_data, ntree = 1000)
# Test the model on the remaining data
test_data <- mimic_preprocessed %>%
dplyr::mutate(gender = ifelse(gender_num == 1, "Male", "Female")) %>%
dplyr::filter(dplyr::row_number() > 700)
test_data$pred <- predict(rf_model, newdata = test_data, type = "prob")[, 2]
# Fairness evaluation
# We will use sex as the sensitive attribute and day_28_flg as the outcome.
# We choose threshold = 0.41 so that the overall FPR is around 5%.
# Evaluate Negative Predictive Parity
eval_neg_pred_parity(
data = test_data,
outcome = "day_28_flg",
group = "gender",
probs = "pred",
cutoff = 0.41
)
#> There is not enough evidence that the model does not satisfy negative predictive parity.
#> Metric GroupFemale GroupMale Difference 95% Diff CI
#> 1 Negative Predictive Value 0.92 0.9 0.02 [-0.15, 0.19]
#> Ratio 95% Ratio CI
#> 1 1.02 [0.78, 1.34]
# }