Examine Balance for the Positive Class of a Model
eval_pos_class_bal.Rd
This function evaluates Balance for the Positive Class, a fairness criterion that checks whether the model assigns similar predicted probabilities across groups among individuals whose true outcome is positive (i.e., \(Y = 1\)).
Usage
eval_pos_class_bal(
data,
outcome,
group,
probs,
confint = TRUE,
alpha = 0.05,
bootstraps = 2500,
digits = 2,
message = TRUE
)
Arguments
- data
Data frame containing the outcome, predicted outcome, and sensitive attribute
- outcome
Name of the outcome variable
- group
Name of the sensitive attribute
- probs
Predicted probabilities
- confint
Logical indicating whether to calculate confidence intervals
- alpha
The 1 - significance level for the confidence interval, default is 0.05
- bootstraps
Number of bootstraps to use for confidence intervals
- digits
Number of digits to round the results to, default is 2
- message
Whether to print the results, default is TRUE
Value
A list containing the following elements:
Average predicted probability for Group 1
Average predicted probability for Group 2
Difference in average predicted probability
Ratio in average predicted probability If confidence intervals are computed (
confint = TRUE
):A vector of length 2 containing the lower and upper bounds of the 95% confidence interval for the difference in average predicted probability
A vector of length 2 containing the lower and upper bounds of the 95% confidence interval for the ratio in average predicted probability
Examples
# \donttest{
library(fairmetrics)
library(dplyr)
library(magrittr)
library(randomForest)
data("mimic_preprocessed")
set.seed(123)
train_data <- mimic_preprocessed %>%
dplyr::filter(dplyr::row_number() <= 700)
# Fit a random forest model
rf_model <- randomForest::randomForest(factor(day_28_flg) ~ ., data = train_data, ntree = 1000)
# Test the model on the remaining data
test_data <- mimic_preprocessed %>%
dplyr::mutate(gender = ifelse(gender_num == 1, "Male", "Female")) %>%
dplyr::filter(dplyr::row_number() > 700)
test_data$pred <- predict(rf_model, newdata = test_data, type = "prob")[, 2]
# Fairness evaluation
# We will use sex as the sensitive attribute and day_28_flg as the outcome.
# Evaluate Balance for Positive Class
eval_pos_class_bal(
data = test_data,
outcome = "day_28_flg",
group = "gender",
probs = "pred"
)
#> There is evidence that the model does not satisfy
#> balance for positive class.
#> Metric GroupFemale GroupMale Difference 95% Diff CI Ratio
#> 1 Avg. Predicted Prob. 0.46 0.37 0.09 [0.04, 0.14] 1.24
#> 95% Ratio CI
#> 1 [1.09, 1.42]
# }