L-infinity histogram gap
- humancompatible.detect.methods.l_inf.l_inf.check_l_inf_gap(X: ndarray, y: ndarray, binarizer: Binarizer, feature_involved: str, subgroup_to_check: Any, delta: float) float[source]
Test whether a protected subgroup’s outcome distribution differs from the overall population by at most delta in the l_inf-norm.
- Parameters:
X (np.ndarray) – Protected-attribute slice of the dataset (same rows as y).
y (np.ndarray) – Boolean target vector.
binarizer (Binarizer) – The binarizer used to encode X and y.
feature_involved (str) – Name of the protected column whose subgroup is tested.
subgroup_to_check (Any) – Raw value of the subgroup to isolate.
delta (float) – Threshold for the L-infinity norm.
- Returns:
- 1.0 (which means True) if the subgroup histogram is within delta;
0.0 (which means False) otherwise.
- Return type:
float
- Raises:
ValueError – If delta is not positive.
KeyError – If feature_involved is not in the binarizer’s feature names.
KeyError – If subgroup_to_check is not a valid value for the feature.
- humancompatible.detect.methods.l_inf.lp_tools.lin_prog_feas(hist1: ndarray, hist2: ndarray, delta: float, num_samples: float = 1.0) int[source]
Specifies a number of samples as a fraction of the total histogram bins and checks whether all the sampled bins satisfy
|hist1 - hist2| <= delta.
- Parameters:
hist1 (np.ndarray) – 1-D array of histogram bin densities for the full dataset.
hist2 (np.ndarray) – 1-D array of histogram bin densities for the subgroup.
delta (float) – Threshold for the absolute difference |hist1 - hist2|.
num_samples (float) – Fraction of total bins to sample. The function draws int(num_samples * (len(hist1) - 1)) random samples.
- Returns:
- Status code from scipy.optimize.linprog. A status of 0 indicates
the constraints are feasible (i.e., |hist1 - hist2| <= delta for all sampled bins); other codes signal infeasibility or solver errors.
- Return type:
int