Is your feature request related to a problem? Please describe.
causalml.metrics.sensitivity already has a Sensitivity framework: placebo treatment, random/irrelevant confounder, subset-data refutation, and SensitivitySelectionBias (based on Blackwell 2014). These are all refutation-style checks. They tell you whether your estimate looks fragile, but none of them give you an actual quantitative bound on how far the true CATE/ATE could move if there's unobserved confounding.
SensitivitySelectionBias also currently logs this on init,
"Only works for linear outcome models right now. Check back soon." (causalml/metrics/sensitivity.py line 396)
that's a real gap for a library built around ML-based meta-learners (S/T/X/R/DR learners). The one sensitivity method that gives a numeric bound doesn't actually support the outcome models causalml is designed around. I'm always frustrated when I fit an X-learner or DR-learner on observational-ish data and the only "how robust is this" tooling available is a refutation test, not an actual interval
Describe the solution you'd like
adding sensitivity bounds based on the Marginal Sensitivity Model (MSM), originally Tan (2006), with closed-form sharp bounds for (C)ATE worked out by Zhao, Small & Bhattacharya (2019, JRSS-B) and later Dorn & Guo (2023, JASA). The NeurIPS 2023 paper by Frauen, Melnychuk & Feuerriegel ("Sharp Bounds for Generalized Causal Sensitivity Analysis," arXiv:2305.16988) shows the standard MSM is the binary-treatment special case of a more general framework, and that their bounds coincide with the Dorn-Guo result there
why MSM specifically...
- it's model-agnostic:, it only needs propensity scores and the fitted outcome regression, both of which every meta-learner in
causalml.inference.meta already produces. No restriction to linear outcomes like the current SensitivitySelectionBias
- the bounds are closed-form and sharp for the binary-treatment case, so no grid search or extra optimization loop is needed, and no heavy new dependency (the general GMSM paper needs normalizing flows for the continuous/mediated case, but that's not necessary here)
- it uses one interpretable sensitivity parameter, Gamma >= 1, that bounds how much an unobserved confounder could be affecting treatment assignment. Gamma = 1 just recovers the original point estimate. This Gamma parameterization is basically the standard in the sensitivity-analysis literature at this point (it's what
causalsens uses in R, and what most of the recent papers compare against)
rough API sketch, following the existing Sensitivity class conventions...
from causalml.metrics.sensitivity import SensitivityMSM
sens = SensitivityMSM(
df=df,
inference_features=feature_names,
p_col='propensity_score',
treatment_col='treatment',
outcome_col='y',
learner=my_fitted_meta_learner,
)
bounds_df = sens.get_msm_bounds(gamma=[1.0, 1.5, 2.0, 3.0])
# gamma ate_lower ate_upper
# 0 1.0 0.42 0.42
# 1 1.5 0.21 0.61
# 2 2.0 0.05 0.74
# 3 3.0 -0.18 0.91
sens.breakdown_gamma() # smallest Gamma that flips the sign of the effect
sens.plot_msm_bounds(gamma_range=(1, 5))
this would sit alongside SensitivityPlaceboTreatment, SensitivityRandomCause, etc. as another Sensitivity subclass, so it fits the existing module shape instead of introducing a separate API
for a first pass i would keep scope tight: binary treatment only, (C)ATE only (not mediation or distributional effects, which need the heavier neural density estimation from the general GMSM paper), and the closed-form Dorn & Guo bounds rather than the older non-sharp percentile-bootstrap version from Zhao et al. Propensity scores are already a dependency via ElasticNetPropensityModel, so there's nothing new to add there either
Describe alternatives you've considered
- just fixing
SensitivitySelectionBias to support non-linear models directly. Possible, but Blackwell's selection-bias approach is designed around linear regression coefficients in a fairly fundamental way, so this would mean a different method under the same class rather than a small patch
- implementing the full GMSM from Frauen et al. (mediation analysis, continuous treatments, distributional effects). More general, but it needs conditional normalizing flows for density estimation, which is a much bigger lift and a new dependency. Binary MSM covers the common case (causalml's existing learners are basically all binary/multi-treatment CATE) without that cost
- pointing users to an external package (e.g. the authors' own
SharpCausalSensitivity repo) instead of adding this in-tree. Doesn't integrate with causalml's existing learner objects or propensity scores, so users would have to glue things together themselves
Additional context
i know the maintainers have mentioned elsewhere (#725) that causalml's primary focus is HTE estimation for randomized experiments rather than general observational causal inference, and that sensitivity tooling should be framed carefully. I would actually frame this addition the same way the existing Sensitivity module already is: not as a green light for observational use, but as another way to show users how fragile their estimate is, just with a number attached instead of a refutation heuristic
If this direction seems reasonable i would like to take a shot at implementing it. Could start with just get_msm_bounds() for ATE on BaseSLearner/BaseTLearner output and extend from there once the core bound is in.
Let me know if this scope makes sense or if you'd rather see it shaped differently before I put together a PR
Is your feature request related to a problem? Please describe.
causalml.metrics.sensitivityalready has aSensitivityframework: placebo treatment, random/irrelevant confounder, subset-data refutation, andSensitivitySelectionBias(based on Blackwell 2014). These are all refutation-style checks. They tell you whether your estimate looks fragile, but none of them give you an actual quantitative bound on how far the true CATE/ATE could move if there's unobserved confounding.SensitivitySelectionBiasalso currently logs this on init,that's a real gap for a library built around ML-based meta-learners (S/T/X/R/DR learners). The one sensitivity method that gives a numeric bound doesn't actually support the outcome models causalml is designed around. I'm always frustrated when I fit an X-learner or DR-learner on observational-ish data and the only "how robust is this" tooling available is a refutation test, not an actual interval
Describe the solution you'd like
adding sensitivity bounds based on the Marginal Sensitivity Model (MSM), originally Tan (2006), with closed-form sharp bounds for (C)ATE worked out by Zhao, Small & Bhattacharya (2019, JRSS-B) and later Dorn & Guo (2023, JASA). The NeurIPS 2023 paper by Frauen, Melnychuk & Feuerriegel ("Sharp Bounds for Generalized Causal Sensitivity Analysis," arXiv:2305.16988) shows the standard MSM is the binary-treatment special case of a more general framework, and that their bounds coincide with the Dorn-Guo result there
why MSM specifically...
causalml.inference.metaalready produces. No restriction to linear outcomes like the currentSensitivitySelectionBiascausalsensuses in R, and what most of the recent papers compare against)rough API sketch, following the existing
Sensitivityclass conventions...this would sit alongside
SensitivityPlaceboTreatment,SensitivityRandomCause, etc. as anotherSensitivitysubclass, so it fits the existing module shape instead of introducing a separate APIfor a first pass i would keep scope tight: binary treatment only, (C)ATE only (not mediation or distributional effects, which need the heavier neural density estimation from the general GMSM paper), and the closed-form Dorn & Guo bounds rather than the older non-sharp percentile-bootstrap version from Zhao et al. Propensity scores are already a dependency via
ElasticNetPropensityModel, so there's nothing new to add there eitherDescribe alternatives you've considered
SensitivitySelectionBiasto support non-linear models directly. Possible, but Blackwell's selection-bias approach is designed around linear regression coefficients in a fairly fundamental way, so this would mean a different method under the same class rather than a small patchSharpCausalSensitivityrepo) instead of adding this in-tree. Doesn't integrate with causalml's existing learner objects or propensity scores, so users would have to glue things together themselvesAdditional context
i know the maintainers have mentioned elsewhere (#725) that causalml's primary focus is HTE estimation for randomized experiments rather than general observational causal inference, and that sensitivity tooling should be framed carefully. I would actually frame this addition the same way the existing
Sensitivitymodule already is: not as a green light for observational use, but as another way to show users how fragile their estimate is, just with a number attached instead of a refutation heuristicIf this direction seems reasonable i would like to take a shot at implementing it. Could start with just
get_msm_bounds()for ATE onBaseSLearner/BaseTLearneroutput and extend from there once the core bound is in.Let me know if this scope makes sense or if you'd rather see it shaped differently before I put together a PR