Abstract
Prediction models often fail if train and test data do not stem from the same distribution. Out-of-distribution (OOD) generalization to unseen, perturbed test data is a desirable but difficult-to-achieve property for prediction models and in general requires strong assumptions on the data generating process (DGP). In a causally inspired perspective on OOD generalization, the test data arise from a specific class of interventions on exogenous random variables of the DGP, called anchors. Anchor regression models, introduced by Rothenhäusler et al. (J R Stat Soc Ser B 83(2):215–246, 2021. https://doi.org/10.1111/rssb.12398), protect against distributional shifts in the test data by employing causal regularization. However, so far anchor regression has only been used with a squared-error loss which is inapplicable to common responses such as censored continuous or ordinal data. Here, we propose a distributional version of anchor regression which generalizes the method to potentially censored responses with at least an ordered sample space. To this end, we combine a flexible class of parametric transformation models for distributional regression with an appropriate causal regularizer under a more general notion of residuals. In an exemplary application and several simulation scenarios we demonstrate the extent to which OOD generalization is possible.
Original language | English |
---|---|
Article number | 39 |
Journal | Statistics and Computing |
Volume | 32 |
Issue number | 3 |
DOIs | |
Publication status | Published - Jun 2022 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022, The Author(s).
Keywords
- Anchor regression
- Covariate shift
- Diluted causality
- Distributional regression
- Out-of-distribution generalization
- Transformation models