Distributional anchor regression

Lucas Kook*, Beate Sick, Peter Bühlmann

*Corresponding author for this work

Publication: Scientific journalJournal articlepeer-review

Abstract

Prediction models often fail if train and test data do not stem from the same distribution. Out-of-distribution (OOD) generalization to unseen, perturbed test data is a desirable but difficult-to-achieve property for prediction models and in general requires strong assumptions on the data generating process (DGP). In a causally inspired perspective on OOD generalization, the test data arise from a specific class of interventions on exogenous random variables of the DGP, called anchors. Anchor regression models, introduced by Rothenhäusler et al. (J R Stat Soc Ser B 83(2):215–246, 2021. https://doi.org/10.1111/rssb.12398), protect against distributional shifts in the test data by employing causal regularization. However, so far anchor regression has only been used with a squared-error loss which is inapplicable to common responses such as censored continuous or ordinal data. Here, we propose a distributional version of anchor regression which generalizes the method to potentially censored responses with at least an ordered sample space. To this end, we combine a flexible class of parametric transformation models for distributional regression with an appropriate causal regularizer under a more general notion of residuals. In an exemplary application and several simulation scenarios we demonstrate the extent to which OOD generalization is possible.

Original languageEnglish
Article number39
JournalStatistics and Computing
Volume32
Issue number3
DOIs
Publication statusPublished - Jun 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022, The Author(s).

Keywords

  • Anchor regression
  • Covariate shift
  • Diluted causality
  • Distributional regression
  • Out-of-distribution generalization
  • Transformation models

Cite this