Distributional regression modeling via generalized additive models for location, scale, and shape: An overview through a data set from learning analytics

Fernando Marmolejo-Ramos*, Mauricio Tejo, Marek Brabec, Jakub Kuzilek, Srecko Joksimovic, Vitomir Kovanovic, Jorge González, Thomas Kneib, Peter Bühlmann, Lucas Kook, Guillermo Briseño-Sánchez, Raydonal Ospina

*Corresponding author for this work

Publication: Scientific journalJournal articlepeer-review

Abstract

The advent of technological developments is allowing to gather large amounts of data in several research fields. Learning analytics (LA)/educational data mining has access to big observational unstructured data captured from educational settings and relies mostly on unsupervised machine learning (ML) algorithms to make sense of such type of data. Generalized additive models for location, scale, and shape (GAMLSS) are a supervised statistical learning framework that allows modeling all the parameters of the distribution of the response variable with respect to the explanatory variables. This article overviews the power and flexibility of GAMLSS in relation to some ML techniques. Also, GAMLSS' capability to be tailored toward causality via causal regularization is briefly commented. This overview is illustrated via a data set from the field of LA. This article is categorized under: Application Areas > Education and Learning Algorithmic Development > Statistics Technologies > Machine Learning.

Original languageEnglish
Article numbere1479
JournalWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Volume13
Issue number1
DOIs
Publication statusPublished - 1 Jan 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 The Authors. WIREs Data Mining and Knowledge Discovery published by Wiley Periodicals LLC.

Keywords

  • causal regularization
  • causality
  • educational data mining
  • generalized additive models for location, scale, and shape
  • learning analytics
  • machine learning
  • statistical learning
  • statistical modeling
  • supervised learning

Cite this