'I apologise for my poor blogging': Searching for apologies in the Birmingham Blog Corpus

Ursula Lutzky, Andrew Kehoe

Publication: Scientific journalJournal articlepeer-review

15 Downloads (Pure)


This study addresses a familiar challenge in corpus pragmatic research: the search for functional phenomena in large electronic corpora. Speech acts are one area of research that falls into this functional domain and the question of how to identify them in corpora has occupied researchers over the past 20 years. This study focuses on apologies as a speech act that is characterised by a standard set of routine expressions, making it easier to search for with corpus linguistic tools. Nevertheless, even for a comparatively formulaic speech act, such as apologies, the polysemous nature of forms (cf. e.g. I am sorry vs. a sorry state) impacts the precision of the search output so that previous studies of smaller data samples had to resort to manual microanalysis. In this study, we introduce an innovative methodological approach that demonstrates how the combination of different types of collocational analysis can facilitate the study of speech acts in larger corpora. By first establishing a collocational profile for each of the Illocutionary Force Indicating Devices associated with apologies and then scrutinising their shared and unique collocates, unwanted hits can be discarded and the amount of manual intervention reduced. Thus, this article introduces new possibilities in the field of corpus-based speech act analysis and encourages the study of pragmatic phenomena in large corpora.
Original languageEnglish
Pages (from-to)37 - 56
JournalCorpus Pragmatics
Issue number1
Publication statusPublished - 2017

Austrian Classification of Fields of Science and Technology (ÖFOS)

  • 602004 General linguistics
  • 602
  • 602011 Computational linguistics


  • Speech Acts
  • Apology
  • Collocation
  • Big Data
  • Blogs

Cite this