TY - JOUR
T1 - Is it all bafflegab? – Linguistic and meta characteristics of research articles in prestigious economics journals
AU - Amon, Julian
AU - Hornik, Kurt
PY - 2022
Y1 - 2022
N2 - In competitive research environments, scholars have a natural interest to maximize the prestige associated with their scientific work. In order to identify factors that might help them address this goal more effectively, the scientometric literature has tried to link linguistic and meta characteristics of academic papers to the associated degree of scientific prestige, conceptualized as cumulative citation counts. In this paper, we take an alternative approach that instead understands scientific prestige in terms of the rankings of the journals that the articles appeared in, as such rankings are routinely used as surrogate research quality indicators. For the purpose of determining the most important drivers of suchlike prestige, we use state-of-the-art text mining tools to extract 344 interpretable features from a large corpus of over 200,000 journal articles in economics. We then estimate beta regression models to investigate the relationship between these predictors and a cross-sectionally standardized version of SCImago Journal Rank (SJR) in multiple topically homogeneous clusters. In so doing, we also reinvestigate the bafflegab theory, according to which more prestigious research papers tend to be less readable, in a methodologically novel way. Our results show the consistently most informative predictors to be associated with the length of the paper, the span of coreference chains in its full text, the deployment of a personal and moderately informal writing style, the “density” of the article in terms of sentences per page, international and institutional collaboration in research teams and the references cited in the paper. Moreover, we identify various linguistic intricacies that matter in the association between readability and scientific prestige, which suggest this relationship to be more complicated than previously assumed.
AB - In competitive research environments, scholars have a natural interest to maximize the prestige associated with their scientific work. In order to identify factors that might help them address this goal more effectively, the scientometric literature has tried to link linguistic and meta characteristics of academic papers to the associated degree of scientific prestige, conceptualized as cumulative citation counts. In this paper, we take an alternative approach that instead understands scientific prestige in terms of the rankings of the journals that the articles appeared in, as such rankings are routinely used as surrogate research quality indicators. For the purpose of determining the most important drivers of suchlike prestige, we use state-of-the-art text mining tools to extract 344 interpretable features from a large corpus of over 200,000 journal articles in economics. We then estimate beta regression models to investigate the relationship between these predictors and a cross-sectionally standardized version of SCImago Journal Rank (SJR) in multiple topically homogeneous clusters. In so doing, we also reinvestigate the bafflegab theory, according to which more prestigious research papers tend to be less readable, in a methodologically novel way. Our results show the consistently most informative predictors to be associated with the length of the paper, the span of coreference chains in its full text, the deployment of a personal and moderately informal writing style, the “density” of the article in terms of sentences per page, international and institutional collaboration in research teams and the references cited in the paper. Moreover, we identify various linguistic intricacies that matter in the association between readability and scientific prestige, which suggest this relationship to be more complicated than previously assumed.
U2 - 10.1016/j.joi.2022.101284
DO - 10.1016/j.joi.2022.101284
M3 - Journal article
SN - 1751-1577
VL - 16
JO - Journal of Informetrics
JF - Journal of Informetrics
IS - 2
ER -