STEM vs humanities: how ranking methodologies create disciplinary winners and losers
Why research-output metrics and journal-focused databases systematically favor STEM disciplines and obscure humanities and social science contributions.
Publication norms: journals vs books
The fundamental structural difference between STEM and humanities disciplines lies in their publication cultures. In the natural sciences, medicine, and engineering, peer-reviewed journal articles are the primary vehicle for communicating research findings. These articles are well covered by bibliometric databases, indexed systematically, and tracked through citation metrics that ranking systems rely upon. In the humanities and many social sciences, books—monographs, edited volumes, and scholarly editions—remain the gold standard of scholarly contribution. A humanities scholar may spend five to ten years on a single book-length project that represents a career-defining contribution, yet this output is largely invisible to rankings built around journal article databases.
The book citation problem compounds this invisibility. Bibliometric databases such as Web of Science and Scopus have historically focused on journal articles, with limited coverage of books. Even when books are indexed, their citation patterns differ from journal articles: book citations accumulate more slowly and over longer periods, while ranking systems often use citation windows of only a few years. This temporal mismatch means that even the small proportion of books captured by these databases may appear to have low impact simply because the measurement window is too short. A book that becomes a standard reference in its field over a decade will register minimal citations during the period rankings measure.
Citation counts and field normalization
Citation rates differ dramatically between STEM and humanities fields, even within the same level of scholarly significance. A highly cited immunology paper might accumulate hundreds of citations within a few years, while an equally important article in art history might receive a handful. Rankings attempt to address this through field normalization, comparing citation counts to field averages. However, field normalization cannot fully correct for the structural differences in citation cultures. In fields where the absolute number of citations is very low—as in many humanities disciplines—small numerical changes can produce large swings in field-normalized scores, reducing statistical reliability.
The co-authorship pattern further favors STEM. Scientific papers often have dozens or even hundreds of co-authors, each of whom receives full publication credit in bibliometric databases. Humanities articles and books are typically single-authored. This means that a STEM researcher may accumulate hundreds of publications over a career through team-science projects, while a humanities scholar with an equally distinguished career may have published a handful of books and articles. Rankings that count publications per faculty will systematically penalize disciplines defined by single authorship and slower publication cycles.
Research funding and institutional resources
Rankings that include research income indicators, such as THE World University Rankings, introduce another dimension of STEM bias. Research funding in the sciences—for laboratory equipment, clinical trials, large-scale data collection—dwarfs the funding required for humanities research. A philosophy department needs little more than library access and time, while a molecular biology department requires millions in grant funding. Rankings that reward high research income per faculty create an incentive structure that privileges departments that spend a lot of money, regardless of the quality or impact of what is produced.
The same pattern appears in industry collaboration and patent indicators. Engineering and biomedical departments naturally generate patents and attract industry funding, while it is difficult to imagine what a patent in literary criticism or history would look like. Rankings that include innovation and commercial translation indicators are effectively measuring something that is structurally inaccessible to large portions of the humanities and interpretative social sciences. This does not mean such indicators are invalid, but it does mean that users should interpret them within a disciplinary context rather than as universal measures of quality.
Solutions and recommendations
Ranking organizations have begun to acknowledge these disciplinary asymmetries. Subject-specific rankings provide some relief by reducing the scope of comparison to disciplines with similar publication cultures. THE and QS both publish subject rankings in arts and humanities that use modified indicator sets, reducing the weight of bibliometric indicators and incorporating reputation surveys more heavily. However, reputation surveys introduce their own geographic and linguistic biases, and they cannot fully replace the missing evidence from books and other humanities output types.
U-Multirank's multi-dimensional approach offers a partial solution: by allowing users to focus on teaching or internationalization dimensions rather than research output, it enables humanities-focused institutions to demonstrate strengths that composite rankings obscure. Initiatives to index books more comprehensively, such as the Book Citation Index in Web of Science and Scopus's expanded book coverage, are improving the situation gradually. For users of rankings, the practical lesson is clear: no single ranking can fairly compare STEM and humanities departments. Disaggregate by discipline, use subject-specific rankings, and supplement quantitative data with qualitative assessments from scholars in the field. Rankings can be useful for cross-institutional comparison within disciplines, but they remain a poor tool for comparing the quality of a physics department with that of a classics department.