Reputation survey methodology in university rankings: a critical examination

How academic and employer reputation surveys underpin major rankings, their sample design, response rates, geographic coverage, and the validity concerns that limit their reliability.

The role of reputation surveys in rankings

Reputation surveys are the single most influential data source in the QS World University Rankings and the Times Higher Education World University Rankings. In QS, Academic Reputation alone can account for up to forty percent of the total score, while in THE, the combined teaching and research reputation surveys contribute roughly a third. These surveys ask academics and employers to nominate institutions they consider excellent, and the aggregated nominations produce reputation scores that heavily shape the overall ranking positions. Despite their weight, reputation surveys face persistent methodological criticism relating to sampling, measurement, and validity.

The appeal of reputation surveys is clear: they capture aspects of university quality—intellectual climate, collegial culture, the quality of graduates—that are difficult to measure quantitatively. A citation count cannot tell you how collegial a department is or how well its graduates perform in the workplace. In principle, surveys aggregate the informed judgments of thousands of experts, producing a measure of collective wisdom that transcends any single individual's limited perspective. Whether they achieve this in practice depends on the quality of their design and execution.

Sampling and response rate challenges

The validity of any survey-based indicator rests on the quality of its sampling and the representativeness of its respondents. Major ranking surveys aim for global coverage, distributing invitations to academics and employers across countries and disciplines. However, response rates are typically low by survey research standards, often in the single digits. When only a small fraction of invited participants respond, the risk of non-response bias becomes significant. If respondents differ systematically from non-respondents—for example, if they are more likely to come from particular countries, career stages, or institutional types—the results may not represent the broader academic or employer community.

Geographic representation is a persistent concern. Survey invitations are distributed across regions, but response rates vary, and final respondent pools tend to skew toward North America, Western Europe, and developed Asia-Pacific countries, where academic networks are densest and survey participation norms are strongest. This geographic skew affects which institutions receive nominations: respondents nominate institutions they know, and knowledge is heavily influenced by geography. A university in Nigeria may be as strong as one in the United Kingdom, but if survey respondents are disproportionately European and North American, the African institution will receive far fewer nominations regardless of its quality.

Disciplinary skew is another challenge. Fields with large research populations, such as medicine and engineering, naturally generate more survey responses than smaller humanities disciplines. Some ranking organizations weight responses to achieve disciplinary balance, but the underlying asymmetry in the size of disciplines means that even weighted results may not fully represent small fields. Additionally, the definition of an expert respondent varies: ranking surveys typically target published academics, but publication norms differ across disciplines, and scholars in fields where other forms of output matter more than journal articles may be underrepresented in the respondent pool.

Measurement validity and the halo effect

Beyond sampling, the measurement instrument itself raises validity questions. Reputation surveys ask respondents to nominate institutions they consider excellent for teaching or research, but they rarely provide a structured framework for evaluating what excellence means. One respondent's understanding of teaching excellence may differ substantially from another's. The surveys also do not typically require respondents to provide evidence or justification for their nominations, meaning that reputation scores may capture general brand awareness, historical prestige, or media visibility rather than informed assessments of current performance.

The halo effect is a well-documented psychological phenomenon in which a positive impression in one area influences judgments in another. In the context of reputation surveys, this means that a university known for research excellence may receive high teaching reputation scores even if its teaching quality is unexceptional, simply because respondents generalize their positive overall impression. Conversely, institutions with limited global brand recognition may receive low scores across all dimensions regardless of their specific strengths. The halo effect creates inertia in reputation scores, making them slow to respond to genuine improvements or declines in specific areas of institutional performance.

Recommendations for ranking users

Reputation surveys are not inherently invalid, but their limitations require users to interpret their results with appropriate caution. When evaluating a ranking that includes reputation indicators, check whether the ranking organization publishes information about the survey's sample size, response rate, geographic distribution, and disciplinary coverage. This transparency supports informed judgment about the reliability of the results. Rankings that do not disclose this information provide weaker evidence for decision-making.

Triangulate reputation data with other sources. A high reputation score with low bibliometric performance might signal strong teaching or a halo effect from historical prestige. A low reputation score with high bibliometric performance might indicate recent improvement not yet reflected in perceptions, or disciplinary focus on fields with different reputation dynamics. Use subject-specific rankings when available, as reputation at the discipline level is often more reliable than institutional-level reputation. And always supplement ranking data with direct engagement: talk to academics in your field, attend open days, read student reviews, and examine the actual research and teaching outputs of the institutions you are considering. Reputation is one piece of the puzzle, not the full picture.

Reputation survey methodology in university rankings: a critical examination

The role of reputation surveys in rankings

Sampling and response rate challenges

Measurement validity and the halo effect

Recommendations for ranking users

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?

Need a cleaner shortlist?