ESG Ratings muddle created by needless complexity in rating methodologies

Lack of reliability in ESG ratings is a problem pointed out in repeated studies. ‘Compass with no direction’ – that was the assessment of a recent academic article. What has caused it? Needless complexity in the methodologies could be the reason, believes Algo Circle

Same data, vastly different results – how is that possible? But that is the situation with the global ESG rating business. Over the years, data users and academic papers have repeatedly pointed this out.

An article in The Economist pointed out that credit ratings have a 99% correlation across rating agencies. By contrast, ESG ratings tally little more than half the time.” This is a stark difference that clearly shows the problem users are dealing with here.

Compass with no direction

A recent article “ESG ratings—a compass without direction*”, from the Rock Center for Corporate Governance at Stanford University, reviewed ESG ratings and their providers, pointing out that significant shortcomings exist in their objectives, methodologies, and incentives which detract from the informativeness of their assessments.

Dubious Data Practices

We have found some of the data practices of ESG raters rather dubious. As the Stanford paper points out, the data sources used may be public, quasi-public or private data, including company responses to solicited questionnaires. If a metric is being generated to influence investments of public money, to incorporate ‘private’ data sources – including asking companies for information that could be potentially non-public – is an unhealthy practice. This is one more example of needless complexity rating firms are wilfully inducing in their methodologies.

Listed companies are being urged or mandated by respective regulators to release ESG information on certain parameters. Our view is that rating firms should operate within the confines of the disclosures companies have been asked to make.

The basic disclosures firms are asked to make are quite adequate to the task at hand. However, in a game of one-upmanship, rating firms are claiming they are using more data sources than the next guy. Consider the wide disparity in the claims raters are making on the data sources they use:

  • One ESG rater claims to use 630+ ESG metrics. Note the lame attempt at precision – not 600+, but 630+.
  • Not to be outdone, another data provider claims to use approximately 1000 data points.


SEC Commissioner Hester Peirce has been on record stating what “good” means in the context of ESG is very subjective, and that is a key reason why ESG ratings firms are so inconsistent in their results.

The Stanford authors observe: this huge number of variables creates issues by itself, requiring the ratings firms to make a variety of judgments, including materiality assessments and related “weighting” of factors, potential absence of relevant data and weighting of “both the variables in their importance to E, S, and G, and also the overall pillars of E, S, and G in relation to one another. (

The race to include more factors than the other rater is likely a large cause of the muddled ratings. Several of these factors are qualitative, requiring analyst judgement.

Materiality – a much celebrated concept in ESG – is another area where subjectivity can creep in. ESG raters try to apply materiality at two levels: weights of E, S and G, and also how they treat various sub-themes or factors of E, S and G. Materiality is a license for subjective creativity. One firm may treat a sub-theme as Governance, while another may treat it within Social. Clearly, results will differ between these firms.

One rater has added another level of creativity: whether an issue will have an impact in the short term or over the long term. Now imagine applying this to 30+ dimensions and 50+ industry groups.

Treatment of missing data

The more data items you try to use, the larger the problem of missing data. Different raters apply different approaches to tackle this pain point.

The Stanford paper writes: When information is not available to populate a data point, MSCI appears to assume that the company’s performance is the industry average. By contrast, FTSE assumes that the company’s performance is the worst. Under the former approach, the worst company in an industry is better off not disclosing data on its weak areas.

Algo Circle Suggestion

Clearly, it is not in the interest of the rating firms to reduce dissonance. However, we would still urge the raters to consider the following:

  • Stop this race to win the crown of the firm using the largest number of data points. It is detrimental to your output. In fact, cut down data points to a level where you can optimise missing data issue
  • Go easy with materiality. We think it is overhyped. Don’t use it at multiple levels
  • Reduce subjectivity. Anything which is subjective will induce errors

Most large firms are gravitating towards using their own frameworks. Firms that use their own frameworks use a much smaller number of data points compared to the third-party rating firms.

* by Brian Tayan, David Larcker, Edward Watts and Lukasz Pomorski

Ajay Jindal, CFA
ESG Lead, Algo Circle

Category: ESG Ratings
Tags: Brian Tayan, David Larcker, Edward Watts, Lukasz Pomorski, Rock Center for Corporate Governance at Stanford University,

Leave a Reply

Your email address will not be published. Required fields are marked *