When management consulting firm McKinsey   declared in 2015 that it had found a link between profits and executive racial and gender diversity, it was a breakthrough. The research was used by investors, lobbyists and regulators to push for more women and minority groups on boards, and to justify investing in companies that appointed them.

Unfortunately, the research doesn’t show what everyone thought it showed.

There are obvious benefits of diverse corporate leadership for society, both in providing role models and in showing a commitment to promoting the best people, irrespective of skin color or gender. But doing it because it is the right thing is not the same as doing it because it makes more money.

Since 2015, the approach has been tested in the fire of the marketplace and failed. Academics have tried to repeat McKinsey’s findings and failed, concluding that there is in fact no link between profitability and executive diversity. And the methodology of McKinsey’s early studies, which helped create the widespread belief that diversity is good for profits, is being questioned.

McKinsey has tried to remedy one of the most obvious flaws. It originally linked profits over several years with diversity at the end of the period, meaning the most it could prove is that profitability led to more diversity, not the other way around. In its latest study, it said it had now run the tests using diversity at the start of the period, and still found a correlation.

“In light of a recent study criticizing our methodologies, we have reviewed our research and continue to stand by its findings—that diverse leadership teams are associated with a higher likelihood of financial outperformance,” McKinsey said. “We have also been clear and consistent that our research identifies correlation, not causation, and that those two things are not the same.”

The trouble is that McKinsey behaves as though the studies do show causation, constantly talking of the corporate benefits of diversity.

Even the correlation is in doubt. Academics can’t replicate McKinsey’s study precisely, because it keeps secret the names of the companies it used. But a paper published this year finds that McKinsey’s methodology doesn’t show benefits from diversity for S&P 500 companies for a range of profitability metrics. It isn’t that a lack of diversity is good for profits either, it’s just there’s no link.

This shouldn’t come as a surprise. If companies could boost their profits as easily as McKinsey suggested—the most-diverse firms had a 39 percentage point higher chance of higher-than-average profit margins than the least-diverse—then surely companies would have rushed to promote more women and minority racial groups.

“It seemed implausible because companies would have jumped on it and the advantages would be competed away,” said John Hand , an accounting professor at the University of North Carolina at Chapel Hill. With Jeremiah Green of Texas A&M University, he found no results that were statistically significant when repeating McKinsey’s study for the S&P 500 . McKinsey keeps secret the names of the companies in its study, which in 2015 included 186 from the U.S. and Canada, so it can’t be independently verified.

This matters, because the McKinsey study was hugely influential. McKinsey’s research figures first in BlackRock ’s references for supporting a board diversity target of 30% in its proxy voting guidelines. It featured prominently among studies used by a Securities and Exchange Commission commissioner in 2020 to explain why she supported corporate disclosure of diversity metrics . Nasdaq cited it as evidence when the exchange applied to the SEC for a rule requiring companies it lists to have minimum diversity on boards, or explain why they don’t. It has been cited by dozens of campaign groups pushing for rules to support consideration of social issues by pension funds and others, too.

McKinsey’s influence wasn’t only on policy, which ought anyway to consider moral and societal issues as well as purely financial ones. BlackRock and Refinitiv, now part of the London Stock Exchange Group , cited the study as evidence of financial benefits from diversity when they created an ETF that tracked a diversity index. That index has lagged badly behind since its 2018 launch, returning about 55% against more than 70% for the global index without diversity conditions.

This seems to be less about diversity than the choice of how to invest in it. The ETF is equal-weighted, which has held it back as giant stocks beat the rest of the market . Because of the diversity requirements, it held a lot more banks and insurers, and less technology , than the market as a whole.

A similar fund was created earlier by State Street Global Advisors with the ticker SHE. It was promoted through the “Fearless Girl” statue —briefly installed opposite Wall Street’s bronze bull and now opposite the New York Stock Exchange—and backed by research from MSCI, which claimed a 36% higher return on equity for firms with at least three women on the board, or “strong female leadership.”

A moment’s thought would suggest this was far too high a figure to be explained by the presence of a handful of women, and subsequent deep underperformance shows skepticism was the right response. Since its 2016 launch the fund’s return has lagged more than 70 percentage points behind that of the top 1,000 companies, from which it selected before switching to an MSCI gauge two years ago. It has shrunk from a peak of $400 million to $245 million.

McKinsey said in its original paper that “it stands to reason…that more diverse companies are better able to win top talent, and improve their customer orientation, employee satisfaction, and decision-making, leading to a virtuous cycle of increasing returns.” Common sense also says it’s easier to avoid potentially catastrophic groupthink if people have a range of different experiences.

Common sense also insists that it’s important to build team spirit and trust, where people with a shared background have a head start. University of Chicago law professor Lisa Bernstein showed this for New York’s Jewish diamond dealers, who would score zero for diversity but gained financially by the trust from their common heritage. Similar studies have shown the same for other small ethnic business groups.

Bernstein thinks such trust can be built from networks of social connections, but it comes built-in for some backgrounds.

Skin color and sex don’t perfectly capture diversity of thought, anyway. A privately-educated Black Harvard Business School graduate would probably think much the same way about business as a white one. A top female New York lawyer may have a similar experience of life—or lack of it—as a male one. McKinsey’s diversity of thought suggestions don’t extend to, for example, appointing worker representatives to the board, even though their ideas might well be quite different to those of senior management.

Finally, correlation is not causation! McKinsey repeatedly says in its study that it only found a correlation. The Aztecs mistook correlation for causation with tragic results, cutting out the heart of a victim to rekindle fire every 52 years in order to ensure the world’s survival. There was a strong correlation between the human sacrifice and the world not ending—but no causation.

Investors don’t risk having vital organs removed, but they should pay more attention to the studies they rely on.

Write to James Mackintosh at james.mackintosh@wsj.com