Yandex Ranking Search Factors Leak

Published on Tuesday, January 31st, 2023 by

What is the deal with the Yandex ranking factors data leak? Learn more about what it means for ranking factors and how search engines work. The search marketing community is trying to make sense of the leaked Yandex repository containing files listing what looks like search ranking factors. Some may be looking for actionable SEO clues but that’s probably not the real value.

The general agreement is that it will be helpful for gaining a general understanding of how search engines work.

There’s A Lot To Learn. Ryan Jones (@RyanJones) believes that this leak is a big deal. He’s already loaded up some of the Yandex machine-learning models onto his own machine for testing.

Ryan is convinced that there’s a lot to learn but that it’s going to take a lot more than just examining a list of ranking factors.

Ryan explains:

“While Yandex isn’t Google, there’s a lot we can learn from this in terms of similarity.

yandex

Yandex uses lots of Google-invented tech. They reference PageRank by name, they use Map Reduce and BERT and lots of other things too.

Obviously, the factors will vary and the weights applied to them will also vary, but the computer science methods of how they analyze text relevance and link text and perform calculations will be very similar across search engines.

I think we can glean a lot of insight from the ranking factors, but just looking at the leaked list alone isn’t enough.

When you look at the default weights applied (before ML) there are negative weights that SEOs would assume are positive or vice versa.

There are also a LOT more ranking factors calculated in the code than what’s been listed in the lists of ranking factors floating around.

That list appears to be just static factors and doesn’t account for how they calculate query relevance or many dynamic factors that relate to the resultset for that query.”

More Than 200 Ranking Factors
It’s commonly repeated, based on the leak, that Yandex uses 1,923 ranking factors (some say less).

Christoph Cemper (LinkedIn profile), founder of Link Research Tools, says that friends have told him that there are many more ranking factors.

Christoph shared: “Friends have seen:

275 personalization factors
220 “web freshness” factors
3186 image search factors
2,314 video search factors
There is a lot more to be mapped.

Probably the most surprising for many is that Yandex has hundreds of factors for links.”

The point is that it’s far more than the 200+ ranking factors Google used to claim.

And even Google’s John Mueller said that Google has moved away from the 200+ ranking factors.

So maybe that will help the search industry move away from thinking of Google’s algorithm in those terms.

Nobody Knows Google’s Entire Algorithm?
What’s striking about the data leak is that the ranking factors were collected and organized in such a simple way.

The leak calls into question is the idea that that Google’s algorithm is highly guarded and that nobody, even at Google, know the entire algorithm.

Is it possible that there’s a spreadsheet at Google with over a thousand ranking factors?

Christoph Cemper questions the idea that nobody knows Google’s algorithm.

Christoph commented to Search Engine Journal:

“Someone said on LinkedIn that he could not imagine Google “documenting” ranking factors just like that.

But that’s how a complex system like that needs to be built. This leak is from a very authoritative insider.

Google has code that could also be leaked. The often repeated statement that not even Google employees know the ranking factors always seemed absurd for a tech person like me.

The number of people that have all the details will be very small. But it must be there in the code, because code is what runs the search engine.”

The leaked Yandex files tease a glimpse into how search engines work. The data doesn’t show how Google works. But it does offer an opportunity to view part of how a search engine (Yandex) ranks search results.

What’s in the data shouldn’t be confused with what Google might use. Nevertheless, there are interesting similarities between the two search engines.

To read the complete post, visit the Search Engine Journal website.