Google Internal Docs Leak: What Every SEO Should Know

Vernon June 13, 2024

0 6 minutes read

Google Internal Docs Leak: What Every SEO Should Know

Blog / SEO Industry News / Internal Docs Leak Reveals Google Myths and Internal Features

In late May 2024, the internal documentation for Google Search’s Content Warehouse API was leaked.

There hasn’t been a leak this big or detailed from Google’s search division since Google’s launch into the market. The leaked documents reveal many secrets that Google has been hiding, or even neglecting, for a long time.

This topic is very ***, so let’s dive right in without any lengthy introductions. We’ll go over why this happened and how SEO specialists can use this data.

Background

Back in early May, an anonymous user shared internal Google Search API Documents with Rand Fishkin (co-founder of Moz and creator of their Domain Authority metric). Rand verified their personality himself, emailing and having a video call with them. After that, he asked Michael King (founder and CEO of iPullRank) to analyze this data. On May 27th, they published this information along with their analysis of all the data.

You can find all the leaked data through this link.

What’s in the docs?

Here, you’ll find more than 2,500 pages of API documentation containing 14,014 attributes (API features) that appear to come from Google’s internal Content API Warehouse. Many of these attributes play an important role in Google’s ranking process.

However, this documentation doesn’t show the weight of particular elements in the search ranking algorithm. It also doesn’t indicate which elements are used in the ranking systems. But, it does show incredible details about the data Google collects.

Here’s an example of the document format:

It’s similar to guidelines for Google team members, outlining what variables are available, what their functions are, and how to work with them.

Note: The documentation was up-to-**** as of last summer (references to other changes in 2023 and earlier years ****** back to 2005 are also present), and possibly even up-to-**** as of the March 2024 **** of disclosure. But it’s not guaranteed that this is the recent version of such ‘instructions.’ For example, there are no mentions of AI Overviews here. There are also some deprecated features (although they are marked as no longer in use).

In any case, this documentation contains a lot of relevant and important data. Let’s take a look.

Google myths revealed

To minimize manipulation of search results, the Google team has closely guarded the details of how their algorithms work and what truly influences rankings.

And now, thanks to the leaked information, we’re faced with what we have. Many claims that Google representatives once made about various aspects of search engine optimization have turned out to be untrue. Much of the leaked data directly contradicts Google’s official and public statements.

Let’s take a look at some of the most popular myths debunked by the leaked documentation.

Domain Authority

Google spokespeople have said numerous times that Google doesn’t use domain authority to rank pages. For example, John Mueller has repeatedly said this. Here is one of his comments on Reddit: