Disclaimer: It’s important to keep an open mind about the data. Don’t take anything from this leak as actionable SEO advice without testing and research.
The recent leak of Google Search’s internal engineering documentation provided new findings. Let’s discuss which of them matters.
What Was Leaked?
Internal documentation of the Google Content Warehouse API has been uncovered. As SEO influencer Rand Fishkin reported, “I received an email from a person claiming to have access to a massive leak of API documentation from inside Google’s Search division.” He writes that the person who sent the email stated that former Google employees have confirmed the data originates from Google Search and shared additional private information about Google’s operations.
Sparktoro: Leaked data about “good”, “bad”, and “unicorn” clicks ^^
The leaked documents revealed 14,000 ranking factors from Google’s algorithm. The data confirms suspicions long held by the SEO community. However, now this information brings more strict answers on valuable search parameters.
According to Andrew Ansley, the leaked files come from a commit to a Google API document, and it wasn’t a **** or a whistleblower.
The Main Insights From the Google Leak
Of course, there are many critical reviews on this leak, nevertheless, the industry can try incorporating new ideas from this. There are a few aspects:
- Google differentiates between “good clicks,” “bad clicks,” and “Unicorn clicks,” which affect rankings differently. “Good clicks” and “Unicorn clicks” likely improve rankings, while “bad clicks” may harm them.
- What users do after clicking a link matters. Though not explicitly detailed in the leak, metrics like “dwell time” or session duration can influence rankings.
- In order to prevent manipulation, the systems normalize the click data.
3.Sandbox. New sites may be initially sandboxed to prevent spam, affecting their early rankings.
5.Content handling. Google classifies and handles different types of content and websites separately, such as small websites or those with “risky” content (YMYL).
6.Homepage importance. Google gives homepages significant weight. The score of a homepage can influence ranking.
7.**** verification. Google checks article dates through multiple methods, emphasizing the importance of accurate **** information.
9.Google keeps a copy of every page it indexes. When analyzing links, it considers the last 20 URL changes.
10.If a website contains video on more than 50% of its pages, it is considered a video site.
A lot of these highlights contradict Google’s earlier public statements.
SEO Community Reactions
This shook up the social web among SEO professionals, sparking discussions about how misleading public statements from Google spokespersons have distorted our understanding of SEO.
As we discussed before, the leak validates long-held beliefs within the SEO community. Therefore, some SEOs feel more confident in their strategies, knowing their assumptions are correct. However, the leak raises new questions.
Some experts believe that the leaked files **** back to last year, and the algorithms may have changed significantly, but they contain more details than we’ve ever seen and are worth our consideration.
An extensive analysis of leaked Google Search API led SEO specialists at Growth SRC to compile these into a user-friendly, searchable database (which I actively used during research):
This database provides description, potential impact, focus area, and interpretation within the context of search rankings.
One more interesting feature is Google’s Ranking Features Modules Relations by Natzir:
The documentation leak explains each module and breaks it into summaries, types, functions, and attributes. Modules include YouTube, Assistant, Books, video search, links, web documents, crawl infrastructure, and other components. Here are a few of these search system elements identified:
- Trawler — Google’s web crawling system.
- Alexandria — the core indexing system.
- TeraGoogle is a secondary indexing system that stores long-term documents on disk.
- WebMirror — manages duplication and canonicalization of content.
- Mustang — the primary scoring, ranking, and serving system.
- NavBoost — re-ranks pages based on user click logs.
Implications of Leaked Document for SEOs
As with the Yandex leak (17,853 ranking factors), these documents will hold value, serving as evidence when implementing something with your clients.
Let’s delve into some actionable strategies based on insights from the Google API leak:
1.Topical relevance. To enhance your site’s trust and quality, prioritize contextually aligned links from authoritative websites (strategic link-building), as well as high-quality viral content.
- Visitors spend more time on your pages, improving dwell time metrics
- They navigate to other pages within your site
- They don’t return to the search results for additional answers (pogo-sticking).
Ultimately, this engagement can lead to conversions and sales.
- Integrate deep audience analysis into your processes.
- Prioritize effective headlines and engaging content.
- Consider elements like font size and weight as minor ranking factors.
- Optimize your Google Business Profile listing
- Encourage customer reviews
- Work on brand signals by increasing navigational queries directly related to your brand
- Focus on local relevance and engagement.
6.Use structured author data and create author pages. Also, instead of using several freelance writers, work with those focused on subject matter expertise.
7.Include timestamps in your video metadata to ensure accuracy and reliability. By using this timestamp in search results, users can jump directly to the part of a video that answers their query.
Remember, while these insights provide valuable guidance, they’re not absolute proof of the real ranking algorithms. The good news is that these algorithm insights still require a focus on fundamentals: high-quality content, backlinks, and user engagement.
Conclusion
Given the ever-evolving and complex nature of Google’s search algorithm, the leaked documentation can provide valuable insights. However, it also raises questions about the future of SEO and potential changes with the advent of new privacy and tracking technologies. I believe Google’s communication strategies now need more transparency and we will get it. But I encourage everyone to make their own conclusions.