SEOSEO News

Google PageRank algorithm and website authority assessment


Today’s web landscape looks much different than in its early days. Web page authority based on links pointing to them is now perceived as the norm. But it was revolutionary back in 1998, when Google introduced the PageRank algorithm to make inbound link assessment a valid ranking factor. PageRank has definitely played a crucial role in the evolution of SEO and its techniques. However, its latest versions are kept secret, and it is now accompanied by many other ranking factors.

In this post, we’ll look into the history of PageRank, explain how it works and how to use this knowledge to improve your rankings. 

What is PageRank

PageRank is an algorithm used by Google Search for ranking web pages based on the number and quality of links pointing to them. It was developed by Google pioneer engineers Larry Page and Sergey Brin in 1998 and marked the first successful attempt of any search engine to assess the level of authority a given web page had. Basically, it meant that a page would get higher rankings with the more backlinks it had. Later on, both engineers tried to stop link manipulations and extended this initiative to counting all links unequally.

History of PageRank

Back in 1998, Larry Page and Sergey Brin published “The Anatomy of a Large-Scale Hypertextual Web Search Engine”. As the engineers explained it in the original paper, PageRank was aimed to “bring order to the web” by distributing weights across pages. They built the algorithm on the idea of a random internet surfer who visits a page and gets to other pages by clicking on links. The probability that a random surfer reaches a certain page is that page’s PageRank. The score is calculated based on a logarithmic scale between 0 and 10 where 10 represents the most trustworthy web source there can be.

PageRank is an objective measure that aligns with searchers’ subjective intentions: the more sources pointing to a page, the more valuable the information on that page and the more likely users are to visit it.

But the referring sources are not equal—the number of pages that link to them is measured as well: the more backlinks a referring page has, the more PageRank power it passes on a page it links out to. Let’s explore it in more detail.

How Google PageRank used to be calculated

Here’s the original PageRank formula:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

where

  • A is the analyzed page
  • T1…Tn are the pages pointing to the analyzed page
  • C is the number of links placed on the analyzed page
  • d is a damping factor that corresponds to the probability that a user will abandon a page (usually set to 0.85)

When pages cast votes on other pages by citing them, they distribute their PageRank. For example, page A has a PageRank score of 5 and it links to pages B and C. In isolation from other links that pages B and C might have, pages B and C receive 85% of page A’s score (4.25) combined (the score multiplied by the damping factor). If page B cites page D, D’s PageRank score will include 85% of B’s score, and so on.

PageRank calculation example

Let’s examine a simple example of PageRank distribution made with a PageRank simulator:

PageRank distribution example

Page 3 here has the highest PageRank score because it is linked out to the most. And because page 3 has the highest score, the PageRank it passes on pages 4 and 5 is also higher. Naturally, this calculation is done in isolation from a real-world scenario, assuming that only these 5 pages exist on the web, but it shows, in a simplified manner, how the value of PageRank is distributed across web pages. 

Since PageRank is an authority metric, the power passed through links is calculated hierarchically: a citation from a PageRank 8 page weighs more than a citation from a PageRank 2 page. But your page can get a higher PageRank value through links from less authoritative pages if they generally use fewer citations. Say, your page is referred to from a PageRank 7 source that contains 10 outbound links and also from a PageRank 3 source that contains only 3 links. The first source will pass the PageRank worth of 0.105 (0.7 multiplied by the damping factor) and the second will bring your page 0.15. However, high-quality and popular pages don’t usually link out to lots of other pages so it’s always best to concentrate on getting backlinks from the most trusted sites.

The value of PageRank is based the number of outgoing links

The updated Google PageRank algorithm

In 2004, Google published the updated website PageRank patent based on a “reasonable surfer model” where they introduced the idea that links may have different values based on their potential to be clicked. For example, links placed on top of the page or links with long enough informative anchor texts are usually more visible and attractive to users. From that moment on, the likelihood of being clicked has been considered for assessing authority and serving rankings.

In 2006, Google designed a new system that selects a few trusted sources called seed pages and assesses the quality of other pages based on their distance in web-link graphs from seed pages. It was a response to PageRank being vulnerable to manipulations, and the new formula looked like this:

∀si ≠ p ∈ P, Ri⁡(p) = d⁢ ∑ q→p⁢ Ri⁡(q) / qout ⁢* w(q→p)

where

  • si are high-quality seed pages
  • P represents all web pages
  • qout is the the out-degree of a page q
  • w is a weight of the link (set to 1 by default)

Google names The New York Times as a good example of a seed page because it is diverse enough to cover a wide range of topics that interest users and features a lot of helpful outgoing links. Pages cited by seeds are considered to be high-quality as well, and the easier it is to reach a page from a seed, the more reliable it is and the higher score it has.

According to this updated patent, the process of ranking distribution based on links goes through the following steps:

  • The system receives a set of pages open to be indexed and ranked
  • The system knows a set of seed pages that link out to other pages
  • The system calculates how far from seeds are the analyzed pages based on the links between them
  • The system determines the rankings based on the shortest distances to seed pages

This new algorithm that replaced the original PageRank formula is faster to compute because it no longer progresses from one iteration to another. And even though the original PageRank patent expired in 2018, it doesn’t mean that Google doesn’t still use it. When asked about Google’s algorithms behind E-E-A-T, Elizabeth Tucker, Director, Product Management at Google Search, referred to “PageRank, one of Google’s classic ranking signals” which “aligns most with authoritativeness.”

PageRank toolbar and link manipulation

In 2000, Google made the PageRank score of any website publicly visible on the browser toolbar. Such exposure led to ranking manipulations called PageRank sculpting: website owners and SEOs would concentrate on getting more links from high-scoring pages and whole link farms emerged to help people buy the links. Google made different attempts to stop ranking manipulations with PageRank and eventually ceased the toolbar in 2016.

Links that don’t pass PageRank

Due to the continuous emergence of various link manipulation techniques and link abuse by SEO, Google started asking webmasters to qualify outbound links using the following three values: 

  • rel=”nofollow” for links you don’t want to pass PageRank with
  • rel=”ugc” for user generated links in comments or forums
  • rel=”sponsored” for third-party sites that buy advertisements

The above stated attributes were invented to help Google fight all kinds of link spam. Google’s link spam policies now prohibit all kinds of link manipulations, including: 

  • Buying or selling links for ranking purposes 
  • Excessive link exchange
  • Anchor spam
  • Requiring a link as part of a Terms of Service
  • Links from low-quality bookmark sites and directory sites
  • Advertorials created just to get a link

The nofollow value

Google together with other major search engines introduced the nofollow value of the rel attribute in 2005. This value tells search crawlers not to follow a link and prevents the link equity distribution. Before nofollow, people could flood the internet with comments mentioning their website’s address and increase the PageRank score.

This new attribute value spurred new link manipulation practices. Given that the weight that PageRank passes to linked pages depends on their number—the more links a page has, the lesser part of this page’s PageRank gets distributed—SEOs would use nofollow to direct the flow of PageRank and pass more weight via followed links. 

The UGC value

Compared to naturally placed relevant outbound links, comment links are most often not so trustworthy and it’s not fair to give them the same credit. In 2019, Google added a new type of value of the rel attribute specifically designed for comment links: UGC (user-generated content). Now, many blogs and forums automatically set any links put in the comments section to UGC, while nofollow is used for a broader range of purposes.

The sponsored value

In 2019, Google introduced rel=”sponsored”, which asks web developers to mark links that are advertisements or paid placements. This “hint” helps Google understand the intention behind this link placement, and the link may not be counted as a credit for a linked page. 

Summing up, the above stated attributes are invented for Google to mark links that should not pass PageRank. But it doesn’t mean that only these are ignored: Google uses the Penguin algorithm to expose all kinds of link schemes and prevent link spam from passing page rank. Moreover, you can tell Google to ignore some of your backlinks by updating your disavow file. 

Does Google still use PageRank in 2024?

Even though PageRank has changed and is not a dominant ranking factor now, Google still uses it. This is because links help bots understand the topical relevance and trustworthiness of pages. Google confirmed that links of mentions from well-known sites to your content matter a lot. Moreover, Google’s big search document leak revealed that there is still an algorithm called pageRank_NS

Uncovering The Search Algorithm Leak of 2024

The Google algorithm leak revealed over 14,000 features and possible ranking signals that almost every SEO professional has dug up. Of course, due to the limited context, not all attributes from the original leaked document have been understood, but some seem to be very easy to interpret, like siteAuthority or uniqueChromeViews. 

 PageRank_NS (Nearest Seed) 

This version of PageRank wasn’t publicly shared and is only known from the leak. It is a newer version of PageRank calculated using NearestSeeds method. Websites that are defined by Google as highly trusted, like New York Times, are determined as seed sites, which means that they carry more pagerank than regular sites.  PageRank_NS is believed to factor in both content relevance and quality. 

Homepage PageRank

This is another ranking feature associated with pagerank. It was found among the leaked attributes. SEOs assume that Homepage PageRank along with siteAuthority are used as proxies for new pages until they have their own PageRank calculated.  

Factors that influence PageRank in 2024

Different aspects of link building can affect the PageRank score:

  1. The number of links 

The more incoming links a webpage has, the better. This rule hasn’t changed much since Larry Page and Sergey Brin introduced the first version of the Algorithm. However, it is only one of many factors now, so you have to pay attention to the quality of your backlinks, not just the quantity. 

  1. The quality of links

Links from high authority websites pass more PageRank. This was confirmed by many tests alongside various mentions by Googlers.  

  1. Link attributes

As mentioned above, backlinks with nofollow, sponsored, and UGC attributes don’t pass the same amount of PageRank as so-called “follow” links. 

  1. The likelihood of being clicked

Depending on their position on the page, links carry different PageRank weights. Footer links are less likely to be clicked than links in the header. This means that header links pass more PageRank. 

Optimizing your backlinks for PageRank

Getting backlinks to cast votes in your website’s favor is still one of the most important factors for establishing authority on the web, but many conditions influence whether a link will pass page rank. Here is a list of factors that influence PageRank that need to be controlled by SEOs. 

  • Backlink Relevance: Relevance is key to SEO in many aspects. Google doesn’t like it when pages are interlinked randomly. Let’s say your page contains a cooking recipe and gets links from pages about cars. No matter how trusted the external source, this type of link won’t boost your page’s rankings. 
  • Anchor texts: Meaningless anchor texts like “click here” or over-optimized ones that contain target keywords are not good for establishing relevance. Anchor text should describe what the linked source is about and serve as a hint to why a user should follow the link. In Google’s link guidelines, it is said that empty, too generic and weirdly long anchor texts wont pass PageRank as expected. 
  • Authority of a linking website: It’s important to verify the domain and page quality of sources to get backlinks from and monitor harmful links coming from low-quality sources.
  • Backlink accessibility for Google: Links matter if search crawlers can find them and they are not blocked in robots.txt or by other methods. Make sure you use link formats that can be crawled by Google
  • Link correctness: Both linked and linking pages should be open for indexing. Also, not just any redirect can pass the full link equity: even though Google stated that all types of redirects pass PageRank, some SEOs believe it may not be the case with non-301 redirects. 
  • Backlink accessibility for Users: Hidden links might lead to penalties, and the more visible links are, the better for UX and SEO. It doesn’t mean that links should stand out sharply: they should be easily distinguishable but designed with common link visualization principles.

Since PageRank assesses the authority on a page and not a site basis, internal links are as important as backlinks. With proper internal linking, you can distribute the link flow:

  • The more internal links a page has, the higher its PageRank
  • The more links placed on a page, the less PageRank value they pass
  • Links that are easily clicked pass a higher PageRank
  • Links attributed by nofollow don’t pass any PageRank

Speaking of external links, they don’t impact the PageRank score of the pages they are placed on. They do serve as relevancy signals and help Google establish connections between different sources but they don’t directly influence search engine rankings.

Alternative authority metrics

PageRank was the first authority metric to influence the web and SEO practices. It is still used among Google’s ranking signals even though it’s not clear how exactly. It’s safe to say that relevant links from high-quality sources are crucial for both rankings and establishing authority. 

Other SEO metrics aimed to assess website authority also revolve around backlink quantity and quality.

For example, SE Ranking’s Domain Trust and Page Trust are aggregated scores of domain and page quality that are based on the number and quality of backlinks and referring domains. You can get an idea of any website’s quality by running its analysis in the Competitive Research tool:

Domain Trust in Competitive Research

DT and PT data is also available in the Backlink Checker, Backlink Monitor tool, and many other SE Ranking tools that feature major domain metrics from SERP Competitors to Content Editor



Source link

Related Articles

Back to top button
error

Enjoy Our Website? Please share :) Thank you!