Posted by
Darrell Mordecai
Yes, we all know that Google is a semantic search engine.
What this means is…
Google attempts to understand the meaning of your content rather than just looking for literal keyword matches.
The reason is that Google is attempting to improve the user experience by attempting to bring results that more accurately satisfy the searcher’s intent.
In order to do this Google has Natural Language Processing (NLP) algorithms that read through your content.
In this post, I’ll attempt to answer:
- What is natural language processing?
- How does natural language processing work?
- How do you use natural language processing in SEO?
To do this, I’ll be focusing on theoretical knowledge as well as putting on my scientist hat and doing a little experimentation. (Yes, I **** tinkering.)
My goal for this experiment is to have a deeper understanding of how to create on-page SEO for NLP.
But first, in case you were wondering what Google’s NLP is…
What is NLP (Natural Language Processing)?
NLP or natural language processing is a field of artificial intelligence that allows a machine to understand human language, whether written or spoken. It’s a field that combines linguistics and computer science allowing computers to analyze and ‘understand’ language in order to extract meaning from text and speech.
Now, you might be wondering…
What is Google’s natural language processing good for?
Natural language processing allows search engines to bring surprisingly accurate answers to user queries and brings quick answers to questions in the form of SERP features. It also helps the users to refine their searches when the question is broad or vague.
In other words, think about this…
Have you noticed that when you type a query into Google, you get a surprisingly accurate and relevant answer?
How does Google do it?
In reality, Google solves two problems at the same time.
First, Google has to understand the query.
Think about it.
How do you word your query so that it will bring you the results you are actually looking for? For the most part, you use natural language.
In contrast, think back to how you worded your query in 2018 before Google’s BERT algorithm rolled out. From what I recall, you’d construct a query based on short phrases. If you didn’t find what you were looking for, you’d try a few variations until Google presented the information you were looking for.
But, now, you type a query using natural language. This means that Google has to understand your query.
Secondly, once Google understands your query, it has to figure out what content to bring in order to answer the question. This means bringing a ranked list of URLs as well as SERP features.
To understand which content to bring you, Google needs to understand the meaning of the content it has in its index.
This means Natural Language Processing techniques analyze both the query and the content in its index.
The best example of this is in the content included in Featured Snippets. When Featured Snippets were new, they often presented incomprehensible word salads.
The reason is that Featured Snippets often bring content from unstructured paragraphs. This means in order to provide a simple and accurate answer to a user query, Google has to do some editing.
Now, as Google’s understanding of language has vastly improved over the last few years, Featured Snippet text has become clear, useful, and easy to understand.
Long gone are the days of the incomprehensible word salads as you can see in the rather ironic Featured Snippet below.
How does Natural Language Processing Work?
Although it’s a big subject, here is a 30, 000 ft overview.
Google’s natural language processing algorithms work by dividing sentences up into different terms, breaking the sentences down into parts of speech, and working out the relationship between words based on grammar rules.
You can easily see this for yourself by analyzing your text with Google’s Natural Language API demo.
Next Google identifies subjects and objects as entities and then assigns them to entity types such as person, location, organization, etc. The algorithms also identify known entities, which means entities that already exist in its knowledge graph.
(As a side point, to truly understand this blog post, you need a clear understanding of what entities and knowledge graphs are. So, in case you were wondering, I’ve included links above to other blog posts designed to make these concepts crystal clear.)
Google’s Natural Language API demo showing how Google’s NLP breaks up sentences into entities.
Google also analyzes the sentiment, which means the attitude that the writer has toward the entities mentioned in the text.
Google’s Natural Language API demo showing Google’s NLP sentiment analysis.
Google also attempts to understand the content category. As you can see in the screenshot below, the category of the analyzed text is Internet & Telecom.
Okay, so now that you’ve had a glimpse into how Google understands content, it becomes obvious that NLP and SEO go hand in hand.
So, let’s test out how Google understands language and see if we can glean some SEO insights.
Four Examples of NLP Based on Four Similar Queries
To demonstrate how Google’s NLP algorithms work, I created a little test to see how Google understands semantically similar queries. What resulted are some great examples of NLP.
In this experiment, my goal is to understand if Google understands the difference between the queries. To do this I’ll examine:
- How Google answers each query
- The semantics based on my own human understanding
- The semantics based on Google’s NLP Demo tool
During this experiment, I specifically looked for queries that bring Featured Snippet results. I chose Featured Snippets because, as I mentioned above, in order to create one, Google often brings various elements together such as text and images. How Google does that demonstrates how Google interprets each element.
And, since Google brings the Featured Snippet as an answer to the query, Featured Snippets also show how Google understands the query.
In other words, I wanted to see how Google curates content, which takes more sophistication than just bringing a list of blue links. Because the more detailed the result, the less likely Google took a wild lucky guess.
I searched these four queries:
- What is the Chinese dragon symbol
- Dragon symbol Chinese
- What is the dragon symbol Chinese
- What does the Chinese dragon symbol represent
Now, at a first glance, you might think these queries have the same or at least similar intent.
And, if that’s the case, they should all bring the same results.
So to test this out, I searched each one on Google.
And, as I’m sure you would imagine, each result was different. What’s more, they didn’t even bring the same URLs, text, or images.
So let’s examine each one to try to understand what Google sees and try to understand why.
1. Example #1 ‘What is the Chinese dragon symbol’
The first query, ‘what is the Chinese dragon symbol’, is grammatically correct. If you type a query like this into Google you should in theory get the best results.
As you can see from the Featured Snippet above, Google brings a quote from Chineasy.com’s blog post entitled ‘The meaning of the Dragon Symbol in Chinese Culture.’
Now before looking at tools, here is my semantic analysis based on my own human understanding. (Yes, my brain is my favorite SEO tool.)
The query is worded to understand what the Chinese dragon symbol is. This means the main entity is the dragon symbol which comes from China.
Google answers this question perfectly. For instance, the title tag expresses that the article will explain the meaning of the dragon symbol in Chinese culture. This is exactly what the searcher was looking for.
Looking at the text in the Featured Snippet, you’ll notice that Google is explaining what dragons symbolize in Chinese culture.
Again, this is a pretty good user experience if you ask me. The Featured Snippet gives the user a pretty good overview of the topic.
Now, let’s jump into Google’s Natural Language API demo to see how Google understands the query.
Looking at the screenshot above, Google seems to understand that there are two entities in the query.
Firstly, the word ‘Chinese’ refers to China the location. You can see this by the fact that the API demo identifies it as ‘location’. Also, the tool features a link to a Wikipedia article about China. A Wikipedia article appears when Google recognizes an entity that’s in its Knowledge Graph.
In other words, Google recognizes an unknown entity called ‘dragon symbol’ and it’s related to the location ‘China’.
From my perspective, this lines up with my human understanding of the entities in the query and the result is most likely to satisfy the user’s query.
Now let’s take a look at the next query.
2. Example #2 – ‘Dragon symbol Chinese’
The next query is ‘dragon symbol Chinese’.
Even though this query is not grammatically correct, as a human, I’d generally type a query like this as short-hand for ‘What is the Chinese Dragon symbol?’
To me, the missing ‘What is the..’ is implied and the two queries mean the same thing. My brain fills in the gaps.
But, to Google, the semantic differences require an entirely different answer.
When I typed it into Google, this is what I saw:
As you can see in the screenshot above, Google presents information from Wikipedia.com. What happened to the Chineasy.com URL that we saw above?
If we go by the title tag alone, it seems like Google is bringing content from a general article about Chinese dragons rather than a specific article that explains what the Chinese dragon means in Chinese culture that we saw when looking at the previous query.
Also, the content in the Featured Snippet is a little awkward.
Although the answer is similar to the one we saw when we looked at the previous query, here the first sentence starts with ‘The dragon symbol is also…’. The word ‘also’ here seems strange.
So the big questions are, why is the URL different, and why is the text awkward?
Perhaps the answer is that the query is missing the ‘What is the…’ context words.
And, from experience, when someone types a broad term into the search bar, Google gives a broad response that could satisfy a number of user intents.
Just surmising here but, this might explain why Google brings the Wikipedia article instead of the Chineasy.com article.
The Wikipedia article is more generic and does not focus on the symbolism of the Chinese dragon but covers the topic more broadly.
It first explains what the Chinese dragon is in general and only mentions what it symbolizes in the fourth line.
And if you look at the screenshot above, you’ll see why the word ‘also’ is mentioned in the Featured Snippet.
In other words, by the user excluding the words ‘What is the…’, Google brings a more generic URL.
Now let’s look at the query in Google’s API demo.
Here we see that Google (according to this tool) understands the query as one entity. China is not seen as a location as we saw in the previous query.
Now out of complete curiosity, I changed the word order to ‘Chinese dragon symbol’ to see if I got the same result. In the screenshot below, you’ll see that when I change the word order, the word ‘Chinese’ is not seen as an entity at all.
Instead, Google views it as an adjective.
The takeaway here is by just changing the word order, changes how the search engine understands the query.
That said, the result is somewhat satisfying but my feeling is the previous result is far better.
Let’s look at the third query.
3. Example #3 – ‘What is the dragon symbol Chinese’
The query ‘what is the dragon symbol Chinese’ is a combination of the previous two queries.
Now from a user perspective, I’d imagine that Google reads this query the same way it reads the previous one. I mean as a human being, I read ‘dragon symbol Chinese’ as a shorthand way of writing the sentence ‘What is the dragon symbol Chinese’.
But, if you look at the Featured Snippet, you’ll notice that Google doesn’t agree.
As you can see from the screenshot above, Google presents the site Chineasy.com’s URL in the Featured Snippet while the previous query brought a URL from Wikipedia.
This is almost identical to the Featured Snippet we saw in the first query, which leads me to believe that the URL change in the second query was not a result of the word order change, but was a result of the missing context words ‘What is the…’.
If we look at the query using Google’s NLP analyzer, we see that Google assumes that the main entity is ‘symbol Chinese’.
Now let’s look at the query ‘What does the Chinese Dragon represent?’
4. Example #4 – ‘What Does the Chinese Dragon Represent’
Again when you look at the Featured Snippet, you’ll notice that Google brings you a completely different URL. This time Google brings you depts.washington.edu.
Now, let’s try to understand why Google brought a different URL this time. If we compare it to the first query ‘What is the Chinese dragon symbol’, there is one basic difference.
The first query asked a vague question. When you ask a question starting with ‘What is…?’ you are not qualifying your question in any way. But, when you ask the question ‘What does x represent?’ you are asking a more specific question. The more specific the question, the more specific the answer.
Now with that in mind, let’s look at the Featured Snippet text.
As you can see the text includes the words ‘the dragon symbol represents‘.
Perhaps Google used this URL because the query included the word ‘represent’ (What does the Chinese dragon symbol represent?)
Okay, we’ve covered some NLP theories and also some examples of NLP in action.
Now let’s look at NLP for SEO.
How Can You Implement NLP in Your SEO?
After my little demonstration, you’ve seen how Google and other search engines use NLP techniques to understand your content. So as an SEO, the big question is, how can you use NLP to improve your SEO?
Below is a short list designed to help you improve your SEO NLP.
Include Search Intent in Your Keyword Research
As I mentioned above, Google analyzes your search queries. When Google does this, it’s attempting to understand the search intent behind the query so that it can bring relevant results that adequately answer the question.
This means understanding how Google interprets search intent is crucial. The simplest way to do that is to analyze the SERPs.
The reason is, that by performing SERP analysis you can easily see what resources Google brings to answer the query. By seeing this you can figure out how Google understands the search intent.
Write Simply and Clearly
As you’ve seen from the NLP examples above, Google analyzes the subjects and objects of sentences in your content to identify entities. What’s more, making small changes to your sentence structure can change the semantic structure of a sentence in a way that you as a person might not detect. Remember, Google isn’t human and doesn’t understand your content the way that you do.
So, to deal with this, always write simple sentences and try to express one idea per sentence.
Identify and Include Entities In Your Content
Since Google not only identifies entities in your content but also links entities in your content to known entities in its knowledge graph, you should try to identify all the entities that Google expects to see in content that answers the search query.
You can easily do this by analyzing Google’s top content using Google’s API demo or by importing entity data using Python.
Analyzing your competitor’s content with Google’s API demo is a great place to get started. Simply drop their content into the demo and hit analyze.
Once you’ve done that, look through the Entities report. Wherever the tool brings a URL (usually from Wikipedia) you’ve found a known entity.
Using this tool, you can look at all of the top content and look for common entities to include in your content.
Another option is to use Python to find entity data. If you want to see how to do this, check out Marco Giordano’s blog post on how to use Python for NLP and semantic SEO.
Match Answers With Questions
In an article that I saw on Bill Slawski’s site, a Google patent states that your content is more likely to be selected for People Also Ask answers if it’s presented as a question and answer.
In other words, to get your content featured in the People Also Ask feature, include the question in your content and then answer it. I’d add to make it easy for Google to understand that you are answering the question, answer it immediately after the question.
Now if writing out the question and answering it immediately in your content helps Google to include your content in PAA boxes, it stands to reason that in order for Google to understand your content, you should generally aim to structure your content this way, even if you are not targeting PAA boxes.
Use a Clear Structure
In order to understand entities, Google can pull information from structured and semi-structured information. This means Google understands simple HTML markup such as headings (H1, H2, etc.).
What’s more, from what I’ve seen, it’s easier for Google to understand structured and semi-structured data than it is for Google to understand unstructured data.
Understanding this, you should create a clear structure for your content using logical headings (H1, H2, etc).
(For more clarity on this, check out my article on Google’s Knowledge Graph.)
What Google’s NLP Means to You
Semantic search is here and from what I’m seeing, it’s here to stay.
And, what I’m seeing is that semantic search is a game changer for SEO. We might not see it now, but as Google double’s down on algorithms like Bert and MUM, Google’s ability to work with human language is improving exponentially.
This implies to me that we must begin to optimize our content for these algorithms. And yes, I do understand that this branch of SEO is in its infancy, but as we have done in the past, with a little trial and error, you can get a handle on how to improve your traffic and visibility.
And that was what my little experiment was all about. I’m attempting to figure out practically how semantic search algorithms affect search results in order to figure out how you can better optimize your content.