What Is a Sitemap? Website Sitemaps Explained
What Is a Sitemap?
A sitemap is a file that lists all the important pages on a website. It guides search engines in determining which pages to include in their index.
Having a sitemap is an important part of SEO. Sitemaps help Google discover your pages faster and more effectively.
Types of Website Sitemaps
There are two types of sitemaps:
- XML sitemaps: sitemaps written in a specific format designed for search engine crawlers
- HTML sitemaps: sitemaps that look like regular pages and help users navigate the website
Let’s take a closer look at each of these two types.
XML Sitemaps
XML sitemaps are the preferred format of sitemaps for search engines, such as Google.
They carry three primary kinds of information for search engines:
- The list of all the URLs you want to have indexed
- The “lastmod” attribute that informs when the URLs were last updated
- The “hreflang” attribute that informs about the local variants of the URLs
They look something like this:
XML stands for Extensible Markup Language. It’s a format that makes it easy for search engine crawlers to read a sitemap.
Google says sitemaps are suitable for large websites, websites with extensive archives, or new websites with few links.
However, every website can benefit from having an XML sitemap. There are really no downsides to having one. And it only takes a few minutes to create one.
If you want to learn more about XML sitemaps, here are some further resources:
HTML Sitemaps
HTML sitemaps used to be a popular way to improve a website’s navigation and provide links to all your pages in one place.
Here’s an example of an HTML sitemap by H&M Group:
As you can see, it’s a standard page with links to various pages organized in a hierarchical way.
Although HTML sitemaps aren’t that common anymore, some voices in the SEO community still say they’re a must.
The fact is:
HTML sitemaps can improve your internal linking and provide another layer of navigation for complex websites with many pages.
However, do not use an HTML sitemap as a replacement for good site navigation elements (such as menus, footer links, breadcrumbs, categories, etc.).
As John Mueller, Google Search Advocate, said on Mastodon:
“If you feel the need for an HTML sitemap, spend the time improving your site’s architecture instead.”
In other words, users should not need a sitemap to navigate your website.
Note: When we talk about sitemaps in the context of SEO, we usually mean XML sitemaps. From now on, we will talk about XML sitemaps in this guide.
How to Find a Sitemap
Here are some effective ways to find a sitemap on a website:
Manual Check
The easiest way to find an XML sitemap is to look for it manually. Most commonly, a website sitemap will be located at this URL address:
https://domain.com/sitemap.xml
Quite often—especially if the website uses WordPress and the Yoast SEO plugin—you’ll be redirected to a sitemap index (/sitemap_index.xml).
In that case, it will look like this:
As you can see, a sitemap index is a simple file that lists all the sitemaps of a website. (Yes, there can be multiple sitemaps.)
To see the actual sitemap, just click the link to the specific sitemap in the index.
Search Operators
Search operators are special phrases or symbols you can add to a search query to return more specific results.
Here are some search operators you can use to find a sitemap on a website:
- “site:domain.com filetype:xml”
- “site:domain.com inurl:sitemap”
- “site:domain.com intitle:sitemap”
You can use these operators in all the popular search engines, such as Google, Bing, or Yahoo.
Simply enter the operator into the search bar and replace “domain.com” with the actual website’s address.
The search results should return the location of the website sitemap if it exists and the search engine you’re using has indexed it.
Google Search Console
If you have access to a website’s Google Search Console, there’s a chance the sitemap has been submitted there.
Head to the “Sitemaps” report in the “Indexing” section of the left menu.
Here, you’ll see a section called “Submitted sitemaps.” If someone has submitted an XML sitemap before, you’ll find its URL in the list.
Note: We’ll talk about submitting a sitemap to Google Search Console later in this guide.
Robots.txt
Robots.txt is a website file that tells search engine crawlers which sections of the website to crawl and which to avoid.
You should place it in the root folder of your site: https://domain.com/robots.txt
If the robots.txt file follows best practices, it will link to the website sitemap. Just search for “sitemap” within the robots.txt document.
The section linking to a sitemap will look something like this:
If you’ve tried all the ways mentioned above and couldn’t locate your XML sitemap, your website probably doesn’t have one.In that case, read our guide to XML sitemaps to learn how to create a sitemap for a website. Or use a sitemap generator.
How to Review Your Sitemap for Issues
To ensure your sitemap is set up correctly, you can use a website auditing tool like Semrush’s Site Audit. The tool will crawl your website (similar to the way Googlebot does) and detect any technical SEO issues.
Here’s what the tool’s dashboard looks like:
You can create a free account (no credit card needed) and crawl up to 100 pages within minutes.
Once you’ve signed up, setting up the first crawl is fairly easy:
- Go to “Projects” and create your first project—just enter your domain and the name of the project.
- Go to the Site Audit tool and select your domain by clicking the input field.
- Configure basic settings in the “Site Audit Settings” window that will pop up. If you’re unsure about something, this detailed setup guide will help you.
- When you’re done, hit the “Start Site Audit” button.
Once you’ve run an audit with the tool, you’ll be able to review any site errors under the “Issues” tab.
Just search for “sitemap.” You’ll get a list of issues related to your sitemap.xml file.
Some common sitemap-related issues include the following:
- Incorrect pages found in a sitemap: Your sitemap contains pages that are not supposed to be in a sitemap (like pages with redirects or pages that are not canonical)
- Sitemap has format errors: There are format errors (like missing XML tags) in your sitemap file
- Sitemap files are too large: Your sitemap exceeds Google’s size limit (more than 50 MB or more than 50,000 URLs)
When you click the link with the number of affected pages, you’ll see a full list of affected pages.
You can also click the “Why and how to fix it” link next to each type of issue.
This will open a modal window with further explanation of the issue and tips on how to fix the problem.
How to Submit and Check a Sitemap with Google
Submitting your XML sitemap to Google is one of the SEO best practices. That’s mainly for two reasons:
- It can speed up the process of Google discovering your sitemap
- It can help you detect issues with your sitemap
You can submit your sitemap in Google Search Console.
(If you don’t have an account yet, read our detailed guide on how to set up Google Search Console for your site.)
To submit your sitemap, go to the “Sitemaps” report. You’ll find it in the “Indexing” section of the left menu.
There, enter the URL of your sitemap in the “Add a new sitemap” section. And click the “Submit” button.
After you’ve submitted your sitemap, you’ll get a message like this:
For a more in-depth guide, read our post on how to submit a sitemap to Google.
You can monitor the status of your sitemap anytime you visit the report. If there’s a green “Success” message, you’re all good.
If there’s an issue with your sitemap, you’ll see a red “Couldn’t fetch” or “Has errors” status. In this case, the report will provide a detailed explanation of what went wrong and how you can fix it.
You can check the full list of possible errors and how to fix them in Google’s guide to the “Sitemaps” report.
Keep Learning
Sitemaps are an important aspect of SEO. But they’re not the only thing that matters.
Read our guide to conducting a technical SEO audit to learn about other technical aspects of SEO you should pay attention to.
Source link : Semrush.com