Python is an incredibly powerful programming language that’s not only quite easy to learn but also very helpful for SEO professionals. It can:
- Automate repetitive SEO tasks like scraping websites, parsing data, and generating reports; this saves you time.
- Connect to APIs, allowing you to effortlessly fetch data from sources like ChatGPT API, Google Search Console API, SEMrush, or Ahrefs.
In this guide, I’ll show you how to use Python programming even if you have no prior experience!
We’ll create the following several handy tools:
- tool that scrapes a sitemap file and saves URLs with errors (404, 5xx) to a separate file.
- scraper that extracts <title> and meta description tags from a list of URLs.
- tool that connects to the PageSpeed Insights API to gather performance data about a given page.
- program that performs basic manipulations with a CSV file.
- program that combines data from Google Search Console and Google Analytics.
- handy tool that uses the ChatGPT API and saves the results in a CSV file.
And the best part? You don’t need any prior programming experience! Let’s dive in and get started!
What you need to start programming right away
To run Python programs you need a Python environment. For the purpose of this tutorial, we will be using Google Colab. It’s awesome for three reasons:
- It’s very easy to start. Creating your first Python program in it will take you just a couple of minutes (or a minute, if you’re fast).
- You don’t need to install anything on your computer.
- You can work on the same code with your friends or colleagues. Think of Google Docs but for Python programming. If you struggle with a specific program, you can ask your colleague for help.
But don’t worry we’ll take it step by step.
Here’s the deal!
To ensure your success, you need to follow these guidelines:
- Read the chapter thoroughly, so you don’t miss any crucial steps.
- Encounter a Python error? Ask ChatGPT to fix it.
- No luck? Start a new ChatGPT chat to resolve the issue.
- If ChatGPT struggles with your CSV files, provide more context and a sample line with the data. Explain the file’s structure, columns, and data format.
- In case the Python program crashes, request ChatGPT to add exception handling.
- If ChatGPT gives you incomplete input and finishes at a random place (yes, it happens!), just ask ChatGPT to start from where it finished.
Deal?
Writing your first Python program – “hello world”
All programmers begin their journey in a similar fashion. They create a simple computer program designed to do one simple thing: display “Hello world” on the screen.
Let’s do it!
To get started, request ChatGPT to generate a Python program for you:
💬
Write a Python program that will print: “hello world” on the display
Then copy the code provided by ChatGPT. You can do this by clicking on the: “Copy code” button.
Next, open Google Colab (it’s a special environment where you can run your Python programs) and select “New notebook” to create a fresh workspace.
Paste the copied code into the Google Colab cell (refer to the screenshot below for guidance) and hit the “Run” button:
Congratulations! You’ve just written your very first Python program! It printed “hello world” on the screen.
Write a program that downloads a sitemap
Alright, let’s move on to something more practical. How about we create a program that downloads all the URLs from a sitemap and saves them to a text file?
With ChatGPT, you can create this program in just a couple of minutes.
Type the following prompt into the chat:
💬
the following tasks:
- Save all the URLs in a file named “sitemap.txt”.
- Visit all the URLs in the sitemap, up to a maximum of 100 URLs, and check their status codes. If a URL does not return a 200 OK status code, save it to a file named “broken_urls.csv”, which will contain two fields: the URL and its corresponding HTTP status code.
Once you’ve got the code from ChatGPT, just as you did with the “Hello World” example, hit the “Copy code” button located at the top-right corner of the ChatGPT console.
Now, in your Google Colab notebook, the main cell is already filled with the previous program.
You have three options for handling this:
- Remove the old code and paste in the new one.
- Create a new notebook.
- Insert a new cell.
For simplicity’s sake, let’s just add new cells to the same notebook.
To do this, in Google Colab click on “Insert” and then select “Code cell”.
Go ahead and paste the code you got from ChatGPT into a new Google Colab cell, then click the “Run” button.
If your sitemap is relatively small, in a couple of seconds Google Colab will notify you that it successfully finished the task:
Where is the generated file?
In the prompt, you requested ChatGPT to download a full list of URLs and save them to a text file. To find this file in Google Colab, simply click the folder icon on the left, and here you’ll quickly spot the sitemap.txt file.
You can then open the file to see if it was properly generated:
Now, let’s explore some ways to enhance this program:
💡
Make it more bullet-proof: Python programs can crash if for some reason they can’t visit certain URLs. To ensure smooth operation, ask ChatGPT to generate code that can handle errors. ChatGPT can give you new Python code that will alert you if something goes wrong but won’t break the tool. Here’s a sample: Write a program that will extract <title> and meta description from a list of URLs.
Using Python we can easily write a program that will extract basic information (such as <title> and <meta> description from a list of pages).
Write a program that will extract <title> and meta description from a list of URLs.
Using Python we can easily write a program that will extract basic information (such as <title> and <meta> description from a list of pages).
Here’s a sample prompt:
💬
Write a Python program that:
- goes through a list of URLs stored in the “input.csv” file,
- extracts the title and meta description,
- saves the results to a “custom_extraction.csv” file. The file should have three columns: URL, title, and meta description.
Make sure that the program handles any errors that may occur during execution. Write an explanation for each piece of code as a comment.
ChatGPT quickly generated a Python program to accomplish the task:
As before, copy the code generated by ChatGPT and add a new code cell to your Google Colab notebook.
Once you paste the code, click the “Run” button.
When you try to use it, you’ll encounter an error saying the file doesn’t exist. To fix this, upload the “input.csv” file containing the URLs you want to scrape.
Click the “folder” icon from the menu and then upload the file. If you don’t want to use your own file, use my sample.
After uploading the file, the program should run smoothly, extracting the <title> and meta description values. As expected, the results will be saved in the “custom_extraction.csv” file:
Great work, ChatGPT, and Python!
Write a program that measures the speed of a website
You’ve already learned a lot!
Now, let’s dive into something more advanced. We’ll be using the PageSpeed Insights API to gather performance information about multiple URLs in bulk.
❗
Important note:
Now we’re entering a field where ChatGPT can generate different programs for every person. Some of them may work in the first run, some won’t. If something is wrong with the output don’t give up 🙂
Try:
- Telling ChatGPT that the program is not working and asking for a rewrite.
- Asking ChatGPT to create this program, but this time, in a new chat.
- If you use ChatGPT Plus, switch to ChatGPT-4 – it should generate better output.
- If none of this works, put my programs into Google Colab. Here’s the link.
So no worries, we will make it work 🙂
Writing a program that will connect to Google PageSpeed API may seem complicated at first, but the best part is that the prompt is short and quite intuitive:
💬
Write a program that will take a list of URLs from input.txt
and for each of the URLs it will get the PageSpeed Insights score.
Then save the results to the psi_api_output.csv file
with the most important metrics.
As expected, ChatGPT generated this program for me:
Next, copy and paste the code into a new cell in your Google Colab notebook.
Before you run it, please put your PageSpeed Insights API key into the code.
For instance, if your key is: “myKey43434343”, then the line should look like this: api_key = “myKey43434343”
In case you don’t have the PageSpeed Insights API key, you can obtain it from: https://developers.google.com/speed/docs/insights/v5/get-started
Once you enter the code and put your own API key in, click on the “Run” button again.
Unfortunately, I encountered an error. Oh no!
Don’t worry, we’re sticking to a no-code approach for this tutorial.
So let’s ask ChatGPT to help with the error:
ChatGPT explained the cause of the error and provided steps to fix it. Note: I trimmed the screenshot for brevity.
In this case, ChatGPT suggests replacing the get_pagespeed_insights_score function with a new, improved version. You can edit the code just like you would in a text editor, making sure to preserve the indentation (which is essential in Python).
Alternatively, if you’re pressed for time, or just not sure where exactly you should paste this, ask ChatGPT to generate the full program again, and you can replace the entire code so you can copy/paste it directly into the Google Colab Notebook:
After pasting the new code into Google Colab, it works perfectly!
For an even better program, consider the following ideas:
💡
- Ask ChatGPT to display current progress. (In my case, the program took three minutes for just 15 URLs. It can be a long wait for larger URL lists, so having a visual indication of progress is helpful).
Sample prompt: “display current progress, using the TQDM Library - Request specific performance metrics from the PageSpeed Insights API: Ask ChatGPT to retrieve particular metrics to suit your needs. Sample metrics that can be useful: First Contentful Paint, Speed Index, Time to Interactive, First Meaningful Paint.
Write a program for extracting just one column of data
Now imagine you have a CSV file with lots of data, but you only want to extract a single column (for instance, a column with a list of URLs).
Then you do it in Excel, and it crashes. You try again and… it crashes.
Forget Excel; you can do it in Python. Here’s the prompt you can use:
💬
I have a CSV file named ‘gsc_data.csv’ that contains three columns: URL, Clicks, and CTR. Each column is separated by a comma.
- Please write a Python program that extracts only the first column, which contains URLs, and saves it to a separate file named “urls_only.txt”. The program should handle cases where some data is missing for certain URLs.
- After running the program, I would like it to give me a short report about the steps that the program has accomplished.
You already know the workflow:
- Ask ChatGPT to generate the program.
- Copy the output generated by ChatGPT.
- Insert a new code cell in Google Colab.
- Upload the input file into Google Colab.
- Paste the code and click on “Run”.
As expected, the program extracted 21 URLs from ‘gsc_data.csv’ and saved them to ‘urls_only.txt’. Below you can see the preview of this file:
Write a program that combines two files into one
Another area where ChatGPT can be helpful is to generate a program that combines two files into one.
Imagine you have two separate CSV files, one coming from Google Search Console, and the other coming from Google Analytics. Excel may quickly fail with larger volumes.
Here’s a sample prompt:
💬
Please write a Python program that joins data from two files:
1. gsc_data.csv
2. ga_data.csv
based on the URL column
- both present comma-separated values
- If some data is missing in either table, please present it with an empty value
- and make sure the program works fine with any columns
- it should save the data to an external file named: combined_gsc_ga_data.csv
ChatGPT quickly generated the Python program that can join two CSV files:
I suspect you already know what to do now. In case you hesitate:
- Copy the output from ChatGPT.
- Paste the output to the new code cell in Google Colab.
- Upload input files [1] [2] to Google Colab. For the purpose of this tutorial, use files provided by me [1] [2] In case you want to use your own files, ensure that:
– Both files contain the URL field.
– Both files contain URLs in the same format. - Run the code cell.
This Python program perfectly joined the data, as shown in the screenshot below:
Ideas for further improvement of the program:
- Make sure that the data from both sources is homogeneous. Usually, Google Analytics data will present a relative URL while Google Search Console will always present the full URL. If this happens, the program will fail. In such cases, first instruct ChatGPT to add a “https://www” prefix to every URL from the ga_data.csv and then do the rest.
- Ask ChatGPT to analyze the data more thoroughly. For instance, you may want to find URLs that visitors spend a lot of time on, but… with zero clicks from Google users,. This will show you a list of URLs that are interesting for your users, but either aren’t indexed, or have some major ranking problems.
Using ChatGPT API
With the power of the ChatGPT API, you can perform various tasks in bulk, making them more efficient and time-saving. Here’s how some of the examples mentioned earlier can be executed in bulk:
- Summarize your articles in bulk: You can create summaries or meta descriptions for many articles at once.
- Check content quality in bulk: Ask ChatGPT to automatically go through a list of your articles to see how good your website’s content is.
- Fix <title> tags in bulk: You can ask ChatGPT to go through a list of URLs, extract title tags and make ChatGPT optimize your website’s <title> tags better for more clicks and conversions.
- Correct grammar in bulk: Fix grammar mistakes in multiple blog articles all at once.
Now, practice! Let’s create a program using ChatGPT that:
- Reads an input.txt file containing a list of URLs.
- For each URL, analyzes the <title> tag to determine the likelihood of users clicking on it in search results.
This way, you can optimize your website more efficiently and effectively by understanding the performance of each <title> tag.
Here’s a sample prompt:
💬
Write a Python program that will read a list of URLs from input.txt.
- For each URL, it will extract the meta description and <title> tags.
- Next, it will use the ChatGPT API to evaluate the <title> on a scale of 0-10, determining how likely it is to encourage people to click on the search result.
- The program should also provide an explanation for each judgment.
- The results should be saved in a CSV file named title_score.csv.
As always, ChatGPT faced the challenge to generate such a program!
Follow these simple steps to get started:
- Copy the output from ChatGPT.
- Paste the output into a new code cell in Google Colab.
- Modify the code to include your OpenAI API key.
Error: missing module
If you encounter an error stating that the “openai module is not found”, you’ll need to install it. To do so, simply add the command !pip install openai before the ChatGPT output and run the cell again.
The program gave me the expected file with the results:
💡
Making the output better
If you find the output quality isn’t great, there are several ways to improve it:
- Refine the prompt: rewrite the prompt to specifically ask ChatGPT for ideas to enhance your titles.
- Experiment with different ChatGPT ****** in the calls to ChatGPT API: in this case, the Python file generated by ChatGPT uses the text-davinci-002 model, which is based on GPT-2. You will get MUCH better results with GPT-3 or GPT-4 ******.
Important note:
It’s unclear what logic ChatGPT follows while judging the clickability of pages. Unfortunately, it’s more of a black box, like Google.
If you want to have more predictable results, use ChatGPT custom training techniques. As always, make sure it works in your niche!
Asking ChatGPT to explain Python code to you
If you’re new to Python and SEO, all the Python programs out there can seem overwhelming. But don’t worry, there’s an easy way to understand what a particular Python code does!
Just ask ChatGPT to explain it to you. Here’s a sample prompt:
💬
Please explain what the Python code below is meant to do
Here’s my Python code:
Make sure the explanation is written in simple words that anyone can understand.
In my case, I asked ChatGPT to explain one of the programs intended to extract the <title> and meta description tags that we generated earlier:
Wrapping up
I hope my instructions have helped you begin programming in Python. Now, it’s time for you to put what you’ve learned into action.
Think of some programs that can help you in your everyday SEO job and make use of chatGPT to help you write them!
Have fun!
So, don’t hesitate to use this feature anytime you need a little help!