What’s this about a Python Website 404 Checker?
I hate wasting time doing the same thing over and over and over again*.
So I wrote this little python tool to check for a 404 Header Response for a made-up URL on a domain to save me from the drudgery of doing it by hand, to give me absolute consistency of checking, and to give me an audit trail of URLs checked and responses.
*unless it’s perfectly acceptable time-wasting like watching cat videos, obviously.
#SEO #SEOTools #Python #WebDev #TechnicalSEO
Python SEO Website 404 Checker
You’d be surprised how many web servers or CMSs are poorly set up and give out a 200 OK response when they should be giving out a 404 File Not Found response, or 302 redirect, or 301, and so on. That’s why Google Search Console has the “soft 404” report. You really want non-existent URLs to generate a 404, unless you’ve removed content, in which case you want to 301 redirect.
The tool is built on the last last one I published: Python SEO URL Checker for Canonicals & Redirects and follows a similar principle.
- If you give the tool a bare domain (domain.com/) it will check all variants of the domain for you to check what happens with them (ideally 301 redirect to the URL on the canonical domain, which should then produce a 404.).
- If you give the tool and fully-qualified domain name eg http://www.domain.com/ it will only check that variant. It assumes you know what’s happening with the others, and just want to double-check that version.
- It doesn’t check the root domain, but adds 404Test and a local date time stamp to it to create a made-up URL which really should generate a 404 somewhere along the line.
- It doesn’t follow the chain until it gets to a 200 / 404 (I said it was simple) and it doesn’t make the tea.
- It does however, try to give you guidance on what you should do next – not all 301s will be valid, not all 200s will be valid, and not all 404s will be valid (non-canonical domains anyone?).
- This is the script, which links to the text file for download: Belmore Digital Simple 404 Checker.
- It’s GNU-GPL, so feel free to use it & modify. If you have any bright ideas for it, please contact me.
- And here’s some example output from the newly-launched amazon.com.au.
- You can see that
- http://www gives a 307 to https://www,
- http:// 301s to https:// (looks like an extra hop there),
- https:// 301s to https://www.
- https://www. produces a 404, which is probably correct.
- So, all reasonably good, but you’d want to shorten the http to https hops by one if possible, and if you’re going to give a 307 for http://www, you should probably do the same for the non-secure http:// variant.
- 7/10 for amazon.com.au then. Not bad. They may have a future.
Mobile First is NOT Mobile Friendly
I recently wrote about how Google’s Mobile First is Not Mobile Friendly. Read it now.
- A pretty simple python SEO tool to check a made-up URL generates a 404.
- It’ll take a bare domain, and run through the http(s)://(www.) variants or a FQDN and just check that.
- It’s based on my last Python SEO tool checking for canonical domain redirects.
- Read The State of SEO in mid-2017.
- Read about how Google’s Mobile First Index is not Mobile Friendly.
- Finally, get your content ranking well on Google by starting to understand Find Crawl Index.
Thanks for reading. If you would like to discuss what these changes mean for your web property, or would like to know how to implement them, please feel free to contact me.