Technical Audit Process

Written by

Richard Magallanes

Published on

SEOptimer Setup

Run a SEOptimer Scan

Go to https://www.seoptimer.com/ → Enter URL → Click “Audit”

Does the website have a robot.txt file?

  • If Yes, then “Pass” // If No, then “Fail”

Does the website have a sitemap?

  • If Yes, then “Pass” // If No, then “Fail”

Does the website have Google Analytics installed?

  • If Yes, then “Pass” // If No, then “Fail”

Is the website using Flash?

  • If Yes, then “Fail” // If No, then “Pass”

Is the website using iFrames?

  • If Yes, then “Fail” // If No, then “Pass”

Is the website using legible font sizes across all devices?

  • If Yes, then “Pass” // If No, then “Fail”

Are all links and buttons easy to tap on mobile?

Some of the links or buttons on your page may be too small for a user to easily tap on a touchscreen.

  • If Yes, then “Pass” // If No, then “Fail”

Is the website using deprecated HTML tags?

These tags are no longer officially supported in modern web browsers, and hence are recommended to be removed as they could cause display issues.

  • If Yes, then “Fail” // If No, then “Pass”

Does the website have an SSL certificate enabled?

  • If Yes, then “Pass” // If No, then “Fail”

Does the non-secure version of the website (HTTP) 301 redirect to the secure version (HTTPS)?

  • If Yes, then “Pass” // If No, then “Fail”

Does the website have Malware? (use Sucuri)

  • If Yes, then “Fail” // If No, then “Pass”

Does the website use structured data?

(Use https://search.google.com/structured-data/testing-tool/u/0/)

  • Check a few internal pages
  • If Yes, then “Pass” // If No, then “Fail”

Is the structured data valid?

HTTP Status IO

Is https canonicalized using 301 redirects?

  • Type in the domain URL
  • Click “Canonical domain check” → Click “Check Status”

78uWlnVp6Dqe_86LTcBuQ5c1jKzXRNPeVR-La2UVElXhunXahuJfuCu2u8YaCHOX52hUrdeA7Au_1T0hgwXlMtZptruWPMO83kdQoDYQsgO4K9blobhtDVZAx9NOkJYBqed2QJ69zEcbUplWil-V xJydHfOiaE1vvfyDL-HSKlAV8PP0l3f9Ss63-rgwYPR93TNpYauh_2fSOhg_qfi9zhjwiHeOpxqgWWtaPo-8ePYkXOldtFjkhHZLAnReA0QmaV57qiKIeNK5kwR2lfsKQyVN1xeG3ohCS1A4peHm

  • Make sure there is no more than 1 redirect and all URLs redirect to the last URL in the list
  • Check the last URL has a self referencing canonical tag
  • Check the redirect latency is less than 100ms
  • If it meets the criteria, mark as “Pass”

Do URLs w/ trailing slash 301 redirect to URLs w/o trailing slash, or vice versa?

  • Grab a blog post URL and paste one version with a trailing slash and one without

cS-DpmJL0Nbg6j1LBwJgWS6HnSNCTzAwv2btA6WkPm-i6hxxOOKnyrC-IlxvfZyvvAULrC26-Coi3fhUqE8uvp21Xzf-RtgVwNl4EFZYUHXfsvxj-U27T2OfWlT0Y-SE34SUpdfonvZm-LPwxi0Y

  • Click “Check Status”
  • Make sure one of the URLs 301 redirects to the other like this:

VQxyz5AMoAy3q_eRT5MAltZtVwZksywUBtmOsR-gzs_ql3whTQSSXayUlD9txIle2SAfkMyvlMT-7FvPXJNGP3H5CvTz0Q9Gv3kTCcNZLG-M6ieFbbLMF9ropxpxaQQNHpB7PECMbc5V37ZqofTX

  • Check the redirect latency is less than 100ms
  • If it meets the criteria, mark as “Pass”

Pingdom

Does the website load in less than 3 seconds?

T7dsJNh4BLMHJDjMEIaORyl6EpmjelPcRsxTejB6P8IARrmnO-10ETdMijdNpY1SonR4_Rwlp5FLLrkiNvBTBhZ5pjE4nDYGKq3GpRbej5xtIghsBqiaVDIydfHhEoBsWLdH3qyE2hOLw5DgGhkv

  • If faster than 3 seconds, mark as pass
  • Use Pingdom to give page speed recommendations

GTMetrix

Is the GTMetrix PageSpeed score 90% or above?

  • Go to https://gtmetrix.com/ → Make a free account
  • Click “Analysis Options” (bottom left) → Click “Test URL In” dropdown → Select closest server → Analyze

3N6MegTD9IxIWXRwvoFv1tDHBfx2sjOTVKVFbUpH5DBYhlj62AQrcx_1GEemmvAwNP4T5JHuvP_CWb7IQQE2GTDpv9iDTPAEMC3GzneJXB8C_1CHqtXjjSEbbCYnR3UvmZOh24MQWd0c_r3N9xTP Fr2VieeyxGRxzy5C1Z5MVmeZLJ1M-QMOD2oyk7ykBuZOl-a2t9xVbprB9P__owq45enFcOponxiN7lAB4jvOjdz2KHogqZ8l6aBAE013RkxRJcNjODCSt5T2MtV5fVz7gLEUlppwlqDLn0ANTY2-

  • Above 90%, mark as pass
  • Use GTMetrix to give page speed recommendations

Google Page Speed Insights

Is the PageSpeed score 90% or above?

SxZ4hkM2-AWPUcluG7zvGjKIuFlM-49HDBCj1mS70BYfg1b8Hc7kLvmalsA5GkS2Peu7uGbLk2tRNzmAToeCmJd5QEQubSHocw2_QmNBe_2ninajMTFVJmVinr6iKuVDsIY1HYRABH-BEknERT-_ dD_blWLrgMRmg0hhSWcUV5_pSN7cUzKwL5sIBP_OILLwgViiNzZz7yLJj2X9GmhQugOVa4YgTL7vf5VTy9jCv_W0svarBFhBJ83u_3WYapZcbHAr1ENFc4PHez6ZiVNK2aUej9PDHG2-f0peGmnj

  • Above 90%, mark as pass
  • Use Page Speed Insights to give recommendations

Copyscape

Does any copied content exist (internally & externally)?

  • Go to https://www.copyscape.com/
  • Type in the domain URL →Go
  • Find recent blog post and old blog posts → test those as well
  • If duplicate content is found →Mark as failed

Google Search Console

Are there any GSC coverage errors or warnings?

Go to https://search.google.com/search-console → Coverage

  • If Yes, then “Pass” // If No, then “Fail”

Is the XML sitemap submitted to Google Search Console?

Go to https://search.google.com/search-console → Sitemaps

  • website/sitemap.xml
  • If Yes, then “Fail” // If No, then “Pass”

Are XML sitemaps compressed?

Go to https://search.google.com/search-console → Sitemaps

  • If Yes, then “Fail” // If No, then “Pass”

Does the website have any Mobile Usability issues?

Go to https://search.google.com/search-console → Search Traffic → Mobile Usability

  • If Yes, then “Fail” // If No, then “Pass”

Does the website have any Manual Actions?

Go to https://search.google.com/search-console → Search Traffic → Manual Actions (under security & manual actions)

  • If Yes, then “Fail” // If No, then “Pass”

Does the website have any Security Issues?

Go to https://search.google.com/search-console → Security Issues

  • If Yes, then “Fail” // If No, then “Pass”

Screaming Frog SEO Spider Setup

Screaming Frog Setup

  • Open up Screaming Frog SEO Spider → Configuration → Click “Spider”
    • Check “Crawl Linked XML Sitemaps” → Click “Crawl These Sitemaps” → Enter Sitemap URL
  • Click Configuration → Click “Custom” → Click “Search”
    • Change Filter → “does not contain” → Enter Google Analytics Tag # → Click Ok
  • Click Crawl Analysis → “Configure” → Check “Auto-analyze at end of crawl” → Click Ok
  • Click Configuration → “Content” → “Duplicates” → “Enable Near Duplicates” → Set Near Duplicate Similarity Threshold (%) to 85 → Click Ok
  • Enter website URL → Click “Start” →Wait for crawl to finish @ 100%

Export All URLs

  • Click the Internal tab →”Export” → Save to desktop
  • Go to Google Sheet →Add a new tab →File →Import →Upload →Upload “internal_html.csv” → Check “Insert new sheet(s)” → OK
  • Go to “internal_html” →Click the filter button:

wRWC1po8aYl3B-UXvAU1PDf7hBXeVo_yRUlyghwCYIaIKKGkTSp8pxj1vB69mzIxMq7Z7N0sH9o27IV9Gb_5Fa9CpLNC0_YGKzfDkw4VXxLeccFvg4UeKdnkFUzrQSQSc9Xi4SfBNzdF4hS_7RI1

Technical Audit Problems & Solutions

Are there pages with low word counts (less than 400)?

  • Filter Content → Clear → Select entries that contain “html”
  • Filter Indexability → Select clear → Select Indexable → OK
  • Filter Word Count → Filter by condition → Dropdown →”Less than” →Enter 400 →OK
  • Add a new tab →Label “Thin Content”
  • Copy “Address” column → Paste to “Thin Content”
  • Copy “Word Count” column → Paste to “Thin Content”
  • Go back to “Crawl” tab → deselect the filter button

2ubbO7tYiSELSr2lEvTKEiT_dj2QCPNFMFsFLPwB-L5nMg8H7CwkC6Oh816hKQY0sL-5h3exXJlcF2XjyVRQpbnhWb4h03AGiVs-JF8LzB30WMWCT6lmd6sr6JGDqXaBP-KOl_AIJbxZdm8XFTnm

  • Mark as “FAIL” in the checklist

Solution:

Thin content doesn’t rank well in Google and Google’s Panda algorithm targets websites that have excessive amounts of thin content. These pages are thin or may have little to no value to searchers. They can eat up your website’s crawl budget and having many of them may trigger a Google penalty. Consider adding “noindex” meta tags if deemed unimportant or add more content to these pages. If the page is important or valuable such as a contact page, do not add “noindex” meta tags to those pages, instead, look at how you can add more relevant content to the page in order for it to no longer be considered thin content and therefore, rank better in Google.

Does copied content exist internally?

  • Go to content tab in Screaming Frog
  • Select “Near Duplicates” under filter
  • Add a new tab →Label “Internal Duplicate Content”
  • Copy “Address” column → Paste to “Internal Duplicate Content”
  • Copy “Closest Similarity Match” column → Paste to “Internal Duplicate Content”
  • Copy “No. Near Duplicates” column → Paste to “Internal Duplicate Content”

Internally copied content can cause canibalisation issues within your site. This can result in multiple pages being relevant and ranking for the same keyword, causing Google to become confused about which page to rank for said term, resulting in poorer rankings overall. Ensure that each page is unique so that Google understands what pages to rank.

Is content behind a URL fragment (#-symbol in the URL)?

  • Search → # → Enter

sdK1VI8j2YApjPc5fT34oAcq-6Ow89BRdJIc2_yosqR2wilZh4kEiVA-OCTb6I3xJR8ivvenXIugZksHYFt3DOxEQH6lOOLvFIPTUgAY10PPL_s7EifEJFEBRc_bPnwZ-QeeN6k4TjLgr8gV2koX

  • Look for URLs like this: yourdomain.com/#content
  • Check to see if the #s come from table of contents
  • If table of contents, mark “Pass” // if not, “Fail”

Solution:

Most websites do not use #s anymore, except for a table of content links. Table of content scroll links with #s are ok. However, #’s used to load new content are not. Have a developer create a static page instead of a dynamically generated one with hashtags.Are there 404 errors?

eUCbltkgoO6HAdVIIYeX9YUvL-8QA9OaVW3fC4FMbDoV_8Zk3b9MiF7ccz9xNmQoVOyhdb2dqv89M3Jrwh5d4vnj8mSv1yjFR1rPyOOdIk2pHTlHkijbPfmZAL_MxYY9nscA1w56jADdmK9PQNCv

  • If errors, then mark as “Pass” // no errors, mark as “Fail”
    • Import to Google Sheets

Solution:

If the 404 error has backlinks, you’ll want to 301 redirect to a relevant page. If there isn’t a relevant page, then the homepage is the second best place. Otherwise, if the page doesn’t exist any longer, doesn’t have backlinks and isn’t relevant anymore, then just let it 404 and Google will remove it from the index.

Are there broken internal links?

  • Go to “response_codes_client_error(4xx)” tab
  • Click filter button → filter “Destination” → Select clear →Type in domain name →Select → OK
  • Are there entries, mark as “Fail” // No entries, mark as “Pass”
    • Create “Internal Broken Links” tab and copy data

Solution:

Make internal URLs point to the correct path. If the content doesn’t exist, find a relevant page to link to.

Are there broken external links?

  • Go to “response_codes_client_error(4xx)” tab
  • Click filter button → filter “Destination” →Type in domain name → Select clear → OK
  • Are there entries, mark as “Fail” // No entries, mark as “Pass”
    • Create “External Broken Links” tab and copy data

Solution:

Find a relevant piece of content to externally link to or remove the link.

Are there canonical tags missing?

Screaming Frog SEO Spider → “Canonicals” tab → Filter: “Missing”

  • Import “canonicals_missing.csv” to Google Sheets
  • Click filter button →Filter indexability → Clear →Select Indexable
  • Are there any entries, then mark as “Fail” // If no entries, then mark as “Pass”

_FOu6lWRGlBQn2IvAR0Gv69lYxgzjtMnVWMMTWIYOfNzUyynXrb9NeBMHdny065RQPMmo-quVCOk2oPt0y5s9yyPa19juT83uapfTBptKuWZ-qaOjFvsIGn8LxkVpltH_CB2WPX8Pl4XkoaG5Mlq

Solution:

“A canonical tag (aka “rel canonical”) is a way of telling search engines that a specific URL represents the master copy of a page.” – Moz. Effective use of canonical tags can help eliminate duplicate content issues (although a 301 is better in certain scenarios). Google will auto canonicalize pages that are missing canonical tags. Best practice is to consider adding them to most or all of your important pages you wish to rank.

Are there images over 100 KBs?

Screaming Frog SEO Spider → Images → Filter: “Over 100 kb”

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import “images_over_100_kb” to Google Sheets

Solution:

Images over 100KB are large and can be compressed further. Compress these images to increase site speed for specific pages.

Recommendations:

Quick Compress

  1. Download all images >100kb
  2. Resize them to the exact dimensions using either:
    1. Canva
    2. Photoshop
    3. Preview (Mac)
  3. Run them through https://tinyjpg.com/
  4. If <100kb, re-upload and replace.

WordPress Plugins

  1. Install an image compression plugin (WordPress-specific)
    1. ShortPixel Image Optimizer (Freemium)
    2. WPRocket’s Imagify (Premium)

Are there images missing Alt tags?

Screaming Frog SEO Spider → Images → Filter: “Missing Alt Text”

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import “images_missing_alt_text” to Google Sheets

Solution:

The Alt text should be used to describe what your image is because this helps Google understand and index it appropriately. Optimize alt text to improve your rankings in Google’s Image Search.

Are there any pages that are insecure although an SSL certificate is installed?

Screaming Frog SEO Spider → Reports → “Insecure Content”

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import “insecure_content.csv” to Google sheets

Solution:

Every page on your website should be secure if you have an SSL certificate installed. And most importantly, all insecure URLs should 301 redirect to your secure URLS (https).

Are there duplicate page titles?

he8CkUflPGCTgWSoZJfTrU1TIzFRjpRel_rtrbpcEacELpAmBUkyg4QU1cK5xKdmMMhz7MWyakyDNRadfUYe-cAw1N6OALANX--mzSzDyumGM8mxWyNh_l0a0UMMuo-QdDqAUUn1pE0cpLDxxEVC

Screaming Frog SEO Spider → “Page Titles” tab → Filter: “Duplicate”

  • If entries exist and are not from page #s, then mark as “Fail”
    • Export
    • Import “page_titles_duplicate.csv” to Google sheets
  • If there are no entries, then mark as “Pass”

Solution:

Duplicate titles or duplicate anything on your website needs to be avoided because of Google’s Panda Algorithm. Do your best to make everything unique. If there are excessive duplicate titles, then there may be a larger issue that needs to be handled.

Are any titles over 60 characters?

This isn’t a big deal, but titles longer than 65 characters may get cut off in Google’s SERPs. It may hurt organic CTR as a result.

ZE34tRv2oWxTBix-z7uQHm68mMpljfwB5d77yxT3oDQf6SnJKRpK6ylNlpk3AAV_icPlXq1U-07ZzmNtO9RmMp4kySYE55Fa8yR5Cd1BGihts1TZzcOk2WOOfieJwJQi8YcpsIQ_ZZ4PMLqRfA0RaA

Screaming Frog SEO Spider → Page Title Tab → Filter: Over 60 Characters → Export

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Are there any pages with missing meta descriptions?

Your meta description is an opportunity to use copywriting to persuade searchers to click through on your result. A higher organic CTR is a positive signal for your page and can potentially help your rankings “stick.”

Screaming Frog SEO Spider → Meta Description Tab → Filter: Missing → Export

EEzjt6Kzpn4iaT5uH-eIoQQ85s8PUcBUP_Q2v568PsKDxegpMnNPdZDXPR8IvAzWWL3eK6bDp8lT41nIyfGKrGGLEUtfzJh8IyJI2Q9_nWgTGXmZ1MznhCk28cm1oOIEGfcR7o9GIpR_xSyPdZqz

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Are there any pages with duplicate meta descriptions?

Google prefers unique content. Ensure to write a Meta Description that is unique & relevant to each individual page.

SXPFGvsLyJw4aqWks7mkfqAbRAQFg55ivbRWKCEfgXGNNBkDDAN3vDSL2ueKXAqIkj2q2pK95ldIzLx1rmXW2xCF9lBnLOWA4xucKH3Jge43oISnmz7JotK8lkboWNZDXtnAdQQzq7Z0FEEVVHTs

Screaming Frog SEO Spider → Meta Description Tab → Filter: Duplicate → Export

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Are there any pages with meta descriptions less than 70 characters?

Your meta description is an opportunity to persuade searchers to click through on your result. Take advantage of all the real estate Google gives you to reach this objective.

kNqMCjMVVhDCpYZw0MVfHVgCbMqqprkrT74SQqSy0BPj8MeHBWhBOHIZ50TiccRmI4ulX6YqpYyGOfDCY9MlzL_PY-UWuHW-0iSAMcRuwrPwgEs9cb_2hT2t9AQ1WNCTZpNEuvROeVBucqxciChf

Screaming Frog SEO Spider → Meta Description Tab → Filter: Below 70 Characters → Export

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Are there pages using meta keywords?

Meta keywords do NOT improve your organic search performance and are unnecessary. You can delete meta keywords from pages that are using them.

Screaming Frog SEO Spider → Meta Keywords → Filter: “All” (also fine if occurrences = 0)

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Is the non-secure version of the website 301 redirecting to the secure version (https)?

Why this matters: 1) We want to avoid link equity having to pass through a 301 redirect and 2) Redirects can slow website loading speed.

Screaming Frog SEO Spider → Security → Filter: “HTTP”

  • If “Yes”, then mark as “Pass” // If “No”, then mark as “Fail”
    • Export
    • Import to Google sheets

Are any pages not responding?

Any unresponsive page (internal or external) is bad for User Experience (UX).

Screaming Frog SEO → Response Codes → Filter: “No Response”

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets
  • Bulk export → Links → all outlinks to find source & destination

Are there 301 chains that can be eliminated?

XL0OhTB701aFNVu0GB0I6Vn2I-V4wS-Rs-WxDgpdttHcbzJXPw_4yKRBT87RW69bU7vi6eMSWMhhJQIQXJjsIWXwGRQTljco8mz_zY80IKr74pEzITc3BGdzqdCf0iNf_NBfaspG_npExM4IgleL

  • Your original URL was https://www.bluewidgets.com/best-blue-widgets/ (A)
  • Your new URL is https://www.bluewidgets.com/blue-widgets/ (C)

The problem is that all internal and external links are still linking to https://www.bluewidgets.com/best-blue-widgets/ (A). Excessive redirects can slow your website loading speed and cause a loss of pagerank or link equity.

o1EsZZoNQ5ecU4oAV-k2NpF5ewIEv-1WoioV4j7AUmgZ77WFT8ZC9PxYmPC0I7waFpzmodpsv4sdsqgq2dDNqMJb_cKzx1NMdqYfWAWQ-6Wqr0GOS1gLi-6riRdBfvSlgEFR2hXpza3a7U43-EHt

Screaming Frog SEO Spider → Redirects → Redirect Chains

  • If “Yes”, then mark as “Fail” // If “No”, then mark as “Pass”
    • Export
    • Import to Google sheets

Are there redirect loops that can be eliminated?

Fat80z84cvzUaKiOMe-j6NU3Dotqpo6pX5bYBTl43LZjDAw46RbD4BMtJyp7xvBwnpJpo7u6_sYOKFN92xQRVaWqYQAkV2IWwsbUP4t26MGlrTXPqAnJxR3QFtwCDi5WNlaenv8jhqo6oXCJRMfy

Screaming Frog SEO Spider → Reports → Redirect Chains

  • Import To Google Sheets
  • Look under the “Loop” column
  • If “TRUE”, then mark as “Fail” // If “FALSE”, then mark as “Pass”

Do internal links point to a 301 redirect or 301 redirect loop?

If a user clicks an internal link that ends up redirecting, it adds to their load time and could result in a bounce. And if 301 redirects are linked internally, it takes crawlers more time to get to the final page. This is a waste of your valuable crawl budget and means search engines will spend less time crawling and indexing live pages that you want them to crawl.

  • Screaming Frog SEO Spider → Bulk Export → Response Codes → Redirection (3XX) Inlinks
  • If present, mark as “Fail” // mark as “Pass”

Are there canonical chains that can be eliminated?

A canonical tag (rel=”canonical”) is a snippet of HTML code that defines the main version for duplicate, near-duplicate and similar pages. -AHREFS

Here’s an example of a canonical chain:

9mUpHPJyJjHjUH_IH4TuJEbTWgHBFb5cYWvoRJyyJl7jcQNQg1ZHy654fdWYpanyXmlbVkDJkPLUpiLBDtM8MIhtgcuT40gh0x_EWOzyMV-DvYfi-D7_esfKcDZ2kRTEUtCc-hpMhbkve-lrhUEd

Both A, B and C have similar or the same content. We want page C to rank in Google.

Problem: A canonicalizes → B canonicalizes → C

Solution: have both A & B canonicalize → C

Screaming Frog SEO Spider → Reports → Canonicals → Canonical Chains

EIB58lpZ0kjty7qA-5Biw5uXzbX0V_8Y25V55feXWZrUk7ZDEAtmboZaYcMMrsCPsuyVeENmJBmx1aZU8c6rCdVBK9D28eitn6Qc74gy31fNHkA0jOPkJMAbcO4gDgG2aS1Yu2DFGA8S-vobHqAr

  • If entries exist, then mark as “Fail” // If not, then mark as “Pass”
    • Import to Google sheets

Are there any 302 or 307 redirects?

Google claims that PageRank and backlink equity flows through 302 redirects just like a 301. However, it’s better to be safe and just change the 302s to 301 if the 302 isn’t being used for its purpose, which is a Temporary Redirect.

  • Spot check a few links using httpstatus.io because there are often false positives

Screaming Frog SEO Spider → Response Codes → Filter: “Redirection” → Export

  • Import to Google sheets
  • Filter Status → Select only 302 and 307 → Ok
    • Entries exist, then mark as “Fail” // If not, then mark as “Pass”

Is any part of the website blocked because of the Robots.txt file?

Google must be able to crawl your website for your pages index and rank. However, some pages can and should be blocked.

Any pages that have # or ? or session IDs are ok to be blocked by robots.txt.

Screaming Frog SEO Spider → Bulk Export → Response Codes → Blocked by Robots.txt

84GYZc7fnr1QwY6rNFCVBybqaqP2oM7ujT_x0FyoGEcBe1fgufP-cOuGS_4Yq9-Dxr5fPF04cKtNfGGyv_0O1d5bR-hr6XJkB2BKSP6ffHOSX4ugEirzeE6dlRZkY4ye-PdvzbcMAIgKeqAUzGDA

  • Look at “To” pages or destination pages
  • Is there anything that does not include #, ? or session
  • Are these pages important?
  • If “Yes” mark as “Fail” // If “No”, then mark as “Pass”

Are pages using the “Noindex” directive correctly?

The “Noindex” tag is a directive for Google crawlers telling it NOT to crawl your page. This is a useful tag for pages that shouldn’t be in Google index such as password protected pages, login pages, account pages, cart pages etc.

Screaming Frog SEO Spider → Directives → Filter “No Index”

  • If “Yes”, then mark as “Pass” // If “No”, then mark as “Fail”
    • Export
    • Import to Google sheets

Are marketing tags implemented on all pages?

Eb1HxNLc3Why79x4XWBopW2nmhhOYdhotmr2yYbzvV0_9gQcsoNp_2B23DA2vl8iAOLfsPDAwXQy8XHRxg-1gPoZ66SQk34T-FEfceBBM5j5sJk-9eO3YpgD6ElxYLF0A_zUD1UIvMDksJ5nX-nd pW059QgzLgh4Lk50Ayf_zyTvU2fZ9PMFvKcHrhkf7xET7br1QT4qPdgBqEGAWA1yd3FcpOZ6_NrEZEwjFU61sF95BKFbx_J32leCLAOq7eZ30B8Yw-00RHWNDp3S88HTwerPoE_RFZVIRWpkullR

Screaming Frog SEO Spider → Dropdown tab → Custom Search → Filter: Does Not Contain

  • Entries exist, mark as “Fail” // If not, then mark as “Pass”
    • Export
    • Import to Google sheets

Google Analytic tags need to be on all pages in order to properly track them.

If the custom search function was setup properly in the beginning, you should see a “Does Not Contain” in the right overview section. What happens is Screaming Frog searches each page for the tag.Are there any orphan URLs?

O3KDwcOYdTmmOmEPj_8KXazZYRhcxvXFKINwlfSAk6a8Zx_O3mj13FtlzzXEiNkPcEmMnl8jgn0xcT0jHIvQT5ZdwUrSHFumTNNVLgvmfTXdhooPqB4eVekttRuMZRdzgB8BpIRR4Xu_MssEYWEmYw SUhorwOgwd-_PWo1y3gwh7Bc4Pdx5cDHD941_2uZrvUBpb8DPqoTi46BkWQRk0H1cKw2iYKa2-6iCFPkCA9wMtKehl3QCXS4-g9mErGQmTCcr27X77lcIichyWSP7f_NZ0oZxzhDv1y1bMgCf2-W

Screaming Frog SEO Spider → Sitemaps → Filter: Orphan URLs

  • Entries exist, mark as “Fail” // If not, then mark as “Pass”
    • Export
    • Import to Google sheets

Orphan URLs have no internal links to them. This makes it difficult for users to find the content and for search engines to crawl the pages. Orphan pages might be discovered by Google from sources (like XML Sitemaps, or external links for example), but without any internal links, they won’t be passed internal PageRank. If you internally link to these pages, they may receive a boost in search engine rankings.

A small number of orphan pages is common and not generally a big issue, but many pages eat at the crawl budget and create index bloat.Are there any soft 404’s?

Soft 404’s should pop up in Google Search Console:

 

Additionally, you might see it in ScreamingFrog as well:

 

Are there any non-indexable pages in the sitemap?

8w2ve6S8XBxbj6i1RpOf7BsROopVS9E58L7JV4VI0kyA3W6PO5yTcb1EpyUhj4B3BEq3lGkguu8dCElusu0msYzHINgUbvwshUNYy9_TbeL53Ay0GUDv8Zlhu5-buQ29JqZI3VEPk6MEc8jp4P7OQQ s_FrYAFkpCdMMwras67Ict2419bzj3pI2Odr1T7oO2gwT37t0yviLndJqi7sTjRJKflocotRFE7mMYmh3-RlsMeKm38sOUvzIv4vARiVvMT55i_gfA8esKLeiAI8o6QZFlQjb1519wW6-BjszsE3

Screaming Frog SEO Spider → Dropdown tab → Sitemaps → Filter: Non-indexable URLs in Sitemap

  • If you see sitemap.xml files just ignore them
  • Entries exist, mark as “Fail” // If not, then mark as “Pass”
    • Export
    • Import to Google sheets

If a url is no-indexed that means it won’t appear in search engines. Sitemaps are a list of URLs that we want search engines to discover. Non-indexed URLs should not be included in sitemaps. Remove any non-index URLs from the sitemap if present.

Are any XML sitemaps over 50K URLs?

SUhorwOgwd-_PWo1y3gwh7Bc4Pdx5cDHD941_2uZrvUBpb8DPqoTi46BkWQRk0H1cKw2iYKa2-6iCFPkCA9wMtKehl3QCXS4-g9mErGQmTCcr27X77lcIichyWSP7f_NZ0oZxzhDv1y1bMgCf2-W _jSLcqEXOYkKQp-YNt50yM_iU5j2L8b3SH9IcO4ReuMMA7VECrhPp8Gw6jNfXFZRV5ksYEtwqRMcrONJvI_b18EQcmrvyJpB9MNLj3qo93KfyTDNRPapNzCfe47NbEkVmXPMVL93Q0fsvPU-GCxI

Screaming Frog SEO Spider → Dropdown tab → Sitemaps → Filter: XML Sitemap with over 50K URLs

  • Entries exist, mark as “Fail” // If not, then mark as “Pass”
    • Export
    • Import to Google sheets

Sitemaps have a 50K limit.

Are any XML sitemaps over 50MB in size?

SUhorwOgwd-_PWo1y3gwh7Bc4Pdx5cDHD941_2uZrvUBpb8DPqoTi46BkWQRk0H1cKw2iYKa2-6iCFPkCA9wMtKehl3QCXS4-g9mErGQmTCcr27X77lcIichyWSP7f_NZ0oZxzhDv1y1bMgCf2-W fim5iUZQOvPyhCV6N21oewunwFTaT6ddLOb-HqrpG7TrZnULFN5yUnLyoM2NaSHJNlNHGVOSayxFcqVs7mTG5uJIzapnKVUcb--N_FD56f4VeBwQxoKRd1xXGI0yVeudOHdAhJGva0Chzi1TFjDz

Screaming Frog SEO Spider → Dropdown tab → Sitemaps → Filter: XML Sitemap with over 50MB

  • Entries exist, mark as “Fail” // If not, then mark as “Pass”
    • Export
    • Import to Google sheets

Sitemaps have a 50MB size limit.Are Internal Links No Follow?

No-follow link attributions can have a negative effect on the flow of both internal and external link equity. Google has changed their stance on this multiple times but it is well known within the community and internally at Google that, though “PageRank” is not as important of a ranking factor as it once was, ensuring that both internal and external links are do-follow is the best way to guarantee that link equity is being pass to and throughout your site.

  • Screaming Frog SEO Spider → Bulk Export → Links → All Inlinks → Export
  • Within the output, check the column titled “Follow”
  • If output contains “False” for any Inlinks, filter and analyse to check if this has been done on purpose (e.g no-following a link to an obsolete product in a mega menu)
  • If there are still no-follow internal links to pages we value and want to rank, mark as “Fail” // mark as “Pass”

Is important content deeper than 4 clicks from the home page?

If a page needs more than 3 clicks to be reached, it will perform poorly because search engines will have issues crawling it compared to a page available in one click.

In fact, deep pages have a lower pagerank because search engines are less likely to find them and crawl them. If a page is hard to find, crawlers won’t check them as often as pages at depth 2 or lower, thus lowering their chance to rank and provide optimal link equity.

  • Screaming Frog SEO Spider → Crawl Depth
  • Filter Crawl Depth >= 4
  • If cases are found, mark as “Failed” and export // mark as “Pass”

Is the client not using automatic IP redirection?

Why it’s important to check this:

  • Search Engine Crawling: Search engines like Google use crawlers with IP addresses that may not be from the same region as the primary audience. If the website redirects based on IP, the crawler might see a different version of the content, which can affect indexing.
  • Duplicate Content Issues: If different versions of the site (localized versions) are accessible through different IPs without proper hreflang tags or canonicalization, it can lead to duplicate content issues, harming SEO.

Use a Proxy or VPN:

  • Change your IP Address: Use a proxy or a VPN to connect to the internet from a different location (different country or region).
  • Access the Website: Visit the website and see if the content changes or if you are redirected to a different version of the site. If the content remains the same or there is no redirection, the site might not be using automatic IP redirection.

Are there any parametered URLs in Google’s index?

Example of a parametered URL:

 

1. Crawl the Website:

  • Screaming Frog will start crawling your website, gathering data on all the URLs it encounters.
  • Let the crawl finish. This might take some time depending on the size of your website.

2. Identify Parameterized URLs:

  • Once the crawl is complete, go to the “Internal” tab. This tab shows all the internal URLs discovered during the crawl.
  • Use the filter options to identify parameterized URLs:
    • Filter by URL: Click on the filter dropdown and select “All.”
    • Search by Parameters: Use the search box to look for common parameter indicators such as ?, &, =.
    • Example: Enter ? in the search box to filter out all URLs containing query parameters.

3. Analyze the Results:

  • Review the list of parameterized URLs. You can see detailed information about each URL, including the status code, page title, meta description, and more.
  • Check for duplicates or unnecessary parameters that might be causing SEO issues.

4. Export the Data:

  • If you need to further analyze the data, you can export the list of parameterized URLs.
  • Click on “Export” in the top menu and choose “Crawl Overview” or “All URLs” to export the data to a CSV file.

Are subdomains noindexed?

Use these formulas:

site:domain.com -inurl:https

site:domain.com -inurl:www

Ensure that you replace “domain.com” with your client’s URL.

Manual Analysis

Does the site rank for their brand name?

Websites that do not rank for their own brand names are either too new or they may have been hit with Google penalties.

  • Type in the brand name into Google
  • Do they rank #1, then “Pass”

Does the robots.txt include the sitemap.xml?

  • Check robots.txt

WHY: Including the sitemap in the robots.txt is best practice, and the most explicit way to share the sitemap file with Google and other search engines.Is there more than one robots.txt?

  • Mark N/A

Do internal navigation links break with javascript turned off?

  • Open website in Safari
  • Safari → preferences → advanced → show develop menu in menu bar (if not already enabled)
  • Develop → disable JavaScript
  • Refresh Website
  • Check if navigation works
  • If it works, mark as “Pass” // If not, then mark as “Fail”

WHY: If internal links in a mega-menu or dropdown style navigation are broken when JS is turned off, it likely means they aren’t in the client-rendered version of the site, and are injected with Javascript when the user interacts with the menu. This means the likelihood of Google and other SE’s crawling those links is low.

Googlebot was unable to crawl and index content created using JavaScript. This caused many SEOs to disallow crawling of JS/CSS files in robots.txt. This changed when Google deprecated AJAX crawling in 2015. However, not all search engine crawlers are able to process JavaScript (.e.g, Bing struggle to render and index JavaScript).

Therefore, you should test important pages with JS and CSS turned off to check for crawling errors.

Why is this important?

While Google can crawl and index JavaScript, there are some limitations you need to know.

All resources (images, JS, CSS) must be available in order to be crawled, rendered and indexed.

All links need to have proper HTML anchor tags.

The rendered page snapshot is taken at 5 seconds (so content must be loaded in 5 seconds or it will not be indexed).

Go to website.com/robots.txt

If Yes, then “Fail” // If No, then “Pass”

Does content load with Javascript off?

  • Open website in Safari
  • Safari → preferences → advanced → show develop menu in menu bar (if not already enabled)
  • Develop → disable JavaScript
  • Refresh Website
  • Check if content loads
  • If it loads, mark as “Pass” // If not, then mark as “Fail”

WHY: For a similar reason as checking the navigation. If any primary content is loaded exclusively through JavaScript, there will likely be impacts on search visibility because of the second round of indexation required to even have the chance to read that content.

Are crawlers able to see internal links with JavaScript turned off?

  • Check visually
  • If visually broken, choose an element on the site that you cannot see with JavaScript turned off
  • Inspect page with JavaScript turned off
  • Command F to search for the element
  • If element is present, mark as “Pass” // If not, then mark as “Fail”

WHY: Google can only follow <a href> links. If links aren’t clickable in JavaScript the likelihood they’re formatted incorrectly for Google and search engines to be able to follow them is high.

Are there any paywalls? If so, are those pages using paywall schema and are indexed?

  • Use “Rich Results Test” link in checklist
  • Open up “Detected items”
  • Command F: paywall → if present as cssSelector: .paywall, then there is a paywall
  • If no paywall: mark as “Pass”
  • If there is Paywall: Go to value: if full item is present then page is able to be crawled.
  • In google: site:xxxxxxx.com
  • If indexed, mark as “Pass” // If not, then mark as “Fail”

Does content load with cookies turned off?

  • open Safari > Preferences.
  • Click Privacy.
  • Select Block all cookies.
  • Refresh website
  • If content loads, mark as “Pass” // If not, then mark as “Fail”

Is there any content behind a user login screen that is indexed?

  • Check if site has a login section
  • Double check: site:xxxxxx.com.au intitle:login
  • If no, mark as “Pass”

Do robots.txt block URLs we want indexed?

  • Make sure it blocks wp-admin if the website is on WordPress.
  • If URL patterns we want indexed are blocked, “Fail”// if not, “Pass”

Do sensitive logins, forgotten password forms, cart, checkout, and thin pages noindexed or blocked by robots.txt?

  • Check robots.txt
  • If blocked, mark as “Pass”
  • If not, check if noindexed
  • If noindexed, mark as “Pass”, if not then mark as “Fail”

WHY: Twofold: pages that don’t need to be crawled (e.g. the checkout process) shouldn’t be crawled, and any pages that have the opportunity to inadvertently disclose personally identifiable information (PII) should be actively kept out of search engines.

Do sitemaps URLs contain pages we want out of the index? (ie. admin logins, forgotten password and thin pages)?

  • Go to sitemap
  • Go into each sitemap folder
  • If there are no pages that shouldn’t be indexed, mark as “Pass”, if not then mark as “Fail”

WHY: If there are unnecessary, 301 redirect or error pages in the sitemap, it erodes Google’s trust in the directives in the document, so it’s more likely the sitemap will be ignored later on.

If the website has multiple languages, do they use hreflang meta tags?

  • Check visually if they have a language switcher
  • Click inspect → search for “hreflang”
  • If yes and yes: mark as “Pass”
  • If no and no: mark as “N/A”
  • If yes and no: mark as “Fail”

WHY: href lang is one of the primary methods of indicating region-specific content that may otherwise be seen as duplicate. When implemented correctly, the AU site should present in AU, the Japan site in Japan, etc.

Do outgoing hreflang tags match incoming hreflang tags?

  • Review your Screaming Frog crawl
  • For each page listed in the hreflang code, the reciprocal page needs to include a link back
  • Sanity check a few pages manually at a code level, and review the rest using Screaming Frog.
  • If outgoing tags match incoming tags, mark as “Pass” // if not, “Fail”.

WHY: For hreflang tags to be implemented correctly, you need to show the relationship between all the different languages to basically show-and-tell Google that the content is essentially the same, but localised to the region.

Is the X-Default tag setup correctly for hreflang?

  • If x-default is set, confirm it’s only set once
  • Confirm the x-default is essentially what the client wants to see as their “default” experience if a language or region isn’t detected by the browser in Google Search. This is often on the root domain without a language subfolder, or the english subfolder
  • If set up correctly, “Pass” // If not, “Fail”.

Do html language and region codes match page content?

  • Confirm the language on the page (use Google Translate if unsure)
  • Confirm the language code in the href lang of the page (e.g. en-GB)
  • Confirm the language code against the ISO-639-1 list (link)
  • If language on the page and ISO language code match, mark as “Pass” // If not, “Fail”.

Are self-referencing hreflang annotations missing? (TBD)

  • ………..

Do canonical declarations use absolute URLs?

  • Screaming Frog → Canonicals → Filter: all → scroll across to canonical link element 1
  • Check to see if they are absolute or relative URLs
  • Here are some examples:
    • Absolute = https://trueline.net.au/design-ideas/patios/carport-insulated-roof/
    • Relative = /design-ideas/patios/carport-insulated-roof/
  • If absolute, mark as “Pass” // If not, “Fail”

Is pagination self-canonicalized with multipage <a> tags?

diaZukwPIxYFZI28_hGVBtYm7BFLAwy_CL2FakdP2sNoSvElZqnojeVItQiyLhMzIjeWowxU77znkiGAzuAfBRe9dHmY5JuAbIM-drbBMf7cQ3TgUmi0BPJHp33zcG2H1SyEr8hRn9FFlV0vQi6G2A

  • Check crawl for /2 to see if site has pagination
    • Type into the search bar → “2” and look for URL’s like:
      • “?p=2” or /page-2/ or something along those lines
  • Go to page with pagination
  • Check Detailed extension to see if the canonical of page 2 is self-referencing (with all pages indexable) or canonicalised back to page 1 (with or without pages indexable)
  • Inspect on page 3/page 1/next page etc. at bottom of page → check it is <a> tag
  • Check page 3/page 1/next page links are clickable with JavaScript turned off

Does The URL Contain Multiple Or Conflicting Canonical Declarations?

  • Screaming Frog → Canonicals → Filter: Multiple
  • If multiple or conflicting canonical tags are found, mark as “fail” // If not , “Pass”
  • If there are multiple canonical tags but they are pointed at the same destination, mark priority as “medium”. If multiple conflicting canonical tags are found, mark priority as “high”.

It is considered best practice to only specify canonicals once on any given URL. This is because doing it multiple times makes the configuration more open to human error.

If search engines encounter conflicting canonical URLs, they will ignore the canonical instruction entirely. This could result in the indexing of pages which you do not wish to be indexed, potentially creating duplicate content.

Are session IDs or search parameters blocked by robots.txt or noindexed?

  • Check if website has search: website.com/?s=x (this is only search query for WordPress)
  • Check detailed for search page to see if indexable
  • If not indexable: , mark as “Pass” // If indexable, “Fail”

If blog or news subdomains exists, do they have robots.txt, sitemap.xml and indexed pages?

  • Check subdomain.site.com/robots.txt
  • Check subdomain.site.com/sitemap.xml
  • Check Google with site:subdomain.site.com
  • If all these come back with the correct files, mark as “Pass” // If not, “Fail”

Are sensitive HTML, PDFs and TXTs indexed?

Type into Google’s search bar → site:domain.com “filetype:filetype”

  • Examples:
    • “filetype:pdf”
    • “filetype:html”
    • “Filetype:txt”

User Experience (UX) / User Interface (UI)

Is the website using aggressive pop-ups, overlays, modals, or interstitials?

Google is cracking down on aggressive interstitials and most users hate them.

  • If Yes, then “Fail” // If No, then “Pass”

Does the website have intrusive advertisements?

Some websites go overboard with ads and they end up disrupting UX. This is most common on content-driven websites.

  • If Yes, then “Fail” // If No, then “Pass”

Is the website design clean and free of unnecessary distractions?

A good design is seamless and free of distractions. If a user has to “think”, then there’s a problem.

  • If Yes, then “Pass” // If No, then “Fail”

Is the font type and size easy to read?

  • If Yes, then “Pass” // If No, then “Fail”

Are CTAs easy-to-understand?

CTAs shouldn’t use cute/clever names or unfamiliar phrases. Clear calls-to-action such as “Click Here” or “Learn More” are best.

  • If Yes, then “Pass” // If No, then “Fail”

Are all buttons and links obviously clickable?

Buttons should clearly be buttons and links should clearly be links.

  • If Yes, then “Pass” // If No, then “Fail”

Copyscape

Does any copied content exist (internally & externally)?

Copyscape → Enter URL

  • If Yes, then “Fail” // If No, then “Pass”

Deprecated

Sitebulb (Deprecated)

Were all issues checked and recorded from the SEO tab?

  • Mark N/A

Do pages need more internal links?

  • Mark N/A

Google Mobile Friendly Test (Deprecated)Is the website mobile friendly?

Go to https://search.google.com/test/mobile-friendly → Enter URL → Run Test

  • If Yes, then “Pass” // If No, then “Fail”