Keywords and content may be the twin pillars on which most search engine optimization strategies are built, but they are far from the only ones that matter.
Less often discussed, but just as important – not just to users, but to search engines – is your website’s visibility.
There are approximately 50 billion web pages out of 1.93 billion websites on the internet. This is far too many for a human team to explore, so these robots, also called spiders, play a significant role.
These robots determine each page’s content by following links from website to website and page to page. This information is compiled into a huge database, or index, of URLs, which is then put through the search engine’s algorithm for ranking.
This two-step process of navigating and understanding your website is called crawling and indexing.
As an SEO professional, you’ve undoubtedly heard these terms before, but let’s define them just for the sake of clarity:
As you can probably imagine, these are both important parts of SEO.
If your website suffers from poor crawlability, such as many broken links and dead ends, search engine crawlers will not be able to access all of your content, which will exclude it from the index.
Indexability, on the other hand, is important because pages that are not indexed do not appear in search results. How can Google rank a page it has not included in its database?
The crawling and indexing process is a bit more complicated than we’ve discussed here, but that’s the basic overview.
If you’re looking for a more in-depth discussion of how they work, Dave Davies has an excellent piece on crawling and indexing.
How To Improve Crawling And Indexing
Contents
Now that we’ve covered how important these two processes are, let’s look at some elements of your site that affect crawling and indexing—and discuss ways to optimize your site for them.
1. Improve Page Loading Speed
With billions of web pages to index, web spiders don’t have to wait all day for your links to load. This is sometimes referred to as a review budget.
If your site doesn’t load within the specified time frame, they will leave your site, which means you will remain uncrawled and unindexed. And as you can imagine, this is not good for SEO purposes.
That’s why it’s a good idea to regularly evaluate your page speed and improve it where you can.
You can use Google Search Console or tools like Screaming Frog to check the speed of your website.
If your site is running slowly, take steps to alleviate the problem. This may include upgrading your server or hosting platform, enabling compression, minifying CSS, JavaScript and HTML, and eliminating or reducing redirects.
Find out what’s slowing down your load time by checking the Core Web Vitals report. If you want more refined information about your goals, especially from a user-centric view, Google Lighthouse is an open source tool you may find very useful.
2. Strengthen Internal Link Structure
A good page structure and internal linking are fundamental elements of a successful SEO strategy. A disorganized website is difficult for search engines to crawl, making internal linking one of the most important things a website can do.
But don’t just take our word for it. Here’s what Google search lawyer John Mueller had to say about it:
“Internal linking is super critical for SEO. I think it’s one of the biggest things you can do on a website to guide Google and guide visitors to the pages you think are important.”
If your internal linking is poor, you also risk orphaned pages or those pages that do not link to any other part of your site. Because nothing is directed to these pages, the only way for search engines to find them is from your sitemap.
To eliminate this problem and others caused by poor structure, create a logical internal structure for your website.
Your homepage should link to subpages that are supported by pages further down the pyramid. These subpages should then have contextual links where it feels natural.
Another thing to watch out for are broken links, including those with typos in the URL. This of course leads to a broken link, which will lead to the dreaded 404 error. In other words, the page was not found.
The problem with this is that broken links don’t help and hurt your crawlability.
Double-check your URLs, especially if you’ve recently undergone a site migration, mass deletion, or structure change. And make sure you don’t link to old or deleted URLs.
Other internal linking best practices include having a good amount of linkable content (content is always king), using anchor text instead of linked images, and using a “reasonable number” of links on a page (whatever that means).
Oh yeah, and make sure you use affiliate links for internal links.
3. Submit Your Sitemap To Google
Given enough time, and assuming you haven’t told it not to, Google will crawl your site. And that’s great, but it won’t help your search rankings while you wait.
If you’ve recently made changes to your content and want Google to know about it immediately, it’s a good idea to submit a sitemap to Google Search Console.
A sitemap is another file located in your root directory. It acts as a road map for search engines with direct links to every page on your website.
This is beneficial for indexability because it allows Google to learn about multiple pages at once. While a crawler might have to follow five internal links to discover a deep page, by submitting an XML sitemap it can find all of your pages with a single visit to the sitemap file.
Submitting your sitemap to Google is especially useful if you have a deep site, frequently add new pages or content, or your site doesn’t have good internal linking.
4. Update Robots.txt Files
You probably want a robots.txt file for your website. Although not required, 99% of sites use it as a rule of thumb. If you’re not familiar with this, it’s a plain text file in your website’s root directory.
It tells search engines how you want them to crawl your site. Its primary use is to manage bot traffic and prevent your site from being overloaded with requests.
Where this comes in handy in terms of crawlability is limiting which pages Google crawls and indexes. For example, you probably don’t want pages like catalogs, carts, and codes in Google’s catalog.
Of course, this useful text file can also negatively affect crawlability. It’s well worth looking at your robots.txt file (or having an expert do it if you’re not confident in your abilities) to see if you’re inadvertently blocking crawlers from accessing your pages.
Some common errors in robots.text files include:
For an in-depth examination of each of these issues – and tips for solving them, read this article.
5. Check Your Canonicalization
Canonical tags consolidate signals from multiple URLs into a single canonical URL. This can be a useful way to tell Google to index the pages you want, while skipping duplicates and outdated versions.
But this opens the door to rogue canonical tags. These refer to older versions of a page that no longer exist, causing search engines to index the wrong pages and leaving your preferred pages invisible.
To eliminate this problem, use a URL inspection tool to scan for fake tags and remove them.
If your site targets international traffic, ie if you direct users in different countries to different canonical pages, you need to have canonical codes for each language. This ensures that your pages are indexed in every language your site uses.
6. Perform A Site Audit
Now that you’ve completed all of these other steps, there’s still one last thing you need to do to ensure your site is optimized for crawling and indexing: a site audit. And that starts with checking the percentage of pages Google has indexed for your site.
Check Your Indexability Rate
Your indexing rate is the number of pages in Google’s index divided by the number of pages on our website.
You can find out how many pages are in the Google index from the Google Search Console Index by going to the “Pages” tab and checking the number of pages on the website from the CMS admin panel.
There’s a good chance your site will have some pages you don’t want indexed, so this number probably won’t be 100%. However, if the indexability rate is below 90%, you have problems that need to be investigated.
You can retrieve non-indexed URLs from Search Console and run an audit on them. This can help you understand what is causing the problem.
Another useful website audit tool included in Google Search Console is the URL Inspection Tool. This allows you to see what Google spiders see, which you can then compare to real web pages to understand what Google is unable to render.
Audit Newly Published Pages
Whenever you publish new pages on your website or update your most important pages, you should make sure they are indexed. Go into Google Search Console and make sure they all show up.
If you’re still having trouble, an audit can also give you insight into what other parts of your SEO strategy are falling short, so it’s a double win. Scale your audit process with free tools like:
7. Check For Low-Quality Or Duplicate Content
If Google does not see your content as valuable to searchers, it may decide that it is not worthy of indexing. This thin content, as it is known, can be poorly written content (eg filled with grammar and spelling errors), standard content that is not unique to your site, or content with no external signals of its value and authority.
To find this, you need to find out which pages on your site are not being indexed and then go through the target searches for them. Do they provide high-quality answers to applicants’ questions? If not, replace or update them.
Duplicate content is another reason bots can get hung up while crawling your site. Basically, what’s happening is that your encoding structure has confused it, and it doesn’t know which version to index. This can be caused by things like session IDs, redundant content elements, and pagination issues.
Sometimes this will trigger an alert in Google Search Console, telling you that Google is encountering more URLs than it thinks it should. If you haven’t received one, check your crawl results for things like duplicate or missing tags, or URLs with extra characters that could create extra work for bots.
Correct these issues by fixing tags, removing pages, or adjusting Google’s access.
8. Eliminate Redirect Chains And Internal Redirects
As websites evolve, redirects are a natural byproduct, directing visitors from one page to a newer or more relevant one. But while they’re common on most sites, if you mishandle them, you can inadvertently sabotage your own indexing.
There are several mistakes you can make when creating redirects, but one of the most common is redirect chains. These occur when there is more than one redirect between the clicked link and the destination. Google does not see this as a positive signal.
In more extreme cases, you can start a redirect loop, where one page redirects to another page, which redirects to another page, and so on, until it finally links back to the very first page. In other words, you’ve created an infinite loop that goes nowhere.
Check your site’s redirects using Screaming Frog, Redirect-Checker.org or a similar tool.
9. Fix Broken Links
Likewise, broken links can wreak havoc on your site’s crawlability. You should regularly check your site to make sure you don’t have broken links, as this will not only hurt your SEO results, but will frustrate human users.
There are a number of ways you can find broken links on your site, including manually evaluating each and every link on your site (header, footer, navigation, in-text, etc.), or you can use Google Search Console, Analytics, or Screaming Frog to find 404 errors.
Once you’ve found broken links, you have three options to fix them: redirect them (see above section for caveats), update them, or remove them.
10. IndexNow
IndexNow is a relatively new protocol that allows URLs to be sent simultaneously between search engines via an API. It works like a supercharged version of submitting an XML sitemap by notifying search engines of new URLs and changes to your site.
Basically, what it does is give crawlers a road map to your website in advance. They enter your site with the information they need, so there’s no need to check your sitemap again. And unlike XML sitemaps, it lets you inform search engines about non-200 status code pages.
It’s easy to implement and only requires you to generate an API key, host it in your directory or elsewhere, and submit your URLs in the recommended format.
Wrapping Up
By now you should have a good understanding of your website’s indexability and crawlability. You should also understand how important these two factors are to your search rankings.
If Google’s spiders can crawl and index your site, it doesn’t matter how many keywords, backlinks and tags you use – you won’t appear in the search results.
And that’s why it’s important to regularly check your site for anything that could be guiding, misleading or deceptive bots.
So get yourself a good set of tools and get started. Be diligent and pay attention to the details and you’ll soon have Google spiders swarming your site like spiders.
Featured Image: Roman Samborskyi/Shutterstock
The fact that SEO works just fine even in 2022 as a method of delivering improved commercial results for business websites and their owners, combined with the need for continuous research into what works, makes it still relevant as a digital marketing method, but even more so as a valuable service and a rewarding…
How can I make my website look visually appealing?
What makes a good website visually appealing?
- Color: Your website’s color should match your brand and present consistency throughout the site. …
- Font: The site’s font does not necessarily have to match the brand or logo. …
- Images: Avoid using stock images from iStock or Fotolia. …
- Simplicity: The simpler the better.
What is a visually appealing website? Visual appeal is what meets the eye. It’s the colors, shapes, images, font, white space and overall visual balance of a design. Whether a website appeals to us affects how we perceive it, how we use it and how we remember it.
Why should a website be aesthetically pleasing?
A visually appealing website is easy to digest, more functional, enjoyable and inviting, and retains attention in the critical first 30 seconds where users need to be engaged. The best website designs emphasize aesthetics to provide the best user experiences and the best results for website owners.
What is a URL slug example?
A slug is the part of a URL that identifies a particular page on a website in an easy-to-read form. In other words, it is the part of the URL that explains the page’s content. For example, for this article the URL is https://yoast.com/slug and the slug is simply ‘slug’.
Why is a URL called a slug? What is a snail? Well, the name “slug” comes from web publishing, and usually refers to a part of a URL that identifies a page or resource. The name is based on the use of the word slug in the news media to indicate a short name given to an article for internal use.
What is a custom URL slug?
A URL slug is the part of the URL that refers to a unique page on a website. URL slugs should be easy to read, because they play an important role in describing what is on a page to both search engines and visitors. Think of it as a name tag used to identify each page on a website.
What should my URL slug be?
Short and clear slugs not only make it easy for readers to understand what content is on each page and help them remember the URL. They also help search engines understand your content. Experts recommend an ideal URL slug length of between three and five words.
What is a custom slug?
AffiliateWP offers many different affiliate URL variations that affiliates can use at the same time. Custom Affiliate Slugs take this one step further and allow affiliates to create their own custom âslugâ. A slug is the part of the referral URL where there will usually be the affiliate’s username or ID.
What is a URL slug?
A slug is the unique identifying part of a web address, usually at the end of the URL. In the context of MDN, it is the part of the URL that follows “/docs/”. It can also only be the last component when a new document is created under a parent document; for example, this page’s slug is Word list/Slug .
What is the difference between slug and URL?
A URL is the complete address of a given page on the Internet. A slug is the part of the URL that comes after the domain name. A well-optimized slug can help improve your site’s SEO and make it easier for users to find your content.
How do you find a slug on a website?
Identifying slugs You can see the slug in the editor for a post or page in the page/post settings under ‘Permalink’: You can also use this section to edit the slug if you want to use something other than the default slug that is created automatically.
What is a slug description?
A slug is a human-readable, unique identifier, used to identify a resource in place of a less human-readable identifier such as an id. You use a slug when you want to refer to an object while preserving the ability to see, at a glance, what the object is.
What does a URL slug look like?
A URL is a web address, and the slug is the part at the end that identifies the exact web page the URL points to. For example, âproduct nameâ is the slug at www.ecommerce.com/category/product-name/. Like many page builders, a slug in WordPress defaults to the name of the website.
What is better than Ubersuggest?
Ubersuggest vs SEMrush: Social Media Analytics SEMrush beats Ubersuggest with its Social Media Tracker and Social Media Management tools. SEMrush’s Social Media Tracker takes competitor research one step further.
Is Ubersuggest a good SEO tool? Ubersuggest is a great, cheaper alternative to the likes of Semrush and Moz, offering both a free and significantly cheaper paid plan. It provides users with features that help generate keyword research and ideas for their content marketing strategy – one of the most important elements of SEO.
Which is better Ubersuggest or SEMrush?
Ubersuggest allows the most basic level of SEO, website auditing and backlink analysis. The site audit features in Ubersuggest are effective if you are a beginner. It lacks some in-depth features for comprehensive SEO such as backlink gap and keyword gap analysis offered by SEMrush. So SEMrush is more dynamic.
Is Ubersuggest good for keyword research?
Finding keywords to rank for Maybe experienced bloggers or large websites need a larger data pool to find the content ideas they need. But for any solopreneur or medium website, Ubersuggest keyword research works great.
Is there anything better than Semrush?
Serbian state. Serpstat is another popular all-in-one SEO solution that offers most of the same features as Semrush. It is slightly cheaper than Semrush and allows you to track a much higher volume of keywords. As a result, many cost-conscious marketers believe that Serpstat is the better option.
Which is better Ubersuggest or Ahrefs?
Reviewers felt that Ahrefs meets the needs of their business better than Ubersuggest.
What is better than Ahrefs?
Advantages of Semrush over Ahrefs If you envision doing a LOT of backlinking or keyword research every day, Semrush is significantly better than Ahrefs. Semrush provides a lot of data not only related to SEO, but also PPC – if you want a tool that covers both areas, it’s a better option than Ahrefs.
Why is website not crawlable?
The entire website or certain pages can remain unseen by Google for one simple reason: the search robots are not allowed to access them. There are several bot commands that will prevent page crawling. Note that it is not a bug to have these parameters in bots.
What does it mean for a website to be crawlable? Crawlability describes the search engine’s ability to access and crawl content on a page. If a site has no crawlability issues, web crawlers can easily access all of its content by following links between pages.
What is crawlable content?
Crawlability means that search engine crawlers can read and follow links in your website’s content. You can think of these crawlers as spiders that follow tons of links all over the web. Indexability means that you allow search engines to display the pages of your website in the search results.
What does it mean to crawl content?
Crawling refers to following the links on a page to new pages, and continuing to find and follow links on new pages to other new pages. A web crawler is a program that follows all the links on a page, leading to new pages, and continues that process until it has no more new links or pages to crawl.
What is crawlable link example?
A crawlable link is a link that can be followed by Google. Links that cannot be crawled are therefore links with a bad URL, these links can be exploited by the JavaScript code of the page, but not by crawlers.
What does it mean when a link is not crawlable?
A crawlable link is a link that can be followed by Google. Links that cannot be crawled are therefore links with a bad URL, these links can be exploited by the JavaScript code of the page, but not by crawlers.
What is a Crawlable URL?
What is a crawlable link? A crawlable link is a link that can be followed by Google. Links that cannot be crawled are therefore links with a bad URL, these links can be exploited by the JavaScript code of the page, but not by crawlers.
Which link is not crawlable?
Google can only follow links if they are a tag with an href attribute. Links using other formats will not be followed by Google’s search robots. Google cannot follow links without an href tag or other tags that execute a link due to script events.
Why is crawlability important?
So crawlability refers to how well a robot can scan and index your pages. The more crawlable your site is, the easier it is to index, which helps improve your rankings in SERPs. Remember that web crawlers are constantly working and they will regularly return to your site to see if it has been changed or updated.
What does it mean when a page can be indexed? A page is indexed by Google if it has been visited by the Google search robot (“Googlebot”), analyzed for content and meaning and stored in the Google index. Indexed pages may appear in Google’s search results (if they follow Google’s webmaster guidelines).
Why is a crawler important in digital marketing?
A crawler is a program used by search engines to collect data from the internet. When a crawler visits a website, it picks over the entire website’s content (ie the text) and stores it in a database. It also stores all external and internal links to the site.
Why is Indexability important?
Why is indexability important? Indexability allows search engines to display the pages of your website in SERPs. Without indexing, you cannot drive organic search traffic to your website. It’s worth noting that sometimes it makes more sense to make certain pages non-indexable.
What does it mean to have a crawlable website?
Crawlability means that search engine crawlers can read and follow links in your website’s content. You can think of these crawlers as spiders that follow tons of links all over the web. Indexability means that you allow search engines to display the pages of your website in the search results.