Saturday, January 13, 2024

What is the Google 404 Error?

Google 404 Error

Here's a breakdown of the situation and potential causes:

Understanding the 404 Error:

  • It indicates that the server can't find the requested webpage.
  • It displays "404 Not Found" or a similar message.

Scenario: Page Loads Correctly Despite 404 Error:

  • This discrepancy can arise due to several reasons:

1. Conflicting Server Responses:

  • The server might be sending a 404 status code while still delivering the content.
  • This inconsistency can puzzle Google and other search engines.

2. Server-Side Misconfiguration:

  • Incorrect server settings could trigger a 404 error even when the page exists.
  • This might involve issues with: URL rewriting rules Content management systems Server software

3. Caching Issues:

  • Cached versions of a page might load despite it being removed or moved.
  • Clearing your browser cache and server-side caches can resolve this.

4. Browser Extensions:

  • Certain extensions might interfere with page loading and generate false 404 errors.
  • Try disabling extensions temporarily to rule out this possibility.

5. Google's Indexing Delays:

  • Google might display a 404 error in search results even if the page has been updated or fixed.
  • This often resolves itself as Google recrawls and updates its index.

Troubleshooting Steps:

  1. Check Server Logs: Examine server logs for accurate 404 error details and identify the root cause.
  2. Verify Page Existence: Ensure the page exists in its intended location on the server.
  3. Review Server Configuration: Check for any misconfigurations in server settings, rewrite rules, or CMS settings.
  4. Clear Caches: Clear your browser cache and potentially server-side caches.
  5. Disable Extensions: Temporarily disable browser extensions to see if they're contributing to the issue.
  6. Monitor Google Search Console: Use Google Search Console to track indexing issues and submit updated URLs for recrawl.

If you're still facing issues, consider seeking assistance from a web developer or server administrator for further diagnosis and resolution.

Friday, October 06, 2023

What are Zombie pages in SEO: chase them away from your site!

Zombie Pages

Zombie pages are pages on a website that generate little or no traffic and are difficult or impossible to access through search engine results.

In this article, we will give you our advice on how to detect these pages and how to treat them so that they do not affect the visibility of your entire site.

Summary:

  1. Why do we have to deal with the zombie pages?
  2. The different types of zombie pages
  3. How to locate these pages?
  4. How to deal with zombie pages?

Why do we have to deal with the Zombie Pages?

The detection and processing of zombie pages allows to:

  • Improve the user experience of visitors. Removing or correcting the zombie pages of a site allows to provide a better user experience and to improve the bounce and conversion rate of a site.
  • To improve the Quality Score awarded by Google. The search engine judges a site as a whole – compensating for the negative effects of zombie pages raises its overall score and therefore improves its positioning.
  • To optimize the crawl budget. Removing or blocking the indexing of zombie pages allows to spread the crawl time that is allocated to a site on its most significant pages.

The different types of Zombie Pages.

1 – Unindexed Pages.

These pages usually have technical problems such as loading times that are too long or scripts that do not execute. Google divides the time that its crawlers spend on a site according to the number of pages on it, it will choose not to index pages that slow down its task and which would anyway also have a high chance of being abandoned by visitors.

These pages are absent from Google’s results index, they are not visited or in any case, do not receive direct traffic from the search engine.

2 – “Non Responsive” pages.

Pages that are not optimized or that take too long to navigate on mobile phones are also at a disadvantage. Google will punish them because it considers them to offer a degraded user experience.

These pages are present in Google’s results but their ranking is penalized.

3 – Pages with obsolete or low-quality content.

There are two types:

  • Published pages that have not been updated for several years. Google may downgrade the rating of such pages by considering that they are no longer current.
  • Pages with low content (less than 300 words) or without real interest are also penalized.

These pages are gradually downgraded in the results.

4 – pages not (or not enough) optimized for SEO.

These pages can be quite useful and interesting for internet users but they do not apply the SEO criteria (such as the absence of alt, h1, h2, or h3 tags, a bad title or a too-long title, and no keywords…).

These pages are downgraded in search engine results.

5 – The annex pages.

These are often pages that can be accessed at the footer of the site: contact, legal notices, GTC, GDPR… Even if they are of little interest to the internet user, they contain legal information and their presence on a site is an SEO requirement.

The absence of these pages negatively affects the referencing of a site.

6 – Orphan pages.

These pages are simply not found by crawler robots, they are not linked by any internal link to other pages of the site and they are not accessible through the site’s menu. They are pages that somehow float in a parallel universe with almost no chance of being visited.

Several techniques exist to identify the orphan pages of a site. One of the simplest (well, unless your site has thousands of pages) is to compare your XML sitemap with the Google index of your site (you will get this index by a search such as “site:mywebsite.com” in Google). You just have to compare the two lists to identify the pages present in your sitemap but absent from the Google index. You will then just have to link the orphan pages to the rest of your site by internal meshing.

How to locate the Zombie Pages?

If you want to make the diagnosis by yourself without going through the services of an agency, we advise you to use the Google Search Console. You will find the tools that will allow you to detect pages with low or decreasing performance.

The “performance” tab (+ new + page), very easy to use (especially if your site is only a few pages long), will allow you to compare the evolution over time of traffic on each of your pages and thus detect those that are experiencing a sharp drop in traffic.

The “excluded” tab (+ coverage + excluded) will allow you to analyze two types of zombie pages:

  • The “Explored, currently unindexed” pages

These are pages that Google decided not to index during its last crawl considering their content too weak, duplicated, or containing information already present on many other sites. It is therefore advisable here to first complete and/or rewrite the content of these pages and wait for Google’s robots to come and explore them again.

  • The pages “Detected, currently not indexed”.

These are pages that Google has chosen not to index due to technical problems (e.g. when the server response time is too long).

How to deal with the Zombie Pages?

Some pages just need to be updated or optimized, while others really need to be removed and redirected.

Improve these pages.

As zombie pages are often pages with a too-long loading time, absent from the site’s mesh, or with unsuitable content, you need to rehabilitate them in the eyes of Google as well as in the eyes of your visitors.

  • Update and enrich the content of these pages;
  • Check that they contain the right keywords and that the semantic richness of the text is adapted to the subject matter;
  • Improve UX and Loading Time;
  • Add links to other linked pages on the site;
  • Add internal inbound links from other pages on your site.;
  • Share it on your social networks;
  • Do not change their URL.

Delete them!

Don’t launch into this delicate operation without checking the pages you are going to delete on a case-by-case basis.

If there are zombie pages on your site that have outdated content and do not generate any conversions, then it is possible to delete them.

On the other hand, pages that interest only a few visitors but which have a very “profitable” conversion rate should be kept.

Of course, there are zombie pages that are essential like the legal notices, the General Conditions of Sale, and the RGPD… which generate little or no visits and are to be kept.

Once your zombie pages have been deleted, don’t forget to redirect (301) the URLs of these pages to the pillar pages of the appropriate category or to other pages that deal with a similar theme.

Friday, July 21, 2023

What is Robots.txt file, How to update robots.txt file created by Yoast SEO with new one?

robots.txt file

User-agent: *

Disallow: /

User-agent: Googlebot-Image

Allow: /images/

This robots.txt file tells all search engine crawlers to not crawl any pages on the website. However, it allows Googlebot-Image to crawl the images directory.

Here is a breakdown of the directives in this robots.txt file:

  • User-agent: This directive specifies which search engine crawler the directive applies to. In this case, the directive applies to all search engine crawlers, as the asterisk (*) is used as a wildcard.
  • Disallow: This directive tells the search engine crawler not to crawl the specified path. In this case, the directive tells the crawler not to crawl any pages on the website.
  • Allow: This directive tells the search engine crawler to crawl the specified path. In this case, the directive tells the crawler to crawl the images directory.

Here are some other directives that you can use in your robots.txt file:

  • Host: This directive specifies which host the directive applies to.
  • Crawl-delay: This directive specifies how long the search engine crawler should wait between requests to your website.
  • Sitemap: This directive specifies the location of your website's sitemap.

For more information on robots.txt files, you can refer to the following resources:

  • Robots Exclusion Standard: https://en.wikipedia.org/wiki/Robots_exclusion_standard
  • Google Search Console Robots.txt documentation: https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • Moz Robots.txt documentation: https://moz.com/learn/seo/robotstxt

Here are the steps on how to cancel the robots.txt file created by Yoast SEO and create a new one:

  1. Go to your WordPress dashboard and click on Yoast SEO > Tools > File editor.
  2. In the File editor section, click on the Robots.txt tab.
  3. Click on the Delete button to delete the existing robots.txt file.
  4. Click on the Create new file button to create a new robots.txt file.
  5. In the new robots.txt file, enter the directives that you want to use to control how search engines crawl your website.
  6. Save the new robots.txt file.

Once you have created the new robots.txt file, you need to upload it to your website's server. You can do this by using a file transfer protocol (FTP) client or by using your web hosting provider's file manager.

Here are some additional things to keep in mind when creating a robots.txt file:

  • The robots.txt file must be named robots.txt and it must be saved in the root directory of your website.
  • The directives in the robots.txt file are case-sensitive.
  • You can use the Allow and Disallow directives to control how search engines crawl your website.
  • You can use the User-agent directive to specify which search engines the directives apply to.

I hope this helps! Let me know if you have any other questions.