Tackling Potential Content Duplication Issues in Your WebStore

Hi @emma

Thanks for reporting that to us.

I will double check and get back to you asap.

Regards,
Franclin

Hi @franclin_foping,

Our canonical tags are working great on department and category pages, but the prev tag breaks on our brand pages on page 2 of results.

Example page: http://www.racksforcars.com/THULE/?page=2

Expected canonical tags:

`

`

Actual canonical tags:

`

`

This isn’t a problem on pages 3, 4, 5, etc. Those seem to create the correct prev URL fine.

Thanks very much,

Tyler

Hi,

An update here:

  • @tylerkrys we have just released a fix for the issue that you reported and having double checked, we are satisfied with the results;

  • @emma there is also a fix for your issue and you should be good to go.

Once more, thanks for bringing those issues to our attention.

Regards,
Franclin

P.S: @emma congratulations for lifting the title from a Chelsea fan!

That looks perfect @franclin_foping, thanks so much for the quick fix!

Hi @franclin_foping,

I’m just investigating possible causes why google has a bit of a downer on us at the moment, and spotted that some of these bugs are back in our live site, examples below:

Brand pages:

Page 1 of a brand does not have the correct URL for the “next” page eg look at http://www.broughtons.com/store/search/brand/elstead-lighting/

  • the brand name itself is missing.

Category Pages:
The link for the “next” url is wrong on Page 1 eg look at http://www.broughtons.com/store/category/28/250/door-handles-on-backplate/
and also on a middle page “next” is wrong eg look at http://www.broughtons.com/store/category/28/250/door-handles-on-backplate/page6.html

Subcategory Pages:
The link for the “next” url is wrong on Page 1 eg look at http://www.broughtons.com/store/category/28/250/Door-Handles-on-Backplate/Polished-Brass/
and also on a middle page “next” is wrong eg look at http://www.broughtons.com/store/category/28/250/Door-Handles-on-Backplate/Polished-Brass/page2.html

And where google searches for pages that are out of scope and the site is supposed to return a 404, it returns an empty results set instead - and with prev and next references too! eg
http://www.broughtons.com/store/category/28/250/door-handles-on-backplate/page9999.html
and http://www.broughtons.com/store/category/28/250/Door-Handles-on-Backplate/Polished-Brass/page9999.html

Thanks,
Emma

Hi @emma

Thanks for reporting this issue to us.

I will double check that and let you know when a fix is released.

Watch out this space in the meantime.

Regards,
Franclin

Hi @franclin_foping,

Can I add another related use case where google is reporting crawl errors because the page does not exist yet doesn’t return a 404…

Its when it crawls a subcategory that doesn’t exist - it should return a 404 eg it crawls http://www.broughtons.com/store/category/13/100/Wall-Lights/Period/

Thanks,
Emma

Hi @emma

Thanks for letting us know. We will also look at this issue and get back to you in due course.

Regards,
Franclin

Interesting you should bring that up Emma. Our Webmaster tools is reporting an increased number of crawl errors last week, thousands of them in fact for pages which haven’t existed in a long time. Perhaps this is related to what you’ve highlighted.

On the subject of having too many and wrong indexed pages, can I raise the following as a potential cause of concern too:

Google still indexes thousands of our pages with URL formats from before we turned on SEO normalization last year. How will google EVER drop these pages - they don’t render the normalized version AND their canonical refs are themselves not the normalized version. Therefore google indexes both the unnormalized and normalized version of a page, both of which serve identical content. Surely google will conclude we have loads of duplicate content therefore. Example here:

site:www.broughtons.com/store/search/brand/kirkpatrick-ironmongery/ (choose to search without omitting similar entries) and you see it indexes both
www.broughtons.com/store/search/brand/kirkpatrick-ironmongery/
and
www.broughtons.com/store/search/brand/Kirkpatrick-Ironmongery/

What do you think @franclin_foping?

Emma

Hi @emma

We have just released a set of patches for the issues that you raised.

The next URL is now correct in your brand/category/subcategary pages and we also return a 404 on listing pages without items.

With regards to the content duplication occurring when the URL normalization is enabled, Google was indexing both pages because the contents of their URLs were similar. The way around it is to send a 301 to the correct version of the page. For instance, for www.broughtons.com/store/search/brand/kirkpatrick-ironmongery/ and www.broughtons.com/store/search/brand/Kirkpatrick-Ironmongery/ since the former is the correct version, opening the latter URL would yield a 301 to the former.

In fact, this mechanism is already in place for retailers not using legacy URLs. Mind you, it may take some time before search engines reindex your store and at that point, I do expect to see a sharp reduction in the number of crawl errors.

Once again, I would like to seize this opportunity to thank you for reporting this issue to us.

Hope this helps.
Franclin

Hi @franclin_foping,

That would be great news - only now our customers cannot navigate beyond page 1 of a brand page! When you try and select page 2, page 3 etc it just returns page 1.

Please can you look at urgently.

Emma

Hi @emma

We have just fixed that for you.

Thanks,
Franclin

Hi @franclin_foping,

The bugs that you reported as fixed on Sept 19th have now returned and we are NOT getting the 301 redirects anymore. We also don’t get 404s for empty listings pages.

What is going on? This is not the only bug reintroduced - meta data problems are also present again which had been fixed. What is going on? It seems a massive struggle to be able to move forward (and a waste of your and my time!).

Emma

Hi @emma

Apologies for the issues that you are having with your store.

I have just checked it again and here are my initial thoughts:

Moving forward, I will get back to you once we have nailed down the metadata issue. Meanwhile, please feel free to get back to us if you need more assistance.

Kindest regards,
Franclin

Hi @franclin_foping,

I am talking about these types of pages:

  1. www.broughtons.com/store/search/brand/Kirkpatrick-Ironmongery/
    You fixed this a couple of weeks ago so it redirected to www.broughtons.com/store/search/brand/kirkpatrick-ironmongery/
    I tested it back then and it was definitely fine.
    Now the error is back.

  2. A listing page that has no data:
    http://www.broughtons.com/store/search/brand/Outdoor-Nautical-Lighting/
    http://www.broughtons.com/store/search/theme/medieval-iron-lighting/
    etc
    All indexed by google but these no longer exist and I don’t think these are proper 404s?

  3. Yes the out of bounds page numbers that you previously fixed is still working fine, I don’t have any issues there :slight_smile:

  4. I wasn’t trying to suggest that this is related to the meta data problem. I am trying to point out that there are quite a few instances where you fix things then they go wrong again - that meta data issue is one of them, and there is an example further up this post and I can think of quite a few others in my experience. So I am trying to suggest that there may be an issue with your procedures, policies and source code control that allows old code to overwrite fixes.

Emma

Hi @emma

Thanks for clarifying this for us.

The second point will be addressed shortly. By the way, please note that the original targets of the 404 on listing pages were department pages, then we added category and most recently subcategory pages. You can see how these changes were incrementally added to the codebase by reading the previous responses to customers’ responses. Brand and theme pages were not initially reported but of course, we will extend the fix to them as well, that’s not a problem at all.

Very glad that you are happy with the out of bound pages, I was under the impression yesterday that they were not working for you.

For the last point, we will take your advice on board and make the necessary adjustments internally. As I said yesterday, the metadata issue is not related to this forum thread but of course, it will be given our full attention as it is of paramount important from an SEO point of view.

I will revert to you once the fix has been rolled out. Meanwhile, please do not hesitate to get in touch with us if we can be of any help.

Kindest regards,
Franclin

Hi @emma

An update here:

We released a fix for the empty brand/theme pages and the links that you reported before are now yielding a 404 response. For instance, http://www.broughtons.com/store/search/brand/Outdoor-Nautical-Lighting/

Also, I understand the issue with the metadata is solved as per 119274: HTML Headers (metatags) missing from Department, Category pages - #8 by david_acheson

I will get back to you once the patch for the 301 redirects is in place. I should stress that this issue is only affecting stores with legacy URL scheme.

Thanks for your custom.

Regards,
Franclin

Hi @franclin_foping,

We have recently switched to beta and SSL, and we are experiencing a small error on page 2 of Brand pages. The prev URL is appending the brand name to the end of the actual prev URL. This same error had occured back in April and you guys were able to fix it then: Tackling Potential Content Duplication Issues in Your WebStore - #70 by tylerkrys

Example page: https://www.racksforcars.com/THULE/?page=2

Canonical URLs:

[code]

[/code]

The prev URL in this case should be https://www.racksforcars.com/THULE/

Could you have a look at this issue please?

Thanks very much,

Tyler

Hi @tylerkrys

Thanks for reporting that. It is now fixed.

Regards,
Franclin

1 Like