URL restructure and phasing out HTML sitemap
-
Hi SEOMozzies,
Love the Q&A resource and already found lots of useful stuff too!
I just started as an in-house SEO at a retailer and my first main challenge is to tidy up the complex URL structures and remove the ugly sub sitemap approach currently used. I already found a number of suggestions but it looks like I am dealing with a number of challenges that I need to resolve in a single release.
So here is the current setup:
The website is an ecommerce site (department store) with around 30k products. We are using multi select navigation (non Ajax). The main website uses a third party search engine to power the multi select navigation, that search engine has a very ugly URL structure. For example www.domain.tld/browse?location=1001/brand=100/color=575&size=1&various other params, or for multi select URL’s www.domain.tld/browse?location=1001/brand=100,104,506/color=575&size=1 &various other non used URL params. URL’s are easily up to 200 characters long and non-descriptive at all to our users. Many of these type of URL’s are indexed by search engines (we currently have 1.2 million of those URL’s indexed including session id’s and all other nasty URL params)
Next to this the site is using a “sub site” that is sort of optimized for SEO, not 100% sure this is cloaking but it smells like it. It has a simplified navigation structure and better URL structure for products. Layout is similair to our main site but all complex HTMLelements like multi select, large top navigations menu's etc are all removed. Many of these links are indexed by search engines and rank higher than links from our main website. The URL structure is www.domain.tld/1/optimized-url .Currently 64.000 of these URL’s are indexed. We have links to this sub site in the footer of every page but a normal customer would never reach this site unless they come from organic search. Once a user lands on one of these pages we try to push him back to the main site as quickly as possible.
My planned approach to improve this:
1.) Tidy up the URL structure in the main website (e.g. www.domain.tld/women/dresses and www.domain.tld/diesel-red-skirt-4563749. I plan to use Solution 2 as described in http://www.seomoz.org/blog/building-faceted-navigation-that-doesnt-suck to block multi select URL’s from being indexed and would like to use the URL param “location” as an indicator for search engines to ignore the link. A risk here is that all my currently indexed URL (1.2 million URL’s) will be blocked immediately after I put this live. I cannot redirect those URL’s to the optimized URL’s as the old URL’s should still be accessible.
2.) Remove the links to the sub site (www.domain.tld/1/optimized-url) from the footer and redirect (301) all those URL’s to the newly created SEO friendly product URL’s. URL’s that cannot be matched since there is no similar catalog location in the main website will be redirected (301) to our homepage.
I wonder if this is a correct approach and if it would be better to do this in a phased way rather than the currently planned big bang?
Any feedback would be highly appreciated, also let me know if things are not clear.
Thanks!
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexed, not submitted in sitemap
I have this problem for the site's blog
Technical SEO | | seomozplan196
There is no problem when I check the yoast plugin setting , but some of my blog content is not on the map site but indexed. Did you have such a problem? What is the cause? my website name is missomister1 -
I Lost Index Status of My Sitemap
We have a simple WordPress website for our law firm, with an English version and a Spanish version. I have created a sitemap (with appropriate language markup in the XML file) and submitted it to Webmaster Tools. Google crawled the site and accepted the sitemap last week, 24/24 pages indexed, 12 English and 12 Spanish. This week, Google decided to remove one of the pages from the index, showing 23/24 pages indexed. So, my questions are as follows: How can I find out which page was dropped from the index? If the pages are the same content, but different language, why did only one version of the page get dropped, while the other version remains? Why did the Big G drop one of my pages from the index? How can I reindex the dropped page? I know this is a fairly basic issue, and I'm embarrassed for asking, but I sure do appreciate the help.
Technical SEO | | RLG0 -
Sitemap international websites
Hey Mozzers,Here is the case that I would appreciate your reply for: I will build a sitemap for .com domain which has multiple domains for other countries (like Italy, Germany etc.). The question is can I put the hreflang annotations in sitemap1 only and have a sitemap 2 with all URLs for EN/default version of the website .COM. Then put 2 sitemaps in a sitemap index. The issue is that there are pages that go away quickly (like in 1-2 days), they are localised, but I prefer not to give annotations for them, I want to keep clear lang annotations in sitemap 1. In this way, I will replace only sitemap 2 and keep sitemap 1 intact. Would it work? Or I better put everything in one sitemap?The second question is whether you recommend to do the same exercise for all subdomains and other domains? I have read much on the topic, but not sure whether it worth the effort.The third question is if I have www.example.it and it.example.com, should I include both in my sitemap with hreflang annotations (the sitemap on www.example.com) and put there it for subdomain and it-it for the .it domain (to specify lang and lang + country).Thanks a lot for your time and have a great day,Ani
Technical SEO | | SBTech0 -
Rel="canonical" of .html/ to .html
Hi, could you guys confirm me that the following scenario is completely senseless? I just got the instruction from an external consultant (with quiet good SEO knowledge) to use a rel="canonical" for the following urls. http://www.example.com/petra.html/
Technical SEO | | petrakraft
to
http://www.example.com/petra.html I mean a folder petra/ to petra is ok - but a trailing slash after .html ??? Apart from that I would rather choose a 301 - not a rel canonical. What is your position here?0 -
How can I best find out which URLs from large sitemaps aren't indexed?
I have about a dozen sitemaps with a total of just over 300,000 urls in them. These have been carefully created to only select the content that I feel is above a certain threshold. However, Google says they have only indexed 230,000 of these urls. Now I'm wondering, how can I best go about working out which URLs they haven't indexed? No errors are showing in WMT related to these pages. I can obviously manually start hitting it, but surely there's a better way?
Technical SEO | | rango0 -
Do I need an XML sitemap?
I have an established website that ranks well in Google. However, I have just noticed that no xml sitemap has been registered in Google webmaster tools, so the likelihood is that it hasn't been registered with the other search engines. However, there is an html sitemap listed on the website. Seeing as the website is already ranking well, do I still need to generate and submit an XML sitemap? Could there be any detriment to current rankings in doing so?
Technical SEO | | pugh0 -
Old URL redirect to New URL
Alright I did something dumb a year a go and I'm still paying for it. I changed my hyphenated URL to the non-hyphenated version when I redesigned my website. I say it was dumb because I lost most of my link juice even though I did 301 redirects (via the htaccess file) for almost all of the pages I could find in Google's index. Here's my problem. My new site took a huge hit in traffic (down 60%) when I made the change and even though I've done thousands of redirects my old site is still showing up in the SERPS and send much if not most of my traffic. I don't want to take the old site down in fear it will kill all of my traffic. What should I do? Is there a better method I should explore then 301 redirects? Could the other site be affecting my current rank since it's still there? (FYI...both sites are built on the WP platform). Any help or ideas are greatly appreciated. Thank you! Joe
Technical SEO | | kaje0 -
Blank Canonical URL
So my devs have the canonical URL loaded up on pages automatically, and in most cases this gets done correctly. However we ran across a bug that left some of these blank like so: Does anyone know what effect that would have? I am trying to provide a priority for this so I can say "FIX IT NOW" or "Fix it after the other 'FIX IT NOW' type of items". Let me know if you have any ideas. I just want to be sure I am not telling google that all of these pages are like the home page. Thanks!
Technical SEO | | SL_SEM0