Help, a certain directory is not being indexed
-
Before I start, dont expect this to be too easy. This really has me puzzled and am surprised I am still yet to find a solution for it. Get ready.
We have a wordpress website, launched over 6 months ago and have never had an issue getting content such as pages and post pages and categories indexed. However, I some what recently (about 2 months ago) installed a directory plugin (Business Directory Plugin) which lists businesses via unique urls that are accesible from a sub folder. Its these business listings that I absolutely cannot get indexed.
The index page to the directory which links to the business pages is indexed, however for some reason google is not indexing all the listing pages which are linked to from this page. Its not an issue of the content being uncrawlable or at least dont think so as when I run crawlers on my site such as xml sitemap crawlers it finds all the pages including the directory pages so I am sure its not an issue of the search engines not finding the content.
I have created xml sitemaps and uploaded to webmaster tools, tools recongises that there are many pages in the xml sitemap but google continues to only index a small percentage (everything but my business listings).
The directory has been there for about 8 weeks now so I know there is a issue as it should of been indexed by now.
See our main website at www.smashrepairbid.com.au and the business directory index page at www.smashrepairbid.com.au/our-shops/
To throw in a curve ball, in looking into this issue and setting up tools we noticed a lot of 404 error pages (nearly 4,000). We were very confused where these were coming from as they were only being generated from search engines - humans could not access the 404s and so we are guessing se's were firing some javascript code to generate them or something else weird. We could see the 404s in the logs so we know they were legit but again feel it was only search engines, this was validated when we added some rules to robots.txt and we saw the errors in the logs stop. We put the rules in robots txt file to try and stop google from indexing the 404 pages as we could not find anyway to fix the site / code (no idea what is causing them). If you do a site search in google you will see all the pages that are omitted in the results.
Since adding the rules to robots, our impressions shown through tools have jumped right up (increased by 5 times) so thought this was a good indication of improvement but still not getting the results we want.
Does anyone have any clue whats going on or why google and other se's are not indexing this content? Any help would be greatly appreciated and if you need any other information to assist just ask me.
Really appreciate anyone who can spare their time to help me, I sure do need it.
Thanks.
-
OK issue resolved!
Lynn thank you - was the relative url in the canonical tag that played havoc Changing it to absolute is now causing the pages to be indexed.
Lesson learnt.
-
Hey Kane,
The /shops url was a old url that had a directory in it. We blocked it in the robots as it was generating tons of 404 errors. In webmaster tools we can see thousands of 404 errors within that directory so we deleted it all and tried to block se's from throwing the errors (like i described in initial post).
A number of those listing do have very little information however there are a bunch that do have great content which is why I am not sure if that is the case. I will keep an eye on this though and also check about the logs and let you know what that says.
-
Thanks Lynn.
I have taken on your recommendation and changed the canonical tag to be absolute. Thanks for your help we will see how it goes.
-
As Lynn said, relative canonical tags could absolutely cause issues. That said, I'm seeing absolute URLs in the canonical tag now, so you may have fixed that in the past few days.
Also, I do see the Our Shops pages indexed when I search for site:smashrepairbid.com.au, but I don't see any other pages in the /our-shops/ directory aside from www.smashrepairbid.com.au/our-shops/?action=search
Your robots.txt is currently blocking /shops/. I don't think that would cause an issue but would be nice to remove that if it's not needed...
There's almost zero content on the pages I glanced at, eg. http://www.smashrepairbid.com.au/our-shops/1263/bakker-towing/ and http://www.smashrepairbid.com.au/our-shops/1616/coastal-towing-service/. When you look at it from Google's perspective, there's very little value being added by these pages. No unique photos, no phone number, no website, etc. There's a million local business scrapers that have more content than this, so why should they bother indexing these pages?
Try pulling up your logs and seeing if these URLs have been requested by Google's spiders. Here's a good guide from Ian Lurie on how to do that in Excel: http://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the spiders are crawling those shop URLs but aren't indexing them, I think the first thing to do is add way more content to the pages.
-
Hi Trent,
Having a quick look I saw that you have relative urls in your canonical tag and this could be problematic. I think it would be worth making those urls absolute to avoid any confusion on Google's part in determining what page or page version should be indexed.
Cannot say for sure if this is the problem, but worth looking into.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help optimizing website for speed
Hello, My website is www.likechimp.com and is a University project. I need to optimise the website for speed as the bounce rate is fairly quick - I feel this could be due to how long it takes web site to load? Any tips in increasing internet speed. I am willing to higher someone if they feel they can help! Thanks, L
On-Page Optimization | | xlucax0 -
Description tag not showing in the SERPs because page is blocked by Robots, but the page isn't blocked. Any help?
While checking some SERP results for a few pages of a site this morning I noticed that some pages were returning this message instead of a description tag, A description for this result is not avaliable because of this site's robot.s.txt The odd thing is the page isn't blocked in the Robots.txt. The page is using Yoast SEO Plugin to populate meta data though. Anyone else had this happen and have a fix?
On-Page Optimization | | mac22330 -
Optimization for allready indexed web without seo before
Hi, I have some questions. I have 2 Joomla web sites who needs seo optimization, but there are allready indexed in Google: 1/ Will rel=canonical fix the problem for duplicated content for allready indexed pages? I have read different solutions but I haven't tested them all. I puted rel=canonical on the first web site that I started to optimize, but Google WMT shows me very small or not at all decreasing of duplicated content. Would it need more time? 2/How can I show search engines to craw and change the cache for a page that is allready indexed faster than 1-2 months? 3/ What is your opinion and experience? With done changes for one website for a low competition keyword /the web site was not optimized for search engines before/ how much time will it take usually for Google to see the changes made and change the ranking for some keyword?
On-Page Optimization | | vladokan0 -
The crawl diagnosis indicated that my domain www.mydomain.com is duplicate with www.mydomain.com/index.php. How can I correct this issue?
How can I fix this issue when crawl diagnosis indicated that my www.mydomain.com is duplicate with www.mydomain.com/index.php? That suppose to be the same page and not duplicate, right?
On-Page Optimization | | jsevilla0 -
Why are some of page indexed and others not
I have created a site structure like this: domain/for-sale/brand domain/for-sale/brand-model domain/for-sale/brand-model/pg1 domain/for-sale/brand-model/pg2 domain/for-sale/brand-model/pg3 etc.... I cannot understand why the domain/for-sale/brand-model does not seem to be indexed, yet the domain/for-sale/brand-model/pg6 is? This is a new site, but I cannot understand why this URL would be indexed without the others... Any ideas? My home pages has links to the domain/for-sale/brand, this page has links to domain/for-sale/brand-model1, domain/for-sale/brand-model2 etc, each of these pages have links to domain/for-sale/brand-model/pg1, domain/for-sale/brand-model/pg2 etc...
On-Page Optimization | | MirandaP0 -
Is it ok to point internal links to index.html home page rather than full www
I thought I saw this somewhere on SEOmoz before but I was so busy by the time I got around to work on my SEO on my site, I realized I have this happening and can't recall if it is a problem which takes away from my ranking. If my www.website.com is ranking well but I have internal menu links pointing to www.website.com/index.html instead of www.website.com will that take away from my www.website.com rankings? Should I change all my menu links that point to /index.html to the full website url path www.website.com ?
On-Page Optimization | | Twinbytes0 -
Need help with fluctuating ranking for a specific keyword
my website www.totalmanagement.com fluctuates for the search term: web based property management software I have been using SEO Moz for a few months now and have managed to get to the top 5 and jump around between 3 and 5. Does anyone have any suggestions to assist me? Long term goal is also to really target: Property Management Software But I am still very new at this. Thanks in advance for the help!
On-Page Optimization | | dgruhin0 -
How should directories be set up on ecommerce?
I have a ecommerce site. I am trying to figure out the best layout for my directories. Here are my two options. Option 1. - Each directory in root domain. Carrying Cases - plastic.com/carryingcases/
On-Page Optimization | | PlasticandFoamPackaging
Case Type - plastic.com/casetype/
Specific Product - plastic.com/sproduct Option 2 - Put them in subdirectories Carrying Cases - plastic.com/carryingcases/
Case Type - plastic.com/carryingcases/casetype/
Specific Product - plastic.com/carryingcases/casetype/sproduct I know this is a very basic question but I am looking for the right answer. I keep getting conflicting answers from different sources.0