Blog page won't get indexed
-
Hi Guys,
I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now.
I found this in the robots.txt file:
Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/
I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt?
Cheers!
-
Thanks alot!
-
Hi Dirk,
Good observation, I missed the canonical part somehow. So, google is indexing the canonical URLs here which doesn't have /blog/ in it and that's the problem. Have a look at the indexed page for this particular instance here. Non /blog/ instance is indexed, which will take you to its /blog/ version with wrong canonical URL.
Solution: Either remove the canonical URLs on these pages to point them to the current page itself. And yeah! As rightly mentioned by Dirk, do a proper /blog/ page linking from the blog page and other pages from where you're linking these articles.
-
This is definitely the issue. Fix that canonical and they'll be indexed.
-
To update - even worse: on the blog itself you are linking to the canonical version - not to the /blog/ version. So it would be impossible for Google to index /blog/ type of content.
If you do woontrends 2016 site:www.keukensduitsland.nl you will notice that the canonical version is properly indexed (even with the strange js redirect.
Dirk
-
It's not related to the robots.txt - you can easily check that in Webmastertools (Crawl > Robots.txt tester)
First issue is the location of the link - if you put a small link to the blog hidden in the left corner at the bottom of the page Google is not going to attribute a lot of importance to this link.
Most important issue on your blog articles is the canonical - example:
http://www.keukensduitsland.nl/blog/woontrends-2016/ has as canonical url: http://www.keukensduitsland.nl/woontrends-2016/ - however this page will redirect you with javascript to the blog article.
Make the canonical self referencing and do a proper redirect on the other pages (301 rather than js redirect)
Dirk
-
Hi Happy SEO,
Well, the robots.txt looks find here. Could you try to fetch any of the blog page/post as google in the search console and share the screenshot here?
Also, to cross check the robots.txt (which looks fine though), you have robots.txt tester in search console where you can put any blog page/post to check if bots can crawl it. Please share a screenshot of that as well.
On a separate note, the sitemap.xml link mentioned in the robots.txt (http://www.keukensduitsland.nl/sitemap.xml) is broken. Fix that as well.
-
Hi Nitin,
The URL is www.keukensduitsland.nl (/blog). The link to the blog page is in the bottom left corner called "Keukennieuws".
-
Hi Happy SEO,
Could you please share the blog URL here? Sounds like an interesting issue and would love to give a try to help you with this
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
Why Are Some Pages On A New Domain Not Being Indexed?
Background: A company I am working with recently consolidated content from several existing domains into one new domain. Each of the old domains focused on a vertical and each had a number of product pages and a number of blog pages; these are now in directories on the new domain. For example, what was www.verticaldomainone.com/products/productname is now www.newdomain.com/verticalone/products/product name and the blog posts have moved from www.verticaldomaintwo.com/blog/blogpost to www.newdomain.com/verticaltwo/blog/blogpost. Many of those pages used to rank in the SERPs but they now do not. Investigation so far: Looking at Search Console's crawl stats most of the product pages and blog posts do not appear to be being indexed. This is confirmed by using the site: search modifier, which only returns a couple of products and a couple of blog posts in each vertical. Those pages are not the same as the pages with backlinks pointing directly at them. I've investigated the obvious points without success so far: There are a couple of issues with 301s that I am working with them to rectify but I have checked all pages on the old site and most redirects are in place and working There is currently no HTML or XML sitemap for the new site (this will be put in place soon) but I don't think this is an issue since a few products are being indexed and appearing in SERPs Search Console is returning no crawl errors, manual penalties, or anything else adverse Every product page is linked to from the /course page for the relevant vertical through a followed link. None of the pages have a noindex tag on them and the robots.txt allows all crawlers to access all pages One thing to note is that the site is build using react.js, so all content is within app.js. However this does not appear to affect pages higher up the navigation trees like the /vertical/products pages or the home page. So the question is: "Why might product and blog pages not be indexed on the new domain when they were previously and what can I do about it?"
Technical SEO | | BenjaminMorel0 -
Removal of date archive pages on the blog
I'm currently building a site which currently has an archive of blog posts by month/year but from a design perspective would rather not have these on the new website. Is the correct practice to 301 these to the main blog index page? Allow them to 404? Or actually to keep them after all. Many thanks in advance Andrew
Technical SEO | | AndieF0 -
How to block text on a page to be indexed?
I would like to block the spider indexing a block of text inside a page , however I do not want to block the whole page with, for example , a noindex tag. I have tried already with a tag like this : chocolate pudding chocolate pudding However this is not working for my case, a travel related website. thanks in advance for your support. Best regards Gianluca
Technical SEO | | CharmingGuy0 -
Can't get Google to Index .pdf in wp-content folder
We created an indepth case study/survey for a legal client and can't get Google to crawl the PDF which is hosted on Wordpress in the wp-content folder. It is linked to heavily from nearly all pages of the site by a global sidebar. Am I missing something obvious as to why Google won't crawl this PDF? We can't get much value from it unless it gets indexed. Any help is greatly appreciated. Thanks! Here is the PDF itself:
Technical SEO | | inboundauthority
http://www.billbonebikelaw.com/wp-content/uploads/2013/11/Whitepaper-Drivers-vs-cyclists-Floridas-Struggle-to-share-the-road.pdf Here is the page it is linked from:
http://www.billbonebikelaw.com/resources/drivers-vs-cyclists-study/0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
Too Many On-Page Links on a Blog
I have a question about the number of on-page links on a page and the implications on how we're viewed by search engines. After SEOmoz crawls our website, we consistently get notifications that some of our pages have "Too Many On-Page Links." These are always limited to pages on our blog, and largely a function of our tag cloud (~ 30 links) plus categories (10 links) plus popular posts (5 links). These all display on every blog post in the sidebar. How significant a problem is this? And, if you think it is a significant problem, what would you suggest to remedy the problem? Here's a link to our blog in case it helps: http://wiredimpact.com/blog/ The above page currently is listed as having 138 links. Any advice is much appreciated. Thanks so much. David
Technical SEO | | WiredImpact0 -
I'm getting duplicate content created with a random string of character added to the end of my blog post permalinks?
In an effort to clean up my blog content I noticed that I have a lot of posts getting tagged for duplicate content. It looks like ... http://carwoo.com/blog/october-sales-robust-stateside-european-outlook-poor-for-ford http://carwoo.com/blog/october-sales-robust-stateside-european-outlook-poor-for-ford/954bf0df0a0d02b700a06816f2276fa5/ Any thoughts on how and why this would be happening?
Technical SEO | | editabletext0