Blog page won't get indexed
-
Hi Guys,
I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now.
I found this in the robots.txt file:
Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/
I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt?
Cheers!
-
Thanks alot!
-
Hi Dirk,
Good observation, I missed the canonical part somehow. So, google is indexing the canonical URLs here which doesn't have /blog/ in it and that's the problem. Have a look at the indexed page for this particular instance here. Non /blog/ instance is indexed, which will take you to its /blog/ version with wrong canonical URL.
Solution: Either remove the canonical URLs on these pages to point them to the current page itself. And yeah! As rightly mentioned by Dirk, do a proper /blog/ page linking from the blog page and other pages from where you're linking these articles.
-
This is definitely the issue. Fix that canonical and they'll be indexed.
-
To update - even worse: on the blog itself you are linking to the canonical version - not to the /blog/ version. So it would be impossible for Google to index /blog/ type of content.
If you do woontrends 2016 site:www.keukensduitsland.nl you will notice that the canonical version is properly indexed (even with the strange js redirect.
Dirk
-
It's not related to the robots.txt - you can easily check that in Webmastertools (Crawl > Robots.txt tester)
First issue is the location of the link - if you put a small link to the blog hidden in the left corner at the bottom of the page Google is not going to attribute a lot of importance to this link.
Most important issue on your blog articles is the canonical - example:
http://www.keukensduitsland.nl/blog/woontrends-2016/ has as canonical url: http://www.keukensduitsland.nl/woontrends-2016/ - however this page will redirect you with javascript to the blog article.
Make the canonical self referencing and do a proper redirect on the other pages (301 rather than js redirect)
Dirk
-
Hi Happy SEO,
Well, the robots.txt looks find here. Could you try to fetch any of the blog page/post as google in the search console and share the screenshot here?
Also, to cross check the robots.txt (which looks fine though), you have robots.txt tester in search console where you can put any blog page/post to check if bots can crawl it. Please share a screenshot of that as well.
On a separate note, the sitemap.xml link mentioned in the robots.txt (http://www.keukensduitsland.nl/sitemap.xml) is broken. Fix that as well.
-
Hi Nitin,
The URL is www.keukensduitsland.nl (/blog). The link to the blog page is in the bottom left corner called "Keukennieuws".
-
Hi Happy SEO,
Could you please share the blog URL here? Sounds like an interesting issue and would love to give a try to help you with this
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Gradual Drop in GWT Indexed Pages for large website
Hey all, I am working on SEO for a massive sports website. The information provided will be limited but I will give you as much context as possible. I just started digging into it and have found several on-page SEO issues of which I will fix when I get to the meat of it but this seems like something else could be going on. I have attached an image below. It doesn't seem like it's a GWT bug as reported at one point either as it's been gradually dropping over the past year. Also, there is about a 20% drop in traffic in Google Analytics over this time as well. This website has hundreds of thousands of pages of player profiles, sports team information and more all marked up with JSON-LD. Some of the on-page stuff that needs to be fixed are the h1 and h2, title tags and meta description. Also, some of the descriptions are pulled from wikipedia and linked to a "view more" area. Anchor text has "sign up" language as well. Not looking for a magic bullet but to be pointed in the right direction. Where should I start checking off to ensure I cover my bases besides the on page stuff above? There aren't any serious errors and I don't see any manual penalties. There are 4,300 404's but I have seen plenty of sites with that many 404's all of which still got traffic. It doesn't look like a sitemap was submitted to GWT and when I try submitting sitemap.xml, I get a 504 error (network unreachable). Thanks for reading. I am just getting started on this project but would like to spend as much time sharpening the axe before getting to work. lJWk8Rh
Technical SEO | | ArashG0 -
Why Are Some Pages On A New Domain Not Being Indexed?
Background: A company I am working with recently consolidated content from several existing domains into one new domain. Each of the old domains focused on a vertical and each had a number of product pages and a number of blog pages; these are now in directories on the new domain. For example, what was www.verticaldomainone.com/products/productname is now www.newdomain.com/verticalone/products/product name and the blog posts have moved from www.verticaldomaintwo.com/blog/blogpost to www.newdomain.com/verticaltwo/blog/blogpost. Many of those pages used to rank in the SERPs but they now do not. Investigation so far: Looking at Search Console's crawl stats most of the product pages and blog posts do not appear to be being indexed. This is confirmed by using the site: search modifier, which only returns a couple of products and a couple of blog posts in each vertical. Those pages are not the same as the pages with backlinks pointing directly at them. I've investigated the obvious points without success so far: There are a couple of issues with 301s that I am working with them to rectify but I have checked all pages on the old site and most redirects are in place and working There is currently no HTML or XML sitemap for the new site (this will be put in place soon) but I don't think this is an issue since a few products are being indexed and appearing in SERPs Search Console is returning no crawl errors, manual penalties, or anything else adverse Every product page is linked to from the /course page for the relevant vertical through a followed link. None of the pages have a noindex tag on them and the robots.txt allows all crawlers to access all pages One thing to note is that the site is build using react.js, so all content is within app.js. However this does not appear to affect pages higher up the navigation trees like the /vertical/products pages or the home page. So the question is: "Why might product and blog pages not be indexed on the new domain when they were previously and what can I do about it?"
Technical SEO | | BenjaminMorel0 -
My wepgages aren't crawled by google
Most of my webpages aren't crawled by google.
Technical SEO | | Poutokas
Why is that and what can i do to make google index at least most of my webpages?0 -
Getting a video displaying a lightbox indexed
We have created a video for a category page with the goal of building links to the page and improving the conversion rate of visitors to the page. This category is Christmas oriented so we want to get the video dropped in ASAP. Unfortunately there was a mixup with our developer and he created a lightbox pop-up to display the video on the category page. I'm concerned this will hurt our ability to get the video indexed in Google. Here was his response. Is what he says here true? "With the video originally being in lightbox the iFrame Embed was enough since the video can't be on the page, it would have to be hidden on the page which is ignored by Google. The SEO would be derived from modifying the video sitemap to define the category page as the HTML page for the Wistia video and Google will make the association. The sitemap did all the heavy lifting, the schema markup did not come till later so it had no additional affect on Google other then to re-enforce the sitemap." Thanks for your help!
Technical SEO | | GManSEO0 -
Https-pages still in the SERP's
Hi all, my problem is the following: our CMS (self-developed) produces https-versions of our "normal" web pages, which means duplicate content. Our it-department put the <noindex,nofollow>on the https pages, that was like 6 weeks ago.</noindex,nofollow> I check the number of indexed pages once a week and still see a lot of these https pages in the Google index. I know that I may hit different data center and that these numbers aren't 100% valid, but still... sometimes the number of indexed https even moves up. Any ideas/suggestions? Wait for a longer time? Or take the time and go to Webmaster Tools to kick them out of the index? Another question: for a nice query, one https page ranks No. 1. If I kick the page out of the index, do you think that the http page replaces the No. 1 position? Or will the ranking be lost? (sends some nice traffic :-))... thanx in advance 😉
Technical SEO | | accessKellyOCG0 -
Additional product information: the product's sales page or a blog post?
I want to go in-depth about different customizations for custom caps, which is one of the products we offer. I just don't know whether it would be better--from an SEO perspective--to expand the caps sales page we already have or to write a blog post to give the site another valuable indexed page. From a user standpoint, I don't think it's as important, because if I do it the blog way, I can't just put a link on the page saying, Want more customizations? Visit our blog post. Any opinions?
Technical SEO | | UnderRugSwept1 -
Number of Indexed Pages in Webmaster Tools
My # of indexed pages in Webmaster Tools fluctuates greatly. Compared to the # of URLs submitted (4700), we have 3000 indexed. The other day, all 4700 were indexed. Why does it keep changing? I obviously want all of them indexed right? What can I do to make that happen?
Technical SEO | | kylesuss0 -
Why this page doesn't get indexed?
Hi, I've just taken over development and SEO for a site and we're having difficulty getting some key pages indexed on our site. They are two clicks away from the homepage, but still not getting indexed. They are recently created pages, with unique content on. The architecture looks like this:Homepage >> Car page >> Engine specific pageWhenever we add a new car, we link to its 'Car page' and it gets indexed very quickly. However the 'Engine pages' for that car don't get indexed, even after a couple of weeks. An example of one of these index pages are - http://www.carbuzz.co.uk/car-reviews/Volkswagen/Beetle-New/2.0-TSISo, things we've checked - 1. Yes, it's not blocked by robots.txt2. Yes, it's in the sitemap (http://www.carbuzz.co.uk/sitemap.xml)3. Yes, it's viewable to search spiders (e.g. the link is present in the html source)This page doesn't have a huge amount of unique content. We're a review aggregator, but it still does have some. Any suggestions as to why it isn't indexed?Thanks, David
Technical SEO | | soulnafein0