Blocked by meta-robots but there is no robots file

Twinbytes

OK, I'm a little frustred here. I've waited a week for the next weekly index to take place after changing the privacy setting in a wordpress website so Google can index, but I still got the same problem. Blocked by meta-robots, no index, no follow. But I do not see a robot file anywhere and the privacy setting in this Wordpress site is set to allow search engines to index this site. Website is www.marketalert.ca

What am I missing here? Why can't I index the rest of the website and is there a faster way to test this rather than wait another week just to find out it didn't work again?

Twinbytes

The .htaccess file is in placing directing www to non www, so I don't see what else I could do with that. I forgot to mention the website was recently overhauled by someone else, and they are having me help with SEO. Not sure if that has anything to do with it. It looks like the .htaccess should be reversed so the non www points to the www which has more value. Someone else designed this site and they are having me do the SEO on it for them.

jim_cetin

The issue might be the forwarding from www.yourdomain.ca to yourdomain.ca

look at http://www.opensiteexplorer.org/pages?site=marketalert.ca%2F

and here http://www.opensiteexplorer.org/pages?site=www.marketalert.ca%2F

..some are indexed on with www and other without www. , this is your main issue.

recommendation:

revisit the htaccess file or where the redirect has been set DNS..
choose one with www or without and stick to it.
revicit your external links and make the changes to your links
create new sitemap and resubmit to SearchEngines

Twinbytes

I ran the SEO web crawler and it finished already. Successfully crawled all pages. I still have to wait for another week to get the main campaign updated and see results there, but I believe it may work too now.

I guess I solved my own problem after being directed to robots.txt by Jim. I found that the Wordpress plugin for SEO xml sitemap creator was the problem because it created a virtual robots.txt file which sent me on a wild goose chase looking for a robots.txt file which didn't exist. Creating a robots.txt file allowing all seems to be the solultion, incase anyone else has this same problem.

blu42media

If you can, follow up either way - happy to help you get it debugged!

Twinbytes

I was able to update my sitemap.xml with Google webmaster tools no problem. I'm not 100% confident though that means the entire site is searchable by the spiders. I guess I'll know for sure in a few days tops.

blu42media

I agree with Jim. Update your sitemap.xml files with Google Webmaster Tools. That will also help you identify problems you might be missing.

Twinbytes

I've done some more looking into it and seems to be a problem when Wordpress uses the XML site generator plugin. It creates a virtual robot.txt file, which is why I couldn't find the robot.txt file. Apparently the only fix is to replace it with an actual robot.txt file forcing it to allow all.

I just replaced the robots.txt file with a real one allowing all. SEOmoz estimates a few days to test site crawl and it's another 7 days before the next scheduled crawl. I'd kinda like to find out sooner if it's not going to work. There must be a faster test. I don't need a detailed test, just a basic test that says, YEP, we can see this many pages or something like that.

jim_cetin

hi

your robots.txt file is located here http://marketalert.ca/robots.txt, which is the root of your website directory.

this is the actual location of your sitemap file (http://marketalert.ca/sitemap.xml), does the Google WT show any issues about the sitemap file could not be found?

You might need to resubmit the sitemap file, if there are any changes, of course with the updated version of your site.

hope this helps.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Blocked by meta-robots but there is no robots file

Browse Questions

Explore more categories

Related Questions

H1 Tags the same as Title Tags and other meta questions

Sub Domains and Robot.txt files...

Pageing page and seo meta tag questions

Quality Issues: My blog is blocked on Google Search Engine

Blocking https from being crawled

Allow or Disallow First in Robots.txt

Search Engine Blocked by Robot Txt warnings for Filter Search result pages--Why?

Does RogerBot read URL wildcards in robots.txt