Blocked by meta-robots but there is no robots file
-
OK, I'm a little frustred here. I've waited a week for the next weekly index to take place after changing the privacy setting in a wordpress website so Google can index, but I still got the same problem. Blocked by meta-robots, no index, no follow. But I do not see a robot file anywhere and the privacy setting in this Wordpress site is set to allow search engines to index this site. Website is www.marketalert.ca
What am I missing here? Why can't I index the rest of the website and is there a faster way to test this rather than wait another week just to find out it didn't work again?
-
The .htaccess file is in placing directing www to non www, so I don't see what else I could do with that. I forgot to mention the website was recently overhauled by someone else, and they are having me help with SEO. Not sure if that has anything to do with it. It looks like the .htaccess should be reversed so the non www points to the www which has more value. Someone else designed this site and they are having me do the SEO on it for them.
-
The issue might be the forwarding from www.yourdomain.ca to yourdomain.ca
look at http://www.opensiteexplorer.org/pages?site=marketalert.ca%2F
and here http://www.opensiteexplorer.org/pages?site=www.marketalert.ca%2F
..some are indexed on with www and other without www. , this is your main issue.
recommendation:
- revisit the htaccess file or where the redirect has been set DNS..
- choose one with www or without and stick to it.
- revicit your external links and make the changes to your links
- create new sitemap and resubmit to SearchEngines
-
I ran the SEO web crawler and it finished already. Successfully crawled all pages. I still have to wait for another week to get the main campaign updated and see results there, but I believe it may work too now.
I guess I solved my own problem after being directed to robots.txt by Jim. I found that the Wordpress plugin for SEO xml sitemap creator was the problem because it created a virtual robots.txt file which sent me on a wild goose chase looking for a robots.txt file which didn't exist. Creating a robots.txt file allowing all seems to be the solultion, incase anyone else has this same problem.
-
If you can, follow up either way - happy to help you get it debugged!
-
I was able to update my sitemap.xml with Google webmaster tools no problem. I'm not 100% confident though that means the entire site is searchable by the spiders. I guess I'll know for sure in a few days tops.
-
I agree with Jim. Update your sitemap.xml files with Google Webmaster Tools. That will also help you identify problems you might be missing.
-
I've done some more looking into it and seems to be a problem when Wordpress uses the XML site generator plugin. It creates a virtual robot.txt file, which is why I couldn't find the robot.txt file. Apparently the only fix is to replace it with an actual robot.txt file forcing it to allow all.
I just replaced the robots.txt file with a real one allowing all. SEOmoz estimates a few days to test site crawl and it's another 7 days before the next scheduled crawl. I'd kinda like to find out sooner if it's not going to work. There must be a faster test. I don't need a detailed test, just a basic test that says, YEP, we can see this many pages or something like that.
-
hi
your robots.txt file is located here http://marketalert.ca/robots.txt, which is the root of your website directory.
this is the actual location of your sitemap file (http://marketalert.ca/sitemap.xml), does the Google WT show any issues about the sitemap file could not be found?
You might need to resubmit the sitemap file, if there are any changes, of course with the updated version of your site.
hope this helps.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Only Images Show Up in Log Files
Has anyone ever seen a log file analysis return only images and no actual page URLs?
Technical SEO | | LoganRay0 -
Google Six Pack Meta Descriptions
Anyone know of some good resources on the way to control the meta descriptions in the Google six pack for a branded search. Google appears to often bypass the meta description and pull random text from the page. We have tested this by adding the brand name to the meta description and this appears to help but doesn't always sort the issue. You often get the same results with the site comand. It not really an SEO issue more a branding one. Anyone get any ideas or resources they can point me at?
Technical SEO | | highwayfive0 -
What punctuation can you use in meta tags? Are there any Google does not like?
So I know you can use dashes and | in meta tags, but can anyone tell me what other punctuation you can use? Also, it'd be great to know what punctuation you can't use. Thanks!
Technical SEO | | Trevorneo1 -
Google is Still Blocking Pages Unblocked 1 Month ago in Robots
I manage a large site over 200K indexed pages. We recently added a new vertical to the site that was 20K pages. We initially blocked the pages using Robots.txt while we were developing/testing. We unblocked the pages 1 month ago. The pages are still not indexed at this point. 1 page will show up in the index with an omitted results link. Upon clicking the link you can see the remaining un-indexed pages. Looking for some suggestions. Thanks.
Technical SEO | | Tyler1230 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Registered Trademark in a Meta Title or Content
I know that registered trademarks don't hurt SEO, however if the trademark is used in the middle of a popular search phrase (see below) will it hurt the site's chanced of getting ranked for this term. Example: Funkybrand® Shoes PS I found one brand that used the trademark Acuvue® contact lenses. thanks!
Technical SEO | | yanaiguana1110 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
Robots.txt
My campaign hse24 (www.hse24.de) is not being crawled any more ... Do you think this can be a problem of the robots.txt? I always thought that Google and friends are interpretating the file correct, seen that he site was crawled since last week. Thanks a lot Bernd NB: Here is the robots.txt: User-Agent: * Disallow: / User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: MSNBot User-agent: Slurp User-agent: yahoo-mmcrawler User-agent: psbot Disallow: /is-bin/ Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_DisplayProductInformation-Start Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/ Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/Root%20Edition/units/HSE24/Beratung/
Technical SEO | | remino630