Crawl diagnostic issue?
-
I'am sorry if my English isn't very good, but this is my problem at the moment:
On two of my campagnes I get a weird error on Moz Analytics:
605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag
Moz Analytics points to an url that starts with: http:/**/None/**www.????.com. We don't understand how Moz indexed this non-existing page that starts with None? And how can we solve this error?
I hope that someone can help me.
-
Hi MOZ,
I'am sorry that I have not previously responded. The problem has been solved. Thanks!
Also thanks to Pixel for the response!
Greetz,
Sam
-
Hi Nettt!
I apologize for any confusion and can confirm there is no issue on your side. One of our crawlers failed causing some campaigns crawled on Aug 29th attempt to follow the strange /None/ URL you are seeing in your diagnostics. I've submitted a re-crawl for all of your campaigns affected so you should see updated data by this Friday.
Hope this helps!
-
"I have checked the URL, and it is not our own website that has the error."
is this the problem?
Could you take a screen grab of the problem it might help better.
-
Thanks for the respons, Pixelbypixel!
I have checked the URL, and it is not our own website that has the error.
We have checked the robots.txt and it should not cause any problem. We have n't recently changed it.
I Think that Moz is causing it, but I am not sure..
-
Is the URL correct on Moz pro? It also seems like your robots.txt is blocking Moz which you may want to look into.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird Indexing Issues with the Pages and Rankings
When I found the my page was non-existent on the search results page, I requested Google to index my page via the Search Console. And then just a few minutes after I did that, that page rose to top 3 ranking on the search page (with the same keyword and browser search). It happens to most of the pages on my website. Maybe a week later the rankings sank again, and I had to do the process again to make my pages to the top. Any reasons to explain this phenomenon, and how I can fix this issue? Thank you in advance.
Intermediate & Advanced SEO | | mrmrsteven0 -
Google Adsbot crawling order confirmation pages?
Hi, We have had roughly 1000+ requests per 24 hours from Google-adsbot to our confirmation pages. This generates an error as the confirmation page cannot be viewed after closing or by anyone who didn't complete the order. How is google-adsbot finding pages to crawl that are not linked to anywhere on the site, in the sitemap or linked to anywhere else? Is there any harm in a google crawler receiving a higher percentage of errors - even though the pages are not supposed to be requested. Is there anything we can do to prevent the errors for the benefit of our network team and what are the possible risks of any measures we can take? This bot seems to be for evaluating the quality of landing pages used in for Adwords so why is it trying to access confirmation pages when they have not been set for any of our adverts? We included "Disallow: /confirmation" in the robots.txt but it has continued to request these pages, generating a 403 page and an error in the log files so it seems Adsbot doesn't follow robots.txt. Thanks in advance for any help, Sam
Intermediate & Advanced SEO | | seoeuroflorist0 -
The images on site are not found/indexed, it's been recommended we change their presentation to Google Bot - could this create a cloaking issue?
Hi We have an issue with images on our site not being found or indexed by Google. We have an image sitemap but the images are served on the Sitecore powered site within <divs>which Google can't read. The developers have suggested the below solution:</divs> Googlebot class="header-banner__image" _src="/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx"/>_Non Googlebot <noscript class="noscript-image"><br /></span></em><em><span><div role="img"<br /></span></em><em><span>aria-label="Arctic Safari Camp, Arctic Canada"<br /></span></em><em><span>title="Arctic Safari Camp, Arctic Canada"<br /></span></em><em><span>class="header-banner__image"<br /></span></em><em><span>style="background-image: url('/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx?mw=1024&hash=D65B0DE9B311166B0FB767201DAADA9A4ADA4AC4');"></div><br /></span></em><em><span></noscript> aria-label="Arctic Safari Camp, Arctic Canada" title="Arctic Safari Camp, Arctic Canada" class="header-banner__image image" data-src="/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx" data-max-width="1919" data-viewport="0.80" data-aspect="1.78" data-aspect-target="1.00" > Is this something that could be flagged as potential cloaking though, as we are effectively then showing code looking just for the user agent Googlebot?The devs have said that via their contacts Google has advised them that the original way we set up the site is the most efficient and considered way for the end user. However they have acknowledged the Googlebot software is not sophisticated enough to recognise this. Is the above solution the most suitable?Many thanksKate
Intermediate & Advanced SEO | | KateWaite0 -
Crawl Test Question
Good Morning, I am just looking for a little bit of advice, I ran a crawl report on our website www.swiftcomm.co.uk. I have resolved most of the issues myself, however I have two questions;- Screenshot image http://imgur.com/VlFEiZ2 Highlighted blue, we have two homepages www.swiftcomm.co.uk and www.swiftcomm.co.uk/ both are set with a Rel-Canonical Target of www.swiftcomm.co.uk/. Will this cause me any SEO issues and or other potential issue? If this may cause an issue how would I go about resolving? Highlighted yellow, Our contact and referral-form are showing as duplicate title and meta description. Both of these pages have separate title and meta desc which it does seem to be detecting. If I search the page in google it returns the correct title and meta desc. The only common denominator behind these pages is that both have php pages behind them for the contact form. Do you think that the moz crawl may be detecting the php page over the html? Could this be cause any issues when search engines crawl the site? Kind Regards Jonathan Mack VlFEiZ2
Intermediate & Advanced SEO | | JMack9860 -
We 410'ed URLs to decrease URLs submitted and increase crawl rate, but dynamically generated sub URLs from pagination are showing as 404s. Should we 410 these sub URLs?
Hi everyone! We recently 410'ed some URLs to decrease the URLs submitted and hopefully increase our crawl rate. We had some dynamically generated sub-URLs for pagination that are shown as 404s in google. These sub-URLs were canonical to the main URLs and not included in our sitemap. Ex: We assumed that if we 410'ed example.com/url, then the dynamically generated example.com/url/page1 would also 410, but instead it 404’ed. Does it make sense to go through and 410 these dynamically generated sub-URLs or is it not worth it? Thanks in advice for your help! Jeff
Intermediate & Advanced SEO | | jeffchen0 -
Wordpress to HubSpot CMS - I had major crawl issues post launch and now traffic is down 400%
Hi there good looking person! Our traffic went from 12k visitors in july to 3k visitors in july. << www.thedsmgroup.com >>When we moved our site from wordpress to the hubspot COS (their CMS system), I didnt submit a new sitemap to google webmaster tools. I didn't know that I had to... and to be honest, I've never submitted or re-submitted a sitemap to GWT. I have always built clean sites with fresh content and good internal linking and never worried about it. Yoast kind of took care of the rest, as all of my sites and our clients' sites were always on wordpress. Well, lesson learned. I got this message on June 27th in GWT_http://www.thedsmgroup.com/: Increase in not found errors__Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages._One month after our site launched we had 1,000 404s on our website. Ouch. Google thought we had a 1,200 page website with only 200 good pages and 1,000 error pages. Not very trust worthy... We never had a 404 ever before this, as we added a plugin to wordpress that would 301 any 404 to the homepage, so we never had a broken link on our site, which is not ideal for UX, but as far as google was concerned, our site was always clean. Obviously I have submitted a new sitemap to GWT a few weeks ago, and we are moving in the right direction... **but have I taken care of everything I need to? I'm not sure. Our traffic is still around 100 visitors per day, not 400 per day as it was before we launched the new site.**Thoughts?I'm not totally freaking out or anything, but a month ago we ranked #1 and #2 for "marketing agency nj", now we aren't in the top 100. I've never had a problem like this. _I added a few screen grabs from Google Webmaster Tools that should be helpful.__Bottom line, have I done everything I need to or do I need to do something with all of these "not found" error details that I have in GWT?_None of these "not found" pages have any value and I'm not sure how Google even found them... For example: http://www.thedsmgroup.com/supersize-page-test/screen-shot-2012-11-06-at-2-33-22-pmHelp! -JasonuhLLtou&h4QmGCW#0 uhLLtou&h4QmGCW#1
Intermediate & Advanced SEO | | Charlene-Wingfield0 -
This is a clear-cut canonical issue, right?
Hello, A client is having one of their daily blogs published on a industry news site along with on their own site. This is a clear-cut case of having a canonical tag implemented on the client's site on each blog page, right? Thanks
Intermediate & Advanced SEO | | Martin_S0 -
How to Avoid Duplicate Content Issues with Google?
We have 1000s of audio book titles at our Web store. Google's Panda de-valued our site some time ago because, I believe, of duplicate content. We get our descriptions from the publishers which means a good
Intermediate & Advanced SEO | | lbohen
deal of our description pages are the same as the publishers = duplicate content according to Google. Although re-writing each description of the products we offer is a daunting, almost impossible task, I am thinking of re-writing publishers' descriptions using The Best Spinner software which allows me to replace some of the publishers' words with synonyms. I have re-written one audio book title's description resulting in 8% unique content from the original in 520 words. I did a CopyScape Check and it reported "65 duplicates." CopyScape appears to be reporting duplicates of words and phrases within sentences and paragraphs. I see very little duplicate content of full sentences
or paragraphs. Does anyone know whether Google's duplicate content algorithm is the same or similar to CopyScape's? How much of an audio book's description would I have to change to stay away from CopyScape's duplicate content algorithm? How much of an audio book's description would I have to change to stay away from Google's duplicate content algorithm?0