Crawl Diagnostic Errors
-
Hi there,
Seeing a large number of errors in the SEOMOZ Pro crawl results. The 404 errors are for pages that look like this:
http://www.example.com/2010/07/blogpost/http:%2F%2Fwww.example.com%2F2010%2F07%2Fblogpost%2F
I know that t%2F represents the two slashes, but I'm not sure why these addresses are being crawled. The site is a wordpress site. Anyone seen anything like this?
-
Yep, i think you nailed it. I crawled another 2 sites I manage, one has sexy bookmarks, one doesn't. The one with had 404 errors. A quick search for sexy bookmarks causes 404 had some results as well.
You're right about the issue with the other plugin, commentluv. Will definitely take that suggestion to the developer.
And a hat trick, you're right about the block of latest from the blog on the footer. Been meaning to take that out for ages.
Very grateful for your attention and wisdom! Thank you!
-
Ross, it seems you have a plugin for comments which adds a link to the last post of the person who made the comment. This is an interesting plugin which i have not seen before. There are two problems I see with the plugin. First, it identifies links to your own site as external, when they should be tagged as internal. Secondly, it probably shouldn't be used to link to the current page. Debbi's comment is a link asking readers to view her latest article, which is the current page.
There is also a link to the current article under Recent Posts. It would be a great advancement for the plugin if it could identify the current URL and not include it in the list.
There is also a footer section "Latest from blog" which offers a link to the post. In my opinion offering the same links in the Recent Posts side bar and the "Latest from blog" footer is excessive, and since footer links aren't used very much I would recommend removing the footer block.
The fourth link to the article I located on the page is from a plugin which is referred to as "Shareaholic TopSharingBar SexyBookmarks". The link is contained within javascript.
All of the above 4 links are valid links and should not be the source of the 404 error.
And finally I believe I just now discovered the root cause of this issue. It seems to be your "Shareaholic" plugin. Try disabling it and then crawling your site again. The 404 error should disappear.
The URL you shared, in the exact format you shared it, is present in your site's HTML code in a line which begins with the following code:
-
will do and thank you for your insight!
-
I just started a SEOmoz crawl for your site. It will take some time to complete. Once the report is available I'll take a look.
Since you removed a plug in, the results may not be the same. You may have resolved the issue. Please refrain from making further changes until the crawl is complete.
-
Okay sure. Embarassingly enough, it's my own site at bayareaseo.net.
http://www.bayareaseo.net/2011/11/things-that-can-mess-up-your-google-places-rankings/
is referring to in SEOMOZ crawler
and in GWT the original url refers to
http://www.bayareaseo.net/2011/11/things-that-can-mess-up-your-google-places-rankings/<a< p=""></a<>
Just removed a "related posts" style plug in, not sure if that's the culprit.
-
It doesn't make sense to me that the referrer is the page itself. If you are willing to share your site's URL and the specific URL which is having an issue I can perform a crawl and offer more details.
-
The referrer is the page itself. Examined the code and I'm not seeing any links that match, with or without the funky markup, i.e. searching for
http://www.example.com/2010/07/blogpost/http:%2F%2Fwww.example.com%2F2010%2F07%2Fblogpost%2F
as well as
http://www.example.com/2010/07/blogpost/http://www.example.com/2010/07/blogpost/
I'm thinking it's down to one of two WP plugins causing the error. Found similar results in GWT, with many 404s referring from themselves as
http://www.example.com/page<a< p=""></a<>
Will disable the plugins and report back after the next crawl
-
The crawler normally will start on your site's home page and move through all the html code on the home page, then crawl each and every link on the home page following it throughout your site. If you are seeing these errors on your crawl report then the links are on your site.
Examine your crawl report and look for the REFERRER field. This field indicates the page which contains the link. If you can't see the link on the page itself, right-click on the page and choose View Page Source, then do a search of the html code (CTRL+F) for the link.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
18 404 errors on pages that are actually fine.
Hi, I just used the compain tool to look for errors on my site and it appears that seomoz crawler finds 18 404 errors on pages that are fine in my good. I do proceed with a URL rewritting on those pages, but navigation is fine. Some of the pages are: http://cassplumbingtampabay.com/about-us http://cassplumbingtampabay.com/commercial-services http://cassplumbingtampabay.com/drain-cleaning-repair ... Does anybody know what's going on?
Moz Pro | | acas110 -
404 errors, but not showing in Google analytics
In my SEOmoz errors there are over a dozen 404 errors listed. However, they are not showing up in Google analytics. How can I find the referring url for the ones showing up here?
Moz Pro | | sakeith0 -
Can I see when SEO Moz has crawled my website?
I would like to know if it's possible to see (maybe in my Google Analytics) if SEO Moz has crawled my website. I'm also curious if and where I can see when the robot of Google visited my website. Thanks!
Moz Pro | | Spotler0 -
Not all pages are being crawled
I am set up on the PRO plan, I was under the impression that it would crawl up to 10,000 pages. My site has just over 200 pages, but whenever I am crawled it only crawls 121 pages. Is this normal? It's hard to know how reliable my data is because a significant amount of pages are missing.
Moz Pro | | KristinHarding0 -
Excluding parameters from seomoz crawl?
I'm getting a ton of duplicate content errors because almost all of my pages feature a "print this page" link that adds the parameter "printable=Y" to the URL and displays a plain text version of the same page. Is there any way to exclude these pages from the crawl results?
Moz Pro | | AmericanOutlets0 -
SEOMoz's Crawl Diagnostics showing an error where the Title is missing on our Sitemap.xml file?
Hi Everyone, I'm working on our website Sky Candle and I've been running it as a campaign in SEOmoz. I've corrected a few errors we had with the site previously, but today it's recrawled and found a new error which is a missing Title tag on the sitemap.xml file. Is this a little glitch in the SEOmoz system? Or do I need to add a page title and meta description to my XML file. http://www.skycandle.co.uk/sitemap.xml Any help would be greatly appreciated. I didn't think I'd need to add this. Kind Regards Lewis
Moz Pro | | LewisSellers0