Crawl Diagnostic Errors
-
Hi there,
Seeing a large number of errors in the SEOMOZ Pro crawl results. The 404 errors are for pages that look like this:
http://www.example.com/2010/07/blogpost/http:%2F%2Fwww.example.com%2F2010%2F07%2Fblogpost%2F
I know that t%2F represents the two slashes, but I'm not sure why these addresses are being crawled. The site is a wordpress site. Anyone seen anything like this?
-
Yep, i think you nailed it. I crawled another 2 sites I manage, one has sexy bookmarks, one doesn't. The one with had 404 errors. A quick search for sexy bookmarks causes 404 had some results as well.
You're right about the issue with the other plugin, commentluv. Will definitely take that suggestion to the developer.
And a hat trick, you're right about the block of latest from the blog on the footer. Been meaning to take that out for ages.
Very grateful for your attention and wisdom! Thank you!
-
Ross, it seems you have a plugin for comments which adds a link to the last post of the person who made the comment. This is an interesting plugin which i have not seen before. There are two problems I see with the plugin. First, it identifies links to your own site as external, when they should be tagged as internal. Secondly, it probably shouldn't be used to link to the current page. Debbi's comment is a link asking readers to view her latest article, which is the current page.
There is also a link to the current article under Recent Posts. It would be a great advancement for the plugin if it could identify the current URL and not include it in the list.
There is also a footer section "Latest from blog" which offers a link to the post. In my opinion offering the same links in the Recent Posts side bar and the "Latest from blog" footer is excessive, and since footer links aren't used very much I would recommend removing the footer block.
The fourth link to the article I located on the page is from a plugin which is referred to as "Shareaholic TopSharingBar SexyBookmarks". The link is contained within javascript.
All of the above 4 links are valid links and should not be the source of the 404 error.
And finally I believe I just now discovered the root cause of this issue. It seems to be your "Shareaholic" plugin. Try disabling it and then crawling your site again. The 404 error should disappear.
The URL you shared, in the exact format you shared it, is present in your site's HTML code in a line which begins with the following code:
-
will do and thank you for your insight!
-
I just started a SEOmoz crawl for your site. It will take some time to complete. Once the report is available I'll take a look.
Since you removed a plug in, the results may not be the same. You may have resolved the issue. Please refrain from making further changes until the crawl is complete.
-
Okay sure. Embarassingly enough, it's my own site at bayareaseo.net.
http://www.bayareaseo.net/2011/11/things-that-can-mess-up-your-google-places-rankings/
is referring to in SEOMOZ crawler
and in GWT the original url refers to
http://www.bayareaseo.net/2011/11/things-that-can-mess-up-your-google-places-rankings/<a< p=""></a<>
Just removed a "related posts" style plug in, not sure if that's the culprit.
-
It doesn't make sense to me that the referrer is the page itself. If you are willing to share your site's URL and the specific URL which is having an issue I can perform a crawl and offer more details.
-
The referrer is the page itself. Examined the code and I'm not seeing any links that match, with or without the funky markup, i.e. searching for
http://www.example.com/2010/07/blogpost/http:%2F%2Fwww.example.com%2F2010%2F07%2Fblogpost%2F
as well as
http://www.example.com/2010/07/blogpost/http://www.example.com/2010/07/blogpost/
I'm thinking it's down to one of two WP plugins causing the error. Found similar results in GWT, with many 404s referring from themselves as
http://www.example.com/page<a< p=""></a<>
Will disable the plugins and report back after the next crawl
-
The crawler normally will start on your site's home page and move through all the html code on the home page, then crawl each and every link on the home page following it throughout your site. If you are seeing these errors on your crawl report then the links are on your site.
Examine your crawl report and look for the REFERRER field. This field indicates the page which contains the link. If you can't see the link on the page itself, right-click on the page and choose View Page Source, then do a search of the html code (CTRL+F) for the link.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to Fix 404 Errors
Hey Moz'ers - I just added a new site to my Moz Pro account and when I got the crawl report back there was a ton of 404 errors (see attached). I realize the best way to fix these is to manually go through every single error and see what the issue is... I just don't have time right now, and I don't have a team member that can jump on this either, but realize this will be a huge boost to this client if/when I get these resolved... So my question is: Is there a quicker way to get these resolved? Is there an outsourcing company that can fix my clients errors correctly? Thanks for the help in advance:) wBhzEeV
Moz Pro | | 2Spurs0 -
Initiate crawl
Anyway to start the crawl of a site immediately after changes have been made? Or must you wait for the next scheduled crawl? Thanks.
Moz Pro | | dave_whatsthebigidea.com0 -
Adjusting SEOmoz Crawling Speed
How do you adjust the SEOmoz crawling speed? SEOmoz tried to crawl 10,000 pages in 3 hours and crashed our MySQL server.
Moz Pro | | cappuccino891 -
How long does it take for a campaign website crawl to be completed?
Our campaign website crawl has been 'crawling' now for 5 days. Is this a normal phenomenon or is something hanging up?
Moz Pro | | Discountvc0 -
Crawling One Page
I set up a profile for a site with many pages, opting for setting up as a root directory. When SEOMoz crawled, they only found one page. Any ideas for why this would be? Thanks!
Moz Pro | | Group160 -
SEOmoz crawl diagnostics report - what are the duplicate pages urls?
I just see the number of duplicates but not what the urls of the duplicates are? I don't see it in the export either, but maybe I'm missing it Cheers S
Moz Pro | | firstconversion0 -
Why is Roger crawling pages that are disallowed in my robots.txt file?
I have specified the following in my robots.txt file: Disallow: /catalog/product_compare/ Yet Roger is crawling these pages = 1,357 errors. Is this a bug or am I missing something in my robots.txt file? Here's one of the URLs that Roger pulled: <colgroup><col width="312"></colgroup>
Moz Pro | | MeltButterySpread
| example.com/catalog/product_compare/add/product/19241/uenc/aHR0cDovL2ZyZXNocHJvZHVjZWNsb3RoZXMuY29tL3RvcHMvYWxsLXRvcHM_cD02/ Please let me know if my problem is in robots.txt or if Roger spaced this one. Thanks! |0