Googlebot and other spiders are searching for odd links in our website trying to understand why, and what to do about it.
-
I recently began work on an existing Wordpress website that was revamped about 3 months ago. https://thedoctorwithin.com. I'm a bit new to Wordpress, so I thought I should reach out to some of the experts in the community.Checking ‘Not found’ Crawl Errors in Google Search Console, I notice many irrelevant links that are not present in the website, nor the database, as near as I can tell. When checking the source of these irrelevant links, I notice they’re all generated from various pages in the site, as well as non-existing pages, allegedly in the site, even though these pages have never existed.
For instance:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/feedback-and-testimonials/ allegedly linked from:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/ (doesn’t exist)
In other cases, these goofy URLs are even linked from the sitemap. BTW - all the URLs in the sitemap are valid URLs.
Currently, the site has a flat structure. Nearly all the content is merely URL/content/ without further breakdown (or subdirectories). Previous site versions had a more varied page organization, but what I'm seeing doesn't seem to reflect the current page organization, nor the previous page organization.
Had a similar issue, due to use of Divi's search feature. Ended up with some pretty deep non-existent links branching off of /search/, such as:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/consultations/ allegedly linked from:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/ (doesn't exist).
I blocked the /search/ branches via robots.txt. No real loss, since neither /search/ nor any of its subdirectories are valid.
There are numerous pre-existing categories and tags on the site. The categories and tags aren't used as pages. I suspect Google, (and other engines,) might be creating arbitrary paths from these. Looking through the site’s 404 errors, I’m seeing the same behavior from Bing, Moz and other spiders, as well.
I suppose I could use Search Console to remove URL/category/ and URL/tag/. I suppose I could do the same, in regards to other legitimate spiders / search engines. Perhaps it would be better to use Mod Rewrite to lead spiders to pages that actually do exist.
- Looking forward to suggestions about best way to deal with these errant searches.
- Also curious to learn about why these are occurring.
Thank you.
-
Thanks, Kevin.
Glad I'm not the only one.
Disabling tags and categories aren't an option, in my case. Guess I need to look at more of the potential upside. Seems tags and categories, if handled correctly, could provide a new way to engage visitors and search engines.
I've heard people refer to 'spidering budgets, or whatnot'. Guess it's an entirely new topic of discussion... if limiting the spurious spider searching, (from good spiders,) means that said spiders will spend more time on the conventional pathways of a site.
-
Thanks, Vjay.
Did a lot of work fixing links in the database.
The issue was occurring even before implementation of WP super cache, and before the link fixing.
Being new-ish to WP, it seems strange that it's so willing to:
-
provide access via directories that don't really exist:
-
categories, tags, even search, if using a theme-provided site search.
I'm getting better at .htaccess, so I'm able to handle a lot of the old incoming links fairly well. In the case of these weird 'in the mind of the spiders' links, will be try to address these as well.
Thanks for your advice about 404 and 301 plugins. Time to look around and see what other useful tools are out there.
-
-
I have the same issue, I have stopped using tags because of all the irrelevant links they cause. Looking forward to reading the comments on this thread.
KJr
-
Hi There,
Your website is built on WordPress and it looks like that there might be spurious entries in the DB, which might also not be getting deleted due to the WP super cache plugin. You may try to empty your cache and install 'all 404 redirect' and 301 management plugins.
I hope this helps.
Regards,
Vijay
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Worried About Broken Links
In Wordpress, I'm using a plugin called Broken Link Checker to check for broken links. Should I be worried about/spend time fixing outbound links that result in: 403 Forbidden -Server Not Found -Timeout -500 Internal Server Error -etc. Thanks for your help! Mike
Technical SEO | | naturalsociety3 -
Trying to find all internal links to a specific page (without index)
Hi guys -- Still waiting on Moz to index a page of mine. We launched a new site over two months ago. In the meantime, I really just need a list of internal links to a specific page because I want to change its URL. Does anybody know how to find that list (of internal links to 1 of my pages) without the Moz index? I appreciate the help!
Technical SEO | | marchexmarketingmcc1 -
Big Increase in 404 Errors after Google Custom Search Engine Install on Website
My URL is: http://www.furniturefashion.comHi forum.I recently installed a Custom Google Search Engine (https://www.google.com/cse/) on my blog about ten days ago. Since then my 404 errors in Webmaster Tools has skyrocketed by several thousand. I had not had an issue before. Once it was installed the 404 errors started appearing. What's interesting is that all the errors have the URL then the word "undefined" at the end. I have attached a screen shot from my Webmaster Tools dashboard. Also, there are a few examples below of what the URLs are that have the 404 errors.wood_closet_organizer_to_improve_space_utilization/undefinedsmall-sweet-10-inspiring-small-kitchen-designs/undefined Has anyone had this issue? I very much want the search engine on my site, but not at the expense of several thousand 404 errors. My site queries has been going down since the installation of the custom search engine. Here is some of the code that I have below that I took off my site doing a "view source". Any help would be greatly appreciated.href='http://cdn.furniturefashion.com/wp-content/plugins/google-custom-search/css/smoothness/jquery-ui-1.7.3.custom.css?ver=3.9.2' type='text/css' media='all' />rel='stylesheet' id='gsc_style_search_bar-css' href='http://www.google.com/cse/style/look/minimalist.css?ver=3.9.2' type='text/css' media='all' />rel='stylesheet' id='gsc_style_search_bar_more-css' href='http://cdn.furniturefashion.com/wp-content/plugins/google-custom-search/css/gsc.css?ver=3.9.2' type='text/css' media='all' />< uXRSEkC
Technical SEO | | will21120 -
Blog separate from Website
One of my clients has a well established website, and a well established blog - each with its own domain. Is there any way to move the blog to his website domain without losing the SEO and links that he has built up over time?
Technical SEO | | EchelonSEO0 -
Internal Links
In OSE, it is reporting that i don't have any internal links to my homepage. In the header on every page is my logo in the top left hand corner which links back to my homepage. Shouldn't this mean then that every page should link to the home page? Similarly, internal pages which link from my main nav aren't showing up as having any internal links in OSE. Any ideas?
Technical SEO | | Santaur0 -
New website
Hello, How bad is going to be if I change my Joomla website to Wordpress? I can check the 100 best pages and redirect them to the new url with 301 but my website has 424 pages. If is this needs time, how long does it take to be in the same position? Is Google review my new website quickly? What about if I make my services more specific and the main topic is going to be smaller in pages? (Mpre social services pages vs. less pages about the main webdesign topic) I should change my website to WP but I am afraid because now I am in the 2. 🙂 Thanks! Regards,
Technical SEO | | Netkreativ
Misi0 -
How to search HTML source for an entire website
Is there a way for me to do a "view source" for an entire website without having to right-click every page and select "view source" for each of them?
Technical SEO | | SmartWebPros0 -
If you add a no follow to a time sensitive link, will it get picked up as broken link 404 in WMT report?
We have a client who publishes deals that are time sensitive. Links to the deals expire and so Google's crawlers are picking them up and finding a 404 If I no follow them, will the 404's still get picked up and reported in WMT? The same question applies to SEOMoz Pro.
Technical SEO | | Red_Mud_Rookie0