Why do old URL format are still being crawled by Rogerbot?
-
Hi,
In the early days of my blog, I used permalinks with the following format:
http://www.mysitesamp.com/2009/02/04/heidi-cortez-photo-shoot/
I then decided to change this format using .htaccess to this format:
http://www.mysitesamp.com//heidi-cortez-photo-shoot/
My question is, why do rogerbot still crawls my old URL format since these urls' no longer exists in my website or blog.
-
Thanks Alan,
That solved my problem...
-
-
Hi Alan,
After disallowing the directory in robots.txt, Rogerbot still includes the non-existing URLs. Here is a sample URL that is being reported by Rogerbot
www.lugaluda.com/2009/08/05/chase-online-banking-chase-checking-bonus/
-
If you give me the url, i can crawl it fior you if you like.
-
Thanks Alan, I really appreciate your help. Gave me an idea since all the old URLs are coming from a virtual 2009 directory, I tried to add a disallow statement for that directory in the robots.txt section. Hopefully this will help solve the problem.
I will let you know the results after rogerbot finishes recrawling my site...
Thanks Dude....
-
You need to search your site, but bots start on a page and follow the links, if the report them then they must of found them, bots like googlebot or bingbot can find them on other sites, but rogerbot is only crawling within your site.
-
How will I know if they still exists on my site? If I tried to access the specific URLs, they are no longer active.
-
The old format must still exist in your site somewhere, bots follow links from your home page though your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Changes Twice in the Same Year
I've got a new client with a great site, great off-page optimization and some scars and a hangover from a bad developer relationship. I'd be so grateful for your thoughts on this situation: Some time in the not-too-distant-past, the website is established and new content is posted. We'll call this Alpha. In April 2015, the client migrates to WordPress, implementing 301 redirects on every content page because of the capitalization issues of the old CMS. That means Alpha URLs are redirecting to Betas. Problem is, the new Beta WordPress URLs are the the permalink structure: /%year%/%monthnum%/%postname%/ and update by default when the page content is updated meaning that any updates to existing content cause another 301. It's my belief that for evergreen content, dates in the URL do nothing to help you and might even hurt from a user-experience standpoint, if not a search engine one. So, naturally, I'd like to move to the simple/%postname%/ structure, which would be Gamma. So, here's how I think we should fix it. Step 1: Update the sitemap and navigation and make the desired URL (Gamma) structure the default and the canonical. Step 2: Change the Alpha -> Beta redirects to Alpha -> Gamma Step 3: Add Beta -> Gamma redirects Anyone done this in the past? Anyone have any problems with it?
Intermediate & Advanced SEO | | LindsayDayton0 -
Site was moved, but still exists on the old server and is being outranked for it's own name
Recently, a client went through a split with a business partner, they both had websites on the same domain, but within their own sub directories. There is a main landing page, which links to both sites, the landing page sits on the root. Ie. example.com is a landing page with links to example.com/partner1, and example.com/partner2 Parter 2 will be my client for this example. After the split, partner 2 downloaded his website, and put it up on his own server, but no longer has any kind of access to the old servers ftp, and partner 1 is refusing to cooperate in any way to have the site removed from the old server. They did add a 301 redirect for the home page on the old server for partner 2, so, example.com/partner2/index.html is 301'ing to the new site on the new server, HOWEVER, every other page is still live on that old server, and is outranking the new site in every instance. The home page is also being outranked, even with the 301 redirect in place. What are some steps I can take to rectify this? The clients main concern is that this old website, containing the old partners name, is outranking him for his own name, and the name of his practice. So far, here's what i've been thinking: Since the site has poor on-page optimization, i'll start be cleaning all of that up. I'll then optimize the home page to better depict the clients name and practice through proper usage of heading tags, titles, alt, etc, as well as the meta title and description. The only other thing I can think of would be to start building some backlinks? Any help/suggestions would be greatly appreciated! Thanks.
Intermediate & Advanced SEO | | RCDesign740 -
Will Canonical tag on parameter URLs remove those URL's from Index, and preserve link juice?
My website has 43,000 pages indexed by Google. Almost all of these pages are URLs that have parameters in them, creating duplicate content. I have external links pointing to those URLs that have parameters in them. If I add the canonical tag to these parameter URLs, will that remove those pages from the Google index, or do I need to do something more to remove those pages from the index? Ex: www.website.com/boats/show/tuna-fishing/?TID=shkfsvdi_dc%ficol (has link pointing here)
Intermediate & Advanced SEO | | partnerf
www.website.com/boats/show/tuna-fishing/ (canonical URL) Thanks for your help. Rob0 -
Received "Googlebot found an extremely high number of URLs on your site:" but most of the example URLs are noindexed.
An example URL can be found here: http://symptom.healthline.com/symptomsearch?addterm=Neck%20pain&addterm=Face&addterm=Fatigue&addterm=Shortness%20Of%20Breath A couple of questions: Why is Google reporting an issue with these URLs if they are marked as noindex? What is the best way to fix the issue? Thanks in advance.
Intermediate & Advanced SEO | | nicole.healthline0 -
Development site crawled
We just found out our password protected development site has been crawled. We are worried about duplicate content - what are the best steps to take to correct this beyond adding to robots.txt?
Intermediate & Advanced SEO | | EileenCleary0 -
Old page redirection method ?
New web site uploaded .but still there are many old site's pages index in Google .I have created 301 redirect for similar page but what about rest of pages?as eg there is a page called www.xxxx.com/testimonial.php but new site don't have a testimonial pages so what i can delete old page and redirect to home page or what please advice me
Intermediate & Advanced SEO | | innofidelity0 -
Subdirectory URLs
If I have category pages for my site; is it better to use http://example.com/category/category or just http://example.com/category? Also, I'm creating a new section of the site; a resource center. Should the URLs of the pages in the resource center be http://example.com/learn/page or just http://example.com/page What are the reasons for the better choice?
Intermediate & Advanced SEO | | Visually0 -
New AddThis URL Sharing
So, AddThis just added a cool feature that attempts to track when people share URL's via cutting and pasting the address from the browser. It appears to do so by adding a URL fragment on the end of the URL, hoping that the person sharing will cut and paste the entire thing. That seems like a reasonable assumption to me. Unless I misunderstand, it seems like it will add a fragment to every URL (since it's trying to track all of 'em). Probably not a huge issue for the search engines when they crawl, as they'll, hopefully, discard the fragment, or discard the JS that appends the fragment. But what about backlinks? Natural backlinks that someone might post to say, their blog, by doing exactly what AddThis is attempting to track - cutting and pasting the link. What are people's thoughts on what will happen when this occurs, and the search engines crawl that link, fragment included?
Intermediate & Advanced SEO | | BedeFahey0