What might make Bing.bot find a URL that looks like this on our site?
-
I have been doing something Richard Baxter recently suggested and reviewing our server logs.
I have found an oddity that hopefully some of you smart Mozzers can help me figure out.
Here is the line from the server log (there are many more like this):
157.55.32.166 - - [04/Mar/2013:08:00:59 -0800] "GET /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones HTTP/1.1" 200 94133 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"
See how the www.ccisolutions.com appears after /StoreFront/category/ ? We used to see weird URLs reported in GWT that looked like this, but ever since we fixed our canonical tags to be absolute instead of relative URLs, they no longer appeared in our Webmaster Tools reports.
However, it seems there is still a problem. Where/how could Bingbot be seeing URLs configured this way? Could it be a server issue, or is it most likely a data problem?
Thanks in advance!
Dana
P.S. Could this be resulting from our massive use of relative URLs all over the site?
-
Hi Streamline,
I thought I would circle back and update everyone as to what I found. You were correct about mal-formed URLs being the culprit of this problem. We have many isolated incidences of URLs for internal links that are missing the "/" at the beginning of a relative URL. There are inconsistencies on the relative URLs all over the site. It's certainly an example of one of many problems that can be caused by using relative rather than absolute URLs.
Since we are in the process of completely re-doing the site and moving to a new platform, it's something we can definitely work to get right during the transition.
Thanks again to you, Daniel and Keri for jumping in with answers.
Dana
-
Thanks to you both Daniel and Streamline.
I believe the problem may have to do with our .htaccess file. I am obtaining a copy of it now.
-
Thanks Keri. That's very helpful. I will do that.
-
Hi Dana,
I agree with Streamline, there will be a hidden issue in you site that it attempting to connect to an under formed link (a URL missing 'http://'). Given there is a number of them in one day I will guess this is happening in a templated page.
Have a look at;
It renders as a page.
The best course of action would be resolve it at the source. If you can pinpoint when this issue is due to occur next, have your developer get each page to append it's URL into the log at the beginning of the page. Then you should be able to determine where the issue is occurring. I am hoping you well see a discernible pattern.
Worse case scenario, possibly a canonical will work, OR create a REGEX redirect to handle this URL pattern in htaccess...
Hope this helps,
Dan
-
Dana, you might also want to contact Bing at https://support.discoverbing.com/eform.aspx?productKey=bingwebmaster&ct=eformts&scrx=1. I sent a quick note on Twitter to Duane Forrester and that's the URL he provided.
-
Can you tell from which page Bing is trying to access these URLs? And it only happened on the 4th and not on any other day? Could it be an issue with the sitemap on that day?
I'm looking at your site now and the page http://www.ccisolutions.com/StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones is returning a 200 response code to me, not a 404 code. The key is to figure out how Bing discovered the URL in the first place...
-
While this is certainly a possibility, I'm not sure it's the cause of the problem. If this were the case, wouldn't it most likely cause a 404 error, instead of rendering the proper page (albeit with a very funky URL) and a 200 status code?
The other thing making me think it's not just a poorly constructed link on the site is that there are over 100 of these in the server log, from just one day.
Thoughts?
-
I'm willing to bet that on some page of your site, there is a link pointing to www.ccisolutions.com/StoreFront/category/shure-se-earphones which is missing the "http://" at the beginning. So if Bing or a user tried to click on that link, they would be directed to /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones instead of the correct link.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Launch of improved site
Hi, Just want to ask you guys if i have missed something in my planning. We have done a migration from Ithemes Exchange to woocommerce. The complete migration are done on our dev server. It has an exakt setup as our live one. My plan is to change our live version with a backup from our migrated and finished site from our dev site. All of our product links will be intact with accept from some that we have combined in to new ones, the ones that are changed has been redirected with a 301. Will this way of launching our site effect our ranking/seo in some way? Thankful for any thoughts about this one! // Jonas
Technical SEO | | knubbz0 -
Site hacked in Jan. Redeveloped new site. Still not ranking. Should we change domain?
Our top ranking site in the UK was hacked at the end of 2014. http://www.ultimatefloorsanding.co.uk/ The site was the subject of a manual spam action from Google. After several unsuccessful attempts to clean it up, using Securi.net and reinstating old versions of the site, changing passwords etc. we took the decision to redevelop the site. We also changed hosting provider as we had received absolutely no support from them whatsoever in resolving the issue. So far we have: Removed the old website files off the server Developed a new website having implemented 301's for all the old URL's (except the spam ones) Submitted a reconsideration request for the manual spam action, which was accepted. Disavowed all the spammy inbound links through Webmaster Tools Implemented custom URL parameters through Google to not index the SPAM URLs ( which were using parameters) Our organic traffic is down by 63% compared to last year, and we are not ranking for most of our target keywords any longer. Is there anything that I am missing in the actions I have taken so far? We were advised that at this stage changing domain and starting again might be the way to go. However the current domain has been used by us since 2007, so it would be a big call. Any advice is appreciated, thanks. Sue - http://www.ultimatefloorsanding.co.uk/
Technical SEO | | galwaygirl0 -
Why would I suddenly start seeing a spike in hits from particular bots (specifically rogerbot, google, bing, and yahoo)?
We have seen consistent network traffic over the past month, then starting yesterday, huge spikes in hits (hits as in crawls to pages causing an increase in megabytes downloaded) started coming in from Rogerbot, Google, Bing, and Yahoo. A specific example from Rogerbot is as follows: rogerbot/1.1+(http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help,[email protected]) Useragent from the bot IP address: 54.226.73.52 Domain / hostname: ec2-54-226-73-52.compute-1.amazonaws.com Physical location: United States flag United States, VA, Ashburn We've have thought about doing a crawl-delay to prevent these bots from hitting us so hard, but that still doesn't help us answer why this even started in the first place. Any clue on what may be going on here?
Technical SEO | | eTundra0 -
Another client copies everything to blogspot. Is that what keeps her site from ranking? Or what? Appears to be a penalty somewhere but can't find it.
This client has a brand new site: http://www.susannoyesandersonpoems.com Her previous site was really bad for SEO, yet at one time she actually ranked on the first page for "LDS poems." She came to me because she lost rank. I checked things out and found some shoddy SEO work by a very popular Wordpress webhoste that I will leave unnamed. If you do a backlink analysis you can see the articles and backlinks they created. But there are so few, so I'm not sure if that was it, or it just was because of the fact that her site was so poorly optimized and Google made a change, and down she fell. Here's the only page she had on the LDS poems topic in her old site: https://web.archive.org/web/20130820161529/http://susannoyesandersonpoems.com/category/lds-poetry/ Even the links in the nav were bad as they were all images. And that ranked in position 2 I think she said. Even with her new site, she continues to decline. In fact she is nowhere to be found for main keywords making me think there is a penalty. To try and build rank for categories, I'm allowing google to index the category landing pages and had her write category descriptions that included keywords. We are also listing the categories on the left and linking to those category pages. Maybe those pages are watered down by the poem excerpts?? Here's an example of a page we want to rank: http://susannoyesandersonpoems.com/category/lds-poetry/ Any help from the peanut gallery?
Technical SEO | | katandmouse0 -
Linking shallow sites to flagship sites
We have hundreds of domains that we are either doing nothing with, or they are very shallow. We do not have the time to build enough quality content on them since they are ancillary to our flagship sites that are already in need of attention and good content. My question is...should we redirect them to the flagship site? If yes, is it ok to do this from root domain to root domain or should we link the root domain to a matching/similar page (gymfranchises.com to http://www.franchisesolutions.com/health_services_franchise_opportunities.cfm)? Or should we do something different altogether? Since we have many to redirect (if this is the route we go), should we redirect gradually?
Technical SEO | | franchisesolutions0 -
Why my site is not indexing in google
In google webmaster i have updated my sitemap in Mar 6th..There is around 22000 links..But google fetched only 5300 links for long time...
Technical SEO | | Rajesh.Chandran
I waited for 1 month till no improvement in google index..So apr6th we have uploaded new sitemap (1200 links totally)..,But only 4 links indexed in google ..
why google not indexing my urls? Is this affect our ranking in SERP? How many links are advisable to submit in sitemap for a website?0 -
Panda Victim still looking for recovery looking for help
I am an internet retailer hit by Panda and have made many changes to my site since first being hit on feb 2011. I had a slight recovery last september but have since slipped back again. I have scoured the internet for panda recoveries for internet retailers like me but I have not seen any. If anyone knows of recoveries of a site like mine (wackyplanet.com) -- we are on a yahoo store platform I would aappreciate any info as I am looking for an SEO who has experience with Panda as it relates to sites like mine.
Technical SEO | | bobforesi0 -
Crawl reveals hundreds of urls with multiple urls in the url string
The latest crawl of my site revealed hundreds of duplicate page content and duplicate page title errors. When I looked it was from a large number of urls with urls appended to them at the end. For example: http://www.test-site.com/page1.html/page14.html or http://www.test-site.com/page4.html/page12.html/page16.html some of them go on for a hundred characters. I am totally stymied, as are the people at my ISP and the person who talked to me on the phone from SEOMoz. Does anyone know what's going on? Thanks So much for any help you can offer! Jean
Technical SEO | | JeanYates0