What might make Bing.bot find a URL that looks like this on our site?
-
I have been doing something Richard Baxter recently suggested and reviewing our server logs.
I have found an oddity that hopefully some of you smart Mozzers can help me figure out.
Here is the line from the server log (there are many more like this):
157.55.32.166 - - [04/Mar/2013:08:00:59 -0800] "GET /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones HTTP/1.1" 200 94133 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"
See how the www.ccisolutions.com appears after /StoreFront/category/ ? We used to see weird URLs reported in GWT that looked like this, but ever since we fixed our canonical tags to be absolute instead of relative URLs, they no longer appeared in our Webmaster Tools reports.
However, it seems there is still a problem. Where/how could Bingbot be seeing URLs configured this way? Could it be a server issue, or is it most likely a data problem?
Thanks in advance!
Dana
P.S. Could this be resulting from our massive use of relative URLs all over the site?
-
Hi Streamline,
I thought I would circle back and update everyone as to what I found. You were correct about mal-formed URLs being the culprit of this problem. We have many isolated incidences of URLs for internal links that are missing the "/" at the beginning of a relative URL. There are inconsistencies on the relative URLs all over the site. It's certainly an example of one of many problems that can be caused by using relative rather than absolute URLs.
Since we are in the process of completely re-doing the site and moving to a new platform, it's something we can definitely work to get right during the transition.
Thanks again to you, Daniel and Keri for jumping in with answers.
Dana
-
Thanks to you both Daniel and Streamline.
I believe the problem may have to do with our .htaccess file. I am obtaining a copy of it now.
-
Thanks Keri. That's very helpful. I will do that.
-
Hi Dana,
I agree with Streamline, there will be a hidden issue in you site that it attempting to connect to an under formed link (a URL missing 'http://'). Given there is a number of them in one day I will guess this is happening in a templated page.
Have a look at;
It renders as a page.
The best course of action would be resolve it at the source. If you can pinpoint when this issue is due to occur next, have your developer get each page to append it's URL into the log at the beginning of the page. Then you should be able to determine where the issue is occurring. I am hoping you well see a discernible pattern.
Worse case scenario, possibly a canonical will work, OR create a REGEX redirect to handle this URL pattern in htaccess...
Hope this helps,
Dan
-
Dana, you might also want to contact Bing at https://support.discoverbing.com/eform.aspx?productKey=bingwebmaster&ct=eformts&scrx=1. I sent a quick note on Twitter to Duane Forrester and that's the URL he provided.
-
Can you tell from which page Bing is trying to access these URLs? And it only happened on the 4th and not on any other day? Could it be an issue with the sitemap on that day?
I'm looking at your site now and the page http://www.ccisolutions.com/StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones is returning a 200 response code to me, not a 404 code. The key is to figure out how Bing discovered the URL in the first place...
-
While this is certainly a possibility, I'm not sure it's the cause of the problem. If this were the case, wouldn't it most likely cause a 404 error, instead of rendering the proper page (albeit with a very funky URL) and a 200 status code?
The other thing making me think it's not just a poorly constructed link on the site is that there are over 100 of these in the server log, from just one day.
Thoughts?
-
I'm willing to bet that on some page of your site, there is a link pointing to www.ccisolutions.com/StoreFront/category/shure-se-earphones which is missing the "http://" at the beginning. So if Bing or a user tried to click on that link, they would be directed to /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones instead of the correct link.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to find orphan pages
Hi all, I've been checking these forums for an answer on how to find orphaned pages on my site and I can see a lot of people are saying that I should cross check the my XML sitemap against a Screaming Frog crawl of my site. However, the sitemap is created using Screaming Frog in the first place... (I'm sure this is the case for a lot of people too). Are there any other ways to get a full list of orphaned pages? I assume it would be a developer request but where can I ask them to look / extract? Thanks!
Technical SEO | | KJH-HAC1 -
Which URL would you choose?
1 – www.company.com/subfolder/subfolder/keyword-keyword-product (I’m able to keyword match with this url) or 2. www.company.com/subfolder/subfolder/product (no url keyword match) What would you choose? A url which is "short" but still relevant, or, a url which is more descriptive allowing “keyword” match? Be great to get your feedback guys. Many thanks Gary
Technical SEO | | GaryVictory0 -
Cache Not Working on Our Site
We redesigned our site (www.motivators.com) back in April. Ever since then, we can't view the cache. It loads as a blank, white page but the cache text is at the top saying: "This is Google's cache of http://www.motivators.com/. It is a snapshot of the page as it appeared on Jul 22, 2013 15:50:40 GMT. The current page could have changed in the meantime. Learn more. Tip: To quickly find your search term on this page, press Ctrl+F or ⌘-F (Mac) and use the find bar." Has anyone else ever seen this happen? Any ideas as to why it's happening? Could it be hurting us? Advice, tips, suggestions would be very much appreciated!
Technical SEO | | Motivators0 -
How to keep a URL social equity during a URL structure/name change?
We are in the process of making significant URL name/structure change to one of our property and we want to keep the social equity (likes, share, +1, tweets) from the old to the new URL. We have been trying many different option without success. We are running our social "button" in an iframe. Thanks
Technical SEO | | OlivierChateau0 -
How to visualize our entire site to discover the origin of URLs?
What is a tool to use so that I can visualize all links to all pages on the site so that I can discover how certain duplicate content URLs are being created?
Technical SEO | | poolguy0 -
We're working on a site that is a beer company. Because it is required to have an age verification page, how should we best redirect the bots (useragents) to the actual homepage (thus skipping ahead of the age verification without allowing all browsers)?
This question is about useragents and alcohol sites that have an age verification screen upon landing on the site.
Technical SEO | | OveritMedia0 -
How to make multiple url redirection using global.asax in IIS 6?
sir, I am working with IIS 6 site and i ant to redirect three different urls of a domain to one url, i.e, there are the different versions of the same url...so how can i create one? I have found a script on google. but it says redirecting one url. see it here: Sub Application_BeginRequest(ByVal sender as Object, ByVal e as EventArgs) Try Dim requestedDomain As String = HttpContext.Current.Request.Url.ToString().toLower() If InStr(requestedDomain, "http://yoursite.com") Then requestedDomain = requestedDomain.Replace("http://yoursite.com", "http://www.yoursite.com") Response.Clear() Response.Status = "301 Moved Permanently" Response.AddHeader("Location", requestedDomain) Response.End() End If Catch ex As Exception Response.Write("Error in Global.asax :" & ex.Message) End Try End Sub
Technical SEO | | VipinLouka780 -
Are lots of links from an external site to non-existant pages on my site harmful?
Google Webmaster Tools is reporting a heck of a lot of 404s which are due to an external site linking incorrectly to my site. The site itself has scraped content from elsewhere and has created 100's of malformed URLs. Since it unlikely I will have any joy having these linked removed by the creator of the site, I'd like to know how much damage this could be doing, and if so, is there is anything I can do to minimise the impact? Thanks!
Technical SEO | | Nobody15569050351140