What might make Bing.bot find a URL that looks like this on our site?
-
I have been doing something Richard Baxter recently suggested and reviewing our server logs.
I have found an oddity that hopefully some of you smart Mozzers can help me figure out.
Here is the line from the server log (there are many more like this):
157.55.32.166 - - [04/Mar/2013:08:00:59 -0800] "GET /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones HTTP/1.1" 200 94133 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"
See how the www.ccisolutions.com appears after /StoreFront/category/ ? We used to see weird URLs reported in GWT that looked like this, but ever since we fixed our canonical tags to be absolute instead of relative URLs, they no longer appeared in our Webmaster Tools reports.
However, it seems there is still a problem. Where/how could Bingbot be seeing URLs configured this way? Could it be a server issue, or is it most likely a data problem?
Thanks in advance!
Dana
P.S. Could this be resulting from our massive use of relative URLs all over the site?
-
Hi Streamline,
I thought I would circle back and update everyone as to what I found. You were correct about mal-formed URLs being the culprit of this problem. We have many isolated incidences of URLs for internal links that are missing the "/" at the beginning of a relative URL. There are inconsistencies on the relative URLs all over the site. It's certainly an example of one of many problems that can be caused by using relative rather than absolute URLs.
Since we are in the process of completely re-doing the site and moving to a new platform, it's something we can definitely work to get right during the transition.
Thanks again to you, Daniel and Keri for jumping in with answers.
Dana
-
Thanks to you both Daniel and Streamline.
I believe the problem may have to do with our .htaccess file. I am obtaining a copy of it now.
-
Thanks Keri. That's very helpful. I will do that.
-
Hi Dana,
I agree with Streamline, there will be a hidden issue in you site that it attempting to connect to an under formed link (a URL missing 'http://'). Given there is a number of them in one day I will guess this is happening in a templated page.
Have a look at;
It renders as a page.
The best course of action would be resolve it at the source. If you can pinpoint when this issue is due to occur next, have your developer get each page to append it's URL into the log at the beginning of the page. Then you should be able to determine where the issue is occurring. I am hoping you well see a discernible pattern.
Worse case scenario, possibly a canonical will work, OR create a REGEX redirect to handle this URL pattern in htaccess...
Hope this helps,
Dan
-
Dana, you might also want to contact Bing at https://support.discoverbing.com/eform.aspx?productKey=bingwebmaster&ct=eformts&scrx=1. I sent a quick note on Twitter to Duane Forrester and that's the URL he provided.
-
Can you tell from which page Bing is trying to access these URLs? And it only happened on the 4th and not on any other day? Could it be an issue with the sitemap on that day?
I'm looking at your site now and the page http://www.ccisolutions.com/StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones is returning a 200 response code to me, not a 404 code. The key is to figure out how Bing discovered the URL in the first place...
-
While this is certainly a possibility, I'm not sure it's the cause of the problem. If this were the case, wouldn't it most likely cause a 404 error, instead of rendering the proper page (albeit with a very funky URL) and a 200 status code?
The other thing making me think it's not just a poorly constructed link on the site is that there are over 100 of these in the server log, from just one day.
Thoughts?
-
I'm willing to bet that on some page of your site, there is a link pointing to www.ccisolutions.com/StoreFront/category/shure-se-earphones which is missing the "http://" at the beginning. So if Bing or a user tried to click on that link, they would be directed to /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones instead of the correct link.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Messy older site
I am taking over a website that doesn't have any canonical tags and spotty redirects. It looks like they have http://, https://, www and non-www pages indexed but GA is just set up for the http://non-www home page. Should all versions of the site be set up in GA and Search Console? I think so but wanted to confirm. Thanks in advance.
Technical SEO | | SpodekandCo0 -
White listing a site
A new clients site is blocked by a lot of Firewalls. And I can't work out why, the content is family friendly they sell nursery equipment. I've run it through the Google checker and there is no malicious software found on the site. Can anyone tell me what I need to do to get this site unblocked? The url is http://knuma.co.uk/
Technical SEO | | Marketing_Optimist0 -
Help Setting Up 301 Redirects from Coldfusion Site to Wordpress Site.
I have created a new website and need to redirect all of the previous pages to the new one. The old website was built in coldfusion and the new site is built in wordpress. One of the pages I'm trying to redirect is www.norriseal.com/products.cfm to http://norrisealwellmark.com/products/. This is what I have in my .htaccess file <ifmodule mod_rewrite.c="">Options +FollowSymlinks
Technical SEO | | MarketHubb
RewriteEngine On
RewriteBase /
Redirect 301 /products.cfm http://norrisealwellmark.com/products/</ifmodule> The result of this redirect is http://norrisealwellmark.com/products.cfm How do I prevent the .cfm from appending to the destination URL?1 -
Canonical URL on frontpage
I have a site where the CMS system have added a canonical URL on my frontpage, pointing to a subpage on my site. Something like on my domain root.Google is still showing MyDomain.com as the result in the search engines which is good, but can't this approach hurt my ranking? I mean it's basically telling google that my frontpage content is located far down the hierarki, instead of my domain root, which of course have the most authority.
Technical SEO | | EdmondHong87
Something seems to indicate that this could very well be the case, as we lost several placements after moving to this new CMS system a few months ago.0 -
Redirecting old html site to new wordpress site
Hi I'm currently updating an old (8 years old) html site to wordpress and about a month ago I redirected some url's to the new site (which is in a directory) like this... Redirect 301 /article1.htm http://mysite.net/wordpress/article1/
Technical SEO | | briandee
Redirect 301 /article2.htm http://mysite.net/wordpress/article2/
Redirect 301 /article3.htm http://mysite.net/wordpress/article3/ Google has indexed these new url's and they are showing in search results. I'm almost finished the new version of site and it is currently in a directory /wordpress I intend to move all the files from the directory to the root so new url when this is done will be http://mysite.net/article1/ etc My question is - what to I do about the redirects which are in place - do I delete them and replace with something like this? Redirect 301 /wordpress/article1/ http://mysite.net/article1/
Redirect 301 /wordpress/article2/ http://mysite.net/article2/
Redirect 301 /wordpress/article3/ http://mysite.net/article3/ Appreciate any help with this0 -
Unnatural links from your site
Hi, 24 February got this penalty message in Google webmaster tool. Google detected a pattern of unnatural, artificial, deceptive, or manipulative outbound links on pages on this site. This may be the result of selling links that pass PageRank or participating in link schemes. Already removed all the link on the blog and sent reconsideration request to Google spam team. But request is rejected. Please help me on this or share link with me on same case. Thanks,
Technical SEO | | KLLC0 -
I have a mobile version and a standard version of my website. I'd like to show users some pages on the non-mobile site but keep googlebot mobile out. Is that ok?
On the mobile version not all the content of the normal site is available to the users. Since we didn't want googlebot mobile to index the non-mobile site, all the non-existent pages were returned with a 404 error. But now we'd like to show the mobile users these pages and send them to the normal site. If we allow the users to see these pages, is it ok to block googlebot mobile so these non-mobile pages are not indexed by googlebot mobile or will that create some issues for google?
Technical SEO | | bgs0 -
Would removing or making non relevant links no follow boost a site?
Hi, I have just been checking out the backlinks for a prospective new client. It appears they have a number of links that are totally irrelevant to their nature of business and I was wondering if they would improve in the rankings etc if I removed them or made them no follow instead? Or would I simply just be throwing away crucial link juice? Thanks in advance
Technical SEO | | Benjamin3790