Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing links from rubbishy 'blog' sites
I need to remove around 800 bad links, probably about 500 domains as a very rough estimate. These were built by a previous link building company. Here some example domains: http://globalweddingblog.com
Link Building | | Coraltoes77
http://theweddinginsider.net
http://www.couturefashionissues.com
http://www.topfashionlabels.com
http://weddingworldnews.com
http://www.savingsdistrict.com
http://bestfemalesblog.com
http://mylatestfashion.com
http://lastfashion.net
http://womansonlineblog.org I have already tried emailing a hundred or so with a manual link request - with zero outcome. Hardly surprising when you consider the types of sites they are. I've had a quote for a link removal service, but I'm not sure if it's wise to pay someone to do this work - not sure what resources/tools they would have above and beyond what I can access and there could be increased risk. Any advice?0 -
Use 301'd Domain for a new campaign
Hello everybody, My company is getting ready to start a new mediacampaign on a very specific subject. The mediacampaign is not directly targeted at our core business, the goal is more to inform our customers about a subject and do a little branding for our company. A nice (and expensive) infographic was built that is going to be the core content of the campaign. We want the infographic to get shared a lot and therefore some of my colleagues want the url to be as short as possible. The idea is to host the infographic on a url on our companysite, but use a 301'd, shorter domainname in our communications. We are going to be getting a lot of links to this empty 301'd domain which does nothing else then 301 to our companysite. I know that linkbuilding to a 301'd domain is an old blackhat tactic, that's the main reason I don't feel good about this. But i can't really find any info on this subject.
Link Building | | Laurensvda0 -
Weird change in amount of links
We just went from 50.000 external followed links to more than 150.000 ext followed links within a week. At the same time we went from just below 200.000 total links (internal/external) to more than 650.000 links and linking root domains dropped from around 750 to below 500. We don't do linkbuilding. We don't use a seo-agency. We do all stuff on our own. So why this major change and what impact will it have?
Link Building | | alsvik0 -
Backlink reports in OSE, the good and the bad!
Hi all Mozers, I have a couple of questions re the backlink reports in Open Site Explorer. In the introductory video Rand suggests that you can indentify backlinks that are a) Having a positive effect, and b) Having a negative effect on SEO campaigns. Do you identify such links using the domain/page authority of the linking page? Also, we know we have more links than OSE is reporting, does this mean that the links that are not reported are not helping our SEO campaign? Many thanks in advance, much appreciated. Lee
Link Building | | Webpresence0 -
How do paid directories like thomasnet.com do so well in the serps? Aren't the Panda updates supposed to be moving us away from this?
With all of the updates/changes to Google's algo, I assumed that paid listings & links like those on thomasnet.com would have less merit. Is this an incorrect assumption?
Link Building | | PropelMike0 -
Links to my website don't show up on OSE
Hello folks, I made some links to my website: www.mediavatar.com.br , but on OSE it doesn't appear. Such as: http://www.meusdownloads.com.br/p.jsp?ppID=0128 Why this above link does not appear on OSE? Probabbly its a newbie question, but my knowledge about link building is not that much. Thanks.
Link Building | | augustos0 -
Why doesn't the Better Business Bureau show up in my link analysis
I've been working on SEO for one of the companies I've designed a website for and I'm confused by the company's lack of Better Business Bureau backlinks. The Company in question does have a BBB account and that account links back to the company's website. However, when I check in the link analysis for the site, the BBB link doesn't appear. My competitors, on the other hand, do have BBB links in their analyses. So, I'm wondering if I somehow don't have the right type of BBB account. The BBB seems to be a pretty good place to have a link from, and the company pays $300.00 per year for the membership, so I'd like to get the most out of it. Here's a link to the BBB page for the company http://www.bbb.org/utah/business-reviews/plumbers/platinum-plumbing-services-in-west-jordan-ut-22199778#bbblogo And here's the company's website www.slcplumbing.com Now, the company site I've just listed is 301 redirected to www.platinumplumbinginc.com, but even when www.slcplumbing.com was the main site, the BBB backlink didn't show up. Thank you Blake
Link Building | | BlakeMcGillis0 -
Backlinks not showing up in the campaign crawl
I have been adding backlinks, about 110 since the last crawl, but only 14 new linking domains showed in my campaign profile. Can someone explain? Is it that the linking domains were not crawled and thus not seen. Is it that the crawl posting is from data that was crawled a long time ago (4-6 weeks ago) and it took a while for it to show up is it that the domains i got links on were, possibly, not so hot and just get filtered out?
Link Building | | Ken_Jansen0