Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One month later, MOZ has not scanned my web
It's been more than a month since we published our new website, you can see the link in my profile, and MOZ has not scanned or taken out the links to my page.
Link Building | | Expansyon
Do you know any way to tell MOZ that the page is published?
I have checked my robots.txt and everything seems to be ok. Google search console, takes out all the links correctly but MOZ is not able to.
Thanks to all of you for helping me1 -
Forwarding a domain seems to be creating 10,720 backlinks according to majestic?
I have a site toptwincitiesrealtors.com that points to my main site mnpropertygroup.com when I look up my backlinks on majestic it says I have 10,720 coming from this toptwincitiesrealtors.com site. should I stop pointing that site? I have a low trust flow but high citation flow on my mnpropertygroup.com site and a 0 trust flow and 5 citation flow on my toptwincitiesrealtors.com site
Link Building | | jchoughton0 -
Links in other language than site's language
I have a quick question regarding usage of links in my articles or website. As I already told in my last night question, I drive 2 french blogs. I'm wondering if putting on english sites links in my articles or pages to add complimentary infos on a particular topic would be useful or messy for my SEO. In other words, will Google be misled by the fact that I write in french and that I put english links ot ressources in my texts even if they are on a related topic
Link Building | | MarcoBernard0 -
Why Links from Top Level Domains doesn't Pass Link Equity?
Hello, I have a doubt about Equity-Passing links report of OpensiteExplorer.org According to Sam Weber http://moz.com/community/users/432678 links which pass value from one page to another including followed 301 and Meta refresh links are under Equity-Passing links. I am surprising after looking at the links report generated by opensiteexplorer.org. Most of the article and directory links which have low DA and PA are under Equity-Passing links. Whereas websites like EzineArticle or Articlebase which have good DA and PA are not found in either Equity-passing links or under only nofollow category. Please suggest me if the report of Opensiteexplorer is not good enough or the links from the site like Ezine Article doesn’t pass the link equity. Thanks.
Link Building | | TopLeagueTechnologies0 -
Someone's been spamming my client...
Hi all Bit of a strange one....doing a backlink analysis for my client's website (a handmade oak furniture supplier) and noticed there are about 13,000 spam backlinks to the domain from dozens of websites for keywords related to replica watches. Odd! Obviously neither us nor them have made these backlinks. Would a disavow be enough action to take in this case? I would rather the client not see a penalty in WMT for spam backlinks for this. Not sure how, or why, we have acquired this links. I can only think someone has been trying to do a spot of negative seo against the site Thanks Carl
Link Building | | carl_daedricdigital0 -
Domain Change, loss of inbound links ...
We're strongly considering a domain name change; this is purely for marketing reasons. We think in the long term, this will be a good thing. I believe we can mitigate the page redirection, branding changes, etc. My concern are the inbound links: from 200+ domains, 3,000+ links. So, I guess we can contact each of the top sites linking to us and hope they update our links. I'm not hopeful. I believe we'll loose must of the links. Has anyone been down this road and have experience to share?
Link Building | | jmueller0823
What should I expect, worst case? Is there a way to mitigate? Thanks much.
Jim0 -
Should we imitate our competitor's blog network?
One of our competitors has built a little blog network, and I'm wondering if it's worth it for us to imitate it. Here's how they have it set up: They have domain.com, their e-commerce site, and blog.domain.com. They also have a half-dozen EMD blogs set up that all link to each other and to the e-commerce site, each one supplying content related to one niche of their busines (e.g. kitchenwidgets.com, widgetsforkids.com, etc.). It seems they've been doing this since December 2011. In my opinion, the content on these EMD blogs is pretty low value. Sure enough, they have basically no inbound links from outside the blog network, and it's not getting shared socially. I'm having a hard time imagining a lot of long-tail searches that would bring in qualified shoppers, since they basically just write up 300-word long descriptions of photos. Based on SEMrush data, it doesn't look like this approach is hurting them -- they didn't take a Penguin dive in April, for example. But how likely is it that this approach is helping them enough to justify the time they must spend writing (probably ~30-60m a day)? It would be trivial for the algo to determine that these are not natural links and completely devalue them. Would it not be better to consolidate that time into 2.5-5hrs a week spent researching and writing a valuable, link-worthy, long-tail-rich post for the main blog and then promoting it in hopes of attracting natural links?
Link Building | | CMC-SD0 -
What's your favorite link building tactic?
What's your favorite 100% white hat link building tactic? Well, maybe you don't want to reveal your favorite...just a good one...
Link Building | | AdamThompson0