How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do about one site dominating search results? (multiple pages ranking)?
Anybody have thoughts on dealing with search results where the same site gets listed multiple times? "weebly vs wix" is one example (same site #1-3, repetitive articles, not crazy high authority), but I see this now and then. I know Google likes variety, so it's weird for me to see results like this dominating search results. Thoughts? What gets these sites to take over the top rankings for a specific term? Any way to rise up in this situation, outside of the usual? Any tips on duplicating this kind of success?
Competitive Research | | davidwaring0 -
Is there any update on Google Search Results
I am following some keywords for my website on google. About a month, on the first page of these keywords, there are a lot of changes on ranking. 3-4 website has been falling to 2.3.page and new 3-4 website are shown on 1.page. But these new sites has 0 pagerank and there are no backlinks..These are new websites. What is the reason is there any update on Google search results ?
Competitive Research | | fikhir0 -
Google Keyword Tool Alternatives
Hey All, So ever since Google shut down the Keyword Tool we have used the keyword planner and a few other tools only to find they aren't quite the same, I was wondering what everyone is using now to get there keyword research and average traffic estimates ? We previous used keyword tool for phrase match and now not much offers phrase match or provides us with what seems as accurate results. We have tried Word Tracker after a numerous amount of research to find a alternative to the keyword tool, and found that the results are very different from the Google Keyword Planner, 1 noticeable result would be of "Web Design Melbourne", in Word Tracker it showed the average search of 9,900 (per month or year I'm not sure) and we compared that to the Google Keyword Planner and it showed 3,000. I have looked over a number of tools and found they weren't quite what we are looking for (Word Tracker, SEMRush, Google Keyword Planner) so I have turned to the MOZ Community. So my first question for everyone would be: A) what do you use for Keyword research and average search estimates now (any good tools with phrase match)? B) If you have used Word Tracker have you found accurate results in your keyword research? If you have found any tools that have proved beneficial and useful please let me know 🙂 (paid or free) Thank you
Competitive Research | | KBB_Digital
Jake Crone0 -
Duplicate content for www & non-www results
why would my campaign show duplicate content entries for www & non-www versions of my url? Here's an example I have a page called 'mydomain.com/resources/', and the campaign analysis shows it as being duplicate content, with the duplicate being 'www.mydomain.com/resources'. I don't know where I can adjust this or if it is perhaps related to some other setting, like Google Analytics or something else. /G
Competitive Research | | swdmedia0 -
Why would a specific Title page search not show up on Google?
I need help to solve an ongoing problem. I have been working to try to figure this out now for weeks. When you search a
Competitive Research | | rdominey
specific page title that has a low competition and all of the SEO checks indicate
that the page should rank in the top 10 if not #1 yet it is nowhere to be found
(not in top 200). I have looked at all of the suggested possible caused from
this and other forums. I have been told by Google that we are not being
manually penalized. I have taken action to correct all of the issues that have
been mentioned in forums; speed, links, SEOmoz crawl results are good, No major
problems for the site, page rank for the search keywords is A yet; Still the problem persists please let me explain with this simple test result: Search Google, Yahoo and Bing for; Gallery Wrap vs Museum Wrap Canvas Looking for this page: http://www.getyourphotosoncanvas.com/gallery-wraps-vs-museum-wraps/ Google = not in top 200 Yahoo = 2 Bing = 2 On the Google search if you drop the work Canvas the result is #2 With the exact title phrase; Gallery Wrap vs Museum Wrap Canvas We find the following pages, but not the correct page: Free Digital Proof from Get Your Photos on Canvas <cite>www.getyourphotosoncanvas.com/free-digital-proof/</cite> FREE Digital Proofs offered by Get Your Photos on Canvas before you ... form the Gallery Wrap or what the Museum Wrap will look like and much, much more! Rank 76 on search for Gallery Wrap vs Museum Wrap Canvas Photos on Canvas Online Gallery Photographs by Ray Dominey <cite>www.getyourphotosoncanvas.com/store/</cite> Photographs
on Canvas by renowned St. Augustine Photographer
Ray Dominey. Photographs ... Gallery Wrap vs. Museum Wrap · Before & Afters
That WOW! Rank 107 on search for Gallery Wrap vs Museum Wrap Canvas Photo on Canvas Triptych, Three Panel Canvas Split Wall Display <cite>www.getyourphotosoncanvas.com/.../split-panel-triptych-photos-on-... 21, 2012 – Photo on Canvas Triptych Split Panels are very popular today but the origin of ...
Gallery Wrap vs. Museum Wrap · Before & Afters That WOW!</cite>Feb Rank 128 and 132 on search for Gallery Wrap vs Museum Wrap Canvas I need help can anyone please help me figure this out?0 -
How does Google decide whether a Google News box appears in organic search results?
How does Google decide whether a Google News box appears in organic search results? A list of any specific factors, if known, would be very helpful.
Competitive Research | | pathjoy0 -
Multiple links from Dmoz/Google directories worldwide
I came across www.soundandvision.com and did a Link Analysis on them.... http://www.opensiteexplorer.org/www.soundandvision.com/a!links I noticed that the top links they have are from Google directories or Google IP's. How has this happened? I am listed in Dmoz in the UK does this mean I have automatically appeared around the world. Dmoz is pretty strict about rejecting links how can a company be listed so much? Is this a good practise? Cheers
Competitive Research | | JohnW-UK1 -
My client has shown me a similar site, though not a competitor. He wants to know what sites they are linked from that give them such a good Google rank for certain kewords. Can SEOMoz tell me this?
When using google.com.au and searching for "travel to france", www.frenchtravel.com.au is the 3rd organic result. (the 1st two are not travel businesses, they are non profit travel guides) My client, who runs www.visituk.com.au, an Australian site that organises tours of the UK, said "so we just need to add these sort of words to the site?" I said, yes, but it doesn't end there. The real task is to have a link to your site on other sites surrounded with the words "travel" and "UK". He asked if he could see a list of the sites the french site was being referred by relevant to the search phrase. Is there an SEOmoz tool for this? Or is there another way I can generate that list? Thanks Simon
Competitive Research | | electrik0