Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question on changing URL structures
We have a lot of "Long URL" errors in moz, and our URLs have no helpful format to them. For example, our blog URLs currently have a date and the title so they end up looking like this: blog/year/month/day/long title that never ends. If I were to setup a new URL structure using MOZ best practices, can I just make the change going forward and redirect a few high trafficked links to this new structure? Or do I really need to make the change for the website (specifically blog) as a whole to see a positive impact? I know this means there may be an initial drop in traffic which I'd like to avoid.
Moz Pro | | ETerika0 -
Facebook URLs, Anchor Text
I have a client that is considering a facebook url change. For ease of explanation, let's say their currently existing URL is facebook.com/Company123. I've googled their currently existing facebook url and found a dozen or so websites that include the text, "facebook.com/Company123". But, these results don't include websites that have an anchor text of, for example, "Facebook" and a link pointing to facebook.com/Company123. Has anybody had success tracking down any/all websites that point to a specific Facebook url? I've tried Open Site Explorer, OpenLinkprofiler, RankSignals, and SEO SpyGlass to no avail. Thank you!
Moz Pro | | OMTAnno0 -
Tools for Monitoring Hundreds to Thousands of Keywords and Rankings
Hi All, I am in process of doing and SEO overhaul for our five global sites in: US, UK, Canada, Sweden, France I'd like to track hundreds of keywords and rankings per site - I'm talking at least 300-400 keywords each site. Each site has its own country domain with both www and www2 domains. So, I need a keyword tool that will let me track massive amounts of keywords. I know that the Moz Pro tool helps, but we only have 350 keywords on this account. I think on this. Any suggestions on something reliable that will provide good data? I'm sure I can get some budget to purchase something, but I also can't spend too too much money. I'm not looking for a massive analytics package. Right now, I'm concerned mainly with our keyword rankings Thanks in advance!
Moz Pro | | CSawatzky0 -
Lots of site errors after last crawl....
Something interesting happened on the last update for my site on SEOmoz pro tools. For the last month or so the errors on my site were very low, then on the last update I had a huge spike in errors, warnings, and notices. I'm not sure if somehow I made a change to my site (without knowing it) and I caused all of these errors, or if it just took a few months to find all the errors on my site? My duplicate page content went from 0 to 45, my duplicate page titles went from 0 to 105, my 4xx (client error) went from 0 to 4, and my title missing or empty went from 0 to 3. On the warnings sections my missing meta description tag went form a hand full to 444. (most of these looking to be archive pages.) Down in the notices I have over 2000 that are blocked by meta robots, meta-robots nofollow, and Rel canonical. I didn't have any where near this many prior to the last update of my site. I just wanted to see what I need to do to clean this up, and figure out if I did something to cause all the errors. I'm assuming the red errors are the first things I need to clean up. Any help you guys can provide would be greatly appreciated. Also if you'd like me to post any additional information, please let me know and I'd be glad to.
Moz Pro | | NoahsDad0 -
Where is the best place to add links on my site?
If I'd like to put links to other sites in my site, is it better to have a page named "Our Helpful Links" etc. instead of just adding them to the bottom of an existing page like I've seen on some sites? I'm asking because I'm wanting to make Google as happy as possible and still add them. Just in case it helps to look at the site yourself to give advise its; http://www.allstatetransmission.net If you see anything else there that I should work on feel free to be hard on it, I value any criticism. Thanks, Jeff
Moz Pro | | allstatetransmission0 -
Why does Open Site Explorer show less inbound links than yahoo site Explorer?
Hello, We have a question regarding inbound link measurement. We used to measure our inbound links with yahoo site explorer. Now that it's been shut down we use opensiteexplorer.org. However, Open Site Explorer only shows a fraction of inbound links compared to yahoo site explorer. For our website www.theprintspace.co.uk yahoo site explorer measured approx. 14,000 inbound links, whereas open Site Explorer only counts approx. 3,000. This is more than 10,000 links less. For our other website www.theprintspace.de Open Site Explorer also shows 3000 links less than Yahoo. How can this be? Does Open Site Explorer count the links in a different way to Yahoo? Please explain. It would be great if you could help us with this. Thank you!
Moz Pro | | Waplington0 -
Keyword Difficulty Tool: Error
Hi - is anyone else getting an error using the Keyword Difficulty tool? I'm getting "ERROR: There was a transient error with your request. Please try again."
Moz Pro | | ErikDster0 -
Multi-languae site anda Campaigns
Hi, I need to optimize a multi-language site. It's an hotel chain website and has 4 languages. Each language version of the site must be optimize for a diferente Google Engine. The english version of the web must be optimized for Google United States, an so on. Do I need to create a new Campaign for each language? or can I use more than 4 Engines in one campaign. Thanks,
Moz Pro | | Dragut-Comunicacion0