Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How many sites can I track with one subscription?
Hello, We are currently a MozPro medium member and we are tracking amlrightsource.com but we have other sites we'd like to track as well. Wondering if we can track more sites with this subscription?
Moz Pro | | KassandraSharr0 -
Good job! This URL received an grade A ?
What does this mean ? This page still ranks very bad at Google. So what does it mean that it recives a grade A ?Also, if this URL recives Grade A. It should clearly be optmizied very good on the Keyword. Stil its on page 9 in google. That is very very low on this keyword. Alot of bad blogs, foreign pages, pages without the keyword in heading, pages without any good content etc. rank better.Does this score have anything to do with ranking on google ?Something is clearly wrong with this page. The on-grade tool wont tell me what that it is. And probably dont understand either, since it gives A. Is there anywhere I can check this page on what is wrong with it ?http://www.butikksiden.no/archives/5-toffe-canada-goose-jakker and Canada Goose jakker Re-Grade Pa Good Job! This URL received an A grade
Moz Pro | | butikksiden0 -
Keyword Suggestion Tool
I want to know is their any good keyword suggestion tool other than Google keyword suggestion tool and keyword planner. I need to have a list of free ones only for now.
Moz Pro | | csfarnsworth0 -
Open site Explorer CSV output mixed up
If I download a CSV output of the external links and the characteristics of them, the output is mixed up. When I put the data in Columns, some columns are disordered. Do more people experience these problems? Is there some preference I have to adapt? Thanks! SFCTf
Moz Pro | | MartijnHoving820 -
Keyword Difficulty Tool
Is there a way to use KDT and include my own URL in the process so that I can see (and show my client) how things look competitively across all these nice dimensions? All is well if my client's site is in the top 10 - but if it isn't, how can I get the same set of metrics on a specific URL as it pertains to a specific keyword? Do I somehow to remember it used to do this? Or am I imagining things? I can't seem to get it to work this way. Thanks,
Moz Pro | | seo_plus0 -
Recommended Custom Reporting Tool?
Can anyone recommend a reporting system I can use to send reports to my clients based on their campaigns in SEOMOZ This appears to be a missing link in the program - Or as a newbie I just have not found it yet. Thanks in advance!
Moz Pro | | HGbiz0 -
Canonical issue in open site explorer
When I look at my back links in OSE, I see two landing pages on my site that are really the same page. www.mysite.com/ and www.mysite.com/(affiliate code here) These show different inbound link characteristics and page authority. The page in question has a rel=canonical tag. Am I doing something wrong?
Moz Pro | | EugeneF0 -
Open Site Explorer Missing URL's
I see a link to my site on a couple different url's, but they are not listed in OSE. The links have been active for a long time too. Does OSE not track all inbound links from all sites? Thanks, Stephen
Moz Pro | | stats440