Tool that can retrieve mysite URL's
-
Hi,
Tool that can retrieve mysite URL's
I am not talking about href,open explorer, Majestic etc
I have a list of 1000 site URL's where my site name is mentioned. I want to get the exact URL of my site next to the URL i want to query with
Example
http://moz.com/community is the URL i have and if this page has mysite name then i need to get the complete URL captured.
Any software or tool that can do this? I used one for sure which got me this info but now i don't remember it
Thanks
-
Or a crawl test with moz pro tools
-
Yes, I forgot that he already had the list of 1000 sites. Xenu link sleuth would be another option--it's free.
-
That would show what's indexed (which is most) but not all pages
-
Do a google search for your "yourdomain.com" and then use a scraper tool to put the results into a google doc. Here's Seer Interactives tool: http://www.seerinteractive.com/blog/google-scraper-in-google-docs-update
-
Screaming frog SEO spider tool should be able to help you with this. However to crawl more than its' 500 URL limit, you will need to purchase a licence key.
http://www.screamingfrog.co.uk/seo-spider/
Good luck.
Regards,
Vahe
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to get a large number of urls out of Google's Index when there are no pages to noindex tag?
Hi, I'm working with a site that has created a large group of urls (150,000) that have crept into Google's index. If these urls actually existed as pages, which they don't, I'd just noindex tag them and over time the number would drift down. The thing is, they created them through a complicated internal linking arrangement that adds affiliate code to the links and forwards them to the affiliate. GoogleBot would crawl a link that looks like it's to the client's same domain and wind up on Amazon or somewhere else with some affiiiate code. GoogleBot would then grab the original link on the clients domain and index it... even though the page served is on Amazon or somewhere else. Ergo, I don't have a page to noindex tag. I have to get this 150K block of cruft out of Google's index, but without actual pages to noindex tag, it's a bit of a puzzler. Any ideas? Thanks! Best... Michael P.S., All 150K urls seem to share the same url pattern... exmpledomain.com/item/... so /item/ is common to all of them, if that helps.
Intermediate & Advanced SEO | | 945010 -
SEO's Structuring Your Work Week
Hi I wanted some feedback on how other SEO's structure their time. I feel as though I'm falling into the trap of fire fighting with tasks rather than working on substantial projects... I don't feel as though I'm being as effective as I could be. Here's our set up - Ecommerce site selling thousands of products - more of a generalist with 5 focus areas. 2 x product/merchandising teams - bring in new products, write content/merchandise products Web team - me (SEO), Webmaster, Ecommcerce manager Studio - Print/Email marketing/creative/photography. A lot of my time is split between working for the product teams doing KWD research, briefing them on keywords to use, checking meta. SEO Tasks - Site audits/craws, reporting Blogs - I try and do a bit as I need it so much for SEO, so I've put a content/social plan together but getting a lot of things actioned is hard... I'm trying to coordinate this across teams Inbetween all that, I don't have much time to work on things I know are crucial like a backlink/outreach plan, blog/user guide/content building etc. How do you plan your time as an SEO? Big projects? Soon I'm going to pull back from the product optimisation & try focussing on category pages, but for an Ecommerce site they are extremely difficulty to promote. Just asking for opinions and advice 🙂
Intermediate & Advanced SEO | | BeckyKey3 -
Forwarded vanity domains, suddenly resolving to 404 with appended URL's ending in random 5 characters
We have several vanity domains that forward to various pages on our primary domain.
Intermediate & Advanced SEO | | SS.Digital
e.g. www.vanity.com (301)--> www.mydomain.com/sub-page (200) These forwards have been in place for months or even years and have worked fine. As of yesterday, we have seen the following problem. We have made no changes in the forwarding settings. Now, inconsistently, they sometimes resolve and sometimes they do not. When we load the vanity URL with Chrome Dev Tools (Network Pane) open, it shows the following redirect chains, where xxxxx represents a random 5 character string of lower and upper case letters. (e.g. VGuTD) EXAMPLE:
www.vanity.com (302, Found) -->
www.vanity.com/xxxxx (302, Found) -->
www.vanity.com/xxxxx (302, Found) -->
www.vanity.com/xxxxx/xxxxx (302, Found) -->
www.mydomain.com/sub-page/xxxxx (404, Not Found) This is just one example, the amount of redirects, vary wildly. Sometimes there is only 1 redirect, sometimes there are as many as 5. Sometimes the request will ultimately resolve on the correct mydomain.com/sub-page, but usually it does not (as in the example above). We have cross-checked across every browser, device, private/non-private, cookies cleared, on and off of our network etc... This leads us to believe that it is not at the device or host level. Our Registrar is Godaddy. They have not encountered this issue before, and have no idea what this 5 character string is from. I tend to believe them because per our analytics, we have determined that this problem only started yesterday. Our primary question is, has anybody else encountered this problem either in the last couple days, or at any time in the past? We have come up with a solution that works to alleviate the problem, but to implement it across hundreds of vanity domains will take us an inordinate amount of time. Really hoping to fix the cause of the problem instead of just treating the symptom.0 -
Weird behavior with site's rankings
I have a problem with my site's rankings.
Intermediate & Advanced SEO | | Mcurius
I rank for higher difficulty (but lower search volume) keywords , but my site gets pushed back for lower difficulty, higher volume keywords, which literally pisses me off. I thought very seriously to start new with a new domain name, cause what ever i do seems that is not working. I will admit that in past (2-3 years ago) i used some of those "seo packages" i had found, but those links which were like no more than 50, are all deleted now, and the domains are disavowed.
The only thing i can think of, is that some how my site got flagged as suspicious or something like that in google. Like 1 month ago, i wrote an article about a topic related with my niche, around a keyword that has difficulty 41%. The search term in 1st page has high authority domains, including a wikipedia page, and i currently rank in the 3rd place. In the other had, i would expect to rank easily for a keyword difficulty of 30-35% but is happening the exact opposite.The pages i try to rank, are not spammy, are checked with moz tools, and also with canirank spam filters. All is good and green. Plus the content of those pages i try to rank have a Content Relevancy Score which varies from 98% to 100%... Your opinion would be very helpful, thank you.0 -
Mystery 404's
I have a large number of 404's that all have a similar structure: www.kempruge.com/example/kemprugelaw. kemprugelaw keeps getting stuck on the end of url's. While I created www.kempruge.com/example/ I never created the www.kempruge.com/example/kemprugelaw page or edited permalinks to have kemprugelaw at the end of the url. Any idea how this happens? And what I can do to make it stop? Thanks, Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup0 -
Best way to view Global Navigation bar from GoogleBot's perspective
Hi, Links in the global navigation bar of our website do not show up when we look at Google cache --> text only version of the page. These links use "style="<a class="attribute-value">display:none;</a>" when we looked at HTML source. But if I use "user agent switcher" add-on in Firefox and set it to Googlebot, the links in global nav are displayed. I am wondering what is the best way to find out if Google can/can not see the links. Thanks for the help! Supriya.
Intermediate & Advanced SEO | | SShiyekar0 -
Restructuring Menu's
Hi all I am running my site on Wordpress using a slightly modified them from Studiopress on the Genisis frame work. I am extremely over my head but alas until I get some revenue SEO and Design are all on me. I do not know HTML or CSS but I do follow directions well (unless you ask my wife). Disclaimer out of the way I have some questions. I would like to change up my menu's to be more on the line of Products | Services | About Us | Contact Us | Blog Listing various direct mail pieces under Products, Sevices and so on and so forth. I wonder does this mean I will have to figure out how to write 301's and other complicated things or can I just make the changes. I think but might be wrong that this will change the URL's. Any advice before I mess this up would be greatly helpful. My site is http://www.roiautosolutions.com. If you want a few laughs about the car business read the 2 most recent blog post, anything before that and my writing style is pretty boring. Thanks, Mark Hilger
Intermediate & Advanced SEO | | mhilger0 -
What's the "most valuable indirectly related skill" to SEO worth learning?
Hi, All! I have a little time on my hands that's not taken up by client work or our own marketing. What would you say is a skill worth learning during that time? My background is not techie, so while I've picked up a teeny bit of knowledge about code, etc. on the way, I still don't really know how to code, use APIs, etc. So I was thinking something along those lines, but anyone have specific suggestions? And resources for whatever you suggest? Thanks! Aviva
Intermediate & Advanced SEO | | debi_zyx0