What crawler do you recommend for finding orphaned pages on a website?
-
Is there a crawler that you guys recommend for finding all pages, including orphaned pages on a website? A data export is not feasible. I saw a question from back in 2013 and was wondering if anything has changed since then in regards to crawling orphaned pages. Do most enterprise systems already have this built into their crawler? Or is it best to get a crawler like Xenu or Screaming Frog or Deepcrawl?
-
Hi there!
i agree with Patrick. I was going to recommend using Screaming Frog or Google Search Console! Let me know if you try these, don't like them, and need another recommendation.
-
Hi there
I really like ScreamingFrog but I also really like Search Console and Moz Pro. The reason being, I like having different sets of data because they are all different. I also like seeing if pages are being linked to randomly from other sources other than my own website which Search Console does a great job (and so does Majestic or Ahrefs). Different sources find different things so it's nice to get other opinions on what you might have out there floating around.
Just my two cents! Hope this helps!
Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is constantly decreasing
For few weeks ago my website is constantly decreasing in search position. I lost keywords and is gooooing down.
Technical SEO | | Dan_Tala
Although it is well rated on several on page and off page seo verification software that I have tried.
I checked Google search console and Analytics and found no major problems. However… from one day to another it keeps going down.
I also checked what the main competitors are doing and they are not doing well, at all.
The main competitor actually has a creepy website. Totally devoid of onpage or offpage SEO but with an enormous number of backlinks. And of a very bad quality, which should disqualify it, still…
Few weeks ago I changed something.
In the pages I had H1, 4xH2, no H3 and an H4 without content.
An unnatural H tag structure.
Now I have H1, H2, H3, 3xH4, with the coherent information.
Theoretically, Google should have been “happy” or I’m missing something. I use a SAAS platform.
I just found out that they made changes to the keywords (tags).
I am selling toner cartridges for printers.
So…
The tags are printer models and generate a url in which they have the products.
Ex. https://www.sertit.ro/cartus-imprimanta-cilindru-color-hp-laserjet-pro-m-177fw goes to the products for that printer model.
The question is… should I make tag canonical?
Is it possible for products to loose so much in Google search?0 -
Home Pages of Several Websites are disappearing / reappearing in Google Index
Hi, I periodically use the Google site command to confirm that our client's websites are fully indexed. Over the past few months I have noticed a very strange phenomenon which is happening for a small subset of our client's websites... basically the home page keeps disappearing and reappearing in the Google index every few days. This is isolated to a few of our client's websites and I have also noticed that it is happening for some of our client's competitor's websites (over which we have absolutely no control). In the past I have been led to believe that the absence of the home page in the index could imply a penalty of some sort. This does not seem to be the case since these sites continue to rank the same in various Google searches regardless of whether or not the home page is listed in the index. Below are some examples of sites of our clients where the home page is currently not indexed - although they may be indexed by the time you read this and try it yourself. Note that most of our clients are in Canada. My questions are: 1. has anyone else experienced/noticed this? 2. any thoughts on whether this could imply some sort of penalty? or could it just be a bug in Google? 3. does Google offer a way to report stuff like this? Note that we have been building websites for over 10 years so we have long been aware of issues like www vs. non-www, canonicalization, and meta content="noindex" (been there done that in 2005). I could be wrong but I do not believe that the site would keep disappearing and reappearing if something like this was the issue. Please feel free to scrutinize the home pages to see if I have overlooked something obvious - I AM getting old. site:dietrichlaw.ca - this site has continually ranked in the top 3 for [kitchener personal injury lawyers] for many years. site:burntucker.com - since we took over this site last year it has moved up to page 1 for [ottawa personal injury lawyers] site:bolandhowe.com - #1 for [aurora personal injury lawyers] site:imranlaw.ca - continually ranked in the top 3 for [mississauga immigration lawyers]. site:canadaenergy.ca - ranks #3 for [ontario hydro plans] Thanks in advance! Jim Donovan, President www.wethinksolutions.com
Technical SEO | | wethink0 -
Issue with Cached pages
I have a client who has a three domains:
Technical SEO | | paulbaguley
budgetkits.co.uk
prosocceruk.co.uk
cheapfootballkits.co.uk Budget Kits is not active but Pro Soccer and Cheap Football Kits are. The issue is when you do site:budgetkits.co.uk on Google it brings back results. If you click on the link it goes to page saying website doesn't exist which is correct but if you click on cached it shows you a page from prosocceruk.co.uk or cheapfootballkits.co.uk. The cached pages are very recent by a couple of days ago to a week. The first result brings up www.budgetkits.co.uk/rainwear but the cached page is www.prosocceruk.co.uk/rainwear The third result brings up www.budgetkits.co.uk/kids-football-kits but the cached page is http://www.cheapfootballkits.co.uk The history of this issue is that budgetkits.co.uk was its own website 7 years ago and then it used to point at prosocceruk.co.uk after that but it no longer does for about two months. All files have been deleted from budgetkits.co.uk so it is just a domain. Any help with this would be very much appreciated as I have not seen this kind of issue before.0 -
Internal link structure, find out if there are any internal links to this page
When i use this url in open site explorer it says that there are no internal links:
Technical SEO | | wilcoXXL
http://goo.gl/d2s6tJ
Page Authority is also 1, it should be higher of there are any internal links to it right? But i am very sure there are links to this url on my website. For example on this URL:
http://goo.gl/ucixRH How certain can i be of this? Because if i can be very certain, than we have a internal linkstructure problem on our entire site i believe.0 -
Block bad crawlers
Hi! how are you? I've been working on some of my sites, and noticed that i'm getting lots of crawls by search engines that i'm not intereted in ranking well. My question is the following: do you have a list of 'bad behaved' search engines that take lots of bandwidth and don´t send much/good traffic? If so, do you know how to block them using robots.txt? Thanks for the help! Best wishes, Ariel
Technical SEO | | arielbortz0 -
My website pages are not crawled, what to do?
Hi all. I have made some changes on the website so i like to crawled them by the search engines Google especially. I have made these changes around 2 weeks ago. I have submitted my website on good bookmarking websites. Also i used a tool available in Google webmasters "Fetch as Google", Resubmitted a sitemap.xml. Still my pages are not crawled your opinion please. Thanks
Technical SEO | | lucidsoftech0 -
Ecommerce website: Product page setup & SKU's
I manage an E-commerce website and we are looking to make some changes to our product pages to try and optimise them for search purposes and to try and improve the customer buying experience. This is where my head starts to hurt! Now, let's say I am selling a T shirt that comes in 4 sizes and 6 different colours. At the moment my website would have 24 products, each with pretty much the same content (maybe differing references to the colour & size). My idea is to change this and have 1 main product page for the T-shirt, but to have 24 product SKU's/variations that exist to give the exact product details. Some different ways I have been considering to do this: a) have drop-down fields on the product page that ask the customer to select their Tshirt size and colour. The image & price then changes on the page. b) All product 24 product SKUs sre listed under the main product with the 'Add to Cart' open next to each one. Each one would be clickable so a page it its own right. Would I need to set up a canonical links for each SKU that point to the top level product page? I'm obviously looking to minimise duplicate content but Im not exactly sure on how to set this up - its a big decision so I need to be 100% clear before signing off on anything. . Any other tips on how to do this or examples of good e-commerce websites that use product SKus well? Kind regards Tom
Technical SEO | | DHS_SH0 -
I am trying to correct error report of duplicate page content. However I am unable to find in over 100 blogs the page which contains similar content to the page SEOmoz reported as having similar content is my only option to just dlete the blog page?
I am trying to correct duplicate content. However SEOmoz only reports and shows the page of duplicate content. I have 5 years worth of blogs and cannot find the duplicate page. Is my only option to just delete the page to improve my rankings. Brooke
Technical SEO | | wianno1680