SEO Content Audits Questions (Removing pages from website, extracting data, organizing data).
-
Hi everyone!
I have a few questions - we are running an SEO content audit on our entire website and I am wondering the best FREE way to extract a list of all indexed pages. Would I need to use a mix of Google Analytics, Webmaster Tools, AND our XML sitemap or could I just use Webmaster Tools to pull the full list? Just want to make sure I am not missing anything.
As well, once the data is pulled and organized (helpful to know the best way to pull detailed info about the pages as well!) I am wondering if it would be a best practice to sort by high trafficked pages in order to rank them for prioritization (ie: pages with most visits will be edited and optimized first).
Lastly, I am wondering what constitutes a 'removable' page. For example, when it is appropriate to fully remove a page from our website? I understand that it is best, if you need to remove a page, to redirect the person to another similar page OR the homepage. Is this the best practice? Thank you for the help!
If you say it is best to organize by trafficked pages first in order to optimize them - I am wondering if it would be an easier process to use MOZ tools like Keyword Explorer, Page Optimization, and Page Authority to rank pages and find ways to optimize them for best top relevant keywords. Let me know if this option makes MORE sense than going through the entire data extraction process.
-
Have you looked at this post, Sunday? I believe it will answer most, if not all, of your questions.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dropdown content on page being crawled
Hi, will the content within a dropdown on a page be crawled? I.e. if the page visitor has to click to reveal the content as a dropdown will it be crawled by bots. Thanks
Technical SEO | | BillSCC1 -
Spammy Structured Data Markup Removal
Hi There, I'm in a weird situation and I am wondering if you can help me. Here we go, We had some of our developers implement structured data markup on our site, and they obviously did not know what they were doing. They messed up our results in the SERP big time and we wound up getting manually penalized for it. We removed those markups and got rid of that penalty (phew), however now we are still stuck with two issues. We had some pages that we changed their URLs, so the old URLs are now dead pages getting redirected to the newer version of the same old page, however, two things now happened: a) for some reason two of the old dead pages still come up in the Google SERP, even though it's over six weeks since we changed the URLs. We made sure that we aren't linking to the old version of the url anywhere from our site. b) those two old URLs are showing up in the SERP with the old spammy markup. We don't have anywhere to remove the markup from cause there are no such pages anymore so obviously there isn't this markup code anywhere anymore. We need a solution for getting the markup out of the SERP. We thought of one idea that might help - create new pages for those old URLs, and make sure that there is nothing spammy in there, and we should tell google not to index these pages - hopefully, that will get Google to de-index those pages. Is this a good idea, if yes, is there anything I should know about, or watch out for? Or do you have a better one for me? Thanks so much
Technical SEO | | Joseph-Green-SEO0 -
Drop in Indexed Page + Organic Traffic
Hey Moz Community, I've been seeing a steady decrease in search console of pages being indexed by Google for our eCommerce site. This is corresponding to lower impressions and traffic in general this year. We started with around a million pages being indexed in Nov of 2015 down to 18,000 pages this Nov. I realized that since we don't have around 3,000 or so products year round this is mostly likely a good thing. I've checked to make sure our main landing pages are being indexed which they are and our sitemap was updated several times this year, although we're in the process of updating it again to resubmit. I also checked our robots.txt and there's nothing out of the ordinary. In the last month we've recently gotten rid of some duplicate content issues caused by pagination by using canonical tags but that's all we've done to reduce the number of pages crawled. We have seen some soft 404's and some server errors coming up in our crawl error report that we've either fixed or are trying to fix. Not really sure where to start looking to find a solution to the problem or if it's even a huge issue, but the drop in traffic is also not great. The drop in traffic corresponded to lose in rankings as well so there could be correlation or none. Any ideas here?
Technical SEO | | znotes0 -
My SEO friend says my website is not being indexed by Google considering the keywords he has placed in the page and URL what does that mean?
My SEO friend says my website is not being indexed by Google considering the keywords he has placed in the page and URL what does that mean? We have added some text in the pages with keywords thats related the page
Technical SEO | | AlexisWithers0 -
My website's pages are not being indexed correctly
Hi, One of our websites, which is actually a price comparison engine, facing indexing problem at Google. When we check “site:mywebsite.com “, there are lots of pages indexed which are not from mywebsite.com but from merchants websites. The index result page also shows merchant’s page title. In some cases the title is from merchant’s site but when the given link is accessed it points to mywebsite.com/index. Also the cache displays the merchant’s product page as the last indexed version rather than showing ours. The mywebsite.com has quite few Merchants that send us their product feed. Those products are listed on comparison page with prices. The merchant’s links on comparison page are all no-follow links but some of the (not all) merchant’s product pages are indexed against mywebsite.com as mentioned above instead of product comparison page of mywebsite.com How can we fix the issue? Thanks!
Technical SEO | | digitalMSB0 -
Question about breaking out content from one site onto many
We have a website and domain -- which is well-established (since 1998) -- that we are considering breaking apart for business reasons. This is a content site that hosts articles from a few of our brands in portal fashion. These brands are represented in print with their own magazines so it's important to keep their presence separate. All of the content on the site is related to a general industry, with each brand covering a unique segment in the industry. For example, think of a toy industry site that hosts content from it's brands covering stuffed animals, electronics and board games. The current thinking is to break out the content from a couple brands to their own sites and domains. The business case for this branding purposes. I'm of the opinion that this is a bad idea as we would likely see a noticeable decline in search traffic across the board, which we rely on for impressions for our advertisers. If we take the appropriate steps to carefully redirect pages to the new domains what kind of hit should we expect to take from this transition? Would it make much difference if we were transition from 1 to 2 sites vs 1 to 4? Should this move be avoided all together? Any advise would be appreciated.
Technical SEO | | accessintel0 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin dev.rollerbannerscheap.co.uk/ A description for this result is not available because of this site's robots.txt – learn more. This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google. In GWT I have tried to remove the sub domain. When I visit remove URLs, I enter dev.rollerbannerscheap.co.uk but then it displays the URL as http://www.rollerbannerscheap.co.uk/dev.rollerbannerscheap.co.uk. I want to remove a sub domain not a page. Can anyone help please?
Technical SEO | | SO_UK0 -
Is it possible to change a sitelink title by off page SEO?
Hi all, I checked a website of my company: sitelinks in SERP are with the correct url, but one of the sitelinks’ title is completely irrelevant. Is it possible that it was changed from "outside"? Or maybe it's a bug? Thank you, Imre
Technical SEO | | DDL0