Sudden Increase In Number of Pages Indexed By Google Webmaster When No New Pages Added
-
Greetings MOZ Community:
On June 14th Google Webmaster tools indicated an increase in the number of indexed pages, going from 676 to 851 pages. New pages had been added to the domain in the previous month. The number of pages blocked by robots increased at that time from 332 (June 1st) to 551 June 22nd), yet the number of indexed pages still increased to 851.
The following changes occurred between June 5th and June 15th:
-A new redesigned version of the site was launched on June 4th, with some links to social media and blog removed on some pages, but with no new URLs added. The design platform was and is Wordpress.
-Google GTM code was added to the site.
-An exception was made by our hosting company to ModSecurity on our server (for i-frames) to allow GTM to function.
In the last ten days my web traffic has decline about 15%, however the quality of traffic has declined enormously and the number of new inquiries we get is off by around 65%. Click through rates have declined from about 2.55 pages to about 2 pages.
Obviously this is not a good situation.
My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline.
My developer is examining the issue. They think there may be some tie in with the installation of GTM. They are noticing an additional issue, the sites Contact Us form will not work if the GTM script is enabled. They find it curious that both issues occurred around the same time.
Our domain is www.nyc-officespace-leader. Does anyone have any idea why these extra pages are appearing and how they can be removed? Anyone have experience with GTM causing issues with this?
Thanks everyone!!!
Alan -
Yes, and I appreciate it!
Alan -
I did what I asked you to do.
-
-
-
- in my first post and repeated frequently.
-
-
-
-
Hi Egol:
How did you locate this duplicate or re-published content?
Obviously what you have pointed out is a major source of concern so I ran Copyscape search this afternoon for duplicate content and did not locate any the URLs you mention in the "this", "this" link above. It appears you entered the URL of the blog post in Google's search bar. Would that work? This method would be pretty slow going with 600 URLs.
Thanks,
Alan -
Those are the 448 URLs from your website that have been filtered.
You should find garbage in them like shown below.
Have you done what I have suggested three times above? Do that if you want to identify the problem pages.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
www.nyc-officespace-leader.com/wp-content/plugins/...
A description for this result is not available because of this site's robots.txt – learn more.
-
-
Hi Egol:
Thanks for the suggestion.
When I click on _ repeat the search with the omitted results included _I get 448 results not the entire 859 results. Seems very strange. Some of these URLS have light content but I don't believe they are dups. I don't see any content outside our website when I click this.
Am I doing something wrong? I would think the total of 859 would appear not 447 URLs.
Thanks!!
Alan -
I don't know. You should ask someone who knows a lot about canonicalization.
Did you drill down through all of those indexed pages to see if you can identify all of them?
I've suggested it twice.
-
Hi Egol:
In the content of launching an upgraded site, could the canonicalization have implemented incorrectly? That could account for 175 pages sudden new content as the thin content has been there for some time.
I am particularly suspicious regarding canonicalization as there was an issue involving multi page URLs of property listings when the site was migrated from Drupal to Wordpress last Summer.
Thoughts?
Thanks, Alan
-
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You can have an attorney demand that they stop, you can file DMCA complaints. Be careful
**However it does not explain the sudden appearance of the 175 pages on Googles index **
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
Get a spreadsheet that has all of your URLs. Drill down through the SERPs checking every one of them. Can you account for your pagination. You have a lot of it and that type of page is usually rubbish in the index. Combine, canonicalize, or get rid of them.
-
-
Hi Egol:
Thanks so much for taking the time for your thorough response!!
Apparently infitter24.rssing.com/chan-13023009/all is poaching my content, taking my original content and adding it to there site. I am not quiet sure what to do about that.
You have pointed out something very useful and I appreciate it and will act upon it. However it does not explain the sudden appearance of the 175 pages on Googles index that did not appear at the end of May and somehow coincided with uploading of the new version of our website in early June. Any ideas???
Thanks,
Alan -
-
Do this query: site:www.nyc-officespace-leader.com
-
Start drilling down the SERPs. One page at a time. Look for content that you didn't make. Look for duplicates.
-
When you drill down about 44 pages you will find this...
In order to show you the most relevant results, we have omitted some entries very similar to the 440 already displayed.
If you like, you can repeat the search with the omitted results included.The bad stuff is usually behind that link. Google doesn't want to show that stuff to people. It could be thin, it could be duplicate, it could be spammy, they just might not like it.
- Find out what is in there.
Possible problems that I see....
I see dupe content like this and this. Either your guys are grabbin' somebodyelse's content or they are grabbin' yours. Can get you in trouble with Panda. You need original and unique. Anything that is not original and unique should be deleted, noindexed or rewritten.
A lot of these pages are really skimpy. Think content can get you into trouble with Panda. Anything that is skimpy should be deleted, noindexed or beefed up.
I see multiple links to tags on lots of these posts. That can cause duplicate content problems.
The tag pages are paginated with just a few pages on each. These can generate extra pages that are low value, suck up your linkjuice or compound duplicate content problems.
You have archive pages, and category pages and more pagination problems.
-
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Drop in indexation but increase in organic traffic
We've had a puzzling drop in indexed pages on our ecommerce website. My crawl returns just over 25k items. Until 19/6 we had about 23-24k indexed. Then we experienced a sudden drop from 19/6 to 26/6: from 23,400 to 18,999, losing 4.4k pages from one week to the next. At the same time, our organic traffic has not decreased, it actually increased, however, it's only been a couple of weeks so that may be coincidence. A few things that have happened during the past few weeks: 31/5: we implemented pagination on category pages to avoid issues with duplicate content - could it be that this led to a decrease in indexed pages 3 weeks later? However, I can only find about 1.5k pages in my crawl that are page 2+ 18-19/6: we had some website outages over the weekend; as a B2B business, we don't get much traffic over the weekend, so I can't see an impact to traffic. However, the following week, indexation dropped by another 250 (then stayed the same this past week), so I don't think this was a factor. 21/6: we retired another website and migrated it to our main website. However, all pages were redirected to existing pages so no new pages were created for the migration. This doesn't really explain a decrease in indexation, but may account for some of the increase in organic traffic; however not all as the retired website hardly got any organic traffic. So, should we be worried? As our website is quite large, it would probably be quite difficult to pin point exactly which pages dropped off the index, but a loss of 19% of pages is quite significant. Then again, it doesn't appear to have negatively impacted organic traffic... Have you got any suggestions for what I should be looking at to find out what happened? Should I be worried at this point? I will definitely continue to have an eye on how our organic traffic (and indexation) develops but I am not sure if there is anything I can do at this point. I'd appreciate your advice on this, to make sure I am not missing something blindingly obvious. Thanks! RmWaNib JJm4tC3
Reporting & Analytics | | ViviCa10 -
Changing URL Parameters in Webmaster Tools
We have a bit of a conundrum. Webmaster tools is telling us that they are crawling too many URLs: Googlebot found an extremely high number of URLs on your site: http://www.uncommongoods.com/ In their list of URL examples, all of the URLs have tons of parameters. We would probably be ok telling Google not to index any of the URLs with parameters. We have a great URL structure. All of our category and product pages have clean links (no parameters) The parameters come only from sorts and filters. We don't have a need for Google to index all of these pages. However, Google Analytics is showing us that over the last year, we received a substantial amount of search revenue from many of these URLs (800+ of them converted) So, Google is telling us they are unhappy. We want to make Google happy by ignoring all of the paramter URLs, but we're worried this will kill the revenue we're seeing. Two questions here: 1. What do we have to lose by keeping everything as-is. Google is giving us errors, but other than that what are the negative repercussions? 2. If we were to de-index all of the parameter URLs via Webmaster tools, how much of the revnenue would likely be recovered by our non-parameter URLs? I've linked to a screenshot from Google Analytics ArxMSMG.jpg
Reporting & Analytics | | znotes0 -
Google Analytics: Deleted Profile
Has anyone ever successfully managed to have a deleted GA profile restored? One of our client's profiles was deleted accidentally. I know the official line is it can't be restored, but...
Reporting & Analytics | | David_ODonnell0 -
Google Analytics - In-Page Analytics
I had a strange thought waking up this morning, and was curious to hear other people's opinions on it. In Google Analytics, under Content > In-Page Analytics, Google shows what links on your site pages get clicked and how many times plus other metrics. Do you think they use that data for ranking back links so-to-speak? What I mean is, say I had a back link to my site on example.com, and example.com had google analytics installed. Google can see through google analytics whether my link has been clicked on. Say that my link gets no clicks, do you think that Google would use that metric against my site deeming it "not popular" or "not a good resource", even if example.com was a very popular site? And it could work the other way. Say my link got thousands of clicks on example.com, do you think that Google might use that to promote my site? I couldn't find any other discussion on this anywhere, so am not sure if people have already thought about this.
Reporting & Analytics | | THB0 -
Analytics/Google Keyword comparison
Hi I'm trying to establish a methodology to best show the gap between potential and realised organic keyword traffic. To obtain potential keyword traffic I'm using the Google Adwords keyword tool to derive local monthly search volumes for exact keyword matches. However, I'm confused as to which is the best way of getting a comparable metric from Google Analytics (GA). I was using custom reports and the 'organic searches' metric. However, this provides different values to a standard report selecting non-paid search in the default advanced segments. What is the best report/metric in GA to use for both organic and paid search volumes that would be comparable to the Google Adwords keyword tool. Also, I'm having problems getting my kids to eat their greens, any advice! 😉 Thanks Neil
Reporting & Analytics | | mccormackmorrison0 -
Should you get a new Google Analytics account if your site has a new domain after a site redesign/new development?
We recently developed a new site for a client and they have opted to move forward with a domain change. Should we create a new Google Analytics account for the new site?
Reporting & Analytics | | TheOceanAgency0 -
If a page bounces in the woods, can Google Panda hear it?
I have read that after the Panda update a site's bounce rate is an important ranking metric. However, can anyone confirm whether all pages count equally? For instance, my home page gets 5000% more traffic than Deep Page X. If Deep Page X has a poor bounce rate, does it matter as much as if my Homepage has a bad bounce rate? I am guessing not, but wanted to open it up for discussion. If not, it has me wondering on what to do for some of my database driven content. I have some dynamically created pages that have higher bounce rates and minimal unique content. They aren't pure spam or junk, but are likely only about 1% unique from one another. Sounds like a no brainer change post-Panda, right? Well, what if I was the only one targeting the keywords for these pages? The pages pull from info I stored on the U.S. government stimulus program (related to my industry). It then has just about every city, state and county combo in the country for my product. For instance, a page <title>might be "Flemington, NJ Widgets - Somerset County". Something that no one else is targeting and drives minimal traffic.</p> <p> </p> <p>Do I take this content down? I didn't have any affects, positive or negative from Panda, so I am hesitant to take down thousands of Google cached pages.</p></title>
Reporting & Analytics | | TheDude0 -
Does Google Analytics use your data against you?
I couldn't find this question answered anywhere in Q&A, so I apologize if it's a duplicate of another post. I heard, about a year ago on either Web Pro World, or Warrior Forum that Google uses your visitor data in your Google Analytic account to rank your site. Someone said that when they took out the Google code, their site went from the third to the first page within 48 hours. That was then verified over the next couple of weeks by others. Their thought was that regardless of the optimized page and incoming link, if the site wasn't getting visitors, then it would be penalized. Since Google has the data, they would be able to use it. I then started using another, paid, solution - getclicky.com. While I like clicky, there is some info Google has that clicky doesn't, everyone integrates with Google analytics - like SEOmoz, and I'm paying a monthly fee. Now that I'm a part of a community of experts, what do you think? Have you noticed Google ranking you based on your analytics data? Has anyone experienced this, or heard about it before? Because I'd like to go back to using Google analytics. Thanks!
Reporting & Analytics | | DallasBonsai0