Is site: a reliable method for getting full list of indexed pages?
-
The site:domain.com search seems to show less pages than it used to (Google and Bing).
It doesn't relate to a specific site but all sites. For example, I will get "page 1 of about 3,000 results" but by the time I've paged through the results it will end and change to "page 24 of 201 results". In that example If I look in GSC it shows 1,932 indexed.
Should I now accept the "pages" listed in site: is an unreliable metric?
-
Keep in mind that for a site:domain.com search, Google now includes pages from OTHER SITES that are using the canonical tag to point to your site. So, even though it says there are 300 pages indexed, 30 of those pages might be on other sites that use the canonical tag pointing to your site. The numbers of pages indexed that you're looking at may not be entirely accurate because of this.
-
I just haven't seen where the pages reduced, but I only use that operator for a general search. I have never gone through all the pages, etc. For that I would use any of the crawler tools. It would be interesting to see a download of search, GSC, and then something like Screaming Frog to see what we see.
As soon as I wrote that I checked our site and realized what you are saying. For Google we get "About 281 results," as I go to last page of results it changes to "page 13 of 126 results."
Then out of curiosity I tried Bing and now I am scratching my head: "763 results." When I go to last possible page I get, "247-256 of 256 results." I think that means my 281 results from Google are mostly on Bing!!!! (in case someone does not realize my humor, that last statement can be defined as either jest or sarcasm.)
So, when doing the site: I get 126 with Google but search console has 428...
Certainly interesting. I will keep playing with it.
Best
-
Hi Robert,
Thanks for your input.
The reason for doing it is part of an SEO site review process to examine pages indexed in Google compared to a site crawl in a tool like screaming frog and the indexed pages defined in GSC.
In terms of the "page 24 of 201 results" example, I mean that when you first use the site:domain.com Google will give you an estimated number of results, e.g. 3000 but actually as you click through the pages you find that the number of results is reduced - sometimes significantly.
-
I am not sure I understand where you say, " ...it will end and change to "page 24 of 201 results." I have used the site: operator a long time and I think it is reasonably accurate. One thing I notice is the occasional "some pages have been ... duplicate" and do you want to see those? So, if you include all of those what's the magic number?
Is there a reason you want the data that demands an exact result? I am not sure of anything that would give you that. The question is "indexed" within the given search engine. If you crawl with screaming frog, etc. you may see pages that are not indexed, so the comparison is not apples to apples. Just curious as to what you are wanting to know exact indexed pages for?
Interesting question.
-
Typically, the site: command in Google is unreliable. There are lots of reasons why, one being that there may be pages indexed that aren't "good enough", for whatever reason, to show up in the search results. When we look at the site pages indexed, we typically will use the site: command, then click a few pages deep and look at the number it shows (not the first number of pages it shows).
For SEO auditing purposes, we're looking to see if there is a significant difference between the number of pages indexed and the number of pages that we find when we we crawl the website ourselves.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to setup an iFrame to be indexed as the parent site
Hi, we are trying to move all of our website content from www.mysite.com to a subdomain (i.e. content.mysite.com), and make "www.mysite.com" nothing more than an iFrame displaying the content from content.mysite.com. We have about 10 pages linking from the home page, all indexed separately, so I understand we'll have to do this for every one of them. (www.mysite.com/contact will be an iframe containing the content from content.mysite.com/contact, and we'll need to do this for every page) How do we do this so Google continues to index the content hosted at content.mysite.com with the parent page in organic results (www.mysite.com). We want all users to enter the site through www.mysite.com or www.mysite.com/xxxxxx, which will contain no content except for iFrames pulling in content from content.mysite.com. Our fear is that google will start directing users directly to content.mysite.com, rather than continue feeding to www.mysite.com. If we use www1.mysite.com or www2.mysite.com as the location of the content, instead of say content.mysite.com, would these subdomain names work better for passing credit for the iFramed content to the parent page (www.mysite.com)? Thanks! SIDE NOTE: Before someone asks why we need to do this, the content on mysite.com ranks very well, but site has a huge bounce rate due to a poorly designed CMS serving the content. The CMS does not load the page in pieces (like most pages load), but instead presents the visitor with a 100% blank page while the page loads in the background for about 5-10 seconds, and then boom 100% of the page shows up. We've been back and forth with our CMS provider about doing something about this for 5 years now, and we have given up. We tested moving our adwords links to xyz.mysite.com, where users are immediately shown a loading indicator, with our site (www.mysite.com) behind it in an iFrame. The immediate result was resounding success... our bounce rate PLUMMETED, and the root domain www.mysite.com saw a huge boost in search results. Problem with this is our site still comes up in organic results as www.mysite.com, which does not have any kind of spinning disk loading indicator, and still has a very high bounce rate.
Technical SEO | | vezaus0 -
Best way to handle URLs of the to-be-translated pages on a multilingual site
Dear Moz community, I have a multilingual site and there are pages with content that is supposed to be translated but for now is English only. The structure of the site is such that different languages have their virtual subdirs: domain.com/en/page1.html for English, domain.com/fr/page1.html for French and so on. Obviously, if the page1.html is not translated, the URLs point to the same content and I get warnings about duplicate content. I see two ways to handle this situation: Break the naming scheme and link to original English pages, i.e. instead of domain.com/fr/index.html linking to domain.com/fr/page1.html link to domain.com/en/page.html Leave the naming scheme intact and set up a 301 redirect so that /fr/page1.html redirects to /en/page1.html Is there any difference for the two methods from the SEO standpoint? Thanks.
Technical SEO | | Lomar0 -
Problems with to many indexed pages
A client of our have not been able to rank very well the last few years. They are a big brand in our country, have more than 100+ offline stores and have plenty of inbound links. Our main issue has been that they have to many indexed pages. Before we started we they had around 750.000 pages in the Google index. After a bit of work we got it down to 400-450.000. During our latest push we used the robots meta tag with "noindex, nofollow" on all pages we wanted to get out of the index, along with canonical to correct URL - nothing was done to robots.txt to block the crawlers from entering the pages we wanted out. Our aim is to get it down to roughly 5000+ pages. They just passed 5000 products + 100 categories. I added this about 10 days ago, but nothing has happened yet. Is there anything I can to do speed up the process of getting all the pages out of index? The page is vita.no if you want to have a look!
Technical SEO | | Inevo0 -
One page of the site disappeared from serp for a month now
Im working on a clients site and been promoting a specific page to a keyword. started to move up the ranks and exactly a month ago on the 19/5 ( on the same day of the last update) updated the main page im working on with new content and published some other new pages on related subjects that all are linking to the main page im working on ( without the same anchor text in the links ) on the same day i found out that because of a technical error the new content was published on 5 other pages of the site and obviously created a duplicate content issue and i removed all the duplicates on the same day , i assume G caught this thing and punished the site for the duplicate content issue but : when i search the page directly with site:...i can find it. its been a month since i fixed all issues that i thought could impact the page..no duplicate content on the site. no KW stuffing. no spammy links to the page. everything seems fine now my question : why is my page not showing ? how long should i wait before giving up and creating a new page .? how come my site has not lost any organic traffic ( apart from that specific page ) ? is it possible to penalize only one page ? can i recover from this at all ? thanks
Technical SEO | | nira0 -
Why is this page not ranking but is indexed?
I have a page http://jobs.hays.co.uk/jobs-in-norfolk and it is indexed by Google but will not show up for any keywords I try. Any ideas?
Technical SEO | | S_Curtis0 -
What happens to content under a category page that is not indexed?
We are reevaluating our URL structure. We have a flat architecture but would like to add subfolders per recommendations here and elsewhere. Some of our category pages are ad heavy/content light so we have them no indexed. We do have lots of quality content on the site that we would like to put under some of these keyword categories. Should we leave it flat? If Google does not see that category page then there will be no link from the homepage to the content page? Now: homepage/content-page Proposed: homepage/category/content-page (category is not indexed)
Technical SEO | | hoch0 -
Discrepency between # of pages and # of pages indexed
Here is some background: The site in question has approximately 10,000 pages and Google Webmaster shows that 10,000 urls(pages were submitted) 2) Only 5,500 pages appear in the Google index 3) Webmaster shows that approximately 200 pages could not be crawled for various reasons 4) SEOMOZ shows about 1,000 pages that have long URL's or Page Titles (which we are correcting) 5) No other errors are being reported in either Webmaster or SEO MOZ 6) This is a new site launched six weeks ago. Within two weeks of launching, Google had indexed all 10,000 pages and showed 9,800 in the index but over the last few weeks, the number of pages in the index kept dropping until it reached 5,500 where it has been stable for two weeks. Any ideas of what the issue might be? Also, is there a way to download all of the pages that are being included in that index as this might help troubleshoot?
Technical SEO | | Mont0 -
Why are my pages getting duplicate content errors?
Studying the Duplicate Page Content report reveals that all (or many) of my pages are getting flagged as having duplicate content because the crawler thinks there are two versions of the same page: http://www.mapsalive.com/Features/audio.aspx http://www.mapsalive.com/Features/Audio.aspx The only difference is the capitalization. We don't have two versions of the page so I don't understand what I'm missing or how to correct this. Anyone have any thoughts for what to look for?
Technical SEO | | jkenyon0