How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Alternate page with proper canonical tag Status: Excluded in Google webmaster tools.
In Google Webmaster Tools, I have a coverage issue. I am getting this error message: Alternate page with proper canonical tag Status: Excluded. It gives the below blog post page as an example. Any idea how to resolve? At one time, I was using handl utm grabber, but the plugin is deactivated on my website. https://www.savacations.com/turrialba-costa-ricas-garden-city/?utm_source=deleted&utm_medium=deleted&utm_term=deleted&utm_content=deleted&utm_campaign=deleted&gclid=deleted5.
Intermediate & Advanced SEO | | Alancito0 -
URL structure - Page Path vs No Page Path
We are currently re building our URL structure for eccomerce websites. We have seen a lot of site removing the page path on product pages e.g. https://www.theiconic.co.nz/liberty-beach-blossom-shirt-680193.html versus what would normally be https://www.theiconic.co.nz/womens-clothing-tops/liberty-beach-blossom-shirt-680193.html Should we be removing the site page path for a product page to keep the url shorter or should we keep it? I can see that we would loose the hierarchy juice to a product page but not sure what is the right thing to do.
Intermediate & Advanced SEO | | Ashcastle0 -
Magento 1.9 SEO. I have product pages with identical On Page SEO score in the 90's. Some pull up Google page 1 some won't pull up at all. I am searching for the exact title on that page.
I have a website built on Magento 1.9. There are approximately 290,000 part numbers on the site. I am sampling Google SERP results. About 20% of the keywords show up on page 1 position 5 thru 10. 80% don't show up at all. When I do a MOZ page score I get high 80's to 90's. A page score of 89 on one part # may show up on page one, An identical page score on a different part # can't be found on Google. I am searching for the exact part # in the page title. Any thoughts on what may be going on? This seems to me like a Magento SEO issue.
Intermediate & Advanced SEO | | CTOPDS0 -
Google authorship page versus posrt
I have a wordpress.com blog and have recently set up google authorship (at least I think I have). If I add a new post what happens to the old post in terms of authorship? Is the solution opening a new page for each article? If so does the contribution link in google+ pick up all pages if you only have home link? many thanks
Intermediate & Advanced SEO | | harddaysgrind1 -
Street Address Not Appearing on Business Google+ Page
I run a local business in New York City, a commercial real estate brokerage. My firm has both a web site and Google+ accounts, one Google+ account for me personally and a Google+ account for my business. Under address my Google+ account is showing New York, NY. It is not showing a street address. Similiarly when my business name is entered in the Google search bar, my web site is the first result, but under address (directly to the right of a black dot with a grey circle around it) "New York, NY" with the phone number beneath it appears. No sign of my street address. My business is registered under Google Places and we have entered the correct street address. Any ideas on how I can get Google to display our street address? This is obviously very, very detrimental for local SEO. Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
Is Sitemap Issue Causing Duplicate Content & Unindexed Pages on Google?
On July 10th my site was migrated from Drupal to Google. The site contains approximately 400 pages. 301 permanent redirects were used. The site contains maybe 50 pages of new content. Many of the new pages have not been indexed and many pages show as duplicate content. Is it possible that there is a site map issue that is causing this problem? My developer believes the map is formatted correctly, but I am not convinced. The sitemap address is http://www.nyc-officespace-leader.com/page-sitemap.xml [^] I am completely non technical so if anyone could take a brief look I would appreciate it immensely. Thanks,
Intermediate & Advanced SEO | | Kingalan1
Alan | |0 -
Can use of the id attribute to anchor t text down a page cause page duplication issues?
I am producing a long glossary of terms and want to make it easier to jump down to various terms. I am using the<a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p=""></a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">Does anyone know whether Google will pick this up as separate duplicate pages?</a> <a id="anchor-text" ="" attribute="" so="" am="" appending="" #anchor-text="" to="" a="" url="" reach="" the="" correct="" spot<="" p="">If so any ideas on what I can do? Apart from not do it to start with? I am thinking 301s won't work as I want the URL to work. And rel=canonical won't work as there is no actual page code to add it to. Many thanks for your help Wendy</a>
Intermediate & Advanced SEO | | Chammy0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0