How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Panda question Category Pages on e-commerce site
Dear Mates, Could you check this category page of our e-commerce site: http://tinyurl.com/zqjalng and give me your opinion about, this is a Panda safe page or not? Actually I have this as NOINDEX preventing any Panda hit, but I'm in doubt. My Question is "Can I index this page again in peace?" Thank you Clay
Intermediate & Advanced SEO | | ClayRey0 -
What is the benefit of directory pages?
I recently started at a new job running ecommerce websites. We sell yoga equipment and on 2 of our sites we built directory pages for yoga studios to list their calendars and whatnot. They are pretty old and out of date, but my question is, is there any benefit to these types of directories? If they do, we need to look at refreshing them. But if not, then they need to go. One of them is here. http://www.everythingyoga.com/studios.aspx Like I said, it is out of date.
Intermediate & Advanced SEO | | ShockoeCommerce0 -
Location Pages On Website vs Landing pages
We have been having a terrible time in the local search results for 20 + locations. I have Places set up and all, but we decided to create location pages on our sites for each location - brief description and content optimized for our main service. The path would be something like .com/location/example. One option that has came up in question is to create landing pages / "mini websites" that would probably be location-example.url.com. I believe that the latter option, mini sites for each location, would be a bad idea as those kinds of tactics were once spammy in the past. What are are your thoughts and and resources so I can convince my team on the best practice.
Intermediate & Advanced SEO | | KJ-Rodgers0 -
How to Get Google to Recognize Your Pages Are Gone
Here's a quick background of the site and issue. A site lost half of its traffic over 18 months ago and its believed to be a Panda penalty. Many, many items were already taken care of and crossed off the list, but here's something that was recently brought up. There are 30,000 pages indexed in Google,but there are about 12,000 active products. Many of these pages in their index are out of stock items. A site visitor cannot find them by browsing the site unless he/she had bookmarked and item before, was given the link by a friend, read about it, etc. If they get to an old product because they had a link to it, they will see an out of stock graphic and not allow to make the purchase. So, efforts have been made about 1 month ago to 301 old products to something similar, if possible, or 410 them. Google has not been removing them from the index. My question is how to make sure Google sees that these pages are no longer there and remove from the index? Some of the items have links to them and this will help Google see them, but what about the items which have 0 external / internal links? Thanks in advance for your assistance. In working on a site which has about 10,000 items available for sale. Looking in G
Intermediate & Advanced SEO | | ABK7170 -
Suspected hacking - Google has detected that some of your pages may contain hidden text or cloaking
I got below message from google, But I did not see any hidden text, Please check it. http://www.astrologerravisharma.com/: Suspected hacking Google has detected that some of your pages may contain hidden text or cloaking, techniques that are outside our Webmaster Guidelines. Specifically, we detected that your site may have been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index. Sample URLs: http://www.astrologerravisharma.com/ http://www.astrologerravisharma.com/about-us/ http://www.astrologerravisharma.com/achievements/ Recommended action Clean up the hacked content so that your site meets Google's Webmaster Guidelines.
Intermediate & Advanced SEO | | bondhoward0 -
Google + under Google business domain email account
Hello there, I have a quick and straight question and I am hoping to find answer here. What do we do with a G+ profile that was set up through a business domain's email account that is used by more than one person? We want to use the company name, but we can't as it is considered personal email account although it is under business domain verified by Google. Is there a way that we ask Google to change it and allow us to use the name of the company or should we just deactivate it? Thanks in advance!
Intermediate & Advanced SEO | | montauto0 -
Why is my XML sitemap ranking on the first page of google for 100s of key words versus the actual relevant page?
I still need this question answerd and I know it's something I must have changed. But google is ranking my sitemap for 100s of key terms versus the actual page. It's great to be on the first page but not my site map...... Geeeez.....
Intermediate & Advanced SEO | | ursalesguru0 -
Get Duplicate Page content for same page with different extension ?
I have added a campaign like "Bannerbuzz" in SEOMOZ Pro account and before 2 or 3 days i got errors related to duplicate page content . they are showing me same page with different extension. As i mentioned below http://www.bannerbuzz.com/outdoor-vinyl-banners.html
Intermediate & Advanced SEO | | CommercePundit
&
http://www.bannerbuzz.com/outdoor_vinyl_banner.php We checked our whole source files but we didn't define php related urls in our source code. we want to catch only our .html related urls. so, Can you please guide us to solve this issue ? Thanks <colgroup><col width="857"></colgroup>
| http://www.bannerbuzz.com/outdoor-vinyl-banners.html |0