Getting rid of a site in Google
-
Hi,
I have two sites, lets call them site A and site B, both are sub domains of the same root domain. Because of a server config error, both got indexed by Google.
Google reports millions of inbound links from Site B to Site A
I want to get rid of Site B, because its duplicate content.
First I tried to remove the site from webmaster tools, and blocking all content in the robots.txt for site B, this removed all content from the search results, but the links from site B to site A still stayed in place, and increased (even after 2 months)
I also tried to change all the pages on Site B to 404 pages, but this did not work either
I then removed the blocks, cleaned up the robots.txt and changed the server config on Site B so that everything redirects (301) to a landing page for Site B. But still the links in Webmaster Tools to site A from Site B is on the increase.
What do you think is the best way to delete a site from google and to delete all the links it had to other sites so that there is NO history of this site? It seems that when you block it with robots.txt, the links and juice does not disappear, but only the blocked by robots.txt report on WMT increases
Any suggestions?
-
The sites are massive and we are talking massive numbers:
Google reports in WMT that site B still has 259,157,970 links to site A, although when you filter into the report it only shows a few
The current state is that nothing is blocked on Site B, and ALL pages point to the landing page of Site B.
In WMT for site B, G still shows data for all the reports, like search queries, keywords, crawl errors (very old and all fixed) and so on. The reports and data does not bother me as much as the 259,157,970 links it reports on Site A.
On the 11th of April when I started the process of getting rid of these links, there were 554,066,716, this jumped up to 603,404,378 on the 28th of April. It started dropping and was as low as 122,405,100 on the 17th of May, and then started growing again up to where it is now 259,157,970
I also noticed that when the pages was giving 404s that the crawl rate of google dropped to zero, now that its redirecting to the landing page, the crawl rate is back up to about 1,800 per day, which is still very low, considering the numbers we are talking about.
The crawl rate on Site A is okay, at 220,000 per day, but it was as high as 800,000 per day at one stage.
-
If you remove all history of a website it may still appear in the wayback machine.
If you first blocked robots then they wont create the 301 links, they'll just keep the previously cached pages? Maybe remove the robots.txt and let google index every page with the 301 to the landing page, then after they've indexed add the robot.txt back. Have you tried submitting a new sitemap in Webmaster tools pointing all pages at the landing page?
Roughly how many pages are in your website?
-
I failed to mention that both sites A and B had the exact same content, database and URL structure, with the only difference being the sub domain.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing our site
We have 700 city pages on our site. We submitted to google via a https://www.samhillbands.com/sitemaps/locations.xml but they only indexed 15 so far. Yes the content is similar on all of the pages...thought on getting them to index the remaining pages?
Intermediate & Advanced SEO | | brianvest0 -
Getting too many links on Google search results, how do I fix?
I'm a total newbie so I apologize for what I am sure is a dumb question — I recently followed Moz suggestions for increasing visibility on my site for a specific keyword by including that keyword in more verbose page descriptions for multiple pages. This worked TOO well as now that keyword is bringing up too many results in Google for these different pages on my site . . . is there a way to compile them into one result with the subpages like for instance, the attached image for a search on Apple? Do I need to change something in my robots.txt file to direct these to my main page? Basically, I am a photographer and a search for my name now brings up each of my different photo gallery pages in multiple results, it's a little over the top. Thanks for any and all help! CNPJZgb
Intermediate & Advanced SEO | | jason54540 -
How do I know what pages of my site is not inedexed by google ?
Hi I my Google webmaster tools under Crawl->sitemaps it shows 1117 pages submitted but 619 has been indexed. Is there any way I can fined which pages are not indexed and why? it has been like this for a while. I also have a manual action (partial) message. "Unnatural links to your site--impacts links" and under affects says "Some incoming links" is that the reason Google does not index some of my pages? Thank you Sina
Intermediate & Advanced SEO | | SinaKashani0 -
Malicious site pointed A-Record to my IP, Google Indexed
Hello All, I launched my site on May 1 and as it turns out, another domain was pointing it's A-Record to my IP. This site is coming up as malicious, but worst of all, it's ranking on keywords for my business objectives with my content and metadata, therefore I'm losing traffic. I've had the domain host remove the incorrect A-Record and I've submitted numerous malware reports to Google, and attempted to request removal of this site from the index. I've resubmitted my sitemap, but it seems as though this offending domain is still being indexed more thoroughly than my legitimate domain. Can anyone offer any advice? Anything would be greatly appreciated! Best regards, Doug
Intermediate & Advanced SEO | | FranGen0 -
We are ignored by Google - what should we do?
Hi, We believe that our website - https://en.greatfire.org - is being all but ignored by Google Search. The following two examples illustrate our case. 1. Searching for “China listening in on Skype - Microsoft assumes you approve”. This is the title of a blog post that we wrote which received some 50,000 visits. On Yahoo and Bing search, we rank first for this search. On Google, however, we rank 7th. Each of the six pages ranking higher than us are quoting and linking to our story. 2. Searching for “Online Censorship In China”. This is the title of our front page. Yahoo and Bing both rank us third for this search. On Google, however, we are not even among the first 300 results. Two of the pages among the first 10 results link to us. Our website has an average of around 1000 visits per day. We are quoted in and linked from virtually all Western mainstream media (see https://en.greatfire.org/press). Yet to this day we are receiving almost no traffic from Google Search. Our mission is to bring transparency to online censorship in China. If people could find us in Google, it would greatly help to spread awareness of the extent of Internet restrictions here. If you could indicate to us what the cause of our poor rankings could be, we would be very grateful. Thank you for your time and consideration.
Intermediate & Advanced SEO | | GreatFire.org0 -
Site #2 beats site #1 in every aspect?
Hey guys, loving SEOMoz so far and will definitely continue my subscription after the free trial. I have a question however, which I am really confused about. When researching my primary keyword, I have found that the second ranked site beats the top site in every single aspect, apart from domain age, which is almost 6 years for the top one and 6 months for the second. When I say every single aspect, I mean everything. More authority for the page and domain, more links, more anchor text links, more authoritive links, more social signals, more relevant links, better domain (although second ranked site is a .net), better MozRank, better MozTrust etc.... I have noticed though, that in the UK SERPs, those sites are switched, so #2 is actually #1. Could it be that the US SERPs just haven't updated yet, or am I missing something completely different.
Intermediate & Advanced SEO | | darrenspeed1 -
One platform, multiple niche sites: Worth $60/mo so each site has different class C?
Howdy all, The short of it is that I currently run a very niche business directory/review website and am in the process of expanding the system to support running multiple sites out of the same database/codebase. In a normal setup I'd just run all the sites off of the same server with all of them sharing a single IP address, but thanks to the wonders of the cloud, it would be fairly simple for me to run each site on it's own server at a cost of about $60/mo/site giving each site a unique IP on a unique c-block (in many cases a unique a-block even.) The ultimate goal here is to leverage the authority I've built up for the one site I currently run to help grow the next site I launch, and repeat the process. The question is: Is the SEO-value that the sites can pass to each other worth the extra cost and management overhead? I've gotten conflicting answers on this topic from multiple people I consider pretty smart so I'd love to know what other people say.
Intermediate & Advanced SEO | | qurve0 -
Working out exactly how Google is crawling my site if I have loooots of pages
I am trying to work out exactly how Google is crawling my site including entry points and its path from there. The site has millions of pages and hundreds of thousands indexed. I have simple log files with a time stamp and URL that google bot was on. Unfortunately there are hundreds of thousands of entries even for one day and as it is a massive site I am finding it hard to work out the spiders paths. Is there any way using the log files and excel or other tools to work this out simply? Also I was expecting the bot to almost instantaneously go through each level eg. main page--> category page ---> subcategory page (expecting same time stamp) but this does not appear to be the case. Does the bot follow a path right through to the deepest level it can/allowed to for that crawl and then returns to the higher level category pages at a later time? Any help would be appreciated Cheers
Intermediate & Advanced SEO | | soeren.hofmayer0