How to de-index a page with a search string with the structure domain.com/?"spam"
-
The site in question was hacked years ago. All the security scans come up clean but the seo crawlers like semrush and ahrefs still show it as an indexed page. I can even click through on it and it takes me to the homepage with no 301. Where is the page and how to deindex it?
domain/com/?spam
There are multiple instances of this.
http://www.clipular.com/c/5579083284217856.png?k=Q173VG9pkRrxBl0b5prNqIozPZI
-
You are most welcome. I'm glad to hear your road to site recovery is coming along. I'm also glad to confirm that, to all of my knowledge, your understanding of the "*" operator and Disallow /?spam string is correct. One more thing:
Fetch as Google and Request Indexing
Apologies, I neglected to mention this step in my answer. It should be included. This is the best tool I'm aware of to ask Google, "hey, crawl me please." Do this after you upload your shiny new robots.txt.In GSC, under Crawl, select Fetch as Google. Then, select Fetch and Render. When status is partial or complete, click Request Indexing. There is no guarantee here, and my experience is Google does what it wants. Even so, I've seen results in less than 2 hours (full disclosure: the longest I've waited has been 3 days).
Penalty Free I agree. They cannot possibly be penalizing your site. At least, not purposefully. You have taken all recommended actions and then some to resolve site issues. Even if you do have a few bad back links floating around out there from some blackhat t3 site PBN, Penguin 4.0 should discredit that bad link juice. Your site doesn't even have the offending pages. It's just a matter of time before Google's index lines back up with your live site.
Good Work Sir,
Wipe the Index Clean,
CopyChrisSEO and the Vizergy Team -
Thanks very much for your explanation.
I have gone ahead and temporarily blocked the pages in GSC.
I am working on the robot.txt and see there are no instructions for the crawlers to skip over these urls in question.
I understand that I should use the "*" operator to alert all crawlers to disallow the pages in this format:
user-agent: *
Disallow: /?spam string
Finally, I will send the suggested edit to Google and see where that gets me. Honestly, at this point, they cannot possibly be penalized the site any worse so anything working towards cleaning up the index for the site will be a step in the right direction.
-
Hello Miamirealestatetrendsguy and fellow Mozers,
It sounds like you have had a crazy time handling this hack. Good news is, as far as I can tell from the given information, you are close to resolution. Googlebot should correct the indexed pages over time. I'm certain you would like to expedite that process. Here are three recommendations that come to mind: Remove URLs via GSC, block the offending URLs via robots.txt, and suggest edits in Google's SERPs.
Remove URLs via GSC
In GSC, under Google Index, select Remove URLs. This suppression is temporary however. Click on more information for more about that. My experience with it as been suppression for a few months. Don't worry about the time though. Our next step should take affect before your time is up.Block the Offending URLs via Robots.txt
Before you do this, be very certain what you are doing. After you are confident, list your offending URLs, edit the offending URLs as noindex nofollow in your robots.txt, and upload it. Hopefully, you can find commonalities to shorten this list and save your time.Note: I have purposefully avoided the details on how to this here because it is vital SEOs learn how to do it with full knowledge of potential risks as well as how to avoid those risks. Here are some resources:
• Google Support • Moz's Robots.txt Rundown
• Search Engine Land's Deeper LookSuggest Edits in Google's SERPs This one is iffy, and I really don't trust Google using this feedback. However, I have done it and it worked more than once. Find your offending results and send specific feedback.
Wipe that Index Clean,
CopyChrisSEO and the Vizergy Team
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redundant categorization - "boys" and "girls" category. Any other suggestions than implementing filtering?
One of our clients (a children's clothing company) has split their categories (outwear, tops, shoes) between boys and girls - There's one category page for girls outwear, and one category for boys outwear. I am suspecting that this redundant categorisation is diluting link juice and rankings for the related search queries. Important points: The clothes themselves are rather gender-neutral, girl's sweaters don't differ that much from the boy's sweaters. Our keyword research indicates that norwegians' search queries are also pretty gender neutral - people are generally searching after "children's dresses", "shoes for kids", "snowsuits", etc. So these gender specific categories are not really reflective of people's search behavior. I acknowledge that implementing a filter for "boys" and "girls" would be the best way to solve this redundant categorization, but that would simply be to expensive for our client. I'm thinking that some sort of canonicalisation would be the best approach to solve this issue. Are there any other suggestions or comments to this?
Technical SEO | | Inevo0 -
Disavow links and domain of SPAM links
Hi, I have a big problem. For the past month, my company website has been scrape by hackers. This is how they do it: 1. Hack un-monitored and/or sites that are still using old version of wordpress or other out of the box CMS. 2. Created Spam pages with links to my pages plus plant trojan horse and script to automatically grab resources from my server. Some sites where directly uploaded with pages from my sites. 3. Pages created with title, keywords and description which consists of my company brand name. 4. Using http-referrer to redirect google search results to competitor sites. What I have done currently: 1. Block identified site's IP in my WAF. This prevented those hacked sites to grab resources from my site via scripts. 2. Reach out to webmasters and hosting companies to remove those affected sites. Currently it's not quite effective as many of the sites has no webmaster. Only a few hosting company respond promptly. Some don't even reply after a week. Problem now is: When I realized about this issue, there were already hundreds if not thousands of sites which has been used by the hacker. Literally tens of thousands of sites has been crawled by google and the hacked or scripted pages with my company brand title, keywords, description has already being index by google. Routinely everyday I am removing and disavowing. But it's just so much of them now indexed by Google. Question: 1. What is the best way now moving forward for me to resolve this? 2. Disavow links and domain. Does disavowing a domain = all the links from the same domain are disavow? 3. Can anyone recommend me SEO company which dealt with such issue before and successfully rectified similar issues? Note: SEAGM is company branded keyword 5CGkSYM.png
Technical SEO | | ahming7770 -
Why would GWT say 0 pages indexed ?
Hi Looking in GWT > Google Index > Index Status says 0 pages indexed Yes if i search manually on google for brand site is listed, and i see organic traffic from Google in analytics I take it this is likely an error in GWT and nothing to worry about ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Www.xyz.com v/s xyz.com creating duplicate pages
I just put my site in moz analytics. The crawl results says I have duplicate content. When I look at the pages it is because one page is www.xyz.com and the duplicate is xyz.com. What causes this and how can it be fixed. I'm not a developer, so be kind and speak a language I can understand. Thanks for your help 🙂
Technical SEO | | Britewave0 -
Website Migration - Very Technical Google "Index" Question
This is my understanding of how Google's search works, and I am unsure about one thing in specifc: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" connects to the "page directory". I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I ask is I am starting to work with a client who has a newly developed website. The old website domain and files were located on a GoDaddy account. The new websites files have completely changed location and are now hosted on a separate GoDaddy account, but the domain has remained in the same account. The client has setup domain forwarding/masking to access the files on the separate account. From what I've researched domain masking and SEO don't get along very well. Not only can you not link to specific pages, but if my above assumption is true wouldn't Google have a hard time crawling and storing each page in the cache?
Technical SEO | | reidsteven750 -
Do Collections in Shopify create Duplicate Pages according to Google/Bing/Yahoo?
I'm using the e-commerce platform Shopify to host an e-store. We've put our products into different collections. Shopify automatically creates different URL paths to a product in multiple collections. I'm worried that the same product listed in different collections is soon as different pages, and therefore duplicate content by Google/Bing/Yahoo. Would love to get your opinion on this concern! Thanks! Matthew
Technical SEO | | HappinessDigital0 -
Should I wory about spam domains linking to me?
A while ago my site had a pharmacy hack done to it and created a ton of spam links. I've since fixed the issues on my site but I'm still showing links from their sites. See screen shot: http://awesomescreenshot.com/0497cc147 I think they are links from the spam site to me and not my site "yakanger" linking to them correct? Do I need to worry about these? Can I get rid of them?
Technical SEO | | mr_w2 -
I have a ton of "duplicated content", "duplicated titles" in my website, solutions?
hi and thanks in advance, I have a Jomsocial site with 1000 users it is highly customized and as a result of the customization we did some of the pages have 5 or more different types of URLS pointing to the same page. Google has indexed 16.000 links already and the cowling report show a lot of duplicated content. this links are important for some of the functionality and are dynamically created and will continue growing, my developers offered my to create rules in robots file so a big part of this links don't get indexed but Google webmaster tools post says the following: "Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools." here is an example of the links: | | http://anxietysocialnet.com/profile/edit-profile/salocharly http://anxietysocialnet.com/salocharly/profile http://anxietysocialnet.com/profile/preferences/salocharly http://anxietysocialnet.com/profile/salocharly http://anxietysocialnet.com/profile/privacy/salocharly http://anxietysocialnet.com/profile/edit-details/salocharly http://anxietysocialnet.com/profile/change-profile-picture/salocharly | | so the question is, is this really that bad?? what are my options? it is really a good solution to set rules in robots so big chunks of the site don't get indexed? is there any other way i can resolve this? Thanks again! Salo
Technical SEO | | Salocharly0