How to stop Search Bot from crawling through a submit button
-
On our website http://www.thefutureminders.com/, we have three form fields that have three pull downs for Month, Day, and year. This is creating duplicate pages while indexing. How do we tell the search Bot to index the page but not crawl through the submit button?
Thanks
Naren
-
Hi Dan
What is happening is this - since we have all the months [12], all the dates [31] and years[1921 through 2011] in the form fields, the robot seems to be taking these incrementally and then using the submit button. After the submit button, user is presented with a registration page. While we do want the search to index the rest of the page and the crawl through the rest of the page links we do not want it to crawl through that submit button. I hope I am making sense.
Naren
-
The advantage of blocking a page from being indexed via a meta tag is it is less likely to have unexpected consequences. I've often seen in the past cases where an incorrectly modified robots.txt file leads to a site being blocked by accident.
-
Hi
To my knowledge, you don't stop it from crawling through the button (like a nofollowed link), rather you block the robot at the page it ends up on after clicking submit.
Say the user hits submit and it takes them to mydomain.com/confirm.html On that page you'll want to add;
....if you want it to NOT index the page but follow the links on it.
or
...if you want it to NOT index and NOT follow the links on that page.
Its advised that its better to do this with the meta tag than in robots.txt.
Hopefully I've understood the question correctly!
-Dan
-
Block the pages/folders you do not wish to be indexed with robots.txt file:
User-agent: * Disallow: /folder1/ Disallow: /folder2/
OR you can add canonical tags to the other pages which are creating duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz Crawled My Site. Now What?
Hey everyone! So Moz crawled my site and I passed it over to my dev team who's curious about what they should prioritize. Curious what everyone's thoughts are. Here are the issue types: Duplicate Content - Missing Title - Duplicate Title Tag - Redirect Chain - Title too long - Description too short - Missing Description - Missing h1 - Thin Content - URL Too Long - Has meta noindex Would love any assistance! Thank you!
Technical SEO | | inksoft_mm0 -
Should I disallow crawl of my Job board?
MOZ crawler is telling me we have loads of duplicate content issues. We use a Job Board plugin on our Wordpress site and we have allot of duplicate or very similar jobs (usually just a different location), but the plugin doesn't allow us to add any rel canonical tags to the individual jobs. Should I disallow the /jobs/ url in the robots.txt file? This will solve the duplicate content issue but then Google wont be able to crawl any of the individual job listings Has anyone had any experience working with a job board plugin on Wordpress and had a similar issue, or can advise on how best to solve our duplicate content?? Thanks 🙂
Technical SEO | | O2C0 -
Google bot notification
Hi there! I've just made some changes in my website in order to optimize it but I don't know if there's a way to notify the googlebot that some aspects of the configuration (metas) have changed and must be "taken into account". The spider visited my site two days ago and obviously processed the sitemap file. I've heard that it's possible to do a ping to certain websites. Is this the way to proceed? I must say that there're not many updates in the site (just one way information) as the social media activity is still low. Thanks in advanced.
Technical SEO | | juanmiguelcr0 -
Summarize your question.Crawl Diagnostics Summary
Hi, Crawl Diagnostics Summary pointed on some mistakes I've done, I fixed them, but Crawl Diagnostics Summary still shows same errors, how often does ithe data refreshes?
Technical SEO | | AndreyStotsky0 -
How do you diagnose if on your site is only 50% crawled?
Good Morning from 7 degrees C, goodbye arctic conditions wetherby UK, If a site had 100 pages for example & that site was plugged into Webmaster Tools how could you diagnose if all the pages had been crawled? The thing is I want to learn how to diagnose crawl issues with sites, is their a known methodology for this? Thanks in advance, David
Technical SEO | | Nightwing0 -
How a google bot sees your site
So I have stumbled across various websites like this: http://www.smart-it-consulting.com/internet/google/googlebot-spoofer/ The concept here is to be able to view your site as a googlebot sees it. However, the results are a little puzzling. Google is reading the text on my page but not the title tags according to the results. Are websites like this accurate OR does Google not read title tags and H1 tags anymore? Also on a slighly related note. I noticed the results show the navigation bar is being read first by google, is this bad and should the navigation bar be optimized for keywords as well? If it did, it would read a bit funny and the "humans" would be confused.
Technical SEO | | StreetwiseReports0 -
Understanding the actions needed from a Crawl Report
I've just joined SEOMOZ last week and have not even received my first full-crawl yet, but as you know, I do get the re-crawl report. It shows I have 50 301's and 20 rel canonical's. I'm still very confused as to what I'm supposed to fix...And, all the rel canonical's are my sites main pages, so hence I am still equally confused as to what the canonical is doing and how do I properly setup my site. I'm a technical person and can grasp most things fairly quickly, but on this the light bulb is taking a little while longer to fire-up 🙂 If my question wasn't total jibberish and you can help shed some light, I would be forever grateful. Thank you.
Technical SEO | | apmgsmith0