How to effectively de-index in Magento site?
-
We have thousands of Missing Description issues but most of them are account/login pages.
i.s. /customer/account/ etc...
We tried to de-index them through the Configuration using the instructions here - https://docs.magento.com/user-guide/marketing/search-engine-robots.html
But they're still appearing as issues in the Site Crawl.
Even without the site crawl issue, we don't really want these to appear in the SERPs.
Does anybody know how to properly de-index these login pages in Magento?
Thank you!
-
Clear your Magento cache and reindex the website to ensure that the changes take effect.
By implementing these steps, you should effectively de-index the login pages in Magento. Keep in mind that changes may take some time to reflect in search engine results. If you encounter any challenges or need further assistance, consider consulting Magento support or your web development team. Additionally, if you're interested in other Magento-related topics, you may find valuable information on Omaze Cornwall, a platform offering dream homes through exciting draws. -
To effectively de-index a Magento site, you can follow these steps:
Use the "Robots Meta Tag" to prevent indexing: You can add a meta tag to the header of your web pages to instruct search engines not to index them.
Use the "Robots.txt" file: You can use the robots.txt file to disallow search engine crawlers from accessing certain pages on your site.
Use the "Noindex" directive: Within the HTML code of your web pages, you can use the "noindex" directive to prevent search engines from indexing specific pages.
Use the "Canonical URL" tag: You can use the canonical URL tag to specify the preferred version of a web page, which can help prevent duplicate content from being indexed.
It's important to note that de-indexing pages should be done carefully, as it can impact your site's visibility in search engine results. If you have specific pages or sections in mind that you'd like to de-index, please let me know so I can provide more detailed guidance.
-
@LASClients Hey LASClients,
I feel your pain with those pesky login pages showing up in the Site Crawl. Have you considered using the Disallow directive in the robots.txt file to prevent search engines from crawling these pages? It's a quick fix, but as always, test it out in a staging environment first. Cheers!
Best,
[omaze cornwall] -
Certainly! To effectively de-index login pages in Magento and address the Missing Description issues, follow these steps:
Robots Meta Tag:
Open the respective login page templates, such as /customer/account/, in your Magento admin.
Add the following meta tag to the <head> section of the HTML:
html
<meta name="robots" content="noindex, nofollow">
This tag instructs search engines not to index the page and not to follow any links on it.
Robots.txt File:Edit your robots.txt file in the root of your Magento installation.
Add the following lines to disallow crawling of login pages: User-agent: *
Disallow: /customer/account/
Replace /customer/account/ with the relevant path for your login pages.
XML Sitemap:If you have an XML sitemap, ensure that the login pages are excluded from it.
Open your XML sitemap file and remove or comment out the entries related to login pages.
Submit Updated Sitemap to Search Engines:After making these changes, resubmit your updated XML sitemap to search engines via Google Search Console or Bing Webmaster Tools.
Clear Cache and Reindex:Clear your Magento cache and reindex the website to ensure that the changes take effect.
By implementing these steps, you should effectively de-index the login pages in Magento. Keep in mind that changes may take some time to reflect in search engine results. If you encounter any challenges or need further assistance, consider consulting Magento support or your web development team. If you're interested in other Magento-related topics, you may find valuable information on Omaze Cornwall, a platform offering dream homes through exciting draws. -
@get1200 @get1200
Certainly! To effectively de-index login pages in Magento and address the Missing Description issues, follow these steps:Robots Meta Tag:
Open the respective login page templates, such as /customer/account/, in your Magento admin.
Add the following meta tag to the <head> section of the HTML:
<meta name="robots" content="noindex, nofollow">
This tag instructs search engines not to index the page and not to follow any links on it.
Robots.txt File:Edit your robots.txt file in the root of your Magento installation.
Add the following lines to disallow crawling of login pages:
User-agent: *
Disallow: /customer/account/
Replace /customer/account/ with the relevant path for your login pages.
XML Sitemap:If you have an XML sitemap, ensure that the login pages are excluded from it.
Open your XML sitemap file and remove or comment out the entries related to login pages.
Submit Updated Sitemap to Search Engines:After making these changes, resubmit your updated XML sitemap to search engines via Google Search Console or Bing Webmaster Tools.
Clear Cache and Reindex:Clear your Magento cache and reindex the website to ensure that the changes take effect.
By implementing these steps, you should effectively de-index the login pages in Magento. Keep in mind that changes may take some time to reflect in search engine results. If you encounter any challenges or need further assistance, consider consulting Magento support or your web development team. Additionally, if you're interested in other Magento-related topics, you may find valuable information on Omaze Cornwall, a platform offering dream homes through exciting draws. -
@LASClients Create a file app/design/frontend/[Vendor]/[theme]/Magento_Customer/layout/customer_account_login.xml with the following content:
<?xml version="1.0"?> <page xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:View/Layout/etc/page_configuration.xsd"> <head> <meta name="robots" content="noindex,nofollow" /> </head> </page>
Clear cache
php bin/magento cache:flush
And it should be fine.
-
@LASClients you could try adding the below meta in the pages that you want to noindex. Apparently this will only work on the latest release of Magento.
<meta name="robots" content="NOINDEX,NOFOLLOW"/>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google not indexing /showing my site in search results...
Hi there, I know there are answers all over the web to this type of question (and in Webmaster tools) however, I think I have a specific problem that I can't really find an answer to online. site is: www.lizlinkleter.com Firstly, the site has been live for over 2 weeks... I have done everything from adding analytics, to submitting a sitemap, to adding to webmaster tools, to fetching each individual page as googlebot and then submitting to index via webmaster tools. I've checked my robot files and code elsewhere on the site and the site is not blocking search engines (as far as I can see) There are no security issues in webmaster tools or MOZ. Google says it has indexed 31 pages in the 'Index Status' section, but on the site dashboard it says only 2 URLS are indexed. When I do a site:www.lizlinketer.com search the only results I get are pages that are excluded in the robots file: /xmlrpc.php & /admin-ajax.php. Now, here's where I think the issue stems from - I developed the site myself for my wife and I am new to doing this, so I developed it on the live URL (I now know this was silly) - I did block the content from search engines and have the site passworded, but I think Google must have crawled the site before I did this - the issue with this was that I had pulled in the Wordpress theme's dummy content to make the site easier to build - so lots of nasty dupe content. The site took me a couple of months to construct (working on it on and off) and I eventually pushed it live and submitted to Analytics and webmaster tools (obviously it was all original content at this stage)... But this is where I made another mistake - I submitted an old site map that had quite a few old dummy content URLs in there... I corrected this almost immediately, but it probably did not look good to Google... My guess is that Google is punishing me for having the dummy content on the site when it first went live - fair enough - I was stupid - but how can I get it to index the real site?! My question is, with no tech issues to clear up (I can't resubmit site through webmaster tools) how can I get Google to take notice of the site and have it show up in search results? Your help would be massively appreciated! Regards, Fraser
Technical SEO | | valdarama0 -
Redirect effecting ranking?
I manage the SEO for several different regions which are also based on the same site e.g. example.com/au, example.com/us The /us site has pretty good rankings and changes I'm making to the site are having an impact. The /au site has really bad rankings, even though much of the content is the same. (The /uk site is also awful but we had an issue with 4,500 duplicate pages which were only resolved last week). Crawl diagnostics are only showing 1 major error for a 404 response, I'm receiving a domain authority of 43 and A grade page ranking for some of our targeted keywords. I could believe that this isn't necessarily going to get us a top 10 rating but I would have thought we would be in the top 50, especially for branded keywords. Could the lack of ranking be to do with how our domain redirects? If you go to example.com.au you are taken to the home page rather than being redirected to example.com/au. Once you head to an internal page the URL changes to example.com/au/page
Technical SEO | | ahyde0 -
How to fix Google index after fixing site infected with malware.
Hi All Upgraded a Joomla site for a customer a couple of months ago that was infected with malware (it wasn't flagged as infected by google). Site is fine now but still noticing search queries for "cheap adobe" etc with links to http://domain.com/index.php?vc=201&Cheap_Adobe_Acrobat_xi in web master tools (about 50 in total). These url's redirect back to home page and seem to be remaining in the index (I think Joomla is doing this automatically) Firstly, what sort of effect would these be having on on their rankings? Would they be seen by google as duplicate content for the homepage (moz doesn't report them as such as there are no internal links). Secondly what's my best plan of attack to fix them. Should I setup 404's for them and then submit them to google? Will resubmitting the site to the index fix things? Would appreciate any advice or suggestions on the ramifications of this and how I should fix it. Regards, Ian
Technical SEO | | iragless0 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Pages not indexed by Google
We recently deleted all the nofollow values on our website. (2 weeks ago) The number of pages indexed by google is the same as before? Do you have explanations for this? website : www.probikeshop.fr
Technical SEO | | Probikeshop0 -
Google and QnA sites
My website has a QnA site - a bit like this one except it's not private to premium members. It is a page with a left colomn for category links and it has a list of recently asked questions, each question is a link to view the full question and answers etc. Does google know this is a QnA ? Or will it say - hey, there are far too many links on this page, tut tut. Is there anything I can do to help it understand what the page is.
Technical SEO | | borderbound0 -
Site not being Indexed that fast anymore, Is something wrong with this Robots.txt
My wordpress site's robots.txt used to be this: User-agent: * Disallow: Sitemap: http://www.domainame.com/sitemap.xml.gz I also have all in one SEO installed and other than posts, tags are also index,follow on my site. My new posts used to appear on google in seconds after publishing. I changed the robots.txt to following and now post indexing takes hours. Is there something wrong with this robots.txt? User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /wp-login.php Disallow: /wp-login.php Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /author Disallow: /category Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /login/ Disallow: /wget/ Disallow: /httpd/ Disallow: /*.php$ Disallow: /? Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /? Disallow: /*?Allow: /wp-content/uploads User-agent: TechnoratiBot/8.1 Disallow: ia_archiverUser-agent: ia_archiver Disallow: / disable duggmirror User-agent: duggmirror Disallow: / allow google image bot to search all imagesUser-agent: Googlebot-Image Disallow: /wp-includes/ Allow: /* # allow adsense bot on entire siteUser-agent: Mediapartners-Google* Disallow: Allow: /* Sitemap: http://www.domainname.com/sitemap.xml.gz
Technical SEO | | ideas1230 -
Delete old site but redirect domain to a new domain and site
I just have a quick query and I have a feeling about what the answer is so just wanted to see what you guys thought... Basically I am working on a client site. This client has a few other websites that are divisions of their company. However these divisions/websites are no longer used. They are wanting to delete the websites but redirect the domains to their name main website. They believe this will pass on SEO benefits as these old division sites are old and have a good PR and history. I'm unsure for DEFINITE, which way is correct?
Technical SEO | | Weerdboil0