Can you see the 'indexing rules' that are in place for your own site?
-
By 'index rules' I mean the stipulations that constitute whether or not a given page will be indexed.
If you can see them - how?
-
Unfortunately, that would be specific to your own platform and server-side code. When you look at the SEOmoz source code, you're either going to see a nofollow or you're not. The code that drives that is on our servers and is unique to our build (PHP/Cake, I think).
You'd have to dig into the source code generating the Robots.txt file. I don't think you can have a fully dynamic Robots.txt (it has to have a .txt extension), so there must be a piece of code that generates a new Robots.txt file, probably on a timer. It could be called something similar, like Robots.php, Robots.aspx, etc. Just a guess.
FYI, dynamic Robots.txt could be a little dicey - it might be better to do this with a META NOINDEX in the header of the user profile pages. That would also avoid the timer approach. The pages would dynamically NOINDEX themselves as they're created.
-
To hopefully clarify what I'm talking about, I want to provide this example: SEOmoz will remove the "no-follow" tag from the first link in your profile if you get 200 mozpoints.
This is a set rule which I believe will automatically occur once a user reaches the minimum. On my site, a similar rule exists where the meta noindex tag will be removed from a user page if you submit 10 'files'.
There were other rules similar to this created and I need to know what they are. How?
-
On my site, there was a rule created where users are blocked by robots unless they have submitted a minimum number of 'files'. This was done to ensure that only quality user profile pages are being indexed and not just spam/untouched profiles.
There have been other rules like this created but I don't know what they are and I'd like to find out.
-
Hi David,
Do you mean how robots.txt is configured and if the robots file is blocking a certain page from being indexed? If so, yes. If the file is complex and you're not sure if it's blocking a particular page, you can go into Google Webmaster Tool and they have a robots.txt utility where you can input a particular URL and it will tell you if the robots.txt file you are using (or proposing) blocks that URL.
If you mean whether the page is quality enough for a search engine to choose to index it? No, that's part of the algorithm and none of the major engines are that nice and open.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Breaking up a site into multiple sites
Hi, I am working on plan to divide up mid-number DA website into multiple sites. So the current site's content will be divided up among these new sites. We can't share anything going forward because each site will be independent. The current homepage will change to just link out to the new sites and have minimal content. I am thinking the websites will take a hit in rankings but I don't know how much and how long the drop will last. I know if you redirect an entire domain to a new domain the impact is negligible but in this case I'm only redirecting parts of a site to a new domain. Say we rank #1 for "blue widget" on the current site. That page is going to be redirected to new site and new domain. How much of a drop can we expect? How hard will it be to rank for other new keywords say "purple widget" that we don't have now? How much link juice can i expect to pass from current website to new websites? Thank you in advance.
Intermediate & Advanced SEO | | timdavis0 -
No-Indexing on Ecommerce site
Hi Our site has a lot of similar/lower quality product pages which aren't a high priority - so these probably won't get looked at in detail to improve performance as we have over 200,000 products . Some of them do generate a small amount of revenue, but an article I read suggested no-indexing pages which are of little value to improve site performance & overall structure. I wanted to find out if anyone had done this and what results they saw? Will this actually improve rankings of our focus areas? It makes me a bit nervous to just block pages so any advice is appreciated 🙂
Intermediate & Advanced SEO | | BeckyKey0 -
Google can't access/crawl my site!
Hi I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings. [URL Errors: 1st photo] 8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up. The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages. After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked. Also when i go to WMT, and try to Fetch as Google the site, this is what i get: [Fetch as Google: 2nd photo] From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles). What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings? Thanks a lot
Intermediate & Advanced SEO | | granitgash
Granit FvhvDVR.png dKx3m1O.png0 -
Help my site it's not being indexed
Hello... We have a client, that had arround 17K visits a month... Last september he hired a company to do a redesign of his website....They needed to create a copy of the site on a different subdomain on another root domain... so I told them to block that content in order to not affect my production site, cause it was going to be an exact replica of the content but different design.... The developmet team did it wrong and blocked the production site (using robots.txt), so my site lost all it's organica traffic, which was 85-90% of the total traffic and now only get a couple of hundreds visits a month... First I thought we had been somehow penalized, however when I the other site recieving new traffic and being indexed i realized so I switched the robots.txt and created 301 redirect from the subdomain to the production site. After resending sitemaps, links to google+ and many things I can't get google to reindex my site.... when i do a site:domain.com search in google I only get 3 results. Its been now almost 2 month and honestly dont know what to do.... Any help would be greatly appreciated Thanks Dan
Intermediate & Advanced SEO | | daniel.alvarez0 -
Panda / Penguin Testing on a Site - Has anyone see this?
Hi, Trying to diagnose the fall of our site. We fell mainly with Panda 3.4 and then a little more with Penguin. We have a main site with 200 pages and an attached blog. example domain.com/blog Then blog that was really small with only 7 posts. One keyword phrase example: "ace widget software" has ranked # 2 and 3 through the entire storm. The page that is ranking is in our main root site (not the blog). We used to rank for 200 phrases now only rank for about 10 Over the past week I stumbled on the fact that if I create a new post in my blog, those pages rank in 3 days. Good rankings, #2 on one and at least first page on the other 5 pages. One page ranked #2 in 17 hours. The test I am conducting: I am now testing to see if maybe there is some coding issue on our site, we do not use a template but a 3 column design built in Dreamweaver using older style tables etc. 1. Putting a new page on the old design. 2. Taking an existing page and putting into new design without side columns. 3. Already testest - adding new page to blog (success on this test) Seems if it was a coding issue/ design the two or three keywords phrases that stayed steady through the storm would have fallen. our site: www.TranslationSoftware4u.com Has anyone else been adding new content to see it rank really good but cannot get the other pages to bounce back up in rankings? Open to ideas of why this is happening. Thanks in advance! Force7
Intermediate & Advanced SEO | | Force70 -
How to let Search engines index login-first SNS sites?
What's the Effective way to let major search engine to index Login-first SNS sites? the reason of asking that is because i saw a search engines index Millon of SNS pages but most of them requested to login, how search engine get through this? http://www.baidu.com/s?wd=site%3Akaixin001.com&pn=50 thanks Boson
Intermediate & Advanced SEO | | DarwinChinaSEO0 -
How can I block unwanted urls being indexed on google?
Hi, I have to block unwanted urls (not that page) from being indexed on google. I have to block urls like example.com/entertainment not the exact page example.com/entertainment.aspx . Is there any other ways other than robot.txt? If i add this to robot.txt will that block my other url too? Or should I make a 301 redirection from example.com/entertainment to example.com/entertainment.aspx. Because some of the unwanted urls are linked from other sites. thanks in advance.
Intermediate & Advanced SEO | | VipinLouka780 -
Google Maps results doesn't show my site url but rather the maps url, why is this?
For several of my clients landing pages that show up in the Maps results the website url has been overwritten by the maps url (maps.google.com). Even though on my places page I have the correct website set up. Does anyone have any idea why they would be doing this and how I can correct it? Thanks kinldy in advance, Aaron. maps-url.png
Intermediate & Advanced SEO | | afranklin0