How to get a list of robots.txt file
-
This is my site.
Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt
In Google Webmaster its not showing up.Just giving the number of blocked URL's.
Any plugin or Software to extract the list of blocked URL's.
-
If you use Bing Webmaster tools you can see a complete list all URLs blocked by robots.txt. You can export the file and then filter.
Just go to Reports & Data > Crawl Information within your Bing webmaster account. I am not aware of this feature being in Google webmaster tools. Hope this helps.
-
simon_realbuzz buddy If I use this /classifieds/ it means I am blocking all URL starting with it.I want to get a list of all blocked URL's of site.
Example
http://muslim-academy.com/classifieds/
How many URL's associated with this classified are blocked by my robots.txt.
-
I'm sorry I don't follow. If you go to that URL you will see the list of blocked URLs as I've pasted below.
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewtopic.php?=&p=
Disallow: /forum/viewtopic.php?t=
Disallow: /forum/viewtopic.php?start=
Disallow: /forum/&view=previousDisallow: /forum/&view=next
Disallow: /forum/&sid=
Disallow: /forum/&p=
Disallow: /forum/&sd=a
Disallow: /forum/&start=0
Disallow: /forum/memberlist.php
Disallow: /forum/posting.php
Disallow: /classifieds/
Disallow: /forum/index.php
Disallow: /forum/ucp
Disallow: /http://muslim-academy.com/الا�%A..
Disallow: /http://muslim-academy.com/особенн%D
Disallow: /http://muslim-academy.com/ислам-ка%
Disallow: /http://muslim-academy.com/classifieds/ads/Disallow: /http://muslim-academy.com/значени%D..
Disallow: /.ifieds/
Disallow: /.ifieds/ads/
Disallow: /forum/alternatelogin/al_tw_connect.php?authentication=1
Disallow: /forum/search.php -
simon_realbuzz I need a list of blocked URL's not the robots.txt file path.
-
You can view your robots file simply by appending /robots.txt to your site URL. Just put the following http://muslim-academy.com/robots.txt and you'll be able to view your robots file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Directory Listings the DO's and Dont's
Hi guys, We are currently working on increasing our online marketing presence and with a new website on the cards I am turning to all of our B2B business directories to update, and being tracking referrals etc to see ones that may be worth premium listings. I have a list of 200 business directories to scope and check, some are relevant some are not, obviously i dont want to sign up to them all and risk a dodgy link profile so im going to be selective, but im not sure how many to aim for etc. So im looking for some general advice and guidance at this early stage so I can properly plan my approach. What advice would you give and are there any major DO's and DONT's of sorting through these directories to look for some new ways to source customers. EDIT: We are UK based Thanks
Reporting & Analytics | | ATP0 -
Is there an automated way to determine which pages of your website are getting 0 traffic?
I'm doing a content audit on my company website and want to identify pages with zero traffic. I can use GA for low traffic, but not zero traffic. I can do this manually, but it would take a long time. Are there any tools to help me determine these pages?
Reporting & Analytics | | Ksink0 -
How can I get the Google Analytics advanced segments beta?
Is there a way that I can get access to the Google Analytics new segmenting features? I've been reading about them for some months now, but still nothing in my GA account. Thank you in advance.
Reporting & Analytics | | LinusB0 -
Uptick in not tracked conversions / anyone have a list of things that google analytics will not track
There seems to have been an uptick in users on our site not being tracked in Google Analytics cause I see a lot more un-tracked revenue in the last 6 months then I used to. I know analytics is still working as it has been tracking a normal amount of visits but I assumed there might be a reason less would be actually showing up in analytics (mabye a change is what is being reported as organic). I know a lot of stuff goes into "not provided" such as logged in search and stuff like that but is there a list of all of the ones that go into not provided and all that just do not get tracked (javascript not enabled, iOS?). If it could be something else as well let me know. Thanks for the help!
Reporting & Analytics | | Gordian0 -
Get a list of robots.txt blocked URL and tell Google to crawl and index it.
Some of my key pages got blocked by robots.txt file and I have made required changes in robots.txt file but how can I get the blocked URL's list. My webmaster page Health>blocked URL's shows only number not the blocked URL's.My first question is from where can I fetch these blocked URL's and how can I get them back in searches, One other interesting point I see is that blocked pages are still showing up in searches.Title is appearing fine but Description shows blocked by robots.txt file. I need urgent recommendation as I do not want to see drop in my traffic any more.
Reporting & Analytics | | csfarnsworth0 -
List all URL's indexed by google
Hi all i need a list of all urls google has indexed from my site i want this in excel format or csv how do i go about getting this thanks in advance
Reporting & Analytics | | Will_Craig0 -
Adding Something to htaccess File
When I did a google search for site.kisswedding.com (my website) I noticed that google is indexing all of the https versions of my site. First of all, I don't get it because I don't have an SSL certificate. Then, last night I did what my host (bluehost) told me to do. I added the below to my htaccess file. Below rule because google is indexing https version of site - https://my.bluehost.com/cgi/help/758RewriteEngine OnRewriteCond %{HTTP_HOST} ^kisswedding.com$ [OR]RewriteCond %{HTTP_HOST} ^kisswedding.com$RewriteCond %{SERVER_PORT} ^443$RewriteRule ^(.*)$ http://www.kisswedding.com [R=301,L] Tonight I when I did a google search for site:kisswedding.com all of those https pages were being redirected to my home page - not the actually page they're supposed to be redirecting to. I went back to Bluehost and they said and 301 redirect shouldn't work because I don't have an SSL certificate. BUT, I figure since it's sorta working I just need to add something to that htaccess rule to make sure it's redirected to the right page. Someone in the google webmaster tools forums told me to do below but I don't really get it? _"to 301 redirect from /~kisswedd/ to the proper root folder you can put this in the root folder .htaccess file as well:_Redirect 301 /~kisswedd/ http://www.kisswedding.com/" Any help/advice would be HUGELY appreciated. I'm a bit at a loss.
Reporting & Analytics | | annasus0 -
Meta Robots Tag - What's it really mean?
I used on a handful of pages recently and noticed that they're still popping up in the Google search index. I'd like to keep these from appearing, so I figured I needed a directive statement with stronger semantic meaning. From what I understand, is what I'm looking for. Using this will keep Google from not only crawling the page, but indexing the page, as well. I decided to see what the official robotstxt.org website said about it, so I checked (link here): the NOFOLLOW directive only applies to links on this page. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page. So, is their explanation saying that the page itself will be indexed, but the content / links on it won't be followed / indexed? Let me hear your thoughts, mozzers.
Reporting & Analytics | | mudbugmedia0