How to get a list of robots.txt file
-
This is my site.
Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt
In Google Webmaster its not showing up.Just giving the number of blocked URL's.
Any plugin or Software to extract the list of blocked URL's.
-
If you use Bing Webmaster tools you can see a complete list all URLs blocked by robots.txt. You can export the file and then filter.
Just go to Reports & Data > Crawl Information within your Bing webmaster account. I am not aware of this feature being in Google webmaster tools. Hope this helps.
-
simon_realbuzz buddy If I use this /classifieds/ it means I am blocking all URL starting with it.I want to get a list of all blocked URL's of site.
Example
http://muslim-academy.com/classifieds/
How many URL's associated with this classified are blocked by my robots.txt.
-
I'm sorry I don't follow. If you go to that URL you will see the list of blocked URLs as I've pasted below.
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewtopic.php?=&p=
Disallow: /forum/viewtopic.php?t=
Disallow: /forum/viewtopic.php?start=
Disallow: /forum/&view=previousDisallow: /forum/&view=next
Disallow: /forum/&sid=
Disallow: /forum/&p=
Disallow: /forum/&sd=a
Disallow: /forum/&start=0
Disallow: /forum/memberlist.php
Disallow: /forum/posting.php
Disallow: /classifieds/
Disallow: /forum/index.php
Disallow: /forum/ucp
Disallow: /http://muslim-academy.com/الا�%A..
Disallow: /http://muslim-academy.com/особенн%D
Disallow: /http://muslim-academy.com/ислам-ка%
Disallow: /http://muslim-academy.com/classifieds/ads/Disallow: /http://muslim-academy.com/значени%D..
Disallow: /.ifieds/
Disallow: /.ifieds/ads/
Disallow: /forum/alternatelogin/al_tw_connect.php?authentication=1
Disallow: /forum/search.php -
simon_realbuzz I need a list of blocked URL's not the robots.txt file path.
-
You can view your robots file simply by appending /robots.txt to your site URL. Just put the following http://muslim-academy.com/robots.txt and you'll be able to view your robots file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No matter what we seem to do, we can never get higher up on Googles ranks. What gives?
We are in independent car dealer in a small(ish) town in Arkansas. No matter what we do, we can never seem to get past the second or third page on Google's ranks. We have a blog, we pay for SEO help from dealer vendors, we have consistent information on our site, etc. etc. . I'm at a loss. I don't know what else to do. Based on our Moz Site Crawl report, we have 185 pages that have duplicate content. However, these are pages that publish our auto check reports for inventory that potential customers request from our site. I guess these auto check reports expire because when I click on the link from the Moz report it says it can no longer be displayed and to contact the dealer (http://www.rathautoresources.com/autocheck.aspx?vin=1C6RR7KT4DS539140)
Reporting & Analytics | | RathAuto0 -
How to get multiple pages to appear under main url in search - photo attached
How do you get a site to have an organized site map under the main url when it is searched as in the example photo? SIte-map.png
Reporting & Analytics | | marketingmediamanagement0 -
Google Webmaster indicates robots.text access error
Seems that Google has not been crawling due to an access issue with our robots.txt
Reporting & Analytics | | jmueller0823
Late 2013 we migrated to a new host, WPEngine, so things might have changed, however this issue appears to be recent. A quick test shows I can access the file. This is the Google Webmaster Tool message: http://www.growth trac dot com/: Googlebot can't access your site January 17, 2014 Over the last 24 hours, Googlebot encountered 62 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 8.8% Note the above message says 'over the last 24 hours', however the date is Jan-17 This is the response from our host:
Thanks for contacting WP Engine support! I looked into the suggestions listed below and it doesn't appear that these scenarios are the cause of the errors. I looked into the server logs and I was only able to find 200 server responses on the /robots.txt. Secondly I made sure that the server wasn't over loaded. The last suggestion doesn't apply to your setup on WP Engine. We do not have any leads as to why the errors occurred. If you have any other questions or concerns, please feel free to reach out to us. Google is crawling the site-- should I be concerned? If so, is there a way to remedy this? By the way, our robots file is very lean, only a few lines, not a big deal. Thanks!0 -
How to get a specific keyword count from a particular country to a particular page in google analytics
We are trying to get the keyword count of a particular keyword from a particular country to a particular page. eg: (keyword)Green shoes from (country)United states on (particular page)one of our blog posts page Any help will be really appreciated
Reporting & Analytics | | Nobody15870501745820 -
How to get crawled pages indexed?
Hi, I've got over 1k pages crawled but approx 100 pages indexed. Although, i submit them on Google Fetch and the links are indexable,they are not indexed. What shall i do the get max pages indexed? Any input highly appreciated. Thanks!
Reporting & Analytics | | Rubix0 -
How do i get Social Media Actions Tracked in GA
Greetings from 17 degrees C wetherby UK 🙂 http://i216.photobucket.com/albums/cc53/zymurgy_bucket/how-do-get-this.jpg The above url pints to dat I'd love to see in my Google analytics account but instead all i can see is this:
Reporting & Analytics | | Nightwing
http://i216.photobucket.com/albums/cc53/zymurgy_bucket/no-socail-media-engagementcopy.jpg What i really want to measure is Facebook Likes etc not just referral traffic from social media sites. So my question is please... "Do i have to add additional tracking code to Google analytics as explained here - https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSocial " Thanks in advance,
David0 -
Why do I get lots of traffic from a bizarre keyword?
Bit of an odd one but I've been getting a large and steady stream of traffic over the last few months from a very random keyword that according to addwords figures shows "on data". Its our second biggest referring term only beaten by our brand name. We get more traffic from this term than keywords we have invested a lot of time in that show thousands of traffic volume in addwords. When looking at behavioral data its gets odder, a bounce rate of 98.11% time on site 2 seconds and page visits 1.02. So this traffic isn't real traffic and it's not real people. So my questions are, what is it? why do we get this random traffic, has anyone els noticed things like this and is it a problem? I presume it must be something to do with some sort of spam but apart from that i'm stumped. It's just one of those things that has been bugging me so I would appreciate any help. Kind Regards Paul
Reporting & Analytics | | pauldoffman0