Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Robots.txt Tester - syntax not understood

2

3

1562

Log in to reply

JamesHancocks1 last edited by

I've looked in the robots.txt Tester and I can see 3 warnings:

There is a 'syntax not understood' warning for each of these.

XML Sitemaps:
https://www.pkeducation.co.uk/post-sitemap.xml
https://www.pkeducation.co.uk/sitemap_index.xml

How do I fix or reformat these to remove the warnings?

Many thanks in advance.
Jim
1 Reply Last reply
Reply Quote 0
JamesHancocks1 @Martijn_Scheijbeler last edited by

I'm to give that a go Martijn.

The text "XML Sitemaps" is in there and flagas as an error. Does this need to be reformatted as well or deleted?

Kind regards,
James.
1 Reply Last reply
Reply Quote 0
Martijn_Scheijbeler last edited by

Hi James,

The right syntax is:

Sitemap: https://www.pkeducation.co.uk/post-sitemap.xml
Sitemap: https://www.pkeducation.co.uk/sitemap_index.xml

When you retry it should show up as working.

Martijn.
1 Reply Last reply
Reply Quote 2

Got a burning SEO question?

Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.

Start my free trial

Browse Questions

View

From

Sorted by

With category

Explore more categories

Related Questions

Disallow wildcard match in Robots.txt

This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Disallow: /?mobile=1 Thank you
Technical SEO | | AmandaBridge

0
Google is Still Blocking Pages Unblocked 1 Month ago in Robots

I manage a large site over 200K indexed pages. We recently added a new vertical to the site that was 20K pages. We initially blocked the pages using Robots.txt while we were developing/testing. We unblocked the pages 1 month ago. The pages are still not indexed at this point. 1 page will show up in the index with an omitted results link. Upon clicking the link you can see the remaining un-indexed pages. Looking for some suggestions. Thanks.
Technical SEO | | Tyler123

0
Log in, sign up, user registration and robots

Hi all, We have an accommodation site that asks users only to register when they want to book a room, in the last step. Though this is the ideal situation when you have tons of users, nowadays we are having around 1500 - 2000 per day and making tests we found out that if we ask for a registration (simple, 1 click FB) we mail them all and through a good customer service we are increasing our sales. That is why, we would like to ask users to register right after the home page ie Home/accommodation or and all the rest. I am not sure how can I make to make that content still visible to robots.
Will the authentication process block google crawling it? Maybe something we can do? We are not completely sure how to proceed so any tip would be appreciated. Thank you all for answering.
Technical SEO | | Eurasmus.com

3
How to solve the meta : A description for this result is not available because this site's robots.txt. ?

Hi, I have many URL for commercialization that redirects 301 to an actual page of my companies' site. My URL provider say that the load for those request by bots are too much, they put robots text on the redirection server ! Strange or not? Now I have a this META description on all my URL captains that redirect 301 : A description for this result is not available because this site's robots.txt. If you have the perfect solutions could you share it with me ? Thank You.
Technical SEO | | Vale7

0
Will an XML sitemap override a robots.txt

I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen

0
Meta-robots Nofollow

I don't understand Meta-robots Nofollow. Wordpress has my homepage set to this according to SEOMoz tool. Is this really bad?
Technical SEO | | hopkinspat

1
Do you get credit for an external link that points to a page that's being blocked by robots.txt

Hi folks, No one, including me seems to actually know what happens!? To repeat: If site A links to /home.html on site B and site B blocks /home.html in Robots.txt, does site B get credit for that link? Does the link pass PageRank? Will Google still crawl through it? Does the domain get some juice, but not the page? I know there's other ways of doing this properly, but it is interesting no?
Technical SEO | | DaveSottimano

0
How does robots.txt affect aliased domains?

Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue. I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead. I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain. Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ? THANK YOU!!!
Technical SEO | | michaelj_me

0