Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Homepage only has disappeared from SERPs
I have a website that has been ranking really well with the home page for our most important keywords. 2nd position. A few days ago I noticed the home page was nowhere to be found in the SERPs. Not really knowing what to do I republished my Rapidweaver site and got the message that there were two index files. An index.html and index.php file which could cause problems. The home page has always been a html page so I have no idea how this was created. I deleted the php file from the file directory and resubmitted the homepage in webmaster tools for indexing. Within about half an hour the page was appearing in the SERPs again in my original position for the important keywords. Two days later and it's gone again. I've tried everything I can think of and resubmitted the page to Google for indexing. I can see it is being indexed as it comes up when I type the URL into the search bar but it will not come up in the search results either for my two keywords or the actual name of my business. When I type in business name it brings up lots of other pages from the site but not the home page. I've spoken to 3 SEO companies and nobody knows what is causing it. Please help with any suggestions this will definitely impact our business if we can't figure it out. Has this happened to anyone else? My website is NSFW but is www.aprilnites.com.au
Intermediate & Advanced SEO | | GemmaApril0 -
Putting my content under domain.com/content, or under related categories: domain.com/bikes/content ?
Hello This questions plays on what Joe Hall talked about during this years' MozCon: Rethinking Information Architecture for SEO and Content Marketing. My Case:
Intermediate & Advanced SEO | | Inevo
So.. we're working out guidelines and templates for a costumer (sporting goods store) on how to publish content (articles, videos, guides) on their category pages, product pages, and other pages. At this moment I have 2 choices:
1. Use a url-structure/information architecture where all the content is placed in one subfolder, for example domain.com/content. Although it's placed here, there's gonna be extensive internal linking from /content to the related category pages, so the content about bikes (even if it's placed under domain.com/bikes) will be just as visible on the pages related to bikes. 2. Place the content about bikes on a subdirectory under the bike category, **for example domain.com/bikes/content. ** The UX/interface for these two scenarios will be identical, but the directories/folder-hierarchy/url structure will be different. According to Joe Hall, the latter scenario will build up more topical authority and relevance towards the category/topic, and should be the overall most ideal setup. Any thoughts on which of the two solutions is the most ideal? PS: There is one critical caveat her: my costumer uses many url-slugs subdirectories for their categories, for example domain.com/activity/summer/bikes/, which means the content in the first scenario will be 4 steps away from the home page. Is this gonna be a problem? Looking forward to your thoughts 🙂 Sigurd, INEVO0 -
To index or de-index internal search results pages?
Hi there. My client uses a CMS/E-Commerce platform that is automatically set up to index every single internal search results page on search engines. This was supposedly built as an "SEO Friendly" feature in the sense that it creates hundreds of new indexed pages to send to search engines that reflect various terminology used by existing visitors of the site. In many cases, these pages have proven to outperform our optimized static pages, but there are multiple issues with them: The CMS does not allow us to add any static content to these pages, including titles, headers, metas, or copy on the page The query typed in by the site visitor always becomes part of the Title tag / Meta description on Google. If the customer's internal search query contains any less than ideal terminology that we wouldn't want other users to see, their phrasing is out there for the whole world to see, causing lots and lots of ugly terminology floating around on Google that we can't affect. I am scared to do a blanket de-indexation of all /search/ results pages because we would lose the majority of our rankings and traffic in the short term, while trying to improve the ranks of our optimized static pages. The ideal is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages - but for some reason Google keeps choosing the internal search results page as the "better" page to rank for our targeted keywords. Can anyone advise? Has anyone been in a similar situation? Thanks!
Intermediate & Advanced SEO | | FPD_NYC0 -
Internal links to blog posts
I am linking manually to blog posts in my site from my Home page. Our site isn't set up with an auto "Recent Posts" that shows on Home. Should I use the exact blog post title as the anchor text or do I need to create something that is not an exact match to the title of the post?
Intermediate & Advanced SEO | | gfiedel0 -
Remove content that is indexed?
Hi guys, I want to delete a entire folder with content indexed, how i can explain to google that content no longer exists?
Intermediate & Advanced SEO | | Valarlf0 -
How do i redirect www.domain.com/ to www.domain.com/index.php
I keep getting in my analytics www.domain.com/ and www.domain.com/index.php how do i make it consistently redirect to one version and not to both. I know about htaccess redirect and am already using this so am puzzle to which is the best one to use. below is the example .htaccess file im using. Options +FollowSymlinks
Intermediate & Advanced SEO | | mattmillen
RewriteEngine on
rewritecond %{http_host} ^domain.co.uk [nc]
rewriterule ^(.*)$ http://www.domain.co.uk/index.php$1 [r=301,nc] which is better for SEO should i forward to www.domain.com/ or www.domain.com/index.php0 -
Http://blogsearch.google.com/ping
Is there any reason why a website would submit all their content (videos, photo galleries, articles) to this?
Intermediate & Advanced SEO | | MargaritaS0 -
Keyword/Content Consistency
My question is: If you have a keyword that is searched more when it's spelled wrong then when it's spelled right - what do you do? Do you do the misspelled word or keep true to the spelling and say oh well to SEO? Also - Along the same lines of that question: What if you have a keyword that has a - in the middle of it. For instance: website and web-site (this isn't the keyword just an example). and drupal website is searched more then drupal web-site but wordpress web-site is searched more then wordpress website. Technically website is the correct spelling and way to write it, but people put web-site (again not the case in reality - just an example).
Intermediate & Advanced SEO | | blackrino0