Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Getting Google to index our sitemap
Hi, We have a sitemap on AWS that is retrievable via a url that looks like ours http://sitemap.shipindex.org/sitemap.xml. We have notified Google it exists and it found our 700k urls (we are a database of ship citations with unique urls). However, it will not index them. It has been weeks and nothing. The weird part is that it did do some of them before, it said so, about 26k. Then it said 0. Now that I have redone the sitemap, I can't get google to look at it and I have no idea why. This is really important to us, as we want not just general keywords to find our front page, but we also want specific ship names to show links to us in results. Does anyone have any clues as to how to get Google's attention and index our sitemap? Or even just crawl more of our site? It has done 35k pages crawling, but stopped.
Intermediate & Advanced SEO | | shipindex0 -
Google Indexing Stopped
Hello Team, A month ago, Google was indexing more than 2,35,000 pages, now has reduced to 11K. I have cross-checked almost everything including content, backlinks and schemas. Everything is looking fine, except the server response time, being a heavy website, or may be due to server issues, the website has an average loading time of 4 secs. Also, I would like to mention that I have been using same server since I have started working on the website, and as said above a month ago the indexing rate was more than 2.3 M, now reduced to 11K. nothing changed. As I have tried my level best on doing research for the same, so please if you had any such experiences, do share your valuable solutions to this problem.
Intermediate & Advanced SEO | | jeffreyjohnson0 -
Top-10 ranked site dropping in/out of Google index?
I work for a company that makes an important product in a category. The company has a website (www.company.org); the product is at www.company.org/product. We recently (early May) redesigned and rearchitected the product site for SEO purposes. The company site talks about the category a bit (imagine the Colgate site; it talks about "toothpaste" a bit). The blog (blog.company.org/product) also talks about the category quite a bit (and links to the company site of course). The product is a major product in the category, among the top 3. The site and blog have been around for 15+ years. The site has appx. a billion backlinks, most branded links to the product. It's in the top 50 highest ranked sites among all sites on the internet in the ahrefs rank index. Imagine you are searching for our product category, "category". If you search for "category" in Bing today, my company's site is the 3rd result, and it's the 1st result from a company that makes a product in this category. If you search for "category" in Google today, our site is not in the top 150 results. In fact, the site keeps dropping out of Google's index. (See attached for what that looks like in the search console.) What might cause a site to jump from "ranked in top 10" to "not ranked" in Google -- back and forth every couple of days? Penalties? Our recent (early May) site rearchitecture? We're not making giant, index-shifting changes every day. wE0Bn
Intermediate & Advanced SEO | | hoosteeno0 -
Multiple Blog Postings
Hi! Will posting more than one blog a day help with SEO? For example: I’d like to post 3 times a day if it will help. Thank you!
Intermediate & Advanced SEO | | EmSt0 -
SERPS showing wrong page
I have optimised a homepage for two keywords. I optimised this a few weeks ago and the page has been crawled by Google, also before this it was already reasonably well optimised for these terms. However, the homepage is not appearing in Google for these terms. Instead two other random pages on the site are appearing for these terms that have not been optimised for these keywords and have few mentions of the keywords on the pages!?? These pages have a lower DA and lower inbound links than the homepage. The homepage is showing for other lower competition keywords. Could anyone offer me some insight into this? The homepage content has been posted on other websites by a former SEO consultant - to a business directory for one? Could duplicate content be causing this problem?
Intermediate & Advanced SEO | | absolutely170 -
What may cause a page not to be indexed (be de-indexed)?
Hi All, I have a main category page, a landing page, that does not appear in the SERPS at all (even if I serach for a whole sentence from it). This page once ranked high. What may cause such a punishment for a specific page? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
New feature on SERPs
I noticed something new on Google's SERPs today. Has this been up for a while? Has anyone else seen this. https://www.google.com/search?q=bubba+watson+wife&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a gm8rI.png
Intermediate & Advanced SEO | | MargaritaS0 -
Keyword/Content Consistency
My question is: If you have a keyword that is searched more when it's spelled wrong then when it's spelled right - what do you do? Do you do the misspelled word or keep true to the spelling and say oh well to SEO? Also - Along the same lines of that question: What if you have a keyword that has a - in the middle of it. For instance: website and web-site (this isn't the keyword just an example). and drupal website is searched more then drupal web-site but wordpress web-site is searched more then wordpress website. Technically website is the correct spelling and way to write it, but people put web-site (again not the case in reality - just an example).
Intermediate & Advanced SEO | | blackrino0