Getting Google to index our sitemap
-
Hi,
We have a sitemap on AWS that is retrievable via a url that looks like ours http://sitemap.shipindex.org/sitemap.xml. We have notified Google it exists and it found our 700k urls (we are a database of ship citations with unique urls). However, it will not index them. It has been weeks and nothing. The weird part is that it did do some of them before, it said so, about 26k. Then it said 0. Now that I have redone the sitemap, I can't get google to look at it and I have no idea why. This is really important to us, as we want not just general keywords to find our front page, but we also want specific ship names to show links to us in results. Does anyone have any clues as to how to get Google's attention and index our sitemap? Or even just crawl more of our site? It has done 35k pages crawling, but stopped.
-
Now I can see Sitemaps, loadings takes time ... a lot and they look weird, but maybe ok. But there is stuff in it, wich I wont like to have in Google-Index. Northeless - whats the message in GSC?
(opend in Chrome, Firefox and on my pixel as well - the first one is looking good, all linked once had the error, now they are differnet from each other (with Linebreaks or without, with space or without) but contain links at least)
Is the site on a subdomain for pages on a different domain? (didn't saw that) - that makes it way more tricky ...
-
I redid the sitemap and just made them xml, Andreas. It hasn't seemed to help. Still not getting indexed. I don't know where you were seeing that information in the sitemap files. Can you tell me how you opened them to see that? All I see is the normal content.
Shawn
-
Maybe I need to change them to plain xml files and update the index file?
-
Where are you seeing the error? I am opening them and see all the content required. I am confused. I don't think I have a key field in the sitemaps.
-
Hi,
I wrote a post about Google & Sitemaps think two month ago, (https://intenseo.de/seo-blog/google/google-sitemaps/) unfortunately in german. So I guess I have to translate the stuff:
- A Sitemap should have not more than 50,000 entrys (Google-News-Sitemaps only 1,000)
- and should not be bigger than 50MB
So you have to split it and you allready did.
Now your Main-Sitemap is pointing to other Sitemaps (zipped, but thats not a problem), ok. So whenever GSC is telling me, my Sitemap has errors or no entrys, I open it and check. I did, I just opened the first one, look what is in it:
NoSuchKey
<message>The specified key does not exist.</message><key>sitemaps/sitemap1.xml</key><requestid>97FFA90B9843EBCA</requestid>vBzVH8Lx9fLYpPgv5SKfSzlKb4lcGxX4+V9JBO4f/M7HiDXQJT/hoLd9b/IYWanl06M41M4oCN8=I opened all of your Sitemaps, no entries in it.At least, Google indexed >8,000 Pages, but not by Sitemap thats for sure.
You can just create sitemaps with Tools (link at the bottom) or with e.g. screaming frog and upload them to your server (zipped or not doesn't matter) sent to google and done. If your System is not working at the moment, easy workaround for short.After that - try to find the Bug in creating your sitemaps, solve it and sent these to Google. Before you sent them, open your sitemaps and check if they are working. Don't wait weeks, Google is fast.List of Sitemap Generators: https://code.google.com/archive/p/sitemap-generators/wikis/SitemapGenerators.wiki
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing
Hi We have roughly 8500 pages in our website. Google had indexed almost 6000 of them, but now suddenly I see that the pages indexed has gone to 45. Any possible explanations why this might be happening and what can be done for it. Thanks, Priyam
Intermediate & Advanced SEO | | kh-priyam0 -
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
Moving a lot of pdfs to main site. Worth trying to get them indexed?
On my main site we link to pdfs that are located on another one of our domains. The only thing that is on this other domain is the pdfs. It was setup really poorly so I am going to redesign everything and probably move it. Is it worthwhile trying to add these pdfs to our sitemap and to try and get them indexed? They are all connected to a current item, but the content is original.
Intermediate & Advanced SEO | | EcommerceSite0 -
Google is indexing the wrong pages
I have been having problems with Google indexing my website since mid May. I haven't made any changes to my website which is wordpress. I have a page with the title 'Peterborough Cathedral wedding', I search Google for 'wedding Peteborough Cathedral', this is not a competitive search phrase and I'd expect to find my blog post on page one. Instead, half way down page 4 I find Google has indexed www.weddingphotojournalist.co.uk/blog with the title 'wedding photojournalist | Portfolio', what google has indexed is a link to the blog post and not the blog post itself. I repeated this for several other blog posts and keywords and found similar results, most of which don't make any sense at all - A search for 'Menorca wedding photography' used to bring up one of my posts at the top of page one. Now it brings up a post titled 'La Mare wedding photography Jersey" which happens to have a link to the Menorca post at the bottom of the page. A search for 'Broadoaks country house weddng photography' brings up 'weddingphotojournalist | portfolio' which has a link to the Broadoaks post. a search for 'Blake Hall wedding photography' does exactly the same. In this case Google is linking to www.weddingphotojournalist.blog again, this is a page of recent blog posts. Could this be a problem with my sitemap? Or the Yoast SEO plugin? or a problem with my wordpress theme? Or is Google just a bit confused?
Intermediate & Advanced SEO | | weddingphotojournalist0 -
301s being indexed
A client website was moved about six months ago to a new domain. At the time of the move, 301 redirects were setup from the pages on the old domain to point to the same page on the new domain. New pages were setup on the old domain for a different purpose. Now almost six months later when I do a query in google on the old domain like site:example.com 80% of the pages returned are 301 redirects to the new domain. I would have expected this to go away by now. I tried removing these URLs in webmaster tools but the removal requests expire and the URLs come back. Is this something we should be concerned with?
Intermediate & Advanced SEO | | IrvCo_Interactive0 -
Can submitting sitemap to Google webmaster improve SEO?
Can creating fresh sitemap and submitting to Google webmaster improve SEO?
Intermediate & Advanced SEO | | chanel270 -
Can Google index PDFs with flash?
Does anyone know if Google can index PDF with Flash embedded? I would assume that the regular flash recommendations are still valid, even when embedded in another document. I would assume there is a list of the filetype and version which Google can index with the search appliance, but was not able to find any. Does anyone have a link or a list?
Intermediate & Advanced SEO | | andreas.wpv0 -
Huge google index with un-relevant pages
Hi, i run a site about sport matches, every match has a page and the pages are generated automatically from the DB. pages are not duplicated, but over time some look a little bit similar. after a match finishes it has no internal links or sitemap entry, but it's reachable by direct URL and continues to be on google index. so over time we have more than 100,000 indexed pages. since past matches have no significance and they're not linked and a match can repeat and it may look like duplicate content....what you suggest us to do: when a match is finished - not linked, but appears on the index and SERP 301 redirect the match Page to the match Category which is a higher hierarchy and is always relevant? use rel=canonical to the match Category do nothing.... *301 redirect will shrink my index status, some say a high index status is good... *is it safe to 301 redirect 100,000 pages at once - wouldn't it look strange to google? *would canonical remove the past matches pages from the index? what do you think? Thanks, Assaf.
Intermediate & Advanced SEO | | stassaf0