Where are the crawled URLS in webmaster tools coming from?
-
When looking at the crawl errors in Webmaster Tools/Search Console, where is Google pulling these URLs from? Sitemap?
-
Just to make complete. Google search console will list errors for pages with links coming from 3 general location
-
Crawling links on your website. Starting from somewhere on your site and going link to link.
-
Crawling links in your sitemap.
-
Crawling URLs from your site that do not exist anymore on your site or sitemap. I have seen Google keep things in memory and come back to hit pages again that are no longer from option 1 or option 2. If you used to have a bunch of 301 directs in place for an old version of your website and then your developer changes something to delete all those 301s and they become 404s, you will find those pages showing up as errors again. This is really useful as it can help diagnose the issue and you can fix it.
-
Crawling links from other sites. Sometimes, this is how links get crawled for #3.
Here is what really sucks about Search Console and I mean sucks big bananas if you are trying to diagnose an issue. If you look at your Search Console error page. You can click on the URL in the report, it will pop up a box and then you can click the tab "Linked From" and see what pages are linking to the URL in question. That is good! If you then download the CSV, all of that info is lost. If you have more than 20 errors to deal with, you do not have a practical way to manage things and see if there is a trend etc. Otherwise you are left with clicking a lot of links in the report and taking lots of notes and going a little insane.
Good luck!
-
-
I agree 100% with Dirk!
Google is going to crawl your site as long as your as long as you have a domain's robots.txt file and Meta tag robots allow for the bot to crawl the site. By not telling Google anything you are saying welcome to my site please index
Google Webmaster tools are doing you a favor and saying look this is a problem that the bot has encountered while indexing your site look into it.
Submitting a XML sitemap to Google will definitely help show them where to look and you can request that they index using crawl is a Googlebot.
Some good advice on how to fix the issues found
**https://www.distilled.net/blog/seo/indexation-problems-diagnosis-using-google-webmaster-tools/ **
a great resource on indexing & robots.txt
https://www.distilled.net/blog/seo/advanced-seo-troubleshooting-why-isnt-this-page-indexed/
https://varvy.com/robottxt.html
I hope this helps,
Tom
-
These emors are the problems the googlebot encounters while crawling your site. A site map can help the googlebot to better crawl your site but isn't strictly necessary .
rgds
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Url folder structure
I work for a travel site and we have pages for properties in destinations and am trying to decide how best to organize the URLs basically we have our main domain, resort pages and we'll also have articles about each resort so the URL structure will actually get longer:
Technical SEO | | Vacatia_SEO
A. domain.com/main-keyword/state/city-region/resort-name
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature _
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village/kid-friend-pool_ B. Another way to structure would be to remove the location and keyword folders and combine. Note that some of the resort names are long and spaces are being replaced dynamically with dashes.
ex. domain.com/main-keyword-in-state-city/resort-name
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature_
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village-kid-friend-pool_ Question: is that too many folders or should i combine or break up? What would you do with this? Trying to avoid too many dashes.0 -
Magento URL change
We have a Magento website parked at HostGator. The site is comprised of both a PC and a mobile version. We changed the URL to a new one ... We made the domain changes in the ‘core_config_data’ (phpMyAdmin) ... We flushed the cache in the ‘File Manager’ part of cPanel (regular and mobile version) Currently we can access the http://newsite.com (on a desktop) with no problem ... We can also access http://m.newsite.com BUT… only from a desktop PC. When we try http://newsite.com from a MOBILE device, it routes to: http://m.OLDsite.com (it keeps going to the old URL) Need some help please. Thanks in advance!
Technical SEO | | Prime850 -
URL Understanding -
Hello everyone! Can anyone help me understanding this url? Product.asp?PID=1236 cheers
Technical SEO | | PremioOscar0 -
Linklicious and Crawl rates
Can somebody please explain me what is 'crawl rate' and how does 'linklicious' help us with it? I mean I can always visit the website and know more about it, but I want to understand the concept. Please help.
Technical SEO | | KS__0 -
Google Webmaster tools error?
So I am trying to set the URL preference in google webmaster tools for my site. However when I try to save it it tells me to verify that I own the site. I have already done this so where can I go to verify I own the site exactly? Maybe I am wrong and I have not done this already but even on the homepage of webmaster tools I don't see an option to "verify".
Technical SEO | | ENSO0 -
Does Google pass link juice a page receives if the URL parameter specifies content and has the Crawl setting in Webmaster Tools set to NO?
The page in question receives a lot of quality traffic but is only relevant to a small percent of my users. I want to keep the link juice received from this page but I do not want it to appear in the SERPs.
Technical SEO | | surveygizmo0 -
Trailing Slashes In Url use Canonical Url or 301 Redirect?
I was thinking of using 301 redirects for trailing slahes to no trailing slashes for my urls. EG: www.url.com/page1/ 301 redirect to www.url.com/page1 Already got a redirect for non-www to www already. Just wondering in my case would it be best to continue using htacces for the trailing slash redirect or just go with Canonical URLs?
Technical SEO | | upick-1623910 -
How do i Organize an XML Sitemap for Google Webmaster Tools?
OK, so i used am xlm sitemap generator tool, xml-sitemaps.com, for Google Webmaster Tools submission. The problem is that the priorities are all out of wack. How on earth do i organize it with 1000's of pages?? Should i be spending hours organizing it?
Technical SEO | | schmeetz0