Huge number of crawl anomalies and 404s - non- existent urls
-
Hi there,
Our site was redesigned at the end of January 2020. Since the new site was launched we have seen a big drop in impressions (50-60%) and also a big drop in total and organic traffic (again 50-60%) when compared to the old site.
I know in the current climate some businesses will see a drop in traffic, however we are a tech business and some of our core search terms have increased in search volume as a result of remote-working.
According to search console there are 82k urls excluded from coverage - the majority of these are classed as 'crawl anomaly' and there are 250+ 404's - almost all of the urls are non-existent, they have our root domain with a string of random characters on the end. Here are a couple of examples:
root.domain.com/96jumblestorebb42a1c2320800306682
root.domain.com/01sportsplazac9a3c52miz-63jth601
root.domain.com/39autoparts-agency26be7ff420582220
root.domain.com/05open-kitchenaf69a7a29510363
Is this a cause for concern? I'm thinking that all of these random fake urls could be preventing genuine pages from being indexed / or they could be having an impact on our search visibility. Can somebody advise please?
Thanks!
-
Unlikely, as long as they're returning 404 errors you should be OK. Maybe update your disavow file and you should be good to go!
-
Thanks for your reply.
I’m new to the business and I’ve found that that the old website had a spam attack, all of these fake urls are from the old pages (as they have 301s).
There are 82,000 crawl anomalies from these fake/spam URLs and around 200 404s. None of the fake /spam urls have been indexed. Could this be having a negative effect of search visibility/DA or rankings?
Thanks!
-
It's tough to say without seeing the site. Overall it's unlikely if you don't use that string anywhere. We usually see it more for broken relative URLs. Maybe a third party site is using that string.
-
Thanks for your reply, would broken urls from the internal linking structure explain the random characters? e.g. root.domain.com/96jumblestorebb42a1c2320800306682
We've never had any page content/urls relating to 'jumblestore'.
Thanks!
-
From what I can tell, this probably isn't the reasons for the drops. I'd go back and ensure that any URLs that changed are 301 redirecting to the correct destination URL. I'd also ensure that no pages that were associated with high volume keywords no longer exist.
For your issue, Google is likely finding some broken URLs, possibly from your internal linking structure. Perform a crawl of the site and see if you can find "Inlinks" to those broken pages. If so, you can work with dev to eliminate the issue.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding a parameter to the URL / URL Stracture
Dear Community, I would like to ask a question regarding url structure. We are struggling with shorting urls and we thought to add a "parameter" to the url. Example: domain.com/product**/a/** or domain.com**/a/**product/ Current url structure: domain.com/product/ So we go after and short url contains "/a/" and find the category we want. Is this going to harm our SEO strategies? Any idea is welcome.
Technical SEO | | geofil0 -
Which URL do I request Google News inclusion for: the http or the non-http?
In Google WMT/Search Console, I've marked the non-www. version of my site as the preferred. But I haven't run into a choice between http:// and non-http:// before. Should I choose the one listed at the top, which is the non-http (AND the non-www) version? Thanks! Unknown.png
Technical SEO | | christyrobinson1 -
Log files vs. GWT: major discrepancy in number of pages crawled
Following up on this post, I did a pretty deep dive on our log files using Web Log Explorer. Several things have come to light, but one of the issues I've spotted is the vast difference between the number of pages crawled by the Googlebot according to our log files versus the number of pages indexed in GWT. Consider: Number of pages crawled per log files: 2993 Crawl frequency (i.e. number of times those pages were crawled): 61438 Number of pages indexed by GWT: 17,182,818 (yes, that's right - more than 17 million pages) We have a bunch of XML sitemaps (around 350) that are linked on the main sitemap.xml page; these pages have been crawled fairly frequently, and I think this is where a lot of links have been indexed. Even so, would that explain why we have relatively few pages crawled according to the logs but so many more indexed by Google?
Technical SEO | | ufmedia0 -
403s vs 404s
Hey all, Recently launched a new site on S3, and old pages that I haven't been able to redirect yet are showing up as 403s instead of 404s. Is a 403 worse than a 404? They're both just basically dead-ends, right? (I have read the status code guides, yes.)
Technical SEO | | danny.wood1 -
Friendly URL
Can be Friendly URL installed on a custom made jobsite using mod rewrite / apache without any big interference to the system itself? Thank you.
Technical SEO | | tomaz770 -
Site Crawl
I was wondering if there was a way to use SEOmoz's tool to quickly and easily find all the URLs on you site and not just the ones with errors. The site that I am working on does not have a site map. What I am trying to do is find all the URLs along with their titles and description tags. Thank you very much for your help
Technical SEO | | pakevin0 -
URL Rewrite
We are trying to convince a client to do a massive rewrite from all URL's looking like this: "www.company.com/category/categoryId=82374" to something like "www.company.com/womens/jackets/rain" How would you describe the importance and impact of doing URL rewrites to an ecommerce site? What evidence/research can we share with them to convince them it is worth the time and effort to do?
Technical SEO | | Hakkasan0 -
Re-write of url
Hi, I would like your input on the following dilemma I am wanting to target the keyword "download xml". at the moment Google indexes us on page 2 and indexes the page www.ourdomain.com/download.aspx I would like to rewrite the url to be /download-xml-editor.aspx The current page is a pr5 and is our most trafficked and externally inked to page. My thoughts are quite mixed on how to do this. approach 1: re-write url of "download.aspx" and setup permanent 301 redirect of download.aspx to download-xml-editor.aspx approach 2: create a new page called download-xml-editor and 301 redirect that to the current stronger page which is download.aspx approach 3: create new page called download-xml-editor with unique content and try and get that page to rank over time, allowing it to build up links and not compromise the current page, then later 301 redirect How would you deal with this and what are your recommendations
Technical SEO | | LiquidTech0