WEBMASTER console: increase in the number of URLs we were blocked from crawling due to authorization permission errors.
-
Hi guys,I received this warning in my webmaster console: "Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors." So i went to "Crawl Errors" section and i found such errors under "Access denied" status:
?page_name=Cheap+Viagra+Gold+Online&id=471
?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
and many happy URLs like these. Does anybody know what this is and where it comes from?
Thanks in advance!
-
Thank you Tom!
-
Hi
to removed any chance of infection and I am not telling you that I am 100% sure it's infected
You must be certain that the regional infection was removed. If it was not and you had links created by a third party other than yourself you are better off getting it completely cleaned
use Sucuri.net to remove any chance of a hack.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
i used deepcrawl.com to create the audit I you referenced.
&
Screaming frog SEO to create the site map
I hope that helps,
Tom
-
Hello Thomas,
I really appreciate your help! You said i can look at your site's structure. What is your site address?
Unfortunately, i still don't know what i need to do in order to remove those pharma hack from my site. If you know where to point me to get the answer, i'll be very grateful.
Also, what tool you used to generate this report http://crawl.blueprintmarketing.com/projects/reports/215533?ro=75ad0c6e4afacc428b553d449dfd281f82ec2ad6 ?
Also, what tool you used to create XML site map?
Thanks
-
No site map from checking multiple configurations of XML site maps and coming up with nothing no redirects either e.g. /sitemap_index.xml might exist separately or redirect to /sitemap.xml
http://www.davidandsonsjewelers.com/sitemap.xml shows a 404
Tool's
deepcrawl.com https://varvy.com/mobile/ & https://varvy.com/tools/
-
detect mobile issues
-
If I were you I would look at my site structure make sure that it was built in a certain manner for the right reasons.
If your traffic is all right you really do not want to change the site that much. If you do change the site change it slowly.
( A great example of this is how FireHost.com it is becoming Armor.com)
the tools I used to find out whether or not you had a site map primarily was deepcrawl.com
to detect mobile issues
https://varvy.com/mobile/ & https://varvy.com/tools/
http://i.imgur.com/W7BDaq7.png
http://www.screamingfrog.co.uk/seo-spider/
http://i.imgur.com/LbCBmmW.png
I used screaming frog to create a XML site map for you here
I would definitely add an XML site map.
Sincerely,
Thomas
-
Also, do you say that the mobile site is blocked? Also, how do you see that the site doesn't have XML? What tool shows you all this info?
Thanks
-
Hi Thomas,
I really appreciate your help! Can you advise me what i should do? I see all these reports but i don't know how i need to clean the site.
Thank you!
-
As you are showing certain URLs that are definitely Pharma hack their are certain things Sucuri is unable to detect because of it being a front-end tool not the PHP tool that would be needed for the two-part WordPress and PHP version of your site.
Just type this into Google
- ?page_name=Cheap+Viagra+Gold+Online&id=471
- ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603
http://www.pearsonified.com/2010/04/wordpress-pharma-hack.php
https://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.html
https://sitecheck.sucuri.net/results/www.davidandsonsjewelers.com/articles/author/carole/
https://www.virustotal.com/en/ip-address/216.120.237.225/information/
http://dnsbl.inps.de/query.cgi?lang=en&ip=216.120.237.225&action=check&quick=0
-
and switch everything to WordPress
view-source:http://www.davidandsonsjewelers.com/
-
some of you are links are really not supposed to be there
Here is your report please use the URL below to navigate the entire report.
All of you are URLs are relative to the most part that should be fixed. You have a Java redirect that definitely needs to be fixed.
PDF & XML outline
- http://cl.ly/d6Sv/www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.pdf
- http://cl.ly/d6S7/public-report_files-215533-www.davidandsonsjewelers.com_http-www-davidandsonsjewelers-com-_13-09-2015_overview_215533.xls
You have roughly 108 indexed URLs according to Google
https://marketing.grader.com/report/www.davidandsonsjewelers.com/overall
you do not have an XML site map unfortunately I found that out in the first five minutes but you can also find out if these things using
https://mza.bundledseo.com/researchtools/crawl-test
upon a quick check with another tool I found
http://i.imgur.com/Y60WnIc.png
I love deepcrawl however your site is not large you can learn a lot about it with
http://www.screamingfrog.co.uk/seo-spider/ free
I hope this is a help, with analytics access and webmaster tool like this I cannot obviously give you a much better picture.
Tom
-
I will run the audit now sorry for the delay
-
-
The best way to solve this problem is to use
Or http://screamingfrog.co.uk Seo spider
If you give me the URL I will do it quick check for you.
-
Thank you Thomas,
My site is clean though according to sucuri. I spoke to owner of this website and they said that they were hacked in the past and they blocked those pages themselves. So now google detects those pages again? Or what exactly is happening? Anybody knows?
Thanks
-
Remember that not every URL is in Googles index. It does not mean that your back link is not in
https://mza.bundledseo.com/researchtools/ose/
You should very quickly make sure that your website is not still completely full of malware like it sounds it is
use this tool to determined what has happened to your site if it is infected it is free.
If it is hacked as I believe it may be dependent on what you have described I would then purchase the malware removal and web application firewall
https://sucuri.net/website-antivirus/
if you would like a much more secure hosting environment https://armor.com is the best.
Once you have removed your site from the blacklists and removed all the bad where/malware make sure to crawl it with Google in Webmaster tools using fetch as a Google bot
your nightmare should be short-lived sorry to hear that your site was hacked hopefully this will get you back on track quickly.
-
Hi Dirk,
In webmaster tools if i click one by one those links, i can see "Linked from" URLs. There are URLs like this:
http://schwagginwagon.com/?page_name=Buying+Tadalis+SX+Safely+No+Prescription+Tadalis+SX&id=1810
and also there is one URL is coming from my domain. Not sure what it means.
I went through every single URL in Google index but all of them are normal URLs. Nothing related to spam. Any ideas?
Thanks
-
Try to do a search of type viagra site:yourdomain.com - and see if there are any pages of suspicious nature that are listed.
In the crawl error section in webmaster tools you could also check where these url's are coming from (external/internal links)
If your site is hacked - you can find more info here http://www.google.com/webmasters/hacked/ on what to do next.
rgds,
Dirk
-
Hello Dirk,
Thank you for fast reply! I thought it too right away. So all of these URLs are forbidden when i try to access them. This is the message from google webmaster tools "Googlebot couldn't crawl your URL because your server either requires authentication to access the page, or it is blocking Googlebot from accessing your site."
Any ideas? Thanks
-
Hi
On first sight I would guess your site has been hacked - do these url's exist when you try them?
Dirk
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What server issues might cause temporary and repeated Soft 404/500 Errors that appear to be functioning correctly when checked later from Google Webmaster Tools?
We are experiencing unknown server issues (we think) which are causing Soft 404/500 errors at unpredictable times on 2 websites. When we check on the pages, they’re fine but still show errors in Moz/Search Console. What are some measures we can take to protect from this or figure out what is causing this? Example URL for Soft 404 Error: https://www.advancedtraveltherapy.com/jobs/any/occupational-therapist/any/ Example URL for 500 Error: https://www.advancedtraveltherapy.com/job-detail/ms/physical-therapist/87529740/ Example URL for Soft 404 Error: https://www.advancedtravelnursing.com/search/searchresults.php?jobState=CA&tempType=g&specialties= Example URL for 500 Error: https://www.advancedtravelnursing.com/job/ma/registered-nurse/emergency-room/87108662/
Technical SEO | | StaffingRobot0 -
Getting error in webmasters
My site was running perfectly from last one year... I don't know what happened now google is showing error while I am trying to use fetch option in webmasters. http://prntscr.com/6mtud5
Technical SEO | | Srinu0 -
Cannot work out why a bunch of urls are giving a 404 error
I have used the Crawl Diagnostic reports to greatly reduce the number of 404 errors but there is a bunch of 16 urls that were all published on the same date and have the same referrer url but I cannot see the woood for trees as to what is causing the error. **The 404 error links have the structure:**http://www.domainname.com/category/thiscategory/page/thiscategory/this-is-a-post The referrer structure is: http://www.domainname.com/category/thiscategory/page/2/ Any suggestions as to how to unravel this would be appreciated.
Technical SEO | | Niamh20 -
AJAX and High Number Of URLS Indexed
I recently took over as the SEO for a large ecommerce site. Every Month or so our webmaster tools account is hit with a warning for a high number of URLS. In each message they send there is a sample of problematic URLS. 98% of each sample is not an actual URL on our site but is an AJAX request url that users are making. This is a server side request so the URL does not change when users make narrowing selections for items like size, color etc. Here is an example of what one of those looks like Tire?0-1.IBehaviorListener.0-border-border_body-VehicleFilter-VehicleSelectPanel-VehicleAttrsForm-Makes We have over 3 million indexed URLs according to Google because of this. We are not submitting these urls in our site maps, Google Bot is making lots of AJAX selections according to our server data. I have used the URL Handling Parameter Tool to target some of those parameters that are currently set to let Google decide and set it to "no urls" with those parameters to be indexed. I still need more time to see how effective that will be but it does seem to have slowed the number of URLs being indexed. Other notes: 1. Overall traffic to the site has been steady and even increasing. 2. Google bot crawls an average of 241000 urls each day according to our crawl stats. We are a large Ecommerce site that sells parts, accessories and apparel in the power sports industry. 3. We are using the Wicket frame work for our website. Thanks for your time.
Technical SEO | | RMATVMC0 -
Content and url duplication?
One of the campaign tools flags one of my clients sites as having lots of duplicates. This is true in the sense the content is sort of boiler plate but with the different countries wording changed. The is same with the urls but they are different in the sense a couple of words have changed in the url`s. So its not the case of a cms or server issue as this seomoz advises. It doesnt need 301`s! Thing is in the niche, freight, transport operators, shipping, I can see many other sites doing the same thing and those sites have lots of similar pages ranking very well. In fact one site has over 300 keywords ranked on page 1-2, but it is a large site with an 12yo domain, which clearly helps. Of course having every page content unique is important, however, i suppose it is better than copy n paste from other sites. So its unique in that sense. Im hoping to convince the site owner to change the content over time for every country. A long process. My biggest problem for understanding duplication issues is that every tabloid or broadsheet media website would be canned from google as quite often they scrape Reuters or re-publish standard press releases on their sites as newsworthy content. So i have great doubt that there is a penalty for it. You only have to look and you can see media sites duplication everywhere, everyday, but they get ranked. I just think that google dont rank the worst cases of spammy duplication. They still index though I notice. So considering the business niche has very much the same content layout replicated content, which rank well, is this duplicate flag such a great worry? Many businesses sell the same service to many locations and its virtually impossible to re write the services in a dozen or so different ways.
Technical SEO | | xtopher660 -
Has Google stopped rendering author snippets on SERP pages if the author's G+ page is not actively updated?
Working with a site that has multiple authors and author microformat enabled. The image is rendering for some authors on SERP page and not for others. Difference seems to be having an updated G+ page and not having a constantly updating G+ page. any thoughts?
Technical SEO | | irvingw0 -
404 error
Both SEOmoz and Google webmaster tools are returning over 4000 error 404.The majority or returned error URLs are for images, and all URLs end up with %20target=as shown belowimages/products/detail/AD9058RoundGlassTableChairs.jpg%20target=images/products/detail/BM921ModernRoundDiningTable.jpg%20target=images/products/detail/CR701506CappuccinoCoffeeTableSet.jpg%20target=any suggestions?RegardsTony
Technical SEO | | OCFurniture0 -
Including spatial location in URL structure. Does subfolder number and keyword order actually matter?
The SEOMoz On-Page report for my site brings up one warning (among others) that I find interesting: Minimal Subfolders in the URL My site deals with trails and courses for both races and general running. The structure for a trail is, for example: /trails/Canada/British-Columbia/Greater-Vancouver-Regional-District/Baden--Powell-Trail/trail/2 The structure for courses is: /course/28 In both cases, the id at the end is used for a database lookup. I'm considering an URL structure that would be: /trail/Baden-Powell-Trail/ca-bc-vancouver This would use the country code (CA) and sub-country code (BC) along with the short name for the region. This could be good because: it puts the main keyword first the URL is much shorter there are only 3 levels in the URL structure However, there is evidence, from Google's Matt Cutts, that the keyword order and URL structure don't matter in that way: See this post: http://www.seomoz.org/q/all-page-files-in-root-or-to-use-directories If Matt Cutts says they aren't so important then why are they listed in the SEOMoz On-Page Report? I'd prefer to use /trail/ca-bc-vancouver/Baden-Powell-Trail. I'll probably do a similar thing for courses. Is this a good idea? Thoughts? Many thanks, in advance, for your help. Cheers, Edward watch?v=l_A1iRY6XTM watch?v=gRzMhlFZz9I
Technical SEO | | esarge0