Steps you can take to ensure your content is indexed and registered to your site before a scraper gets to it?
-
Hi,
A clients site has significant amounts of original content that has blatantly been copied and pasted in various other competitor and article sites.
I'm working with the client to rejig lots of this content and to publish new content.
What steps would you recommend to undertake when the new, updated site is launched to ensure Google clearly attributes the content to the clients site first?
One thing I will be doing is submitting a new xml + html sitemap.
Thankyou
-
There are no "best practices" established for the tags' usage at this point. On the one hand, it could technically be used for every page, and on the other, should only be used when it's an article, blog post, or other individual person's writing.
-
Thanks Alan.
Guess there's no magic trick that will give you 100% attribution.
Regarding this tag, do you recommend I add this to EVERY page of the clients website including the homepage? So even the usual about us/contact etc pages?
Cheers
Hash
-
Google continually tries to find new ways to encourage solutions for helping them understand intent, relevance, ownership and authority. It's why Schema.org finally hit this year. None of their previous attempts have been good enough, and each has served a specific individual purpose.
So with Schema, the theory is there's a new, unified framework that can grow and evolve, without having to come up with individual solutions.
The "original source" concept was supposed to address the scraper issue, and there's been some value in that, though it's far from perfect. A good scraper script can find it, strip it out or replace the contents.
rel="author" is yet one more thing that can be used in the overall mix, though Schema.org takes authorship and publisher identity to a whole new, complex, and so far confused level :-).
Since Schema.org is most likely not going to be widely adopted til at least early next year, Google's encouraging use of the rel="author" tag as the primary method for assigning authorship at this point, and will continue to support it even as Schema rolls out.
So if you're looking at a best practices solution, yes, rel="author" is advisable. Until it's not.
-
Thanks Alan... I am surprised to learn about this "original source" information. There must not have been a lot of talk about it when it was released or I would have seen it.
Google recently started encouraging people to use the rel="author" attribute. I am going to use that on my site... now I am wondering if I should be using "original source" too.
Are you recommending rel="author"?
Also, reading that full post there is a section added at the end recommending rel="canonical"
-
Always have a sitemap.xml file with all the URLs you want indexed included in it. Right after publishing, submit the sitemap.xml file (or files if there are tens of thousands of pages) through Google Webmaster Tools and Bing Webmaster Tools. Include the Meta "original-source" tag in your page headers.
Include a Copyright line at the bottom of each page with the site or company name, and have that link to the home page.
This does not guarantee with 100% certainty that you'll get proper attribution, however these are the best steps you can take in that regard.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I change a URL on a site that has only a few back links?
I have a site that wants to change their URL, It's a very basic site with hardly any backlinks. http://www.cproofingandexteriors.com/ The only change they want to make is taking out the 'and'.. so it would be cproofingexteriors.com they already own the domain. What should I do?? Thanks
Intermediate & Advanced SEO | | MissThumann0 -
How bad is duplicate content for ecommerce sites?
We have multiple eCommerce sites which not only share products across domains but also across categories within a single domain. Examples: http://www.artisancraftedhome.com/sinks-tubs/kitchen-sinks/two-tone-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-kitchen-and-farmhouse-sinks/two-tone-kitchen-farmhouse-sinks/medium-rounded-front-farmhouse-sink-two-tone-scroll http://www.coppersinksonline.com/copper-sinks-on-sale/medium-rounded-front-farmhouse-sink-two-tone-scroll We have selected canonical links for each domain but I need to know if this practice is having a negative impact on my SEO.
Intermediate & Advanced SEO | | ArtisanCrafted0 -
How can a website have multiple pages of duplicate content - still rank?
Can you have a website with multiple pages of the exact same copy, (being different locations of a franchise business), and still be able to rank for each individual franchise? Is that possible?
Intermediate & Advanced SEO | | OhYeahSteve0 -
How long does internationlisation take to be indexed correctly
Hi guys I have had a UK site that has been indexed in Google for some time. Recently we started targeting Ireland and so I created a folder to do this (domain.com/ireland/) As well as adding an /irland/ folder I created a hreflang sitemap and in Webmaster Tools I specified that .com/ireland/ targets Ireland and .com targets UK. However this was all two weeks ago and Im still not seeing the Irish pages start ranking in Google.ie and was hoping one of you guys would be able to help me out? How long should it take for these pages to start appearing in the relevant country specific search engine? Deepcrawl states that the Hreflang is correct as well so Im just a bit worried that Ive missed something glaringly obvious! Thanks
Intermediate & Advanced SEO | | AndrewAkesson0 -
Advice on Getting this site ranking?
Hi there I'm looking to optimise this site for SEO -> Gets about 3,000 visits per day but all from branded searches. Gets virtually no 'keyword searches' It's just a landing page at the moment. Would you recommend I integrate a blog with it, so we can start targeting more long tail keywords (free football game etc) Any thoughts/advice appreciated 🙂 Thanks Howard
Intermediate & Advanced SEO | | HowardK0 -
Can you be penalized by a development server with duplicate content?
I developed a site for another company late last year and after a few months of seo done by them they were getting good rankings for hundreds of keywords. When penguin hit they seemed to benefit and had many top 3 rankings. Then their rankings dropped one day early May. Site is still indexed and they still rank for their domain. After some digging they found the development server had a copy of the site (not 100% duplicate). We neglected to hide the site from the crawlers, although there were no links built and we hadn't done any optimization like meta descriptions etc. The company was justifiably upset. We contacted Google and let them know the site should not have been indexed, and asked they reconsider any penalties that may have been placed on the original site. We have not heard back from them as yet. I am wondering if this really was the cause of the penalty though. Here are a few more facts: Rankings built during late March / April on an aged domain with a site that went live in December. Between April 14-16 they lost about 250 links, mostly from one domain. They acquired those links about a month before. They went from 0 to 1130 links between Dec and April, then back to around 870 currently According to ahrefs.com they went from 5 ranked keywords in March to 200 in April to 800 in May, now down to 500 and dropping (I believe their data lags by at least a couple of weeks). So the bottom line is this site appeared to have suddenly ranked well for about a month then got hit with a penalty and are not in top 10 pages for most keywords anymore. I would love to hear any opinions on whether a duplicate site that had no links could be the cause of this penalty? I have read there is no such thing as a duplicate content penalty per se. I am of the (amateur) opinion that it may have had more to do with the quick sudden rise in the rankings triggering something. Thanks in advance.
Intermediate & Advanced SEO | | rmsmall0 -
One page wordpress site - what are the steps for SEO
Hello, I am launching 5 sites with keyword exact domains. I am developing the sites on wordpress as one page sales funnel sites. What do I need to do to optimize my sites? Really appreciate any bullet points or directions. Tks
Intermediate & Advanced SEO | | brianmaher0 -
Pages un-indexed in my site
My current website www.energyacuity.com has had most pages indexed for more than a year. However, I tried cache a few of the pages, and it looks the only one that is now indexed by Goggle is the homepage. Any thoughts on why this is happening?
Intermediate & Advanced SEO | | abernatj0