Steps you can take to ensure your content is indexed and registered to your site before a scraper gets to it?
-
Hi,
A clients site has significant amounts of original content that has blatantly been copied and pasted in various other competitor and article sites.
I'm working with the client to rejig lots of this content and to publish new content.
What steps would you recommend to undertake when the new, updated site is launched to ensure Google clearly attributes the content to the clients site first?
One thing I will be doing is submitting a new xml + html sitemap.
Thankyou
-
There are no "best practices" established for the tags' usage at this point. On the one hand, it could technically be used for every page, and on the other, should only be used when it's an article, blog post, or other individual person's writing.
-
Thanks Alan.
Guess there's no magic trick that will give you 100% attribution.
Regarding this tag, do you recommend I add this to EVERY page of the clients website including the homepage? So even the usual about us/contact etc pages?
Cheers
Hash
-
Google continually tries to find new ways to encourage solutions for helping them understand intent, relevance, ownership and authority. It's why Schema.org finally hit this year. None of their previous attempts have been good enough, and each has served a specific individual purpose.
So with Schema, the theory is there's a new, unified framework that can grow and evolve, without having to come up with individual solutions.
The "original source" concept was supposed to address the scraper issue, and there's been some value in that, though it's far from perfect. A good scraper script can find it, strip it out or replace the contents.
rel="author" is yet one more thing that can be used in the overall mix, though Schema.org takes authorship and publisher identity to a whole new, complex, and so far confused level :-).
Since Schema.org is most likely not going to be widely adopted til at least early next year, Google's encouraging use of the rel="author" tag as the primary method for assigning authorship at this point, and will continue to support it even as Schema rolls out.
So if you're looking at a best practices solution, yes, rel="author" is advisable. Until it's not.
-
Thanks Alan... I am surprised to learn about this "original source" information. There must not have been a lot of talk about it when it was released or I would have seen it.
Google recently started encouraging people to use the rel="author" attribute. I am going to use that on my site... now I am wondering if I should be using "original source" too.
Are you recommending rel="author"?
Also, reading that full post there is a section added at the end recommending rel="canonical"
-
Always have a sitemap.xml file with all the URLs you want indexed included in it. Right after publishing, submit the sitemap.xml file (or files if there are tens of thousands of pages) through Google Webmaster Tools and Bing Webmaster Tools. Include the Meta "original-source" tag in your page headers.
Include a Copyright line at the bottom of each page with the site or company name, and have that link to the home page.
This does not guarantee with 100% certainty that you'll get proper attribution, however these are the best steps you can take in that regard.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Image Quality In Online Store Effect Main Site SEO?
My client is building a new shopping cart that will be mobile friendly BUT will have terrible image quality - tiny images, with text on them. How will these tiny images impact the SEO on the main site? Should they put the store on the mainsite.com or on a subdomain (store.mainsite.com)? Does google see subdomains as part of the main site? I would think that having thousands of shopping product pages could be beneficial to the main site SEO, as long as the images don't negate the content. Thoughts?
Intermediate & Advanced SEO | | jerrico10 -
Site still indexed after request 'change of address' search console
Hello, A couple of weeks ago we requested a change of address in Search console. The new, correct url is already indexed. Yet when we search the old url (with site:www.) we find that the old url is still indexed. Is there another way to remove old urls?
Intermediate & Advanced SEO | | conversal0 -
Site not indexed in Google UK
This site was moved to a new host by the client a month back and is still not indexed in Google UK if you search for the site directly. www.loftconversionswestsussex.com Webmaster tools shows that 55 pages have been crawled and no errors have been detected. The client also tried the "Fetch as Google Bot" tactic in GWT as well as running a PPC campaign and the site is still not appearing in Google. Any thoughts please? Cheers, SEO5..
Intermediate & Advanced SEO | | SEO5Team0 -
Where is the best place to put a sitemap for a site with local content?
I have a simple site that has cities as subdirectories (so URL is root/cityname). All of my content is localized for the city. My "root" page simply links to other cities. I very specifically want to rank for "topic" pages for each city and I'm trying to figure out where to put the sitemap so Google crawls everything most efficiently. I'm debating the following options, which one is better? Put the sitemap on the footer of "root" and link to all popular pages across cities. The advantage here is obviously that the links are one less click away from root. Put the sitemap on the footer of "city root" (e.g. root/cityname) and include all topics for that city. This is how Yelp does it. The advantage here is that the content is "localized" but the disadvantage is it's further away from the root. Put the sitemap on the footer of "city root" and include all topics across all cities. That way wherever Google comes into the site they'll be close to all topics I want to rank for. Thoughts? Thanks!
Intermediate & Advanced SEO | | jcgoodrich0 -
Help with Best Content Posting Approach - WordPress site
I have a word document that i would like to add to my wordpress site as a page. The document has a large detailed flow chart of a complex legal process. (about 20+ boxes in the flow chart). I do not want to add it as an image because i want search engines to read/index the information in the flow chart. any suggestions to post this detailed flow chart on a WP page in the best SEO manner? Thanks.
Intermediate & Advanced SEO | | CamiloSC0 -
Duplicate content on sites from different countries
Hi, we have a client who currently has a lot of duplicate content with their UK and US website. Both websites are geographically targeted (via google webmaster tools) to their specific location and have the appropriate local domain extension. Is having duplicate content a major issue, since they are in two different countries and geographic regions of the world? Any statement from Google about this? Regards, Bill
Intermediate & Advanced SEO | | MBASydney0 -
Best way to de-index content from Google and not Bing?
We have a large quantity of URLs that we would like to de-index from Google (we are affected b Panda), but not Bing. What is the best way to go about doing this?
Intermediate & Advanced SEO | | nicole.healthline0 -
How to see which site Google views as a scraper site?
If we have content on our site that is found on another site, what is the best way to know which site Google views as the original source? If you search for a line of the content such as "xyz abc etc" and the other site shows before yours in search results, does that mean that Google views that site as the original source?
Intermediate & Advanced SEO | | nicole.healthline0