It’s a sad truth to the Internet that it’s sort of a content wild west. Copyright law with regards to digital content is in flux, with very little in the way of legal precedent, and widespread knowledge about it is thin at best. Every day, webmasters run Google image searches and use unedited images they have no rights to use, violating copyright and stealing content. It’s that simple. That’s not even mentioning the written content, which is often scraped and stolen by unscrupulous webmasters for thin affiliate sites and black hat sites. Content theft is a major problem.
Unfortunately, there’s not a lot you can do. However, you can take some preventative steps and some steps to fix the problem when your content is stolen.
Sometimes, you can thwart basic thefts by displaying a copyright notice. This is why many sites have copyright notices in their footer. Legally, when you create content and publish it, you own it.
There are some exceptions and some variations to the rule. For example, when you hire a freelancer to write content for you, the ownership of that content depends on the contract between you and the freelancer. You want to make sure your contract gives you the rights to the content on purchase, otherwise the freelancer can post the content elsewhere with legal rights to do so.
The basic copyright notification is simple; just add “Copyright <year> <Name>” to the footer of your page. For non-US countries, you should also add All Rights Reserved.
There are a lot of preventative measures you can take, but unfortunately many of them can be harmful to the user experience, or detrimental to SEO if you’ve implemented them improperly. For example:
Copyscape is one of the biggest and easiest methods to use to identify if any of your content has been copied. Simply run your content through their scan and they will tell you if it shows up in part or wholly on another site. Your site will pop up in the list if you’ve published the content already, but so will any site that has copied it. There are also Copyscape alternatives, like PlagSpotter and CopyGator.
For a manual solution, you can just run Google searches for snippets of your content. Choose something deeper than the intro paragraph, in case the copier changed the title and intro.
Before you decide to pursue removal of the content, make sure it’s actually stolen. Don’t set up an automatic bot to send removal notices, or else you’ll end up sending them to yourself when a URL changes, or to your RSS provider when you syndicate it, among other situations.
There are two situations where content may appear to be a theft when it isn’t. The first is when your content is syndicated on other sites through a partnership deal. This is less common than it used to be, but still happens occasionally. The other is when another blow quotes your post. Tools like Copyscape aren’t smart enough to identify a quote as fair use; they just tell you when the content appears elsewhere. A quote is not a removable offense, and you look like a jerk trying to get it removed.
Note: before you begin fighting stolen content, make sure to gather the appropriate proof. Note the links and timestamps of your content and the stolen content, and take screenshots as proof. Make sure the URL of the content is visible in the screenshot as well.
The first step after identifying stolen content is to locate contact information for the site. If they have a contact form, use it, and inform them that their content is stolen. Depending on how legitimate the site looks, you can tailor your message. For legitimate sites, I recommend a simple notice that the content is stolen and request that it be removed. If the site looks like a thin affiliate or spam site, add in threats of complaints or legal action.
In some cases, the content was stolen by a writer looking to sell content they didn’t write. In these cases, the site owner will likely take down the content without need of a threat.
If the site has no contact information, or if they ignored your message, you can take it one step further. Use Whois to look up the legally-required contact information for the owner of the site, and use that information to send the same request to remove the content. It might work, it might not.
If they still ignore your request, or if the Whois information is incorrect, kick it up a notch. Go to Who Is Hosting This to look up the contact information for the web host. Most web hosts don’t want their IPs flagged as spam IPs, and most web hosts have rules in their terms of service that ban stolen content. Contact the web host, inform them that one of their sites is hosting stolen content – with specific links – and request its removal. Most hosts will remove the offending pages, while others will terminate the accounts of the offender entirely.
If this doesn’t work, you can also file a Digital Millennium Copyright Act complaint against the site. The DMCA notice can be filed through Google Webmaster Tools. This will inform Google of the theft, and that content will be removed from the search rankings. This might not get the copied content removed, but it will hurt the site that did the copying and will keep that copied content from hurting your site SEO.