As upstanding humans, we like to think the best of each other. We give each other the benefit of the doubt. We trust each other until that trust is betrayed. We’re innocent until proven guilty. Unfortunately, it’s a fact of life that betrayal and guilt do happen. In this particular instance, I’m talking about the issues that come up with digital copyright, content ownership, and plagiarism.
Plagiarism happens, and it happens a lot, but most of the time it doesn’t matter. It might be unintentional and easily dealt with, or it might be intentional but ineffectual.
I’ll go over different forms plagiarism takes online, what you need to know to understand it, and how you can go about solving it.
All online plagiarism can be categories into unintentional or intentional plagiarism. Intentional plagiarism is pretty simple. People know what they’re doing, and what they’re doing is stealing your content with the intent of passing it off as their own. It’s theft, it’s illegal, and you have solid legal recourse should you decide to pursue it.
Unintentional plagiarism comes from people who are too new or uneducated about Internet copyright to know what they’re doing. It’s really tempting for a business to see a blog post they like and just take it. Sometimes they even attribute it to the source, but that’s still illegal copying. The same thing happens all the time to images and art online, and it’s even worse on artists than it is on writers.
There’s also the case where you as a blog owner might be responsible for unintentional plagiarism. What happens if you buy an article from a writer, only to find out months later than the piece was copied wholesale from someone else? So you can see how much of a problem it can cause.
There’s a lot of misinformation and mythology surrounding digital copyrights, and part of the problem is that there have rarely been rulings made about what is and isn’t a violation. Copyright law is a labyrinth, and many of the biggest companies try to keep things settled out of court just in case some judge with common sense rules against some abuse they’ve used.
First of all, if you’re worried about your content being stolen and the copied content causing you to lose your Google ranking position, don’t be.
Copied content penalties do exist from Google’s Panda algorithm, but they aren’t going to destroy you due to the actions of someone else. If that were the case, black hat scammers could just sell their services as scrapers. They could copy your content, put it on 10+ other sites, and tank your ranking. This doesn’t happen, so obviously there must be some protection in place.
The protection, in this case, is a mixture of factors. The primary factor is indexation date. Even if a blog that steals your content posts it with a backdates timestamp to make theirs look first, Google knows which appeared on the Internet first and attributes primary creation to you.
Another factor is trust. Most of the time, the sites stealing content are very thin spammy sites. They break rules left and right, so Google doesn’t trust them. Your site follows the rules, has earned trust with the search engines, and is thus given the benefit of the doubt when there is any confusion.
The point is, 99% of the time, someone stealing your content isn’t going to hurt your search ranking. It might hurt you in the sense that your content – and possibly your name – is showing up on spam sites, but ideally those spam sites won’t show up in search rankings and very few people will ever see them.
Blog post scraping is really easy to do, very difficult to stop, and pretty much nothing to worry about. As I mentioned, it isn’t going to hurt you in almost every case. In the rare case it does hurt you, you have a lot of legal, ethical, and mechanical options you can take to fix the problem.
Another source of written plagiarism comes from other forms of site writing. Someone could steal your product descriptions, for example. Or, maybe, you took the product descriptions straight from the original manufacturer. This can hurt you, actually, as it is occasionally a Panda target. However, it’s easy to get new product descriptions if you’re at fault, and if you’re the original source, it’s not going to hurt you.
Now and then, someone will like your business model and decide to try to scam people by coping you wholesale. I’ve seen entire copies of sites ripped and posted with nothing but the name changed, including all product descriptions and blog posts. This, thankfully, is very easy to deal with.
Stealing images is easy, and many people seem to think anything they find in Google’s image search is free to use. This means you can often find the unique images you created for your posts used in other locations without permission. This is a copyright violation, and it’s one you can take action to solve.
Plagiarism happens, and it happens all the time. Don’t be surprised if you find your content posted on some other site. Take rational action and don’t knee-jerk your way into a larger problem.
That, of course, primarily applies if you just stumbled upon stolen copies of your content. If you would prefer to have a pristine Internet record, you can try to go out of your way to find copied content and have it removed. Here are some tools you can use to locate stolen content.
Once you have found locations where your content has been posted without your permission, you have to identify the thief. Or, rather, the owner of the site. It’s possible that the actual owner bought content from a writer who was a thief; you’re never going to find the actual thief in that scenario.
The first thing to check is whether or not the site has public contact information. If they have an email you can use to contact them, or a social profile you can use to message them, that’s the most direct and immediate way to resolve the situation. It also makes them less likely to be a spammer and more likely to be either the victim of a thief or an unintentional plagiarist.
If they have no contact information, or the information they post is clearly stolen or non-functional, you can go to the next step, which is looking up the information they used to register the site. The easiest way to do this is to go to Whois and look up the contact information for their domain. Sometimes this will be valid, though other times it will be an agency, a dummy corporation, or just hidden information.
If all of that fails, you can go directly to their web host. You can enter the domain name into Who Is hosting This, and you will be given information of the web host. For reference, this tool is great to learn about hosting providers, as well as the hosting for popular sites, to see where you stand up. It has other uses than just as a tool for solving plagiarism.
This is one of three methods you can use to solve your plagiarism issues. It’s generally going to be your first method of choice, if possible, and it’s the ideal method for dealing with unintentional plagiarists. If you have contact information for the site owner – rather than the web host – you should contact them.
In the message you send them, tell them you found your content posted on their site without permission. Provide them with links to the content on both their site and on your site, to prove that the content is copied. Explain digital copyright – that you own the content and that their use of it without permission is a copyright violation – and request that they take the content down.
In some cases, like with specific images, you might want to ask them to add attribution instead. This is entirely you choice. Some photographers release their work under creative commons licenses that require attribution, and are fine with others using the images as long as the credit is there. Others would prefer to have the content removed. For text, attribution isn’t really viable; just ask to have it removed.
Before you send such a letter, one thing you should do is make sure you actually own the content. If you bought it from a writer or from a marketplace like Constant Content, there’s a chance that the contract you used doesn’t give you exclusive rights. On Constant Content, if you buy something for the cheaper price, all you’re getting are Use Rights. Anyone else can come along and buy the same post, and their use of it is just as legal as yours. If you bought your content, make sure you own the full rights to use it exclusively. You don’t want to embarrass yourself by sending a takedown letter only to find the other party also has legal rights to use the content.
Ideally, the other party will admit their guilt and will comply, removing the content. If that happens, you’re done. You don’t need to proceed with the interaction.
Sometimes they will respond but try to argue with you. This is a case where you might want to find copyright law references online and use the evidence to explain why their theft is theft and not fair use. If you choose to argue, make sure you back up your statements. Alternatively, just proceed to the next method.
If the owner of the site doesn’t respond, if they argue and you don’t want to deal with it, if their contact information is false, or if they have no posted contact information, you can undercut the owner and go directly to the web host.
Web hosts almost always have terms in their terms of service that disallow stealing content and copyright violations. This protects them primarily from being attached my the RIAA or MPAA, big corporations that will demolish a company if that company is caught sharing copyrighted files. It’s a boon to you that the same laws the MPAA uses are the laws you can use to get a stolen blog post removed.
Essentially, all you have to do is send a message to the support contact for the web host of the spammer. Inform them that site X has stolen your content, and provide links the same way you would provide them in the previous route. The vast majority of the time, the web host will deal with the problem, either by contacting the site owner and telling them to remove the offending content, or just by removing the content altogether.
If all of that fails, you have one more option; legal action. The first step towards taking legal action is sending a legal notice of copyright violation and requesting removal of the violating content. The way you do this is through a DMCA request. You can do this through Google’s Webmaster Tools. This will pretty much always result in the content being removed, if not from the web entirely, from Google’s search index. If it doesn’t show up in search, it doesn’t really matter if it’s stolen, does it? Well, it does, but the chances that it has a negative effect on your business are nil.
Only in the rarest of the rare instances will you find your content, stolen and used on a site that ranks higher than you and has more trust than you. In these cases, you can follow up the DMCA with a legal suit. Suing people costs money, though, and it’s a long and difficult process. I’m also not a lawyer, so I’m not going to give you any more advice beyond “go talk to a real copyright lawyer.”
There are a few methods you can use to help prevent your content from being stolen. Google Authorship used to be one, but they killed that program.
First, add a clear copyright notice to the bottom of your site. This will stop many of the unintentional plagiarists out there, those who mistakenly believe that if there’s no notice, the content is public domain.
Second, consider using a script like Tynt. This script automatically ads a backlink to any clipboard when a user copies and pastes content. It’s not a guaranteed method of prevention, but it does mean that whenever someone steals your content, they’re adding a link to your site. You can even customize this link with UTM parameters to flag them in a specific section, so you can see all copied links in one place. Don’t assume they’re all theft, though; a legitimate user pasting a sentence from your content as a tweet would also qualify here.
Third, there are scripts you can use to disable right-clicking entirely, or disable it on pictures, if you’re worried about those being stolen. Unfortunately, I don’t’ recommend them. They hurt the user experience, because right-clicking is essential to many people and their use of web browsers. It also doesn’t really stop theft, just hinders it; copying from your source code, or just disabling scripts via NoScript, get around such scripts easily.
Fourth, just grow. The bigger you are, the better you are, the more trust you have with Google. Sure, you’ll become a bigger target, but it will also matter less that someone is stealing your content. If it doesn’t matter, then it’s essentially ignorable.