When the original Panda update hit, sites everywhere bemoaned the events that led to their penalties. Among the many faults that Google targeted, copied content was perhaps the most important. Copied content, even syndicated content or product entries with too much similarity, earned sites a hefty penalty. Some could remove a page and restore their position. Others required near-complete rewrites of their existing content. Some simply folded up and disappeared. It was the end of an era and the start of something new.
Copied content could show up anywhere, and sometimes you have little or no control over it.
With so many ways duplicate content could appear on your site, it’s easy to see how so many sites were penalized. Among those that have recovered, a tool was needed to prevent future instances of duplicate content. This tool, it has emerged, is Copyscape.
Copyscape is the current industry standard plagiarism checker. It’s the most widely used out of many such tools, though nothing stops you from using more than one to be extra certain. In essence, Copyscape scans the Internet for instances of a piece of content. If the content you search for is present elsewhere, in whole or in part, Copyscape will let you know.
With Copyscape, you plug in a piece of content, small or large. It might be a 200-word product description. It might be a 5,000-page website. Whatever the content is, you submit it and let Copyscape do its thing. Copyscape scans the Internet for instances of the content anywhere online.
This provides a few benefits. First, it lets you know if any of your content has been copied or scraped elsewhere online. It lets you know if a piece of content you purchased has been copied. You can also compare two pieces of content to check for similarity, if you’re interested in seeing how different a spun article has to be before it is considered unique.
Copyscape comes in two primary flavors; free and premium. The free version allows a few basic searches and comparisons, with a limited array of search results and a cap on the number of searches you can perform each month. It works for small businesses checking the occasional blog post, but if your site is larger or you have higher volumes of content, you’re going to want to find the budget for premium.
The premium Copyscape service offers a number of additional features.
If you publish content frequently, particularly if you purchase that content from a content mill or marketplace, you’ll benefit from using the API to search for matches.
This is the biggest part of using Copyscape. If you’re blindly rejecting any content with a match, you’re probably missing out on quality content. When you view the search results of a given page, consider what they mean.
Copyscape is a brilliant and valuable way to catch content scrapers and content thieves. It’s also an important tool for analyzing content you buy to make sure it’s unique.
When it comes time to actually use Copyscape, you have a few options. First, you can manually submit and scan any piece of content you’re about to purchase. If that content is copied, you can have it revised or reject it, depending on the processes of the source. You can also use the Premium features to scan your entire site for existing copies on the web. If you find such copies, you can determine how to tackle the issue of duplicate content, which itself can fill an entire blog post.
For larger sites or for a more automatic process, you can integrate the Copyscape API into your publication, submission or purchase process. The API is very robust and can be configured in a number of ways to suit your needs.
You can also use Copysentry, Copyscape’s active protection service. For a monthly fee, Copysentry will monitor up to 500 pages – ten basic, and a per-page fee beyond those ten – on an ongoing basis. Such scans take place either once per week or once per day, depending on the level of service you’re buying.
The end result is, primarily, awareness. Copyscape is not a content removal tool. If you find your content has been scraped, or that the content you’re interested in buying has been previously published, it’s up to you to determine what to do. However, it’s important to learn to avoid duplicate content to avoid future Google penalties.