How to Protect Your Web Site Content From Scrapers.
There is nothing more galling in life that someone else taking credit for your work. Putting together a well-constructed, informative, entertaining website is an art in and of itself, so when time and care has been spent in doing this, it can be seriously deflating to see it replicated and used by someone else, usually with only the merest attempt at rebranding it. Sadly, site scraping like this is not a rarity, but rather a very real problem that many site owners and businesses have to deal with.
Acquiring and developing quality content for your site can be hard enough, without mentioning the accrued costs. So to see someone else appropriating your content at no cost can be galling.
If you want to offer quality content to visitors for free, but don’t want unscrupulous vendors stealing it for their own ends, read on and take a look at the advice below.
Now, this may seem like an odd approach, but this can be a great option for some people. This approach advocates trusting in your site and not taking the time and concerted effort attempting to eliminate your site being copied.
“Websites can be copied; the hard work behind them can’t.”
With the ‘do nothing’ approach, you will be weighing up the effort on your part trying to stop scraping, against the real work that you have undertaken. This may again seem strange, but if the basic idea behind it is: Websites can be copied; the hard work behind them can’t.
This approach basically rests on you being confident in your own content and its layout, coupled with knowing that a scraped site can never truly compete with an original one. This is where the hard work and endeavor which you have performed cannot be replicated. If your site is well-optimized, full of original (dated) content and well-planned, then your site will always win through over a shallow clone. Even if the site is somewhat well presented, then it will always have the burden of having copied content – content that you can prove is yours, if need be.
Under this approach, it is better to dedicate yourself to growing your site, creating and procuring great content, and leaving those who would copy to their own devices. And remember – if someone is copying you, it usually means that you are worth imitating.
Link back to your own Site
This approach is clever, as it takes into account the fact that scraping is inevitable. Despite all of the counter-measures you may take, some scraping is still bound to occur.
By peppering your content with anchor texts that point to more content on your own site, you are doing as much as you can to exploit those who might copy your work. If the scraping is done wholesale, without any changes made whatsoever, then the cloned site will still be hosting content that will garner click-through traffic for your site. This is great as you are generating traffic off of the backs of those who would steal from you, providing some revenge and beneficial results for your hard work.
The Digital Millennium Copyright Act, passed in 1996, is, in part, designed to punish copyright infringement that occurs on the internet. You can file a DMCA complaint with Google, which will guide you through the process.
You may need to collate your information about what material you think has been copied/infringed, the site(s) that are using your material, and so on. The best thing is to have as much information and proof at hand when you come to make the DMCA complaint, just so that you are armed with everything you need and are prepared. As this is a legal process, consider everything carefully before you proceed, as making a false claim could land you in hot water. Conversely, if you ever find yourself in the strange position of being issued a DMCA notice for content that was stolen from you in the first place, then there is recourse: dmlp.org/legal-guide/responding-dmca-takedown-notice-targeting-your-content
There are masses of weapons against site scraping, with the above tips only being a few. More in- depth solutions can be attempted, with varying degrees of success. With it being unlikely that scraping will ever be fully eliminated, it falls on you to decide how you will proceed. Bear in mind that nothing will be a substitute for your hard work and great content, so just remember the tenets that made your site one that it is so desirable to steal from.