This article introduces a new “Web Checker” feature, which allows you to quickly identify if a website has been copied or has similar content. The tool compares the provided webpage with known URLs and presents the top 5 most similar results and a quick AI report.
Intro
Introducing the new ‘Webpage Similarity Scanner’. This feature allows you to determine if a website has been copied.
You can scan the majority of websites and identify similar sites or copies in an instant. Rick swiftly compares the provided webpage with all known URLs, presenting the top three most relevant results, time since they were first detected, a short AI report and hyperlinks found on the page.
[!new] This tool works excellently for identifying copied or stolen GitHub repositories!
Usage
This feature is available with the following command:
/web https://...
.web https://...
Not all domains are supported, websites like Twitter/𝕏 are banned from the scanner, as they add no value to the dataset.
[!note] Keep in mind that the tool simply helps you identify similar websites, you should always verify the results, the AI summary & emoji flags can help if if old websites are offline.
Scoring
Rick will present you with the 5 websites that have the highest scores according to our algorithm. This does not necessarily mean the website content is copied. As a general guideline, longer pages tend to have higher scores.
[!tip] There is a slight learning curve, as the dataset is constantly evolving, causing the scores to adapt accordingly. The more you utilize this command, the more comfortable you will become interpreting the scores.
Flags
To help you during this decision-making progress, Rick may raise several flags:
- = high likelihood of copied text
- = possibly a similar template/website builder
- = AI match, high likelihood of very similar or exact content
If one or more flags are raised, expect a significantly higher score. Please note that the scoring algo and flags are experimental and are continuously being refined as the dataset evolves.
FAQ
Is there a simple way to tell if a website is copied?
Yes, (almost) exact copies will likely show up with 2 or more flags. Without these flags it’s unlikely to be a full copy, but certain parts might be copied contributing to a higher score.
Examples
The returned sites all have a similar score: probably not a copy
The top result score is significantly higher vs. the rest of the results: very likely to be a copy - additionally, some flags may be raised.
Sometimes sites are not exactly the same but have small differences, this usually goes paired with slighly higher scores on every result.
Keywords: copy web, copyscape, web checker