<%! from flask import request from copyvios.attribution import get_attribution_info from copyvios.checker import T_POSSIBLE, T_SUSPECT from copyvios.cookies import get_cookies from copyvios.misc import cache %>\ <% titleparts = [] if query.page: titleparts.append(query.page.title) titleparts.append("Earwig's Copyvio Detector") title = " | ".join(titleparts) cookies = get_cookies() %>\ <%include file="/support/header.mako" args="title=title, splash=not result"/> <%namespace module="copyvios.highlighter" import="highlight_delta"/>\ <%namespace module="copyvios.misc" import="httpsfix, urlstrip"/>\ % if notice:
${notice}
% endif % if query.submitted: % if query.error:

% if query.error == "bad action": Unknown action: ${query.action | h}. % elif query.error == "no search method": No copyvio search methods were selected. A check can only be made using the search engine, links present in the page, Turnitin, or some combination of these. % elif query.error == "bad oldid": The revision ID ${query.oldid | h} is invalid. It should be an integer. % elif query.error == "no URL": Compare mode requires a URL to be entered. Enter one in the text box below, or choose copyvio search mode to look for content similar to the article elsewhere on the web. % elif query.error == "bad URI": Unsupported URI scheme: ${query.url | h}. % elif query.error == "no data": Couldn't find any text in ${query.url | h}. Note: only HTML documents, plain text pages, and PDFs are supported, and content generated by JavaScript or found inside iframes is ignored. % elif query.error == "timeout": The URL ${query.url | h} timed out before any data could be retrieved. % elif query.error == "search error": An error occurred while using the search engine (${query.error.__cause__}). Note: there is a daily limit on the number of search queries the tool is allowed to make. You may repeat the check without using the search engine. % else: An unknown error occurred. % endif

% elif not query.site:

The given site (project=${query.project | h}, language=${query.lang | h}) doesn't seem to exist. It may also be closed or private. Confirm its URL.

% elif query.oldid and not result:

The revision ID couldn't be found: ${query.oldid | h}.

% elif query.title and not result:

The page couldn't be found: ${query.page.title | h}.

% endif %endif

This tool attempts to detect copyright violations in articles. In search mode, it will check for similar content elsewhere on the web using Google, external links present in the text of the page, or Turnitin (via EranBot), depending on which options are selected. In compare mode, the tool will compare the article to a specific webpage without making additional searches, like the Duplication Detector.

Running a full check can take up to a minute if other websites are slow or if the tool is under heavy use. Please be patient. If you get a timeout, wait a moment and refresh the page.

Be aware that other websites can copy from Wikipedia, so check the results carefully, especially for older or well-developed articles. Specific websites can be skipped by adding them to the excluded URL list.

% if query.nocache or (result and result.cached):
% endif
% if result:
Results % if result.cached: cachedTo save time (and money), this tool will retain the results of checks for up to 72 hours. This includes the URLs of the checked sources, but neither their content nor the content of the article. Future checks on the same page (assuming it remains unchanged) will not involve additional search queries, but a fresh comparison against the source URL will be made. If the page is modified, a new check will be run. from ${result.cache_age} ago. Originally % endif generated in ${round(result.time, 3)} % if query.action == "search": seconds using ${result.queries} quer${"y" if result.queries == 1 else "ies"}. % else: seconds. % endif Permalink.
${query.page.title | h} % if query.oldid: @${query.oldid | h} % endif % if query.redirected_from:
Redirected from ${query.redirected_from.title | h}. Check original. % endif
% if result.confidence >= T_SUSPECT: Violation suspected % elif result.confidence >= T_POSSIBLE: Violation possible % elif result.sources: Violation unlikely % else: No violation % endif
${round(result.confidence * 100, 1)}%
similarity
% if result.url: ${result.url | urlstrip, h} % else: No matches found. % endif
<% attrib = get_attribution_info(query.site, query.page) %> % if attrib:
This article contains an attribution template: {{${attrib[0] | h}}}. Please verify that any potential copyvios are not from properly attributed sources.
% endif % if query.turnitin_result:
Turnitin Results
% if query.turnitin_result.reports: % for report in turnitin_result.reports: % endfor
Report ${report.reportid} for text added at ${report.time_posted.strftime("%H:%M, %d %B %Y (UTC)")}:
    % for source in report.sources:
  • ${source['percent']}% of revision text (${source['words']} words) found at ${source['url'] | h}
  • % endfor
% else:
No matching sources found.
% endif
% endif % if query.action == "search": <% skips = False %>
Checked Sources
% if result.sources: % for i, source in enumerate(result.sources): = 10 else 'id="source-row-selected"' if i == 0 else ""}> % endfor
URL Similarity Compare
${source.url | h} % if source.excluded: Excluded % elif source.skipped: <% skips = True %> Skipped % else: = T_SUSPECT else "source-possible" if source.confidence >= T_POSSIBLE else "source-novio"}">${round(source.confidence * 100, 1)}% % endif Compare
% else: % endif % if len(result.sources) > 10: % endif % if skips or result.possible_miss: % endif
% endif
Article:

${highlight_delta(result.article_chain, result.best.chains[1] if result.best else None)}

Source:

${highlight_delta(result.best.chains[0], result.best.chains[1]) if result.best else ""}

% endif <%include file="/support/footer.mako"/>