<%! from flask import g, request from copyvios.checker import T_POSSIBLE, T_SUSPECT from copyvios.misc import cache %>\ <%include file="/support/header.mako" args="title='Earwig\'s Copyvio Detector'"/> <%namespace module="copyvios.highlighter" import="highlight_delta"/>\ <%namespace module="copyvios.misc" import="httpsfix, urlstrip"/>\ % if notice:
${notice}
% endif % if query.submitted: % if query.error:

% if query.error == "bad action": Unknown action: ${query.action | h}. % elif query.error == "no search method": No copyvio search methods were selected. A check can only be made using a search engine, links present in the page, or both. % elif query.error == "no URL": URL comparison mode requires a URL to be entered. Enter one in the text box below, or choose copyvio search mode to look for content similar to the article elsewhere on the web. % elif query.error == "bad URI": Unsupported URI scheme: ${query.url | h}. % elif query.error == "no data": Couldn't find any text in ${query.url | h}. Note: only HTML documents, plain text pages, and PDFs are supported, and content generated by JavaScript or found inside iframes is ignored. % elif query.error == "timeout": The URL ${query.url | h} timed out before any data could be retrieved. % elif query.error == "search error": An error occurred while using the search engine (${query.exception}). Try reloading the page. If the error persists, repeat the check without using the search engine. % else: An unknown error occurred. % endif

% elif not query.site:

The given site (project=${query.project | h}, language=${query.lang | h}) doesn't seem to exist. It may also be closed or private. Confirm its URL.

% elif query.oldid and not result:

The given revision ID doesn't seem to exist: ${query.oldid | h}.

% elif query.title and not result:

The given page doesn't seem to exist: ${query.page.title | h}.

% endif %endif

This tool attempts to detect copyright violations in articles. In search mode, it will check for similar content elsewhere on the web using Yahoo! BOSS and/or external links present in the text of the page, depending on which options are selected. In comparison mode, the tool will skip the searching step and display a report comparing the article to the given webpage, like the Duplication Detector.

Running a full check can take up to 45 seconds if other websites are slow. Please be patient. If you get a timeout, wait a moment and refresh the page.

Specific websites can be skipped (for example, if their content is in the public domain) by being added to the excluded URL list.

% if query.nocache or (result and result.cached): % endif
Site: https:// . .org
Page title: % if query.title: % else: % endif or revision ID: % if query.oldid: % else: % endif
Action:
% if result:
Results % if result.cached: cachedTo save time (and money), this tool will retain the results of checks for up to 72 hours. This includes the URLs of the checked sources, but neither their content nor the content of the article. Future checks on the same page (assuming it remains unchanged) will not involve additional search queries, but a fresh comparison against the source URL will be made. If the page is modified, a new check will be run. from ${result.cache_age} ago. Originally % endif generated in ${round(result.time, 3)} % if query.action == "search": seconds using ${result.queries} quer${"y" if result.queries == 1 else "ies"}. % else: seconds. % endif Permalink.
% if query.turnitin:
Turnitin Results
% if query.turnitin_result.reports:

Turnitin (through EranBot) found revisions that may have been plagiarized. Please review them.

## TODO: make this prettier/tabular %for report in turnitin_result.reports: %endfor
Turnitin report ${report.reportid} for text added in revision ${loop.index} ## TODO: Rework this to something like: [Turnitin report](link) for [revision at timestamp](diff link). Requires API-result-parsing/TurnitinReport changes. Shouldn't be too bad. Reason: needs to make it clear that Turnitin is looking at individual revisions; current report does not.
    % for source in report.sources:
  • ${source['percent']}% of revision text (${source['words']} words) found at ${source['url']}
  • % endfor
% else:

Turnitin (through EranBot) found no matching sources.

% endif
% endif
${query.page.title | h} % if query.oldid: @${query.oldid | h} % endif % if query.redirected_from:
Redirected from ${query.redirected_from.title | h}. Check original. % endif
% if result.confidence >= T_SUSPECT: Violation Suspected % elif result.confidence >= T_POSSIBLE: Violation Possible % elif result.sources: Violation Unlikely % else: No Violation % endif
${round(result.confidence * 100, 1)}%
confidence
% if result.url: ${result.url | urlstrip, h} % else: No matches found. % endif
% if query.action == "search": <% skips = False %>
Checked Sources
% if result.sources: % for i, source in enumerate(result.sources): = 10 else 'id="source-row-selected"' if i == 0 else ""}> % endfor
URL Confidence Compare
${source.url | h} % if source.excluded: Excluded % elif source.skipped: <% skips = True %> Skipped % else: = T_SUSPECT else "source-possible" if source.confidence >= T_POSSIBLE else "source-novio"}">${round(source.confidence * 100, 1)}% % endif % if i == 0: Compare % else: Compare % endif
% else: % endif % if len(result.sources) > 10: % endif % if skips or result.possible_miss: % endif
% endif
Article:

${highlight_delta(result.article_chain, result.best.chains[1] if result.best else None)}

Source:

${highlight_delta(result.best.chains[0], result.best.chains[1]) if result.best else ""}

% endif <%include file="/support/footer.mako"/>