<%def name="walk_json(obj)"> ${obj | h} API - Earwig's Copyvio Detector % if help:

Copyvio Detector API

This is the first version of the API for Earwig's Copyvio Detector. It works, but some bugs might still need to be ironed out, so please report any if you see them.

Requests

The API responds to GET requests made to https://tools.wmflabs.org/copyvios/api.json. Parameters are described in the tables below:

Always
Parameter Values Required? Description
action compare, search, sites Yes The API will do URL comparisons in compare mode, run full copyvio checks in search mode, and list all known site languages and projects in sites mode.
format json, jsonfm No (default: json) The default output format is JSON. jsonfm mode produces the same output, but renders it as a formatted HTML document for debugging.
version integer No (default: 1) Currently, the API only has one version. You can skip this parameter, but it is recommended to include it for forward compatibility.
compare Mode
Parameter Values Required? Description
project string Yes The project code of the site the page lives on. Examples are wikipedia and wiktionary. A list of acceptable values can be retrieved using action=sites.
lang string Yes The language code of the site the page lives on. Examples are en and de. A list of acceptable values can be retrieved using action=sites.
title string Yes (either title or oldid) The title of the page or article to make a comparison against. Namespace must be included if the page isn't in the mainspace.
oldid integer Yes (either title or oldid) The revision ID (also called oldid) of the page revision to make a comparison against. If both a title and oldid are given, the oldid will be used.
url string Yes The URL of the suspected violation source that will be compared to the page.
search Mode
Parameter Values Required? Description
project string Yes The project code of the site the page lives on. Examples are wikipedia and wiktionary. A list of acceptable values can be retrieved using action=sites.
lang string Yes The language code of the site the page lives on. Examples are en and de. A list of acceptable values can be retrieved using action=sites.
title string Yes (either title or oldid) The title of the page or article to make a check against. Namespace must be included if the page isn't in the mainspace.
oldid integer Yes (either title or oldid) The revision ID (also called oldid) of the page revision to make a check against. If both a title and oldid are given, the oldid will be used.
use_engine boolean No (default: true) Whether to use a search engine (Yahoo! BOSS) as a source of URLs to compare against the page.
use_links boolean No (default: true) Whether to compare the page against external links found in its wikitext.
nocache boolean No (default: false) Whether to bypass search results cached from previous checks. It is recommended that you don't pass this option unless a user specifically asks for it.
noredirect boolean No (default: false) Whether to avoid following redirects if the given page is a redirect.
noskip boolean No (default: false) If a suspected source is found during a check to have a sufficiently high confidence value, the check will end prematurely, and other pending URLs will be skipped. Passing this option will prevent this behavior, resulting in complete (but more time-consuming) checks.

Responses

The JSON response object always contains a status key, whose value is either ok or error. If an error has occurred, the response will look like this:

{
    "status": "error",
    "error": {
        "code": (string) error code,
        "info": (string) human-readable description of error
    }
}

Valid responses for action=compare and action=search are formatted like this:

{
    "status": "ok",
    "meta": {
        "time":       (float) time to generate results, in seconds,
        "queries":    (int) number of search engine queries made,
        "cached":     (boolean) whether or not these results are cached from an earlier search (always false in the case of action=compare),
        (only if cached=true) "cache_time": (string) human-readable time of the original search that the results are cached from
        "redirected": (boolean) whether or not a redirect was followed
    },
    "page": {
        "title": (string) the normalized title of the page checked,
        "url":   (string) the full URL of the page checked
    },
    (only if redirected=true) "original_page": {
        "title": (string) the normalized title of the original page whose redirect was followed,
        "url":   (string) the full URL of the original page whose redirect was followed
    },
    "best": {
        "url":        (string) the URL of the best match found, or null if no matches were found,
        "confidence": (float) the confidence of a violation in the best match, or 0.0 if no matches were found,
        "violation":  (string) one of "suspected", "possible", or "none"
    },
    "sources": [
        {
            "url":        (string) the URL of the source,
            "confidence": (float) the confidence of a violation in the source,
            "violation":  (string) one of "suspected", "possible", or "none",
            "skipped":    (boolean) whether or not the source was skipped due to the check finishing early (see note about noskip above)
        },
        ...
    ]
}

In the case of action=search, sources will contain one entry for each source checked (or skipped if the check ends early), sorted in order of confidence, with skipped sources at the bottom.

In the case of action=compare, best will always contain information about the URL that was given, so response["best"]["url"] will never be null. Also, sources will always contain one entry, with the same data as best, since only one source is checked in comparison mode.

Valid responses for action=sites are formatted like this:

{
    "status": "ok",
    "langs": [
        [
            (string) language code,
            (string) human-readable language name
        ],
        ...
    ],
    "projects": [
        [
            (string) project code,
            (string) human-readable project name
        ],
        ...
    ]
}

Example

GET https://tools.wmflabs.org/copyvios/api.json?version=1&action=search&project=wikipedia&lang=en&title=User:The_Earwig/Sandbox/CopyvioExample

{
    "status": "ok",
    "meta": {
        "time": 2.2474379539489746,
        "queries": 1,
        "cached": false,
        "redirected": false
    },
    "page": {
        "title": "User:The Earwig/Sandbox/CopyvioExample",
        "url": "https://en.wikipedia.org/wiki/User:The_Earwig/Sandbox/CopyvioExample"
    },
    "best": {
        "url": "http://www.whitehouse.gov/administration/president-obama/",
        "confidence": 0.9886608511242603,
        "violation": "suspected"
    }
    "sources": [
        {
            "url": "http://www.whitehouse.gov/administration/president-obama/",
            "confidence": 0.9886608511242603,
            "violation": "suspected",
            "skipped": false
        },
        {
            "url": "http://maige2009.blogspot.com/2013/07/barack-h-obama-is-44th-president-of.html",
            "confidence": 0.9864798816568047,
            "violation": "suspected",
            "skipped": false
        },
        {
            "url": "http://jeuxdemonstre-apkdownload.rhcloud.com/luo-people-of-kenya-and-tanzania---wikipedia--the-free",
            "confidence": 0.0,
            "violation": "none",
            "skipped": false
        },
        {
            "url": "http://www.whitehouse.gov/about/presidents/barackobama",
            "confidence": 0.0,
            "violation": "none",
            "skipped": true
        },
        {
            "url": "http://jeuxdemonstre-apkdownload.rhcloud.com/president-barack-obama---the-white-house",
            "confidence": 0.0,
            "violation": "none",
            "skipped": true
        }
    ]
}
% endif % if result:

You are using jsonfm output mode, which renders JSON data as a formatted HTML document. This is intended for testing and debugging only.

${walk_json(result)}
% endif