A copyright violation detector running on Wikimedia Cloud Services https://tools.wmflabs.org/copyvios/
Não pode escolher mais do que 25 tópicos Os tópicos devem começar com uma letra ou um número, podem incluir traços ('-') e podem ter até 35 caracteres.
 
 
 
 
 

280 linhas
14 KiB

  1. <%def name="walk_json(obj)">
  2. <!-- TODO -->
  3. ${obj | h}
  4. </%def>
  5. <!DOCTYPE html>
  6. <html lang="en">
  7. <head>
  8. <meta charset="utf-8">
  9. <title>API - Earwig's Copyvio Detector</title>
  10. <link rel="stylesheet" href="${request.script_root}/static/api.min.css" type="text/css" />
  11. </head>
  12. <body>
  13. % if help:
  14. <div id="help">
  15. <h1>Copyvio Detector API</h1>
  16. <p>This is the first version of the <a href="//en.wikipedia.org/wiki/Application_programming_interface">API</a> for <a href="${request.script_root}">Earwig's Copyvio Detector</a>. It works, but some bugs might still need to be ironed out, so please <a href="https://github.com/earwig/copyvios/issues">report any</a> if you see them.</p>
  17. <h2>Requests</h2>
  18. <p>The API responds to GET requests made to <span class="code">https://tools.wmflabs.org/copyvios/api.json</span>. Parameters are described in the tables below:</p>
  19. <table class="parameters">
  20. <tr>
  21. <th colspan="4">Always</th>
  22. </tr>
  23. <tr>
  24. <th>Parameter</th>
  25. <th>Values</th>
  26. <th>Required?</th>
  27. <th>Description</th>
  28. </tr>
  29. <tr>
  30. <td>action</td>
  31. <td><span class="code">compare</span>, <span class="code">search</span>, <span class="code">sites</span></td>
  32. <td>Yes</td>
  33. <td>The API will do URL comparisons in <span class="code">compare</span> mode, run full copyvio checks in <span class="code">search</span> mode, and list all known site languages and projects in <span class="code">sites</span> mode.</td>
  34. </tr>
  35. <tr>
  36. <td>format</td>
  37. <td><span class="code">json</span>, <span class="code">jsonfm</span></td>
  38. <td>No&nbsp;(default:&nbsp;<span class="code">json</span>)</td>
  39. <td>The default output format is <a href="http://json.org/">JSON</a>. <span class="code">jsonfm</span> mode produces the same output, but renders it as a formatted HTML document for debugging.</td>
  40. </tr>
  41. <tr>
  42. <td>version</td>
  43. <td>integer</td>
  44. <td>No (default: <span class="code">1</span>)</td>
  45. <td>Currently, the API only has one version. You can skip this parameter, but it is recommended to include it for forward compatibility.</td>
  46. </tr>
  47. </table>
  48. <table class="parameters">
  49. <tr>
  50. <th colspan="4"><span class="code">compare</span> Mode</th>
  51. </tr>
  52. <tr>
  53. <th>Parameter</th>
  54. <th>Values</th>
  55. <th>Required?</th>
  56. <th>Description</th>
  57. </tr>
  58. <tr>
  59. <td>project</td>
  60. <td>string</td>
  61. <td>Yes</td>
  62. <td>The project code of the site the page lives on. Examples are <span class="code">wikipedia</span> and <span class="code">wiktionary</span>. A list of acceptable values can be retrieved using <span class="code">action=sites</span>.</td>
  63. </tr>
  64. <tr>
  65. <td>lang</td>
  66. <td>string</td>
  67. <td>Yes</td>
  68. <td>The language code of the site the page lives on. Examples are <span class="code">en</span> and <span class="code">de</span>. A list of acceptable values can be retrieved using <span class="code">action=sites</span>.</td>
  69. </tr>
  70. <tr>
  71. <td>title</td>
  72. <td>string</td>
  73. <td>Yes&nbsp;(either&nbsp;<span class="code">title</span>&nbsp;or&nbsp;<span class="code">oldid</span>)</td>
  74. <td>The title of the page or article to make a comparison against. Namespace must be included if the page isn't in the mainspace.</td>
  75. </tr>
  76. <tr>
  77. <td>oldid</td>
  78. <td>integer</td>
  79. <td>Yes (either <span class="code">title</span> or <span class="code">oldid</span>)</td>
  80. <td>The revision ID (also called oldid) of the page revision to make a comparison against. If both a title and oldid are given, the oldid will be used.</td>
  81. </tr>
  82. <tr>
  83. <td>url</td>
  84. <td>string</td>
  85. <td>Yes</td>
  86. <td>The URL of the suspected violation source that will be compared to the page.</td>
  87. </tr>
  88. </table>
  89. <table class="parameters">
  90. <tr>
  91. <th colspan="4"><span class="code">search</span> Mode</th>
  92. </tr>
  93. <tr>
  94. <th>Parameter</th>
  95. <th>Values</th>
  96. <th>Required?</th>
  97. <th>Description</th>
  98. </tr>
  99. <tr>
  100. <td>project</td>
  101. <td>string</td>
  102. <td>Yes</td>
  103. <td>The project code of the site the page lives on. Examples are <span class="code">wikipedia</span> and <span class="code">wiktionary</span>. A list of acceptable values can be retrieved using <span class="code">action=sites</span>.</td>
  104. </tr>
  105. <tr>
  106. <td>lang</td>
  107. <td>string</td>
  108. <td>Yes</td>
  109. <td>The language code of the site the page lives on. Examples are <span class="code">en</span> and <span class="code">de</span>. A list of acceptable values can be retrieved using <span class="code">action=sites</span>.</td>
  110. </tr>
  111. <tr>
  112. <td>title</td>
  113. <td>string</td>
  114. <td>Yes&nbsp;(either&nbsp;<span class="code">title</span>&nbsp;or&nbsp;<span class="code">oldid</span>)</td>
  115. <td>The title of the page or article to make a check against. Namespace must be included if the page isn't in the mainspace.</td>
  116. </tr>
  117. <tr>
  118. <td>oldid</td>
  119. <td>integer</td>
  120. <td>Yes (either <span class="code">title</span> or <span class="code">oldid</span>)</td>
  121. <td>The revision ID (also called oldid) of the page revision to make a check against. If both a title and oldid are given, the oldid will be used.</td>
  122. </tr>
  123. <tr>
  124. <td>use_engine</td>
  125. <td>boolean</td>
  126. <td>No (default: <span class="code">true</span>)</td>
  127. <td>Whether to use a search engine (<a href="//developer.yahoo.com/boss/search/">Yahoo! BOSS</a>) as a source of URLs to compare against the page.</td>
  128. </tr>
  129. <tr>
  130. <td>use_links</td>
  131. <td>boolean</td>
  132. <td>No (default: <span class="code">true</span>)</td>
  133. <td>Whether to compare the page against external links found in its wikitext.</td>
  134. </tr>
  135. <tr>
  136. <td>nocache</td>
  137. <td>boolean</td>
  138. <td>No (default: <span class="code">false</span>)</td>
  139. <td>Whether to bypass search results cached from previous checks. It is recommended that you don't pass this option unless a user specifically asks for it.</td>
  140. </tr>
  141. <tr>
  142. <td>noredirect</td>
  143. <td>boolean</td>
  144. <td>No (default: <span class="code">false</span>)</td>
  145. <td>Whether to avoid following redirects if the given page is a redirect.</td>
  146. </tr>
  147. <tr>
  148. <td>noskip</td>
  149. <td>boolean</td>
  150. <td>No (default: <span class="code">false</span>)</td>
  151. <td>If a suspected source is found during a check to have a sufficiently high confidence value, the check will end prematurely, and other pending URLs will be skipped. Passing this option will prevent this behavior, resulting in complete (but more time-consuming) checks.</td>
  152. </tr>
  153. </table>
  154. <h2>Responses</h2>
  155. <p>The JSON response object always contains a <span class="code">status</span> key, whose value is either <span class="code">ok</span> or <span class="code">error</span>. If an error has occurred, the response will look like this:</p>
  156. <pre>{
  157. "status": "error",
  158. "error": {
  159. "code": (string) error code,
  160. "info": (string) human-readable description of error
  161. }
  162. }</pre>
  163. <p>Valid responses for <span class="code">action=compare</span> and <span class="code">action=search</span> are formatted like this:</p>
  164. <pre>{
  165. "status": "ok",
  166. "meta": {
  167. "time": (float) time to generate results, in seconds,
  168. "queries": (int) number of search engine queries made,
  169. "cached": (boolean) whether or not these results are cached from an earlier search (always false in the case of action=compare),
  170. (only if cached=true) "cache_time": (string) human-readable time of the original search that the results are cached from
  171. "redirected": (boolean) whether or not a redirect was followed
  172. },
  173. "page": {
  174. "title": (string) the normalized title of the page checked,
  175. "url": (string) the full URL of the page checked
  176. },
  177. (only if redirected=true) "original_page": {
  178. "title": (string) the normalized title of the original page whose redirect was followed,
  179. "url": (string) the full URL of the original page whose redirect was followed
  180. },
  181. "best": {
  182. "url": (string) the URL of the best match found, or null if no matches were found,
  183. "confidence": (float) the confidence of a violation in the best match, or 0.0 if no matches were found,
  184. "violation": (string) one of "suspected", "possible", or "none"
  185. },
  186. "sources": [
  187. {
  188. "url": (string) the URL of the source,
  189. "confidence": (float) the confidence of a violation in the source,
  190. "violation": (string) one of "suspected", "possible", or "none",
  191. "skipped": (boolean) whether or not the source was skipped due to the check finishing early (see note about noskip above)
  192. },
  193. ...
  194. ]
  195. }</pre>
  196. <p>In the case of <span class="code">action=search</span>, <span class="code">sources</span> will contain one entry for each source checked (or skipped if the check ends early), sorted in order of confidence, with skipped sources at the bottom.</p>
  197. <p>In the case of <span class="code">action=compare</span>, <span class="code">best</span> will always contain information about the URL that was given, so <span class="code">response["best"]["url"]</span> will never be <span class="code">null</span>. Also, <span class="code">sources</span> will always contain one entry, with the same data as <span class="code">best</span>, since only one source is checked in comparison mode.</p>
  198. <p>Valid responses for <span class="code">action=sites</span> are formatted like this:</p>
  199. <pre>{
  200. "status": "ok",
  201. "langs": [
  202. [
  203. (string) language code,
  204. (string) human-readable language name
  205. ],
  206. ...
  207. ],
  208. "projects": [
  209. [
  210. (string) project code,
  211. (string) human-readable project name
  212. ],
  213. ...
  214. ]
  215. }</pre>
  216. <h2>Example</h2>
  217. <p>GET https://tools.wmflabs.org/copyvios/api.json?version=1&amp;action=search&amp;project=wikipedia&amp;lang=en&amp;title=User:The_Earwig/Sandbox/CopyvioExample</p>
  218. <pre>
  219. {
  220. "status": "ok",
  221. "meta": {
  222. "time": 2.2474379539489746,
  223. "queries": 1,
  224. "cached": false,
  225. "redirected": false
  226. },
  227. "page": {
  228. "title": "User:The Earwig/Sandbox/CopyvioExample",
  229. "url": "https://en.wikipedia.org/wiki/User:The_Earwig/Sandbox/CopyvioExample"
  230. },
  231. "best": {
  232. "url": "http://www.whitehouse.gov/administration/president-obama/",
  233. "confidence": 0.9886608511242603,
  234. "violation": "suspected"
  235. }
  236. "sources": [
  237. {
  238. "url": "http://www.whitehouse.gov/administration/president-obama/",
  239. "confidence": 0.9886608511242603,
  240. "violation": "suspected",
  241. "skipped": false
  242. },
  243. {
  244. "url": "http://maige2009.blogspot.com/2013/07/barack-h-obama-is-44th-president-of.html",
  245. "confidence": 0.9864798816568047,
  246. "violation": "suspected",
  247. "skipped": false
  248. },
  249. {
  250. "url": "http://jeuxdemonstre-apkdownload.rhcloud.com/luo-people-of-kenya-and-tanzania---wikipedia--the-free",
  251. "confidence": 0.0,
  252. "violation": "none",
  253. "skipped": false
  254. },
  255. {
  256. "url": "http://www.whitehouse.gov/about/presidents/barackobama",
  257. "confidence": 0.0,
  258. "violation": "none",
  259. "skipped": true
  260. },
  261. {
  262. "url": "http://jeuxdemonstre-apkdownload.rhcloud.com/president-barack-obama---the-white-house",
  263. "confidence": 0.0,
  264. "violation": "none",
  265. "skipped": true
  266. }
  267. ]
  268. }
  269. </pre>
  270. </div>
  271. % endif
  272. % if result:
  273. <div id="result">
  274. <p>You are using <span class="code">jsonfm</span> output mode, which renders JSON data as a formatted HTML document. This is intended for testing and debugging only.</p>
  275. ${walk_json(result)}
  276. </div>
  277. % endif
  278. </body>
  279. </html>