A Python parser for MediaWiki wikicode https://mwparserfromhell.readthedocs.io/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

481 lines
16 KiB

  1. name: basic
  2. label: basic external link
  3. input: "http://example.com/"
  4. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose()]
  5. ---
  6. name: basic_brackets
  7. label: basic external link in brackets
  8. input: "[http://example.com/]"
  9. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose()]
  10. ---
  11. name: brackets_space
  12. label: basic external link in brackets, with a space after
  13. input: "[http://example.com/ ]"
  14. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), ExternalLinkClose()]
  15. ---
  16. name: brackets_title
  17. label: basic external link in brackets, with a title
  18. input: "[http://example.com/ Example]"
  19. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  20. ---
  21. name: brackets_multiword_title
  22. label: basic external link in brackets, with a multi-word title
  23. input: "[http://example.com/ Example Web Page]"
  24. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example Web Page"), ExternalLinkClose()]
  25. ---
  26. name: brackets_adjacent
  27. label: three adjacent bracket-enclosed external links
  28. input: "[http://foo.com/ Foo][http://bar.com/ Bar]\n[http://baz.com/ Baz]"
  29. output: [ExternalLinkOpen(brackets=True), Text(text="http://foo.com/"), ExternalLinkSeparator(), Text(text="Foo"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://bar.com/"), ExternalLinkSeparator(), Text(text="Bar"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=True), Text(text="http://baz.com/"), ExternalLinkSeparator(), Text(text="Baz"), ExternalLinkClose()]
  30. ---
  31. name: brackets_newline_before
  32. label: bracket-enclosed link with a newline before the title
  33. input: "[http://example.com/ \nExample]"
  34. output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" \nExample]")]
  35. ---
  36. name: brackets_newline_inside
  37. label: bracket-enclosed link with a newline in the title
  38. input: "[http://example.com/ Example \nWeb Page]"
  39. output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example \nWeb Page]")]
  40. ---
  41. name: brackets_newline_after
  42. label: bracket-enclosed link with a newline after the title
  43. input: "[http://example.com/ Example\n]"
  44. output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example\n]")]
  45. ---
  46. name: brackets_space_before
  47. label: bracket-enclosed link with a space before the URL
  48. input: "[ http://example.com Example]"
  49. output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" Example]")]
  50. ---
  51. name: brackets_title_like_url
  52. label: bracket-enclosed link with a title that looks like a URL
  53. input: "[http://example.com http://example.com]"
  54. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="http://example.com"), ExternalLinkClose()]
  55. ---
  56. name: brackets_recursive
  57. label: bracket-enclosed link with a bracket-enclosed link as the title
  58. input: "[http://example.com [http://example.com]]"
  59. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="[http://example.com"), ExternalLinkClose(), Text(text="]")]
  60. ---
  61. name: brackets_recursive_2
  62. label: bracket-enclosed link with a double bracket-enclosed link as the title
  63. input: "[http://example.com [[http://example.com]]]"
  64. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="[[http://example.com"), ExternalLinkClose(), Text(text="]]")]
  65. ---
  66. name: period_after
  67. label: a period after a free link that is excluded
  68. input: "http://example.com."
  69. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=".")]
  70. ---
  71. name: colons_after
  72. label: colons after a free link that are excluded
  73. input: "http://example.com/foo:bar.:;baz!?,"
  74. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/foo:bar.:;baz"), ExternalLinkClose(), Text(text="!?,")]
  75. ---
  76. name: close_paren_after_excluded
  77. label: a closing parenthesis after a free link that is excluded
  78. input: "http://example.)com)"
  79. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.)com"), ExternalLinkClose(), Text(text=")")]
  80. ---
  81. name: close_paren_after_included
  82. label: a closing parenthesis after a free link that is included because of an opening parenthesis in the URL
  83. input: "http://example.(com)"
  84. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.(com)"), ExternalLinkClose()]
  85. ---
  86. name: open_bracket_inside
  87. label: an open bracket inside a free link that causes it to be ended abruptly
  88. input: "http://foobar[baz.com"
  89. output: [ExternalLinkOpen(brackets=False), Text(text="http://foobar"), ExternalLinkClose(), Text(text="[baz.com")]
  90. ---
  91. name: brackets_period_after
  92. label: a period after a bracket-enclosed link that is included
  93. input: "[http://example.com. Example]"
  94. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com."), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  95. ---
  96. name: brackets_punct_after
  97. label: punctuation after a bracket-enclosed link that are included
  98. input: "[http://example.com/foo:bar.:;baz!?, Example]"
  99. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/foo:bar.:;baz!?,"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  100. ---
  101. name: brackets_close_paren_after_included
  102. label: a closing parenthesis after a bracket-enclosed link that is included
  103. input: "[http://example.)com) Example]"
  104. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.)com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  105. ---
  106. name: brackets_close_paren_after_included_2
  107. label: a closing parenthesis after a bracket-enclosed link that is also included
  108. input: "[http://example.(com) Example]"
  109. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.(com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  110. ---
  111. name: brackets_open_bracket_inside
  112. label: an open bracket inside a bracket-enclosed link that is also included
  113. input: "[http://foobar[baz.com Example]"
  114. output: [ExternalLinkOpen(brackets=True), Text(text="http://foobar[baz.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  115. ---
  116. name: adjacent_space
  117. label: two free links separated by a space
  118. input: "http://example.com http://example.com"
  119. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
  120. ---
  121. name: adjacent_newline
  122. label: two free links separated by a newline
  123. input: "http://example.com\nhttp://example.com"
  124. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
  125. ---
  126. name: adjacent_close_bracket
  127. label: two free links separated by a close bracket
  128. input: "http://example.com]http://example.com"
  129. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="]"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
  130. ---
  131. name: html_entity_in_url
  132. label: a HTML entity parsed correctly inside a free link
  133. input: "http://exa mple.com/"
  134. output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="mple.com/"), ExternalLinkClose()]
  135. ---
  136. name: template_in_url
  137. label: a template parsed correctly inside a free link
  138. input: "http://exa{{template}}mple.com/"
  139. output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), TemplateOpen(), Text(text="template"), TemplateClose(), Text(text="mple.com/"), ExternalLinkClose()]
  140. ---
  141. name: argument_in_url
  142. label: an argument parsed correctly inside a free link
  143. input: "http://exa{{{argument}}}mple.com/"
  144. output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ArgumentOpen(), Text(text="argument"), ArgumentClose(), Text(text="mple.com/"), ExternalLinkClose()]
  145. ---
  146. name: wikilink_in_url
  147. label: a wikilink that destroys a free link
  148. input: "http://exa[[wikilink]]mple.com/"
  149. output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), WikilinkOpen(), Text(text="wikilink"), WikilinkClose(), Text(text="mple.com/")]
  150. ---
  151. name: external_link_in_url
  152. label: a bracketed link that destroys a free link
  153. input: "http://exa[http://example.com/]mple.com/"
  154. output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose(), Text(text="mple.com/")]
  155. ---
  156. name: spaces_padding
  157. label: spaces padding a free link
  158. input: " http://example.com "
  159. output: [Text(text=" "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" ")]
  160. ---
  161. name: text_and_spaces_padding
  162. label: text and spaces padding a free link
  163. input: "x http://example.com x"
  164. output: [Text(text="x "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" x")]
  165. ---
  166. name: template_before
  167. label: a template before a free link
  168. input: "{{foo}}http://example.com"
  169. output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
  170. ---
  171. name: spaces_padding_no_slashes
  172. label: spaces padding a free link with no slashes after the colon
  173. input: " mailto:example@example.com "
  174. output: [Text(text=" "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text=" ")]
  175. ---
  176. name: text_and_spaces_padding_no_slashes
  177. label: text and spaces padding a free link with no slashes after the colon
  178. input: "x mailto:example@example.com x"
  179. output: [Text(text="x "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text=" x")]
  180. ---
  181. name: template_before_no_slashes
  182. label: a template before a free link with no slashes after the colon
  183. input: "{{foo}}mailto:example@example.com"
  184. output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]
  185. ---
  186. name: no_slashes
  187. label: a free link with no slashes after the colon
  188. input: "mailto:example@example.com"
  189. output: [ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]
  190. ---
  191. name: slashes_optional
  192. label: a free link using a scheme that doesn't need slashes, but has them anyway
  193. input: "mailto://example@example.com"
  194. output: [ExternalLinkOpen(brackets=False), Text(text="mailto://example@example.com"), ExternalLinkClose()]
  195. ---
  196. name: short
  197. label: a very short free link
  198. input: "mailto://abc"
  199. output: [ExternalLinkOpen(brackets=False), Text(text="mailto://abc"), ExternalLinkClose()]
  200. ---
  201. name: slashes_missing
  202. label: slashes missing from a free link with a scheme that requires them
  203. input: "http:example@example.com"
  204. output: [Text(text="http:example@example.com")]
  205. ---
  206. name: no_scheme_but_slashes
  207. label: no scheme in a free link, but slashes (protocol-relative free links are not supported)
  208. input: "//example.com"
  209. output: [Text(text="//example.com")]
  210. ---
  211. name: no_scheme_but_colon
  212. label: no scheme in a free link, but a colon
  213. input: " :example.com"
  214. output: [Text(text=" :example.com")]
  215. ---
  216. name: no_scheme_but_colon_and_slashes
  217. label: no scheme in a free link, but a colon and slashes
  218. input: " ://example.com"
  219. output: [Text(text=" ://example.com")]
  220. ---
  221. name: fake_scheme_no_slashes
  222. label: a nonexistent scheme in a free link, without slashes
  223. input: "fake:example.com"
  224. output: [Text(text="fake:example.com")]
  225. ---
  226. name: fake_scheme_slashes
  227. label: a nonexistent scheme in a free link, with slashes
  228. input: "fake://example.com"
  229. output: [Text(text="fake://example.com")]
  230. ---
  231. name: fake_scheme_brackets_no_slashes
  232. label: a nonexistent scheme in a bracketed link, without slashes
  233. input: "[fake:example.com]"
  234. output: [Text(text="[fake:example.com]")]
  235. ---
  236. name: fake_scheme_brackets_slashes
  237. label: #=a nonexistent scheme in a bracketed link, with slashes
  238. input: "[fake://example.com]"
  239. output: [Text(text="[fake://example.com]")]
  240. ---
  241. name: interrupted_scheme
  242. label: an otherwise valid scheme with something in the middle of it, in a free link
  243. input: "ht?tp://example.com"
  244. output: [Text(text="ht?tp://example.com")]
  245. ---
  246. name: interrupted_scheme_brackets
  247. label: an otherwise valid scheme with something in the middle of it, in a bracketed link
  248. input: "[ht?tp://example.com]"
  249. output: [Text(text="[ht?tp://example.com]")]
  250. ---
  251. name: no_slashes_brackets
  252. label: no slashes after the colon in a bracketed link
  253. input: "[mailto:example@example.com Example]"
  254. output: [ExternalLinkOpen(brackets=True), Text(text="mailto:example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  255. ---
  256. name: space_before_no_slashes_brackets
  257. label: a space before a bracketed link with no slashes after the colon
  258. input: "[ mailto:example@example.com Example]"
  259. output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text=" Example]")]
  260. ---
  261. name: slashes_optional_brackets
  262. label: a bracketed link using a scheme that doesn't need slashes, but has them anyway
  263. input: "[mailto://example@example.com Example]"
  264. output: [ExternalLinkOpen(brackets=True), Text(text="mailto://example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  265. ---
  266. name: short_brackets
  267. label: a very short link in brackets
  268. input: "[mailto://abc Example]"
  269. output: [ExternalLinkOpen(brackets=True), Text(text="mailto://abc"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  270. ---
  271. name: slashes_missing_brackets
  272. label: slashes missing from a scheme that requires them in a bracketed link
  273. input: "[http:example@example.com Example]"
  274. output: [Text(text="[http:example@example.com Example]")]
  275. ---
  276. name: protcol_relative
  277. label: a protocol-relative link (in brackets)
  278. input: "[//example.com Example]"
  279. output: [ExternalLinkOpen(brackets=True), Text(text="//example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
  280. ---
  281. name: scheme_missing_but_colon_brackets
  282. label: scheme missing from a bracketed link, but with a colon
  283. input: "[:example.com Example]"
  284. output: [Text(text="[:example.com Example]")]
  285. ---
  286. name: scheme_missing_but_colon_slashes_brackets
  287. label: scheme missing from a bracketed link, but with a colon and slashes
  288. input: "[://example.com Example]"
  289. output: [Text(text="[://example.com Example]")]
  290. ---
  291. name: unclosed_protocol_relative
  292. label: an unclosed protocol-relative bracketed link
  293. input: "[//example.com"
  294. output: [Text(text="[//example.com")]
  295. ---
  296. name: space_before_protcol_relative
  297. label: a space before a protocol-relative bracketed link
  298. input: "[ //example.com]"
  299. output: [Text(text="[ //example.com]")]
  300. ---
  301. name: unclosed_just_scheme
  302. label: an unclosed bracketed link, ending after the scheme
  303. input: "[http"
  304. output: [Text(text="[http")]
  305. ---
  306. name: unclosed_scheme_colon
  307. label: an unclosed bracketed link, ending after the colon
  308. input: "[http:"
  309. output: [Text(text="[http:")]
  310. ---
  311. name: unclosed_scheme_colon_slashes
  312. label: an unclosed bracketed link, ending after the slashes
  313. input: "[http://"
  314. output: [Text(text="[http://")]
  315. ---
  316. name: incomplete_bracket
  317. label: just an open bracket
  318. input: "["
  319. output: [Text(text="[")]
  320. ---
  321. name: incomplete_scheme_colon
  322. label: a free link with just a scheme and a colon
  323. input: "http:"
  324. output: [Text(text="http:")]
  325. ---
  326. name: incomplete_scheme_colon_slashes
  327. label: a free link with just a scheme, colon, and slashes
  328. input: "http://"
  329. output: [Text(text="http://")]
  330. ---
  331. name: brackets_scheme_but_no_url
  332. label: brackets around a scheme and a colon
  333. input: "[mailto:]"
  334. output: [Text(text="[mailto:]")]
  335. ---
  336. name: brackets_scheme_slashes_but_no_url
  337. label: brackets around a scheme, colon, and slashes
  338. input: "[http://]"
  339. output: [Text(text="[http://]")]
  340. ---
  341. name: brackets_scheme_title_but_no_url
  342. label: brackets around a scheme, colon, and slashes, with a title
  343. input: "[http:// Example]"
  344. output: [Text(text="[http:// Example]")]