A Python parser for MediaWiki wikicode https://mwparserfromhell.readthedocs.io/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

656 lines
28 KiB

  1. name: basic
  2. label: a basic tag with an open and close
  3. input: "<ref></ref>"
  4. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  5. ---
  6. name: basic_selfclosing
  7. label: a basic self-closing tag
  8. input: "<ref/>"
  9. output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding="")]
  10. ---
  11. name: content
  12. label: a tag with some content in the middle
  13. input: "<ref>this is a reference</ref>"
  14. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), Text(text="this is a reference"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  15. ---
  16. name: padded_open
  17. label: a tag with some padding in the open tag
  18. input: "<ref ></ref>"
  19. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  20. ---
  21. name: padded_close
  22. label: a tag with some padding in the close tag
  23. input: "<ref></ref >"
  24. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref "), TagCloseClose()]
  25. ---
  26. name: padded_selfclosing
  27. label: a self-closing tag with padding
  28. input: "<ref />"
  29. output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding=" ")]
  30. ---
  31. name: attribute
  32. label: a tag with a single attribute
  33. input: "<ref name></ref>"
  34. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  35. ---
  36. name: attribute_value
  37. label: a tag with a single attribute with a value
  38. input: "<ref name=foo></ref>"
  39. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  40. ---
  41. name: attribute_quoted
  42. label: a tag with a single quoted attribute
  43. input: "<ref name="foo bar"></ref>"
  44. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="foo bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  45. ---
  46. name: attribute_single_quoted
  47. label: a tag with a single singly-quoted attribute
  48. input: "<ref name='foo bar'></ref>"
  49. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="'"), Text(text="foo bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  50. ---
  51. name: attribute_hyphen
  52. label: a tag with a single attribute, containing a hyphen
  53. input: "<ref name=foo-bar></ref>"
  54. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  55. ---
  56. name: attribute_quoted_hyphen
  57. label: a tag with a single quoted attribute, containing a hyphen
  58. input: "<ref name="foo-bar"></ref>"
  59. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  60. ---
  61. name: attribute_selfclosing
  62. label: a self-closing tag with a single attribute
  63. input: "<ref name/>"
  64. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseSelfclose(padding="")]
  65. ---
  66. name: attribute_selfclosing_value
  67. label: a self-closing tag with a single attribute with a value
  68. input: "<ref name=foo/>"
  69. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseSelfclose(padding="")]
  70. ---
  71. name: attribute_selfclosing_value_quoted
  72. label: a self-closing tag with a single quoted attribute
  73. input: "<ref name="foo"/>"
  74. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="foo"), TagCloseSelfclose(padding="")]
  75. ---
  76. name: nested_tag
  77. label: a tag nested within the attributes of another
  78. input: "<ref name=<span style="color: red;">foo</span>>citation</ref>"
  79. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  80. ---
  81. name: nested_tag_quoted
  82. label: a tag nested within the attributes of another, quoted
  83. input: "<ref name="<span style="color: red;">foo</span>">citation</ref>"
  84. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="\""), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  85. ---
  86. name: nested_troll_tag
  87. label: a bogus tag that appears to be nested within the attributes of another
  88. input: "<ref name=</ ><//>>citation</ref>"
  89. output: [Text(text="<ref name=</ ><//>>citation</ref>")]
  90. ---
  91. name: nested_troll_tag_quoted
  92. label: a bogus tag that appears to be nested within the attributes of another, quoted
  93. input: "<ref name="</ ><//>">citation</ref>"
  94. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="</ ><//>"), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  95. ---
  96. name: nested_tag_selfclosing
  97. label: a tag nested within the attributes of another; outer tag implicitly self-closing
  98. input: "<li <b></b></li>"
  99. output: [TagOpenOpen(), Text(text="li"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TagOpenOpen(), Text(text="b"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="</li"), TagCloseSelfclose(padding="", implicit=True)]
  100. ---
  101. name: invalid_space_begin_open
  102. label: invalid tag: a space at the beginning of the open tag
  103. input: "< ref>test</ref>"
  104. output: [Text(text="< ref>test</ref>")]
  105. ---
  106. name: invalid_space_begin_close
  107. label: invalid tag: a space at the beginning of the close tag
  108. input: "<ref>test</ ref>"
  109. output: [Text(text="<ref>test</ ref>")]
  110. ---
  111. name: valid_space_end
  112. label: valid tag: spaces at the ends of both the open and close tags
  113. input: "<ref >test</ref >"
  114. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), Text(text="test"), TagOpenClose(), Text(text="ref "), TagCloseClose()]
  115. ---
  116. name: invalid_template_ends
  117. label: invalid tag: a template at the ends of both the open and close tags
  118. input: "<ref {{foo}}>test</ref {{foo}}>"
  119. output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]
  120. ---
  121. name: invalid_template_ends_nospace
  122. label: invalid tag: a template at the ends of both the open and close tags, without spacing
  123. input: "<ref {{foo}}>test</ref{{foo}}>"
  124. output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]
  125. ---
  126. name: valid_template_end_open
  127. label: valid tag: a template at the end of the open tag
  128. input: "<ref {{foo}}>test</ref>"
  129. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  130. ---
  131. name: valid_template_end_open_space_end_close
  132. label: valid tag: a template at the end of the open tag; whitespace at the end of the close tag
  133. input: "<ref {{foo}}>test</ref\n>"
  134. output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref\n"), TagCloseClose()]
  135. ---
  136. name: invalid_template_end_open_nospace
  137. label: invalid tag: a template at the end of the open tag, without spacing
  138. input: "<ref{{foo}}>test</ref>"
  139. output: [Text(text="<ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref>")]
  140. ---
  141. name: invalid_template_start_close
  142. label: invalid tag: a template at the beginning of the close tag
  143. input: "<ref>test</{{foo}}ref>"
  144. output: [Text(text="<ref>test</"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>")]
  145. ---
  146. name: invalid_template_start_open
  147. label: invalid tag: a template at the beginning of the open tag
  148. input: "<{{foo}}ref>test</ref>"
  149. output: [Text(text="<"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>test</ref>")]
  150. ---
  151. name: unclosed_quote
  152. label: a quoted attribute that is never closed
  153. input: "<span style="foobar>stuff</span>"
  154. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foobar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  155. ---
  156. name: fake_quote
  157. label: a fake quoted attribute
  158. input: "<span style="foo"bar>stuff</span>"
  159. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo\"bar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  160. ---
  161. name: fake_quote_complex
  162. label: a fake quoted attribute, with spaces and templates and links
  163. input: "<span style="foo {{bar}}\n[[baz]]"buzz >stuff</span>"
  164. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo"), TagAttrStart(pad_first=" ", pad_before_eq="\n", pad_after_eq=""), TemplateOpen(), Text(text="bar"), TemplateClose(), TagAttrStart(pad_first="", pad_before_eq=" ", pad_after_eq=""), WikilinkOpen(), Text(text="baz"), WikilinkClose(), Text(text="\"buzz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  165. ---
  166. name: quotes_in_quotes
  167. label: singly-quoted text inside a doubly-quoted attribute
  168. input: "<span foo="bar 'baz buzz' biz">stuff</span>"
  169. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="bar 'baz buzz' biz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  170. ---
  171. name: quotes_in_quotes_2
  172. label: doubly-quoted text inside a singly-quoted attribute
  173. input: "<span foo='bar "baz buzz" biz'>stuff</span>"
  174. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(char="'"), Text(text="bar \"baz buzz\" biz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  175. ---
  176. name: quotes_in_quotes_3
  177. label: doubly-quoted text inside a singly-quoted attribute, with backslashes
  178. input: "<span foo='bar "baz buzz\\" biz'>stuff</span>"
  179. output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(char="'"), Text(text="bar \"baz buzz\\\" biz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
  180. ---
  181. name: incomplete_lbracket
  182. label: incomplete tags: just a left bracket
  183. input: "<"
  184. output: [Text(text="<")]
  185. ---
  186. name: incomplete_lbracket_junk
  187. label: incomplete tags: just a left bracket, surrounded by stuff
  188. input: "foo<bar"
  189. output: [Text(text="foo<bar")]
  190. ---
  191. name: incomplete_unclosed_open
  192. label: incomplete tags: an unclosed open tag
  193. input: "junk <ref"
  194. output: [Text(text="junk <ref")]
  195. ---
  196. name: incomplete_unclosed_open_space
  197. label: incomplete tags: an unclosed open tag, space
  198. input: "junk <ref "
  199. output: [Text(text="junk <ref ")]
  200. ---
  201. name: incomplete_unclosed_open_unnamed_attr
  202. label: incomplete tags: an unclosed open tag, unnamed attribute
  203. input: "junk <ref name"
  204. output: [Text(text="junk <ref name")]
  205. ---
  206. name: incomplete_unclosed_open_attr_equals
  207. label: incomplete tags: an unclosed open tag, attribute, equal sign
  208. input: "junk <ref name="
  209. output: [Text(text="junk <ref name=")]
  210. ---
  211. name: incomplete_unclosed_open_attr_equals_quoted
  212. label: incomplete tags: an unclosed open tag, attribute, equal sign, quote
  213. input: "junk <ref name=""
  214. output: [Text(text="junk <ref name=\"")]
  215. ---
  216. name: incomplete_unclosed_open_attr
  217. label: incomplete tags: an unclosed open tag, attribute with a key/value
  218. input: "junk <ref name=foo"
  219. output: [Text(text="junk <ref name=foo")]
  220. ---
  221. name: incomplete_unclosed_open_attr_quoted
  222. label: incomplete tags: an unclosed open tag, attribute with a key/value, quoted
  223. input: "junk <ref name="foo""
  224. output: [Text(text="junk <ref name=\"foo\"")]
  225. ---
  226. name: incomplete_open
  227. label: incomplete tags: an open tag
  228. input: "junk <ref>"
  229. output: [Text(text="junk <ref>")]
  230. ---
  231. name: incomplete_open_unnamed_attr
  232. label: incomplete tags: an open tag, unnamed attribute
  233. input: "junk <ref name>"
  234. output: [Text(text="junk <ref name>")]
  235. ---
  236. name: incomplete_open_attr_equals
  237. label: incomplete tags: an open tag, attribute, equal sign
  238. input: "junk <ref name=>"
  239. output: [Text(text="junk <ref name=>")]
  240. ---
  241. name: incomplete_open_attr
  242. label: incomplete tags: an open tag, attribute with a key/value
  243. input: "junk <ref name=foo>"
  244. output: [Text(text="junk <ref name=foo>")]
  245. ---
  246. name: incomplete_open_attr_quoted
  247. label: incomplete tags: an open tag, attribute with a key/value, quoted
  248. input: "junk <ref name="foo">"
  249. output: [Text(text="junk <ref name=\"foo\">")]
  250. ---
  251. name: incomplete_open_text
  252. label: incomplete tags: an open tag, text
  253. input: "junk <ref>foo"
  254. output: [Text(text="junk <ref>foo")]
  255. ---
  256. name: incomplete_open_attr_text
  257. label: incomplete tags: an open tag, attribute with a key/value, text
  258. input: "junk <ref name=foo>bar"
  259. output: [Text(text="junk <ref name=foo>bar")]
  260. ---
  261. name: incomplete_open_text_lbracket
  262. label: incomplete tags: an open tag, text, left open bracket
  263. input: "junk <ref>bar<"
  264. output: [Text(text="junk <ref>bar<")]
  265. ---
  266. name: incomplete_open_text_lbracket_slash
  267. label: incomplete tags: an open tag, text, left bracket, slash
  268. input: "junk <ref>bar</"
  269. output: [Text(text="junk <ref>bar</")]
  270. ---
  271. name: incomplete_open_text_unclosed_close
  272. label: incomplete tags: an open tag, text, unclosed close
  273. input: "junk <ref>bar</ref"
  274. output: [Text(text="junk <ref>bar</ref")]
  275. ---
  276. name: incomplete_open_text_wrong_close
  277. label: incomplete tags: an open tag, text, wrong close
  278. input: "junk <ref>bar</span>"
  279. output: [Text(text="junk <ref>bar</span>")]
  280. ---
  281. name: incomplete_unclosed_close
  282. label: incomplete tags: an unclosed close tag
  283. input: "junk </"
  284. output: [Text(text="junk </")]
  285. ---
  286. name: incomplete_unclosed_close_text
  287. label: incomplete tags: an unclosed close tag, with text
  288. input: "junk </br"
  289. output: [Text(text="junk </br")]
  290. ---
  291. name: incomplete_close
  292. label: incomplete tags: a close tag
  293. input: "junk </ref>"
  294. output: [Text(text="junk </ref>")]
  295. ---
  296. name: incomplete_no_tag_name_open
  297. label: incomplete tags: no tag name within brackets; just an open
  298. input: "junk <>"
  299. output: [Text(text="junk <>")]
  300. ---
  301. name: incomplete_no_tag_name_selfclosing
  302. label: incomplete tags: no tag name within brackets; self-closing
  303. input: "junk < />"
  304. output: [Text(text="junk < />")]
  305. ---
  306. name: incomplete_no_tag_name_open_close
  307. label: incomplete tags: no tag name within brackets; open and close
  308. input: "junk <></>"
  309. output: [Text(text="junk <></>")]
  310. ---
  311. name: backslash_premature_before
  312. label: a backslash before a quote before a space
  313. input: "<foo attribute="this is\\" quoted">blah</foo>"
  314. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="this is\\\" quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  315. ---
  316. name: backslash_premature_after
  317. label: a backslash before a quote after a space
  318. input: "<foo attribute="this is \\"quoted">blah</foo>"
  319. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="this is \\\"quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  320. ---
  321. name: backslash_premature_middle
  322. label: a backslash before a quote in the middle of a word
  323. input: "<foo attribute="this i\\"s quoted">blah</foo>"
  324. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="this i\\\"s quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  325. ---
  326. name: backslash_adjacent
  327. label: escaped quotes next to unescaped quotes
  328. input: "<foo attribute="\\"this is quoted\\"">blah</foo>"
  329. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="\\\"this is quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  330. ---
  331. name: backslash_endquote
  332. label: backslashes before the end quote, causing the attribute to become unquoted
  333. input: "<foo attribute="this_is quoted\\">blah</foo>"
  334. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), Text(text="\"this_is"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  335. ---
  336. name: backslash_double
  337. label: two adjacent backslashes, which do *not* affect the quote
  338. input: "<foo attribute="this is\\\\" quoted">blah</foo>"
  339. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="this is\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  340. ---
  341. name: backslash_triple
  342. label: three adjacent backslashes, which do *not* affect the quote
  343. input: "<foo attribute="this is\\\\\\" quoted">blah</foo>"
  344. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="this is\\\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  345. ---
  346. name: backslash_unaffecting
  347. label: backslashes near quotes, but not immediately adjacent, thus having no effect
  348. input: "<foo attribute="\\quote\\d" also="quote\\d\\">blah</foo>"
  349. output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="\\quote\\d"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="also"), TagAttrEquals(), Text(text="\"quote\\d\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
  350. ---
  351. name: unparsable
  352. label: a tag that should not be put through the normal parser
  353. input: "{{t1}}<nowiki>{{t2}}</nowiki>{{t3}}"
  354. output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
  355. ---
  356. name: unparsable_complex
  357. label: a tag that should not be put through the normal parser; lots of stuff inside
  358. input: "{{t1}}<pre>{{t2}}\n==Heading==\nThis is some text with a [[page|link]].</pre>{{t3}}"
  359. output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="pre"), TagCloseOpen(padding=""), Text(text="{{t2}}\n==Heading==\nThis is some text with a [[page|link]]."), TagOpenClose(), Text(text="pre"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
  360. ---
  361. name: unparsable_attributed
  362. label: a tag that should not be put through the normal parser; parsed attributes
  363. input: "{{t1}}<nowiki attr=val attr2="{{val2}}">{{t2}}</nowiki>{{t3}}"
  364. output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr"), TagAttrEquals(), Text(text="val"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr2"), TagAttrEquals(), TagAttrQuote(char="\""), TemplateOpen(), Text(text="val2"), TemplateClose(), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
  365. ---
  366. name: unparsable_incomplete
  367. label: a tag that should not be put through the normal parser; incomplete
  368. input: "{{t1}}<nowiki>{{t2}}{{t3}}"
  369. output: [TemplateOpen(), Text(text="t1"), TemplateClose(), Text(text="<nowiki>"), TemplateOpen(), Text(text="t2"), TemplateClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
  370. ---
  371. name: unparsable_entity
  372. label: a HTML entity inside unparsable text is still parsed
  373. input: "{{t1}}<nowiki>{{t2}}&nbsp;{{t3}}</nowiki>{{t4}}"
  374. output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="{{t3}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t4"), TemplateClose()]
  375. ---
  376. name: unparsable_entity_incomplete
  377. label: an incomplete HTML entity inside unparsable text
  378. input: "<nowiki>&</nowiki>"
  379. output: [TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="&"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
  380. ---
  381. name: unparsable_entity_incomplete_2
  382. label: an incomplete HTML entity inside unparsable text
  383. input: "<nowiki>&"
  384. output: [Text(text="<nowiki>&")]
  385. ---
  386. name: single_open_close
  387. label: a tag that supports being single; both an open and a close tag
  388. input: "foo<li>bar{{baz}}</li>"
  389. output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseOpen(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenClose(), Text(text="li"), TagCloseClose()]
  390. ---
  391. name: single_open
  392. label: a tag that supports being single; just an open tag
  393. input: "foo<li>bar{{baz}}"
  394. output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  395. ---
  396. name: single_selfclose
  397. label: a tag that supports being single; a self-closing tag
  398. input: "foo<li/>bar{{baz}}"
  399. output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  400. ---
  401. name: single_close
  402. label: a tag that supports being single; just a close tag
  403. input: "foo</li>bar{{baz}}"
  404. output: [Text(text="foo</li>bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  405. ---
  406. name: single_only_open_close
  407. label: a tag that can only be single; both an open and a close tag
  408. input: "foo<br>bar{{baz}}</br>"
  409. output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True)]
  410. ---
  411. name: single_only_open
  412. label: a tag that can only be single; just an open tag
  413. input: "foo<br>bar{{baz}}"
  414. output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  415. ---
  416. name: single_only_selfclose
  417. label: a tag that can only be single; a self-closing tag
  418. input: "foo<br/>bar{{baz}}"
  419. output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  420. ---
  421. name: single_only_close
  422. label: a tag that can only be single; just a close tag
  423. input: "foo</br>bar{{baz}}"
  424. output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  425. ---
  426. name: single_only_double
  427. label: a tag that can only be single; a tag with backslashes at the beginning and end
  428. input: "foo</br/>bar{{baz}}"
  429. output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
  430. ---
  431. name: single_only_close_attribute
  432. label: a tag that can only be single; presented as a close tag with an attribute
  433. input: "</br id="break">"
  434. output: [TagOpenOpen(invalid=True), Text(text="br"), TagAttrStart(pad_first=" ", pad_after_eq="", pad_before_eq=""), Text(text="id"), TagAttrEquals(), TagAttrQuote(char="\""), Text(text="break"), TagCloseSelfclose(padding="", implicit=True)]
  435. ---
  436. name: capitalization
  437. label: caps should be ignored within tag names
  438. input: "<NoWiKi>{{test}}</nOwIkI>"
  439. output: [TagOpenOpen(), Text(text="NoWiKi"), TagCloseOpen(padding=""), Text(text="{{test}}"), TagOpenClose(), Text(text="nOwIkI"), TagCloseClose()]
  440. ---
  441. name: unparsable_incomplete_close
  442. label: an unparsable tag with an incomplete close afterwards
  443. input: "<nowiki>foo</nowiki"
  444. output: [Text(text="<nowiki>foo</nowiki")]
  445. ---
  446. name: unparsable_with_intermediates
  447. label: an unparsable tag with intermediate tags inside of it
  448. input: "<nowiki><ref></ref></nowiki>"
  449. output: [TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="<ref></ref>"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
  450. ---
  451. name: unparsable_with_intermediates_normalize
  452. label: an unparsable tag with intermediate tags inside of it, requiring normalization
  453. input: "<nowiki><ref></ref></nowIKI >"
  454. output: [TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="<ref></ref>"), TagOpenClose(), Text(text="nowIKI "), TagCloseClose()]
  455. ---
  456. name: non_ascii_open
  457. label: a open tag containing non-ASCII characters
  458. input: "<éxamplé>"
  459. output: [Text(text="<éxamplé>")]
  460. ---
  461. name: non_ascii_full
  462. label: an open/close tag pair containing non-ASCII characters
  463. input: "<éxamplé></éxamplé>"
  464. output: [TagOpenOpen(), Text(text="éxamplé"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="éxamplé"), TagCloseClose()]
  465. ---
  466. name: single_nested_selfclosing
  467. label: a single (unpaired) tag with a self-closing tag in the middle (see issue #147)
  468. input: "<li a <br/> c>foobar"
  469. output: [TagOpenOpen(), Text(text="li"), TagAttrStart(pad_first=" ", pad_after_eq="", pad_before_eq=" "), Text(text="a"), TagAttrStart(pad_first="", pad_after_eq="", pad_before_eq=" "), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding=""), TagAttrStart(pad_first="", pad_after_eq="", pad_before_eq=""), Text(text="c"), TagCloseSelfclose(padding="", implicit=True), Text(text="foobar")]