A Python parser for MediaWiki wikicode https://mwparserfromhell.readthedocs.io/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

153 lines
8.6 KiB

  1. name: empty
  2. label: sanity check that parsing an empty string yields nothing
  3. input: ""
  4. output: []
  5. ---
  6. name: template_argument_mix
  7. label: an ambiguous mix of templates and arguments
  8. input: "{{{{{{{{foo}}}}}}}}{{{{{{{bar}}baz}}}buz}}"
  9. output: [TemplateOpen(), ArgumentOpen(), ArgumentOpen(), Text(text="foo"), ArgumentClose(), ArgumentClose(), TemplateClose(), TemplateOpen(), ArgumentOpen(), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), ArgumentClose(), Text(text="buz"), TemplateClose()]
  10. ---
  11. name: link_in_template_name
  12. label: a wikilink inside a template name, which breaks the template
  13. input: "{{foo[[bar]]}}"
  14. output: [Text(text="{{foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="}}")]
  15. ---
  16. name: rich_heading
  17. label: a heading with templates/wikilinks in it
  18. input: "== Head{{ing}} [[with]] {{{funky|{{stuf}}}}} =="
  19. output: [HeadingStart(level=2), Text(text=" Head"), TemplateOpen(), Text(text="ing"), TemplateClose(), Text(text=" "), WikilinkOpen(), Text(text="with"), WikilinkClose(), Text(text=" "), ArgumentOpen(), Text(text="funky"), ArgumentSeparator(), TemplateOpen(), Text(text="stuf"), TemplateClose(), ArgumentClose(), Text(text=" "), HeadingEnd()]
  20. ---
  21. name: html_entity_with_template
  22. label: a HTML entity with a template embedded inside
  23. input: "&n{{bs}}p;"
  24. output: [Text(text="&n"), TemplateOpen(), Text(text="bs"), TemplateClose(), Text(text="p;")]
  25. ---
  26. name: html_entity_with_comment
  27. label: a HTML entity with a comment embedded inside
  28. input: "&n<!--foo-->bsp;"
  29. output: [Text(text="&n"), CommentStart(), Text(text="foo"), CommentEnd(), Text(text="bsp;")]
  30. ---
  31. name: rich_tags
  32. label: a HTML tag with tons of other things in it
  33. input: "{{dubious claim}}<ref name={{abc}} foo="bar {{baz}}" abc={{de}}f ghi=j{{k}}{{l}} \n mno = "{{p}} [[q]] {{r}}">[[Source]]</ref>"
  34. output: [TemplateOpen(), Text(text="dubious claim"), TemplateClose(), TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TemplateOpen(), Text(text="abc"), TemplateClose(), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(), Text(text="bar "), TemplateOpen(), Text(text="baz"), TemplateClose(), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="abc"), TagAttrEquals(), TemplateOpen(), Text(text="de"), TemplateClose(), Text(text="f"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="ghi"), TagAttrEquals(), Text(text="j"), TemplateOpen(), Text(text="k"), TemplateClose(), TemplateOpen(), Text(text="l"), TemplateClose(), TagAttrStart(pad_first=" \n ", pad_before_eq=" ", pad_after_eq=" "), Text(text="mno"), TagAttrEquals(), TagAttrQuote(), TemplateOpen(), Text(text="p"), TemplateClose(), Text(text=" "), WikilinkOpen(), Text(text="q"), WikilinkClose(), Text(text=" "), TemplateOpen(), Text(text="r"), TemplateClose(), TagCloseOpen(padding=""), WikilinkOpen(), Text(text="Source"), WikilinkClose(), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  35. ---
  36. name: wildcard
  37. label: a wildcard assortment of various things
  38. input: "{{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}"
  39. output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), Text(text="biz"), TemplateClose(), Text(text="buzz"), TemplateClose(), Text(text="usr"), TemplateParamSeparator(), TemplateOpen(), Text(text="bin"), TemplateClose(), TemplateClose()]
  40. ---
  41. name: wildcard_redux
  42. label: an even wilder assortment of various things
  43. input: "{{a|b|{{c|[[d]]{{{e}}}}}}}[[f|{{{g}}}<!--h-->]]{{i|j=&nbsp;}}"
  44. output: [TemplateOpen(), Text(text="a"), TemplateParamSeparator(), Text(text="b"), TemplateParamSeparator(), TemplateOpen(), Text(text="c"), TemplateParamSeparator(), WikilinkOpen(), Text(text="d"), WikilinkClose(), ArgumentOpen(), Text(text="e"), ArgumentClose(), TemplateClose(), TemplateClose(), WikilinkOpen(), Text(text="f"), WikilinkSeparator(), ArgumentOpen(), Text(text="g"), ArgumentClose(), CommentStart(), Text(text="h"), CommentEnd(), WikilinkClose(), TemplateOpen(), Text(text="i"), TemplateParamSeparator(), Text(text="j"), TemplateParamEquals(), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), TemplateClose()]
  45. ---
  46. name: link_inside_dl
  47. label: an external link inside a def list, such that the external link is parsed
  48. input: ";;;mailto:example"
  49. output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example"), ExternalLinkClose()]
  50. ---
  51. name: link_inside_dl_2
  52. label: an external link inside a def list, such that the external link is not parsed
  53. input: ";;;malito:example"
  54. output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="malito"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="example")]
  55. ---
  56. name: link_inside_template
  57. label: an external link nested inside a template, before the end
  58. input: "{{URL|http://example.com}}"
  59. output: [TemplateOpen(), Text(text="URL"), TemplateParamSeparator(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), TemplateClose()]
  60. ---
  61. name: link_inside_template_2
  62. label: an external link nested inside a template, before a separator
  63. input: "{{URL|http://example.com|foobar}}"
  64. output: [TemplateOpen(), Text(text="URL"), TemplateParamSeparator(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), TemplateParamSeparator(), Text(text="foobar"), TemplateClose()]
  65. ---
  66. name: link_inside_template_3
  67. label: an external link nested inside a template, before an equal sign
  68. input: "{{URL|http://example.com=foobar}}"
  69. output: [TemplateOpen(), Text(text="URL"), TemplateParamSeparator(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), TemplateParamEquals(), Text(text="foobar"), TemplateClose()]
  70. ---
  71. name: link_inside_argument
  72. label: an external link nested inside an argument
  73. input: "{{{URL|http://example.com}}}"
  74. output: [ArgumentOpen(), Text(text="URL"), ArgumentSeparator(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), ArgumentClose()]
  75. ---
  76. name: link_inside_heading
  77. label: an external link nested inside a heading
  78. input: "==http://example.com=="
  79. output: [HeadingStart(level=2), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), HeadingEnd()]
  80. ---
  81. name: link_inside_tag_body
  82. label: an external link nested inside the body of a tag
  83. input: "<ref>http://example.com</ref>"
  84. output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), TagOpenClose(), Text(text="ref"), TagCloseClose()]
  85. ---
  86. name: link_inside_tag_style
  87. label: an external link nested inside style tags
  88. input: "''http://example.com''"
  89. output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]
  90. ---
  91. name: style_tag_inside_link
  92. label: style tags disrupting an external link
  93. input: "http://example.com/foo''bar''"
  94. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/foo"), ExternalLinkClose(), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar"), TagOpenClose(), Text(text="i"), TagCloseClose()]
  95. ---
  96. name: comment_inside_link
  97. label: an HTML comment inside an external link
  98. input: "http://example.com/foo<!--comment-->bar"
  99. output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/foo"), CommentStart(), Text(text="comment"), CommentEnd(), Text(text="bar"), ExternalLinkClose()]
  100. ---
  101. name: bracketed_link_inside_template
  102. label: a bracketed external link nested inside a template, before the end
  103. input: "{{URL|[http://example.com}}]"
  104. output: [Text(text="{{URL|"), ExternalLinkOpen(brackets=True), Text(text="http://example.com}}"), ExternalLinkClose()]
  105. ---
  106. name: comment_inside_bracketed_link
  107. label: an HTML comment inside a bracketed external link
  108. input: "[http://example.com/foo<!--comment-->bar]"
  109. output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/foo"), CommentStart(), Text(text="comment"), CommentEnd(), Text(text="bar"), ExternalLinkClose()]