A Python parser for MediaWiki wikicode https://mwparserfromhell.readthedocs.io/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

85 lines
3.2 KiB

  1. Usage
  2. =====
  3. Normal usage is rather straightforward (where ``text`` is page text)::
  4. >>> import mwparserfromhell
  5. >>> wikicode = mwparserfromhell.parse(text)
  6. ``wikicode`` is a :class:`mwparserfromhell.Wikicode <.Wikicode>` object, which
  7. acts like an ordinary ``str`` object (or ``unicode`` in Python 2) with some
  8. extra methods. For example::
  9. >>> text = "I has a template! {{foo|bar|baz|eggs=spam}} See it?"
  10. >>> wikicode = mwparserfromhell.parse(text)
  11. >>> print(wikicode)
  12. I has a template! {{foo|bar|baz|eggs=spam}} See it?
  13. >>> templates = wikicode.filter_templates()
  14. >>> print(templates)
  15. ['{{foo|bar|baz|eggs=spam}}']
  16. >>> template = templates[0]
  17. >>> print(template.name)
  18. foo
  19. >>> print(template.params)
  20. ['bar', 'baz', 'eggs=spam']
  21. >>> print(template.get(1).value)
  22. bar
  23. >>> print(template.get("eggs").value)
  24. spam
  25. Since nodes can contain other nodes, getting nested templates is trivial::
  26. >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
  27. >>> mwparserfromhell.parse(text).filter_templates()
  28. ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']
  29. You can also pass *recursive=False* to :meth:`.filter_templates` and explore
  30. templates manually. This is possible because nodes can contain additional
  31. :class:`.Wikicode` objects::
  32. >>> code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
  33. >>> print(code.filter_templates(recursive=False))
  34. ['{{foo|this {{includes a|template}}}}']
  35. >>> foo = code.filter_templates(recursive=False)[0]
  36. >>> print(foo.get(1).value)
  37. this {{includes a|template}}
  38. >>> print(foo.get(1).value.filter_templates()[0])
  39. {{includes a|template}}
  40. >>> print(foo.get(1).value.filter_templates()[0].get(1).value)
  41. template
  42. Templates can be easily modified to add, remove, or alter params.
  43. :class:`.Wikicode` objects can be treated like lists, with
  44. :meth:`~.Wikicode.append`, :meth:`~.Wikicode.insert`,
  45. :meth:`~.Wikicode.remove`, :meth:`~.Wikicode.replace`, and more. They also have
  46. a :meth:`~.Wikicode.matches` method for comparing page or template names, which
  47. takes care of capitalization and whitespace::
  48. >>> text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
  49. >>> code = mwparserfromhell.parse(text)
  50. >>> for template in code.filter_templates():
  51. ... if template.name.matches("Cleanup") and not template.has("date"):
  52. ... template.add("date", "July 2012")
  53. ...
  54. >>> print(code)
  55. {{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{uncategorized}}
  56. >>> code.replace("{{uncategorized}}", "{{bar-stub}}")
  57. >>> print(code)
  58. {{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{bar-stub}}
  59. >>> print(code.filter_templates())
  60. ['{{cleanup|date=July 2012}}', '{{bar-stub}}']
  61. You can then convert ``code`` back into a regular :class:`str` object (for
  62. saving the page!) by calling :func:`str` on it::
  63. >>> text = str(code)
  64. >>> print(text)
  65. {{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{bar-stub}}
  66. >>> text == code
  67. True
  68. (Likewise, use :func:`unicode(code) <unicode>` in Python 2.)
  69. For more tips, check out :class:`Wikicode's full method list <.Wikicode>` and
  70. the :mod:`list of Nodes <.nodes>`.