Merge branch 'develop'

11 years ago · d53ca7837a
--- a/+ 26
+++ b/+ 26
@@ -1,4 +1,24 @@
 v0.1.1 (19da4d2144) to v0.2:
 v0.3 (released August 24, 2013):

 - Added complete support for HTML Tags, including forms like <ref>foo</ref>,
  <ref name="bar"/>, and wiki-markup tags like bold ('''), italics (''), and
  lists (*, #, ; and :).
 - Added support for ExternalLinks (http://example.com/ and
  [http://example.com/ Example]).
 - Wikicode's filter methods are now passed 'recursive=True' by default instead
  of False. This is a breaking change if you rely on any filter() methods being
  non-recursive by default.
 - Added a matches() method to Wikicode for page/template name comparisons.
 - The 'obj' param of Wikicode.insert_before(), insert_after(), replace(), and
  remove() now accepts other Wikicode objects and strings representing parts of
  wikitext, instead of just nodes. These methods also make all possible
  substitutions instead of just one.
 - Renamed Template.has_param() to has() for consistency with Template's other
  methods; has_param() is now an alias.
 - The C tokenizer extension now works on Python 3 in addition to Python 2.7.
 - Various bugfixes, internal changes, and cleanup.

 v0.2 (released June 20, 2013):

 - The parser now fully supports Python 3 in addition to Python 2.7.
 - Added a C tokenizer extension that is significantly faster than its Python
@@ -24,10 +44,14 @@ v0.1.1 (19da4d2144) to v0.2:
 - Fixed some broken example code in the README; other copyedits.
 - Other bugfixes and code cleanup.

 v0.1 (ba94938fe8) to v0.1.1 (19da4d2144):
 v0.1.1 (released September 21, 2012):

 - Added support for Comments (<!-- foo -->) and Wikilinks ([[foo]]).
 - Added corresponding ifilter_links() and filter_links() methods to Wikicode.
 - Fixed a bug when parsing incomplete templates.
 - Fixed strip_code() to affect the contents of headings.
 - Various copyedits in documentation and comments.

 v0.1 (released August 23, 2012):

 - Initial release.
--- a/README.rst
+++ b/README.rst
@@ -9,7 +9,8 @@ mwparserfromhell
 that provides an easy-to-use and outrageously powerful parser for MediaWiki_
 wikicode. It supports Python 2 and Python 3.

 Developed by Earwig_ with help from `Σ`_.
 Developed by Earwig_ with help from `Σ`_. Full documentation is available on
 ReadTheDocs_.

 Installation
 ------------
@@ -18,7 +19,7 @@ The easiest way to install the parser is through the `Python Package Index`_,
 so you can install the latest release with ``pip install mwparserfromhell``
 (`get pip`_). Alternatively, get the latest development version::

    git clone git://github.com/earwig/mwparserfromhell.git
    git clone https://github.com/earwig/mwparserfromhell.git
    cd mwparserfromhell
    python setup.py install

@@ -59,13 +60,20 @@ For example::
    >>> print template.get("eggs").value
    spam

 Since every node you reach is also a ``Wikicode`` object, it's trivial to get
 nested templates::
 Since nodes can contain other nodes, getting nested templates is trivial::

    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
    >>> mwparserfromhell.parse(text).filter_templates()
    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']

 You can also pass ``recursive=False`` to ``filter_templates()`` and explore
 templates manually. This is possible because nodes can contain additional
 ``Wikicode`` objects::

    >>> code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
    >>> print code.filter_templates()
    >>> print code.filter_templates(recursive=False)
    ['{{foo|this {{includes a|template}}}}']
    >>> foo = code.filter_templates()[0]
    >>> foo = code.filter_templates(recursive=False)[0]
    >>> print foo.get(1).value
    this {{includes a|template}}
    >>> print foo.get(1).value.filter_templates()[0]
@@ -73,21 +81,16 @@ nested templates::
    >>> print foo.get(1).value.filter_templates()[0].get(1).value
    template

 Additionally, you can include nested templates in ``filter_templates()`` by
 passing ``recursive=True``::

    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
    >>> mwparserfromhell.parse(text).filter_templates(recursive=True)
    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']

 Templates can be easily modified to add, remove, or alter params. ``Wikicode``
 can also be treated like a list with ``append()``, ``insert()``, ``remove()``,
 ``replace()``, and more::
 objects can be treated like lists, with ``append()``, ``insert()``,
 ``remove()``, ``replace()``, and more. They also have a ``matches()`` method
 for comparing page or template names, which takes care of capitalization and
 whitespace::

    >>> text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
    >>> code = mwparserfromhell.parse(text)
    >>> for template in code.filter_templates():
    ...     if template.name == "cleanup" and not template.has_param("date"):
    ...     if template.name.matches("Cleanup") and not template.has("date"):
    ...         template.add("date", "July 2012")
    ...
    >>> print code
@@ -142,6 +145,7 @@ following code (via the API_)::
        return mwparserfromhell.parse(text)

 .. _MediaWiki:              http://mediawiki.org
 .. _ReadTheDocs:            http://mwparserfromhell.readthedocs.org
 .. _Earwig:                 http://en.wikipedia.org/wiki/User:The_Earwig
 .. _Σ:                      http://en.wikipedia.org/wiki/User:%CE%A3
 .. _Python Package Index:   http://pypi.python.org
--- a/docs/api/mwparserfromhell.nodes.rst
+++ b/docs/api/mwparserfromhell.nodes.rst
@@ -25,6 +25,14 @@ nodes Package
    :undoc-members:
    :show-inheritance:

 :mod:`external_link` Module
 ---------------------------

 .. automodule:: mwparserfromhell.nodes.external_link
    :members:
    :undoc-members:
    :show-inheritance:

 :mod:`heading` Module
 ---------------------

@@ -46,6 +54,7 @@ nodes Package

 .. automodule:: mwparserfromhell.nodes.tag
    :members:
    :undoc-members:
    :show-inheritance:

 :mod:`template` Module
--- a/docs/api/mwparserfromhell.rst
+++ b/docs/api/mwparserfromhell.rst
@@ -30,6 +30,12 @@ mwparserfromhell Package
    :members:
    :undoc-members:

 :mod:`definitions` Module
 -------------------------

 .. automodule:: mwparserfromhell.definitions
    :members:

 :mod:`utils` Module
 -------------------

--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@@ -1,10 +1,38 @@
 Changelog
 =========

 v0.3
 ----

 `Released August 24, 2013 <https://github.com/earwig/mwparserfromhell/tree/v0.3>`_
 (`changes <https://github.com/earwig/mwparserfromhell/compare/v0.2...v0.3>`__):

 - Added complete support for HTML :py:class:`Tags <.Tag>`, including forms like
  ``<ref>foo</ref>``, ``<ref name="bar"/>``, and wiki-markup tags like bold
  (``'''``), italics (``''``), and lists (``*``, ``#``, ``;`` and ``:``).
 - Added support for :py:class:`.ExternalLink`\ s (``http://example.com/`` and
  ``[http://example.com/ Example]``).
 - :py:class:`Wikicode's <.Wikicode>` :py:meth:`.filter` methods are now passed
  *recursive=True* by default instead of *False*. **This is a breaking change
  if you rely on any filter() methods being non-recursive by default.**
 - Added a :py:meth:`.matches` method to :py:class:`~.Wikicode` for
  page/template name comparisons.
 - The *obj* param of :py:meth:`Wikicode.insert_before() <.insert_before>`,
  :py:meth:`~.insert_after`, :py:meth:`~.Wikicode.replace`, and
  :py:meth:`~.Wikicode.remove` now accepts :py:class:`~.Wikicode` objects and
  strings representing parts of wikitext, instead of just nodes. These methods
  also make all possible substitutions instead of just one.
 - Renamed :py:meth:`Template.has_param() <.has_param>` to
  :py:meth:`~.Template.has` for consistency with :py:class:`~.Template`\ 's
  other methods; :py:meth:`~.has_param` is now an alias.
 - The C tokenizer extension now works on Python 3 in addition to Python 2.7.
 - Various bugfixes, internal changes, and cleanup.

 v0.2
 ----

 19da4d2144_ to master_ (released June 20, 2013)
 `Released June 20, 2013 <https://github.com/earwig/mwparserfromhell/tree/v0.2>`_
 (`changes <https://github.com/earwig/mwparserfromhell/compare/v0.1.1...v0.2>`__):

 - The parser now fully supports Python 3 in addition to Python 2.7.
 - Added a C tokenizer extension that is significantly faster than its Python
@@ -38,7 +66,8 @@ v0.2
 v0.1.1
 ------

 ba94938fe8_ to 19da4d2144_ (released September 21, 2012)
 `Released September 21, 2012 <https://github.com/earwig/mwparserfromhell/tree/v0.1.1>`_
 (`changes <https://github.com/earwig/mwparserfromhell/compare/v0.1...v0.1.1>`__):

 - Added support for :py:class:`Comments <.Comment>` (``<!-- foo -->``) and
  :py:class:`Wikilinks <.Wikilink>` (``[[foo]]``).
@@ -51,8 +80,6 @@ ba94938fe8_ to 19da4d2144_ (released September 21, 2012)
 v0.1
 ----

 ba94938fe8_ (released August 23, 2012)
 `Released August 23, 2012 <https://github.com/earwig/mwparserfromhell/tree/v0.1>`_:

 .. _master:     https://github.com/earwig/mwparserfromhell/tree/v0.2
 .. _19da4d2144: https://github.com/earwig/mwparserfromhell/tree/v0.1.1
 .. _ba94938fe8: https://github.com/earwig/mwparserfromhell/tree/v0.1
 - Initial release.
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,15 +1,18 @@
 MWParserFromHell v0.2 Documentation
 ===================================
 MWParserFromHell v\ |version| Documentation
 ===========================================

 :py:mod:`mwparserfromhell` (the *MediaWiki Parser from Hell*) is a Python
 package that provides an easy-to-use and outrageously powerful parser for
 MediaWiki_ wikicode. It supports Python 2 and Python 3.

 Developed by Earwig_ with help from `Σ`_.
 Developed by Earwig_ with contributions from `Σ`_, Legoktm_, and others.
 Development occurs on GitHub_.

 .. _MediaWiki:            http://mediawiki.org
 .. _Earwig:               http://en.wikipedia.org/wiki/User:The_Earwig
 .. _Σ:                    http://en.wikipedia.org/wiki/User:%CE%A3
 .. _Legoktm:              http://en.wikipedia.org/wiki/User:Legoktm
 .. _GitHub:               https://github.com/earwig/mwparserfromhell

 Installation
 ------------
@@ -18,7 +21,7 @@ The easiest way to install the parser is through the `Python Package Index`_,
 so you can install the latest release with ``pip install mwparserfromhell``
 (`get pip`_). Alternatively, get the latest development version::

    git clone git://github.com/earwig/mwparserfromhell.git
    git clone https://github.com/earwig/mwparserfromhell.git
    cd mwparserfromhell
    python setup.py install

--- a/docs/usage.rst
+++ b/docs/usage.rst
@@ -27,13 +27,20 @@ some extra methods. For example::
    >>> print template.get("eggs").value
    spam

 Since every node you reach is also a :py:class:`~.Wikicode` object, it's
 trivial to get nested templates::
 Since nodes can contain other nodes, getting nested templates is trivial::

    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
    >>> mwparserfromhell.parse(text).filter_templates()
    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']

 You can also pass *recursive=False* to :py:meth:`~.filter_templates` and
 explore templates manually. This is possible because nodes can contain
 additional :py:class:`~.Wikicode` objects::

    >>> code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
    >>> print code.filter_templates()
    >>> print code.filter_templates(recursive=False)
    ['{{foo|this {{includes a|template}}}}']
    >>> foo = code.filter_templates()[0]
    >>> foo = code.filter_templates(recursive=False)[0]
    >>> print foo.get(1).value
    this {{includes a|template}}
    >>> print foo.get(1).value.filter_templates()[0]
@@ -41,22 +48,17 @@ trivial to get nested templates::
    >>> print foo.get(1).value.filter_templates()[0].get(1).value
    template

 Additionally, you can include nested templates in :py:meth:`~.filter_templates`
 by passing *recursive=True*::

    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
    >>> mwparserfromhell.parse(text).filter_templates(recursive=True)
    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']

 Templates can be easily modified to add, remove, or alter params.
 :py:class:`~.Wikicode` can also be treated like a list with
 :py:class:`~.Wikicode` objects can be treated like lists, with
 :py:meth:`~.Wikicode.append`, :py:meth:`~.Wikicode.insert`,
 :py:meth:`~.Wikicode.remove`, :py:meth:`~.Wikicode.replace`, and more::
 :py:meth:`~.Wikicode.remove`, :py:meth:`~.Wikicode.replace`, and more. They
 also have a :py:meth:`~.Wikicode.matches` method for comparing page or template
 names, which takes care of capitalization and whitespace::

    >>> text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
    >>> code = mwparserfromhell.parse(text)
    >>> for template in code.filter_templates():
    ...     if template.name == "cleanup" and not template.has_param("date"):
    ...     if template.name.matches("Cleanup") and not template.has("date"):
    ...         template.add("date", "July 2012")
    ...
    >>> print code
--- a/mwparserfromhell/init.py
+++ b/mwparserfromhell/init.py
@@ -31,9 +31,10 @@ from __future__ import unicode_literals
 __author__ = "Ben Kurtovic"
 __copyright__ = "Copyright (C) 2012, 2013 Ben Kurtovic"
 __license__ = "MIT License"
 __version__ = "0.2"
 __version__ = "0.3"
 __email__ = "ben.kurtovic@verizon.net"

 from . import compat, nodes, parser, smart_list, string_mixin, utils, wikicode
 from . import (compat, definitions, nodes, parser, smart_list, string_mixin,
               utils, wikicode)

 parse = utils.parse_anything
--- a/mwparserfromhell/compat.py
+++ b/mwparserfromhell/compat.py
@@ -15,14 +15,12 @@ py3k = sys.version_info[0] == 3
 if py3k:
    bytes = bytes
    str = str
    basestring = str
    maxsize = sys.maxsize
    import html.entities as htmlentities

 else:
    bytes = str
    str = unicode
    basestring = basestring
    maxsize = sys.maxint
    import htmlentitydefs as htmlentities

--- a/mwparserfromhell/definitions.py
+++ b/mwparserfromhell/definitions.py
@@ -0,0 +1,91 @@
 # -*- coding: utf-8  -*-
 #
 # Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.

 """Contains data about certain markup, like HTML tags and external links."""

 from __future__ import unicode_literals

 __all__ = ["get_html_tag", "is_parsable", "is_visible", "is_single",
           "is_single_only", "is_scheme"]

 URI_SCHEMES = {
    # [mediawiki/core.git]/includes/DefaultSettings.php @ 374a0ad943
    "http": True, "https": True, "ftp": True, "ftps": True, "ssh": True,
    "sftp": True, "irc": True, "ircs": True, "xmpp": False, "sip": False,
    "sips": False, "gopher": True, "telnet": True, "nntp": True,
    "worldwind": True, "mailto": False, "tel": False, "sms": False,
    "news": False, "svn": True, "git": True, "mms": True, "bitcoin": False,
    "magnet": False, "urn": False, "geo": False
 }

 PARSER_BLACKLIST = [
    # enwiki extensions @ 2013-06-28
    "categorytree", "gallery", "hiero", "imagemap", "inputbox", "math",
    "nowiki", "pre", "score", "section", "source", "syntaxhighlight",
    "templatedata", "timeline"
 ]

 INVISIBLE_TAGS = [
    # enwiki extensions @ 2013-06-28
    "categorytree", "gallery", "imagemap", "inputbox", "math", "score",
    "section", "templatedata", "timeline"
 ]

 # [mediawiki/core.git]/includes/Sanitizer.php @ 87a0aef762
 SINGLE_ONLY = ["br", "hr", "meta", "link", "img"]
 SINGLE = SINGLE_ONLY + ["li", "dt", "dd"]

 MARKUP_TO_HTML = {
    "#": "li",
    "*": "li",
    ";": "dt",
    ":": "dd"
 }

 def get_html_tag(markup):
    """Return the HTML tag associated with the given wiki-markup."""
    return MARKUP_TO_HTML[markup]

 def is_parsable(tag):
    """Return if the given *tag*'s contents should be passed to the parser."""
    return tag.lower() not in PARSER_BLACKLIST

 def is_visible(tag):
    """Return whether or not the given *tag* contains visible text."""
    return tag.lower() not in INVISIBLE_TAGS

 def is_single(tag):
    """Return whether or not the given *tag* can exist without a close tag."""
    return tag.lower() in SINGLE

 def is_single_only(tag):
    """Return whether or not the given *tag* must exist without a close tag."""
    return tag.lower() in SINGLE_ONLY

 def is_scheme(scheme, slashes=True, reverse=False):
    """Return whether *scheme* is valid for external links."""
    if reverse:  # Convenience for C
        scheme = scheme[::-1]
    scheme = scheme.lower()
    if slashes:
        return scheme in URI_SCHEMES
    return scheme in URI_SCHEMES and not URI_SCHEMES[scheme]
--- a/mwparserfromhell/nodes/init.py
+++ b/mwparserfromhell/nodes/init.py
@@ -69,6 +69,7 @@ from . import extras
 from .text import Text
 from .argument import Argument
 from .comment import Comment
 from .external_link import ExternalLink
 from .heading import Heading
 from .html_entity import HTMLEntity
 from .tag import Tag
--- a/mwparserfromhell/nodes/external_link.py
+++ b/mwparserfromhell/nodes/external_link.py
@@ -0,0 +1,97 @@
 # -*- coding: utf-8  -*-
 #
 # Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.

 from __future__ import unicode_literals

 from . import Node
 from ..compat import str
 from ..utils import parse_anything

 __all__ = ["ExternalLink"]

 class ExternalLink(Node):
    """Represents an external link, like ``[http://example.com/ Example]``."""

    def __init__(self, url, title=None, brackets=True):
        super(ExternalLink, self).__init__()
        self._url = url
        self._title = title
        self._brackets = brackets

    def __unicode__(self):
        if self.brackets:
            if self.title is not None:
                return "[" + str(self.url) + " " + str(self.title) + "]"
            return "[" + str(self.url) + "]"
        return str(self.url)

    def __iternodes__(self, getter):
        yield None, self
        for child in getter(self.url):
            yield self.url, child
        if self.title is not None:
            for child in getter(self.title):
                yield self.title, child

    def __strip__(self, normalize, collapse):
        if self.brackets:
            if self.title:
                return self.title.strip_code(normalize, collapse)
            return None
        return self.url.strip_code(normalize, collapse)

    def __showtree__(self, write, get, mark):
        if self.brackets:
            write("[")
        get(self.url)
        if self.title is not None:
            get(self.title)
        if self.brackets:
            write("]")

    @property
    def url(self):
        """The URL of the link target, as a :py:class:`~.Wikicode` object."""
        return self._url

    @property
    def title(self):
        """The link title (if given), as a :py:class:`~.Wikicode` object."""
        return self._title

    @property
    def brackets(self):
        """Whether to enclose the URL in brackets or display it straight."""
        return self._brackets

    @url.setter
    def url(self, value):
        from ..parser import contexts
        self._url = parse_anything(value, contexts.EXT_LINK_URI)

    @title.setter
    def title(self, value):
        self._title = None if value is None else parse_anything(value)

    @brackets.setter
    def brackets(self, value):
        self._brackets = bool(value)
--- a/mwparserfromhell/nodes/extras/attribute.py
+++ b/mwparserfromhell/nodes/extras/attribute.py
@@ -36,18 +36,34 @@ class Attribute(StringMixIn):
    whose value is ``"foo"``.
    """

    def __init__(self, name, value=None, quoted=True):
    def __init__(self, name, value=None, quoted=True, pad_first=" ",
                 pad_before_eq="", pad_after_eq=""):
        super(Attribute, self).__init__()
        self._name = name
        self._value = value
        self._quoted = quoted
        self._pad_first = pad_first
        self._pad_before_eq = pad_before_eq
        self._pad_after_eq = pad_after_eq

    def __unicode__(self):
        if self.value:
        result = self.pad_first + str(self.name) + self.pad_before_eq
        if self.value is not None:
            result += "=" + self.pad_after_eq
            if self.quoted:
                return str(self.name) + '="' + str(self.value) + '"'
            return str(self.name) + "=" + str(self.value)
        return str(self.name)
                return result + '"' + str(self.value) + '"'
            return result + str(self.value)
        return result

    def _set_padding(self, attr, value):
        """Setter for the value of a padding attribute."""
        if not value:
            setattr(self, attr, "")
        else:
            value = str(value)
            if not value.isspace():
                raise ValueError("padding must be entirely whitespace")
            setattr(self, attr, value)

    @property
    def name(self):
@@ -64,14 +80,41 @@ class Attribute(StringMixIn):
        """Whether the attribute's value is quoted with double quotes."""
        return self._quoted

    @property
    def pad_first(self):
        """Spacing to insert right before the attribute."""
        return self._pad_first

    @property
    def pad_before_eq(self):
        """Spacing to insert right before the equal sign."""
        return self._pad_before_eq

    @property
    def pad_after_eq(self):
        """Spacing to insert right after the equal sign."""
        return self._pad_after_eq

    @name.setter
    def name(self, newval):
        self._name = parse_anything(newval)
    def name(self, value):
        self._name = parse_anything(value)

    @value.setter
    def value(self, newval):
        self._value = parse_anything(newval)
        self._value = None if newval is None else parse_anything(newval)

    @quoted.setter
    def quoted(self, newval):
        self._quoted = bool(newval)
    def quoted(self, value):
        self._quoted = bool(value)

    @pad_first.setter
    def pad_first(self, value):
        self._set_padding("_pad_first", value)

    @pad_before_eq.setter
    def pad_before_eq(self, value):
        self._set_padding("_pad_before_eq", value)

    @pad_after_eq.setter
    def pad_after_eq(self, value):
        self._set_padding("_pad_after_eq", value)
--- a/mwparserfromhell/nodes/tag.py
+++ b/mwparserfromhell/nodes/tag.py
@@ -22,8 +22,10 @@

 from __future__ import unicode_literals

 from . import Node, Text
 from . import Node
 from .extras import Attribute
 from ..compat import str
 from ..definitions import is_visible
 from ..utils import parse_anything

 __all__ = ["Tag"]
@@ -31,146 +33,85 @@ __all__ = ["Tag"]
 class Tag(Node):
    """Represents an HTML-style tag in wikicode, like ``<ref>``."""

    TAG_UNKNOWN = 0

    # Basic HTML:
    TAG_ITALIC = 1
    TAG_BOLD = 2
    TAG_UNDERLINE = 3
    TAG_STRIKETHROUGH = 4
    TAG_UNORDERED_LIST = 5
    TAG_ORDERED_LIST = 6
    TAG_DEF_TERM = 7
    TAG_DEF_ITEM = 8
    TAG_BLOCKQUOTE = 9
    TAG_RULE = 10
    TAG_BREAK = 11
    TAG_ABBR = 12
    TAG_PRE = 13
    TAG_MONOSPACE = 14
    TAG_CODE = 15
    TAG_SPAN = 16
    TAG_DIV = 17
    TAG_FONT = 18
    TAG_SMALL = 19
    TAG_BIG = 20
    TAG_CENTER = 21

    # MediaWiki parser hooks:
    TAG_REF = 101
    TAG_GALLERY = 102
    TAG_MATH = 103
    TAG_NOWIKI = 104
    TAG_NOINCLUDE = 105
    TAG_INCLUDEONLY = 106
    TAG_ONLYINCLUDE = 107

    # Additional parser hooks:
    TAG_SYNTAXHIGHLIGHT = 201
    TAG_POEM = 202

    # Lists of tags:
    TAGS_INVISIBLE = set((TAG_REF, TAG_GALLERY, TAG_MATH, TAG_NOINCLUDE))
    TAGS_VISIBLE = set(range(300)) - TAGS_INVISIBLE

    def __init__(self, type_, tag, contents=None, attrs=None, showtag=True,
                 self_closing=False, open_padding=0, close_padding=0):
    def __init__(self, tag, contents=None, attrs=None, wiki_markup=None,
                 self_closing=False, invalid=False, implicit=False, padding="",
                 closing_tag=None):
        super(Tag, self).__init__()
        self._type = type_
        self._tag = tag
        self._contents = contents
        if attrs:
            self._attrs = attrs
        if contents is None and not self_closing:
            self._contents = parse_anything("")
        else:
            self._attrs = []
        self._showtag = showtag
            self._contents = contents
        self._attrs = attrs if attrs else []
        self._wiki_markup = wiki_markup
        self._self_closing = self_closing
        self._open_padding = open_padding
        self._close_padding = close_padding
        self._invalid = invalid
        self._implicit = implicit
        self._padding = padding
        if closing_tag:
            self._closing_tag = closing_tag
        else:
            self._closing_tag = tag

    def __unicode__(self):
        if not self.showtag:
            open_, close = self._translate()
        if self.wiki_markup:
            if self.self_closing:
                return open_
                return self.wiki_markup
            else:
                return open_ + str(self.contents) + close
                return self.wiki_markup + str(self.contents) + self.wiki_markup

        result = "<" + str(self.tag)
        if self.attrs:
            result += " " + " ".join([str(attr) for attr in self.attrs])
        result = ("</" if self.invalid else "<") + str(self.tag)
        if self.attributes:
            result += "".join([str(attr) for attr in self.attributes])
        if self.self_closing:
            result += " " * self.open_padding + "/>"
            result += self.padding + (">" if self.implicit else "/>")
        else:
            result += " " * self.open_padding + ">" + str(self.contents)
            result += "</" + str(self.tag) + " " * self.close_padding + ">"
            result += self.padding + ">" + str(self.contents)
            result += "</" + str(self.closing_tag) + ">"
        return result

    def __iternodes__(self, getter):
        yield None, self
        if self.showtag:
        if not self.wiki_markup:
            for child in getter(self.tag):
                yield self.tag, child
            for attr in self.attrs:
            for attr in self.attributes:
                for child in getter(attr.name):
                    yield attr.name, child
                if attr.value:
                    for child in getter(attr.value):
                        yield attr.value, child
        for child in getter(self.contents):
            yield self.contents, child
        if self.contents:
            for child in getter(self.contents):
                yield self.contents, child
        if not self.self_closing and not self.wiki_markup and self.closing_tag:
            for child in getter(self.closing_tag):
                yield self.closing_tag, child

    def __strip__(self, normalize, collapse):
        if self.type in self.TAGS_VISIBLE:
        if self.contents and is_visible(self.tag):
            return self.contents.strip_code(normalize, collapse)
        return None

    def __showtree__(self, write, get, mark):
        tagnodes = self.tag.nodes
        if (not self.attrs and len(tagnodes) == 1 and isinstance(tagnodes[0], Text)):
            write("<" + str(tagnodes[0]) + ">")
        write("</" if self.invalid else "<")
        get(self.tag)
        for attr in self.attributes:
            get(attr.name)
            if not attr.value:
                continue
            write("    = ")
            mark()
            get(attr.value)
        if self.self_closing:
            write(">" if self.implicit else "/>")
        else:
            write("<")
            get(self.tag)
            for attr in self.attrs:
                get(attr.name)
                if not attr.value:
                    continue
                write("    = ")
                mark()
                get(attr.value)
            write(">")
        get(self.contents)
        if len(tagnodes) == 1 and isinstance(tagnodes[0], Text):
            write("</" + str(tagnodes[0]) + ">")
        else:
            get(self.contents)
            write("</")
            get(self.tag)
            get(self.closing_tag)
            write(">")

    def _translate(self):
        """If the HTML-style tag has a wikicode representation, return that.

        For example, ``<b>Foo</b>`` can be represented as ``'''Foo'''``. This
        returns a tuple of the character starting the sequence and the
        character ending it.
        """
        translations = {
            self.TAG_ITALIC: ("''", "''"),
            self.TAG_BOLD: ("'''", "'''"),
            self.TAG_UNORDERED_LIST: ("*", ""),
            self.TAG_ORDERED_LIST: ("#", ""),
            self.TAG_DEF_TERM: (";", ""),
            self.TAG_DEF_ITEM: (":", ""),
            self.TAG_RULE: ("----", ""),
        }
        return translations[self.type]

    @property
    def type(self):
        """The tag type."""
        return self._type

    @property
    def tag(self):
        """The tag itself, as a :py:class:`~.Wikicode` object."""
@@ -182,7 +123,7 @@ class Tag(Node):
        return self._contents

    @property
    def attrs(self):
    def attributes(self):
        """The list of attributes affecting the tag.

        Each attribute is an instance of :py:class:`~.Attribute`.
@@ -190,52 +131,142 @@ class Tag(Node):
        return self._attrs

    @property
    def showtag(self):
        """Whether to show the tag itself instead of a wikicode version."""
        return self._showtag
    def wiki_markup(self):
        """The wikified version of a tag to show instead of HTML.

        If set to a value, this will be displayed instead of the brackets.
        For example, set to ``''`` to replace ``<i>`` or ``----`` to replace
        ``<hr>``.
        """
        return self._wiki_markup

    @property
    def self_closing(self):
        """Whether the tag is self-closing with no content."""
        """Whether the tag is self-closing with no content (like ``<br/>``)."""
        return self._self_closing

    @property
    def open_padding(self):
        """How much spacing to insert before the first closing >."""
        return self._open_padding
    def invalid(self):
        """Whether the tag starts with a backslash after the opening bracket.

        This makes the tag look like a lone close tag. It is technically
        invalid and is only parsable Wikicode when the tag itself is
        single-only, like ``<br>`` and ``<img>``. See
        :py:func:`.definitions.is_single_only`.
        """
        return self._invalid

    @property
    def close_padding(self):
        """How much spacing to insert before the last closing >."""
        return self._close_padding
    def implicit(self):
        """Whether the tag is implicitly self-closing, with no ending slash.

    @type.setter
    def type(self, value):
        value = int(value)
        if value not in self.TAGS_INVISIBLE | self.TAGS_VISIBLE:
            raise ValueError(value)
        self._type = value
        This is only possible for specific "single" tags like ``<br>`` and
        ``<li>``. See :py:func:`.definitions.is_single`. This field only has an
        effect if :py:attr:`self_closing` is also ``True``.
        """
        return self._implicit

    @property
    def padding(self):
        """Spacing to insert before the first closing ``>``."""
        return self._padding

    @property
    def closing_tag(self):
        """The closing tag, as a :py:class:`~.Wikicode` object.

        This will usually equal :py:attr:`tag`, unless there is additional
        spacing, comments, or the like.
        """
        return self._closing_tag

    @tag.setter
    def tag(self, value):
        self._tag = parse_anything(value)
        self._tag = self._closing_tag = parse_anything(value)

    @contents.setter
    def contents(self, value):
        self._contents = parse_anything(value)

    @showtag.setter
    def showtag(self, value):
        self._showtag = bool(value)
    @wiki_markup.setter
    def wiki_markup(self, value):
        self._wiki_markup = str(value) if value else None

    @self_closing.setter
    def self_closing(self, value):
        self._self_closing = bool(value)

    @open_padding.setter
    def open_padding(self, value):
        self._open_padding = int(value)
    @invalid.setter
    def invalid(self, value):
        self._invalid = bool(value)

    @implicit.setter
    def implicit(self, value):
        self._implicit = bool(value)

    @close_padding.setter
    def close_padding(self, value):
        self._close_padding = int(value)
    @padding.setter
    def padding(self, value):
        if not value:
            self._padding = ""
        else:
            value = str(value)
            if not value.isspace():
                raise ValueError("padding must be entirely whitespace")
            self._padding = value

    @closing_tag.setter
    def closing_tag(self, value):
        self._closing_tag = parse_anything(value)

    def has(self, name):
        """Return whether any attribute in the tag has the given *name*.

        Note that a tag may have multiple attributes with the same name, but
        only the last one is read by the MediaWiki parser.
        """
        for attr in self.attributes:
            if attr.name == name.strip():
                return True
        return False

    def get(self, name):
        """Get the attribute with the given *name*.

        The returned object is a :py:class:`~.Attribute` instance. Raises
        :py:exc:`ValueError` if no attribute has this name. Since multiple
        attributes can have the same name, we'll return the last match, since
        all but the last are ignored by the MediaWiki parser.
        """
        for attr in reversed(self.attributes):
            if attr.name == name.strip():
                return attr
        raise ValueError(name)

    def add(self, name, value=None, quoted=True, pad_first=" ",
            pad_before_eq="", pad_after_eq=""):
        """Add an attribute with the given *name* and *value*.

        *name* and *value* can be anything parasable by
        :py:func:`.utils.parse_anything`; *value* can be omitted if the
        attribute is valueless. *quoted* is a bool telling whether to wrap the
        *value* in double quotes (this is recommended). *pad_first*,
        *pad_before_eq*, and *pad_after_eq* are whitespace used as padding
        before the name, before the equal sign (or after the name if no value),
        and after the equal sign (ignored if no value), respectively.
        """
        if value is not None:
            value = parse_anything(value)
        attr = Attribute(parse_anything(name), value, quoted)
        attr.pad_first = pad_first
        attr.pad_before_eq = pad_before_eq
        attr.pad_after_eq = pad_after_eq
        self.attributes.append(attr)
        return attr

    def remove(self, name):
        """Remove all attributes with the given *name*."""
        attrs = [attr for attr in self.attributes if attr.name == name.strip()]
        if not attrs:
            raise ValueError(name)
        for attr in attrs:
            self.attributes.remove(attr)
--- a/mwparserfromhell/nodes/template.py
+++ b/mwparserfromhell/nodes/template.py
@@ -26,7 +26,7 @@ import re

 from . import HTMLEntity, Node, Text
 from .extras import Parameter
 from ..compat import basestring, str
 from ..compat import str
 from ..utils import parse_anything

 __all__ = ["Template"]
@@ -84,7 +84,7 @@ class Template(Node):
        replacement = str(HTMLEntity(value=ord(char)))
        for node in code.filter_text(recursive=False):
            if char in node:
                code.replace(node, node.replace(char, replacement))
                code.replace(node, node.replace(char, replacement), False)

    def _blank_param_value(self, value):
        """Remove the content from *value* while keeping its whitespace.
@@ -164,15 +164,15 @@ class Template(Node):
    def name(self, value):
        self._name = parse_anything(value)

    def has_param(self, name, ignore_empty=True):
    def has(self, name, ignore_empty=True):
        """Return ``True`` if any parameter in the template is named *name*.

        With *ignore_empty*, ``False`` will be returned even if the template
        contains a parameter with the name *name*, if the parameter's value
        is empty. Note that a template may have multiple parameters with the
        same name.
        same name, but only the last one is read by the MediaWiki parser.
        """
        name = name.strip() if isinstance(name, basestring) else str(name)
        name = str(name).strip()
        for param in self.params:
            if param.name.strip() == name:
                if ignore_empty and not param.value.strip():
@@ -180,6 +180,9 @@ class Template(Node):
                return True
        return False

    has_param = lambda self, *args, **kwargs: self.has(*args, **kwargs)
    has_param.__doc__ = "Alias for :py:meth:`has`."

    def get(self, name):
        """Get the parameter whose name is *name*.

@@ -188,7 +191,7 @@ class Template(Node):
        parameters can have the same name, we'll return the last match, since
        the last parameter is the only one read by the MediaWiki parser.
        """
        name = name.strip() if isinstance(name, basestring) else str(name)
        name = str(name).strip()
        for param in reversed(self.params):
            if param.name.strip() == name:
                return param
@@ -226,7 +229,7 @@ class Template(Node):
        name, value = parse_anything(name), parse_anything(value)
        self._surface_escape(value, "|")

        if self.has_param(name):
        if self.has(name):
            self.remove(name, keep_field=True)
            existing = self.get(name)
            if showkey is not None:
@@ -291,7 +294,7 @@ class Template(Node):
        the first instance if none have dependents, otherwise the one with
        dependents will be kept).
        """
        name = name.strip() if isinstance(name, basestring) else str(name)
        name = str(name).strip()
        removed = False
        to_remove = []
        for i, param in enumerate(self.params):
--- a/mwparserfromhell/parser/init.py
+++ b/mwparserfromhell/parser/init.py
@@ -46,16 +46,15 @@ class Parser(object):
    :py:class:`~.Node`\ s by the :py:class:`~.Builder`.
    """

    def __init__(self, text):
        self.text = text
    def __init__(self):
        if use_c and CTokenizer:
            self._tokenizer = CTokenizer()
        else:
            self._tokenizer = Tokenizer()
        self._builder = Builder()

    def parse(self):
        """Return a string as a parsed :py:class:`~.Wikicode` object tree."""
        tokens = self._tokenizer.tokenize(self.text)
    def parse(self, text, context=0):
        """Parse *text*, returning a :py:class:`~.Wikicode` object tree."""
        tokens = self._tokenizer.tokenize(text, context)
        code = self._builder.build(tokens)
        return code
--- a/mwparserfromhell/parser/builder.py
+++ b/mwparserfromhell/parser/builder.py
@@ -24,8 +24,8 @@ from __future__ import unicode_literals

 from . import tokens
 from ..compat import str
 from ..nodes import (Argument, Comment, Heading, HTMLEntity, Tag, Template,
                     Text, Wikilink)
 from ..nodes import (Argument, Comment, ExternalLink, Heading, HTMLEntity, Tag,
                     Template, Text, Wikilink)
 from ..nodes.extras import Attribute, Parameter
 from ..smart_list import SmartList
 from ..wikicode import Wikicode
@@ -83,7 +83,7 @@ class Builder(object):
                                    tokens.TemplateClose)):
                self._tokens.append(token)
                value = self._pop()
                if not key:
                if key is None:
                    key = self._wrap([Text(str(default))])
                return Parameter(key, value, showkey)
            else:
@@ -142,6 +142,22 @@ class Builder(object):
            else:
                self._write(self._handle_token(token))

    def _handle_external_link(self, token):
        """Handle when an external link is at the head of the tokens."""
        brackets, url = token.brackets, None
        self._push()
        while self._tokens:
            token = self._tokens.pop()
            if isinstance(token, tokens.ExternalLinkSeparator):
                url = self._pop()
                self._push()
            elif isinstance(token, tokens.ExternalLinkClose):
                if url is not None:
                    return ExternalLink(url, self._pop(), brackets)
                return ExternalLink(self._pop(), brackets=brackets)
            else:
                self._write(self._handle_token(token))

    def _handle_entity(self):
        """Handle a case where an HTML entity is at the head of the tokens."""
        token = self._tokens.pop()
@@ -170,7 +186,7 @@ class Builder(object):
                self._write(self._handle_token(token))

    def _handle_comment(self):
        """Handle a case where a hidden comment is at the head of the tokens."""
        """Handle a case where an HTML comment is at the head of the tokens."""
        self._push()
        while self._tokens:
            token = self._tokens.pop()
@@ -180,7 +196,7 @@ class Builder(object):
            else:
                self._write(self._handle_token(token))

    def _handle_attribute(self):
    def _handle_attribute(self, start):
        """Handle a case where a tag attribute is at the head of the tokens."""
        name, quoted = None, False
        self._push()
@@ -191,37 +207,46 @@ class Builder(object):
                self._push()
            elif isinstance(token, tokens.TagAttrQuote):
                quoted = True
            elif isinstance(token, (tokens.TagAttrStart,
                                    tokens.TagCloseOpen)):
            elif isinstance(token, (tokens.TagAttrStart, tokens.TagCloseOpen,
                                    tokens.TagCloseSelfclose)):
                self._tokens.append(token)
                if name is not None:
                    return Attribute(name, self._pop(), quoted)
                return Attribute(self._pop(), quoted=quoted)
                if name:
                    value = self._pop()
                else:
                    name, value = self._pop(), None
                return Attribute(name, value, quoted, start.pad_first,
                                 start.pad_before_eq, start.pad_after_eq)
            else:
                self._write(self._handle_token(token))

    def _handle_tag(self, token):
        """Handle a case where a tag is at the head of the tokens."""
        type_, showtag = token.type, token.showtag
        attrs = []
        close_tokens = (tokens.TagCloseSelfclose, tokens.TagCloseClose)
        implicit, attrs, contents, closing_tag = False, [], None, None
        wiki_markup, invalid = token.wiki_markup, token.invalid or False
        self._push()
        while self._tokens:
            token = self._tokens.pop()
            if isinstance(token, tokens.TagAttrStart):
                attrs.append(self._handle_attribute())
                attrs.append(self._handle_attribute(token))
            elif isinstance(token, tokens.TagCloseOpen):
                open_pad = token.padding
                padding = token.padding or ""
                tag = self._pop()
                self._push()
            elif isinstance(token, tokens.TagCloseSelfclose):
                tag = self._pop()
                return Tag(type_, tag, attrs=attrs, showtag=showtag,
                           self_closing=True, open_padding=token.padding)
            elif isinstance(token, tokens.TagOpenClose):
                contents = self._pop()
            elif isinstance(token, tokens.TagCloseClose):
                return Tag(type_, tag, contents, attrs, showtag, False,
                           open_pad, token.padding)
                self._push()
            elif isinstance(token, close_tokens):
                if isinstance(token, tokens.TagCloseSelfclose):
                    tag = self._pop()
                    self_closing = True
                    padding = token.padding or ""
                    implicit = token.implicit or False
                else:
                    self_closing = False
                    closing_tag = self._pop()
                return Tag(tag, contents, attrs, wiki_markup, self_closing,
                           invalid, implicit, padding, closing_tag)
            else:
                self._write(self._handle_token(token))

@@ -235,6 +260,8 @@ class Builder(object):
            return self._handle_argument()
        elif isinstance(token, tokens.WikilinkOpen):
            return self._handle_wikilink()
        elif isinstance(token, tokens.ExternalLinkOpen):
            return self._handle_external_link(token)
        elif isinstance(token, tokens.HTMLEntityStart):
            return self._handle_entity()
        elif isinstance(token, tokens.HeadingStart):
--- a/mwparserfromhell/parser/contexts.py
+++ b/mwparserfromhell/parser/contexts.py
@@ -51,6 +51,12 @@ Local (stack-specific) contexts:
    * :py:const:`WIKILINK_TITLE`
    * :py:const:`WIKILINK_TEXT`

 * :py:const:`EXT_LINK`

    * :py:const:`EXT_LINK_URI`
    * :py:const:`EXT_LINK_TITLE`
    * :py:const:`EXT_LINK_BRACKETS`

 * :py:const:`HEADING`

    * :py:const:`HEADING_LEVEL_1`
@@ -60,7 +66,21 @@ Local (stack-specific) contexts:
    * :py:const:`HEADING_LEVEL_5`
    * :py:const:`HEADING_LEVEL_6`

 * :py:const:`COMMENT`
 * :py:const:`TAG`

    * :py:const:`TAG_OPEN`
    * :py:const:`TAG_ATTR`
    * :py:const:`TAG_BODY`
    * :py:const:`TAG_CLOSE`

 * :py:const:`STYLE`

    * :py:const:`STYLE_ITALICS`
    * :py:const:`STYLE_BOLD`
    * :py:const:`STYLE_PASS_AGAIN`
    * :py:const:`STYLE_SECOND_PASS`

 * :py:const:`DL_TERM`

 * :py:const:`SAFETY_CHECK`

@@ -74,41 +94,76 @@ Local (stack-specific) contexts:
 Global contexts:

 * :py:const:`GL_HEADING`

 Aggregate contexts:

 * :py:const:`FAIL`
 * :py:const:`UNSAFE`
 * :py:const:`DOUBLE`
 * :py:const:`INVALID_LINK`

 """

 # Local contexts:

 TEMPLATE =              0b00000000000000000111
 TEMPLATE_NAME =         0b00000000000000000001
 TEMPLATE_PARAM_KEY =    0b00000000000000000010
 TEMPLATE_PARAM_VALUE =  0b00000000000000000100

 ARGUMENT =              0b00000000000000011000
 ARGUMENT_NAME =         0b00000000000000001000
 ARGUMENT_DEFAULT =      0b00000000000000010000

 WIKILINK =              0b00000000000001100000
 WIKILINK_TITLE =        0b00000000000000100000
 WIKILINK_TEXT =         0b00000000000001000000

 HEADING =               0b00000001111110000000
 HEADING_LEVEL_1 =       0b00000000000010000000
 HEADING_LEVEL_2 =       0b00000000000100000000
 HEADING_LEVEL_3 =       0b00000000001000000000
 HEADING_LEVEL_4 =       0b00000000010000000000
 HEADING_LEVEL_5 =       0b00000000100000000000
 HEADING_LEVEL_6 =       0b00000001000000000000

 COMMENT =               0b00000010000000000000

 SAFETY_CHECK =          0b11111100000000000000
 HAS_TEXT =              0b00000100000000000000
 FAIL_ON_TEXT =          0b00001000000000000000
 FAIL_NEXT  =            0b00010000000000000000
 FAIL_ON_LBRACE =        0b00100000000000000000
 FAIL_ON_RBRACE =        0b01000000000000000000
 FAIL_ON_EQUALS =        0b10000000000000000000
 TEMPLATE_NAME =        1 << 0
 TEMPLATE_PARAM_KEY =   1 << 1
 TEMPLATE_PARAM_VALUE = 1 << 2
 TEMPLATE = TEMPLATE_NAME + TEMPLATE_PARAM_KEY + TEMPLATE_PARAM_VALUE

 ARGUMENT_NAME =    1 << 3
 ARGUMENT_DEFAULT = 1 << 4
 ARGUMENT = ARGUMENT_NAME + ARGUMENT_DEFAULT

 WIKILINK_TITLE = 1 << 5
 WIKILINK_TEXT =  1 << 6
 WIKILINK = WIKILINK_TITLE + WIKILINK_TEXT

 EXT_LINK_URI      = 1 << 7
 EXT_LINK_TITLE    = 1 << 8
 EXT_LINK_BRACKETS = 1 << 9
 EXT_LINK = EXT_LINK_URI + EXT_LINK_TITLE + EXT_LINK_BRACKETS

 HEADING_LEVEL_1 = 1 << 10
 HEADING_LEVEL_2 = 1 << 11
 HEADING_LEVEL_3 = 1 << 12
 HEADING_LEVEL_4 = 1 << 13
 HEADING_LEVEL_5 = 1 << 14
 HEADING_LEVEL_6 = 1 << 15
 HEADING = (HEADING_LEVEL_1 + HEADING_LEVEL_2 + HEADING_LEVEL_3 +
           HEADING_LEVEL_4 + HEADING_LEVEL_5 + HEADING_LEVEL_6)

 TAG_OPEN =  1 << 16
 TAG_ATTR =  1 << 17
 TAG_BODY =  1 << 18
 TAG_CLOSE = 1 << 19
 TAG = TAG_OPEN + TAG_ATTR + TAG_BODY + TAG_CLOSE

 STYLE_ITALICS =      1 << 20
 STYLE_BOLD =         1 << 21
 STYLE_PASS_AGAIN =   1 << 22
 STYLE_SECOND_PASS =  1 << 23
 STYLE = STYLE_ITALICS + STYLE_BOLD + STYLE_PASS_AGAIN + STYLE_SECOND_PASS

 DL_TERM = 1 << 24

 HAS_TEXT =       1 << 25
 FAIL_ON_TEXT =   1 << 26
 FAIL_NEXT  =     1 << 27
 FAIL_ON_LBRACE = 1 << 28
 FAIL_ON_RBRACE = 1 << 29
 FAIL_ON_EQUALS = 1 << 30
 SAFETY_CHECK = (HAS_TEXT + FAIL_ON_TEXT + FAIL_NEXT + FAIL_ON_LBRACE +
                FAIL_ON_RBRACE + FAIL_ON_EQUALS)

 # Global contexts:

 GL_HEADING = 0b1
 GL_HEADING = 1 << 0

 # Aggregate contexts:

 FAIL = TEMPLATE + ARGUMENT + WIKILINK + EXT_LINK_TITLE + HEADING + TAG + STYLE
 UNSAFE = (TEMPLATE_NAME + WIKILINK + EXT_LINK_TITLE + TEMPLATE_PARAM_KEY +
          ARGUMENT_NAME + TAG_CLOSE)
 DOUBLE = TEMPLATE_PARAM_KEY + TAG_CLOSE
 INVALID_LINK = TEMPLATE_NAME + ARGUMENT_NAME + WIKILINK + EXT_LINK
--- a/mwparserfromhell/parser/tokenizer.c
+++ b/mwparserfromhell/parser/tokenizer.c
--- a/mwparserfromhell/parser/tokenizer.h
+++ b/mwparserfromhell/parser/tokenizer.h
@@ -28,6 +28,7 @@ SOFTWARE.
 #include <Python.h>
 #include <math.h>
 #include <structmember.h>
 #include <bytesobject.h>

 #if PY_MAJOR_VERSION >= 3
 #define IS_PY3K
@@ -41,8 +42,8 @@ SOFTWARE.
 #define ALPHANUM  "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

 static const char* MARKERS[] = {
    "{",  "}", "[", "]", "<", ">", "|", "=", "&", "#", "*", ";", ":", "/", "-",
    "!", "\n", ""};
    "{", "}", "[", "]", "<", ">", "|", "=", "&", "'", "#", "*", ";", ":", "/",
    "-", "\n", ""};

 #define NUM_MARKERS 18
 #define TEXTBUFFER_BLOCKSIZE 1024
@@ -51,19 +52,20 @@ static const char* MARKERS[] = {
 #define MAX_BRACES 255
 #define MAX_ENTITY_SIZE 8

 static int route_state = 0;
 #define BAD_ROUTE     (route_state)
 #define FAIL_ROUTE()  (route_state = 1)
 #define RESET_ROUTE() (route_state = 0)
 static int route_state = 0, route_context = 0;
 #define BAD_ROUTE            route_state
 #define BAD_ROUTE_CONTEXT    route_context
 #define FAIL_ROUTE(context)  route_state = 1; route_context = context
 #define RESET_ROUTE()        route_state = 0

 static char** entitydefs;

 static PyObject* EMPTY;
 static PyObject* NOARGS;
 static PyObject* tokens;
 static PyObject* definitions;


 /* Tokens */
 /* Tokens: */

 static PyObject* Text;

@@ -80,6 +82,10 @@ static PyObject* WikilinkOpen;
 static PyObject* WikilinkSeparator;
 static PyObject* WikilinkClose;

 static PyObject* ExternalLinkOpen;
 static PyObject* ExternalLinkSeparator;
 static PyObject* ExternalLinkClose;

 static PyObject* HTMLEntityStart;
 static PyObject* HTMLEntityNumeric;
 static PyObject* HTMLEntityHex;
@@ -102,47 +108,83 @@ static PyObject* TagCloseClose;

 /* Local contexts: */

 #define LC_TEMPLATE             0x00007
 #define LC_TEMPLATE_NAME        0x00001
 #define LC_TEMPLATE_PARAM_KEY   0x00002
 #define LC_TEMPLATE_PARAM_VALUE 0x00004

 #define LC_ARGUMENT             0x00018
 #define LC_ARGUMENT_NAME        0x00008
 #define LC_ARGUMENT_DEFAULT     0x00010

 #define LC_WIKILINK             0x00060
 #define LC_WIKILINK_TITLE       0x00020
 #define LC_WIKILINK_TEXT        0x00040

 #define LC_HEADING              0x01F80
 #define LC_HEADING_LEVEL_1      0x00080
 #define LC_HEADING_LEVEL_2      0x00100
 #define LC_HEADING_LEVEL_3      0x00200
 #define LC_HEADING_LEVEL_4      0x00400
 #define LC_HEADING_LEVEL_5      0x00800
 #define LC_HEADING_LEVEL_6      0x01000

 #define LC_COMMENT              0x02000

 #define LC_SAFETY_CHECK         0xFC000
 #define LC_HAS_TEXT             0x04000
 #define LC_FAIL_ON_TEXT         0x08000
 #define LC_FAIL_NEXT            0x10000
 #define LC_FAIL_ON_LBRACE       0x20000
 #define LC_FAIL_ON_RBRACE       0x40000
 #define LC_FAIL_ON_EQUALS       0x80000
 #define LC_TEMPLATE             0x00000007
 #define LC_TEMPLATE_NAME        0x00000001
 #define LC_TEMPLATE_PARAM_KEY   0x00000002
 #define LC_TEMPLATE_PARAM_VALUE 0x00000004

 #define LC_ARGUMENT             0x00000018
 #define LC_ARGUMENT_NAME        0x00000008
 #define LC_ARGUMENT_DEFAULT     0x00000010

 #define LC_WIKILINK             0x00000060
 #define LC_WIKILINK_TITLE       0x00000020
 #define LC_WIKILINK_TEXT        0x00000040

 #define LC_EXT_LINK             0x00000380
 #define LC_EXT_LINK_URI         0x00000080
 #define LC_EXT_LINK_TITLE       0x00000100
 #define LC_EXT_LINK_BRACKETS    0x00000200

 #define LC_HEADING              0x0000FC00
 #define LC_HEADING_LEVEL_1      0x00000400
 #define LC_HEADING_LEVEL_2      0x00000800
 #define LC_HEADING_LEVEL_3      0x00001000
 #define LC_HEADING_LEVEL_4      0x00002000
 #define LC_HEADING_LEVEL_5      0x00004000
 #define LC_HEADING_LEVEL_6      0x00008000

 #define LC_TAG                  0x000F0000
 #define LC_TAG_OPEN             0x00010000
 #define LC_TAG_ATTR             0x00020000
 #define LC_TAG_BODY             0x00040000
 #define LC_TAG_CLOSE            0x00080000

 #define LC_STYLE                0x00F00000
 #define LC_STYLE_ITALICS        0x00100000
 #define LC_STYLE_BOLD           0x00200000
 #define LC_STYLE_PASS_AGAIN     0x00400000
 #define LC_STYLE_SECOND_PASS    0x00800000

 #define LC_DLTERM               0x01000000

 #define LC_SAFETY_CHECK         0x7E000000
 #define LC_HAS_TEXT             0x02000000
 #define LC_FAIL_ON_TEXT         0x04000000
 #define LC_FAIL_NEXT            0x08000000
 #define LC_FAIL_ON_LBRACE       0x10000000
 #define LC_FAIL_ON_RBRACE       0x20000000
 #define LC_FAIL_ON_EQUALS       0x40000000

 /* Global contexts: */

 #define GL_HEADING 0x1

 /* Aggregate contexts: */

 #define AGG_FAIL         (LC_TEMPLATE | LC_ARGUMENT | LC_WIKILINK | LC_EXT_LINK_TITLE | LC_HEADING | LC_TAG | LC_STYLE)
 #define AGG_UNSAFE       (LC_TEMPLATE_NAME | LC_WIKILINK | LC_EXT_LINK_TITLE | LC_TEMPLATE_PARAM_KEY | LC_ARGUMENT_NAME)
 #define AGG_DOUBLE       (LC_TEMPLATE_PARAM_KEY | LC_TAG_CLOSE)
 #define AGG_INVALID_LINK (LC_TEMPLATE_NAME | LC_ARGUMENT_NAME | LC_WIKILINK | LC_EXT_LINK)

 /* Tag contexts: */

 #define TAG_NAME        0x01
 #define TAG_ATTR_READY  0x02
 #define TAG_ATTR_NAME   0x04
 #define TAG_ATTR_VALUE  0x08
 #define TAG_QUOTED      0x10
 #define TAG_NOTE_SPACE  0x20
 #define TAG_NOTE_EQUALS 0x40
 #define TAG_NOTE_QUOTE  0x80


 /* Miscellaneous structs: */

 struct Textbuffer {
    Py_ssize_t size;
    Py_UNICODE* data;
    struct Textbuffer* prev;
    struct Textbuffer* next;
 };

@@ -158,13 +200,24 @@ typedef struct {
    int level;
 } HeadingData;

 typedef struct {
    int context;
    struct Textbuffer* pad_first;
    struct Textbuffer* pad_before_eq;
    struct Textbuffer* pad_after_eq;
    Py_ssize_t reset;
 } TagData;

 typedef struct Textbuffer Textbuffer;
 typedef struct Stack Stack;


 /* Tokenizer object definition: */

 typedef struct {
    PyObject_HEAD
    PyObject* text;         /* text to tokenize */
    struct Stack* topstack; /* topmost stack */
    Stack* topstack;        /* topmost stack */
    Py_ssize_t head;        /* current position in text */
    Py_ssize_t length;      /* length of text */
    int global;             /* global context */
@@ -173,78 +226,80 @@ typedef struct {
 } Tokenizer;


 /* Macros for accessing Tokenizer data: */
 /* Macros related to Tokenizer functions: */

 #define Tokenizer_READ(self, delta) (*PyUnicode_AS_UNICODE(Tokenizer_read(self, delta)))
 #define Tokenizer_READ_BACKWARDS(self, delta) \
                (*PyUnicode_AS_UNICODE(Tokenizer_read_backwards(self, delta)))
 #define Tokenizer_CAN_RECURSE(self) (self->depth < MAX_DEPTH && self->cycles < MAX_CYCLES)

 #define Tokenizer_emit(self, token) Tokenizer_emit_token(self, token, 0)
 #define Tokenizer_emit_first(self, token) Tokenizer_emit_token(self, token, 1)
 #define Tokenizer_emit_kwargs(self, token, kwargs) Tokenizer_emit_token_kwargs(self, token, kwargs, 0)
 #define Tokenizer_emit_first_kwargs(self, token, kwargs) Tokenizer_emit_token_kwargs(self, token, kwargs, 1)


 /* Macros for accessing definitions: */

 #define GET_HTML_TAG(markup) (markup == *":" ? "dd" : markup == *";" ? "dt" : "li")
 #define IS_PARSABLE(tag) (call_def_func("is_parsable", tag, NULL, NULL))
 #define IS_SINGLE(tag) (call_def_func("is_single", tag, NULL, NULL))
 #define IS_SINGLE_ONLY(tag) (call_def_func("is_single_only", tag, NULL, NULL))
 #define IS_SCHEME(scheme, slashes, reverse) \
    (call_def_func("is_scheme", scheme, slashes ? Py_True : Py_False, reverse ? Py_True : Py_False))


 /* Function prototypes: */

 static int heading_level_from_context(int);
 static Textbuffer* Textbuffer_new(void);
 static void Textbuffer_dealloc(Textbuffer*);

 static TagData* TagData_new(void);
 static void TagData_dealloc(TagData*);

 static PyObject* Tokenizer_new(PyTypeObject*, PyObject*, PyObject*);
 static struct Textbuffer* Textbuffer_new(void);
 static void Tokenizer_dealloc(Tokenizer*);
 static void Textbuffer_dealloc(struct Textbuffer*);
 static int Tokenizer_init(Tokenizer*, PyObject*, PyObject*);
 static int Tokenizer_push(Tokenizer*, int);
 static PyObject* Textbuffer_render(struct Textbuffer*);
 static int Tokenizer_push_textbuffer(Tokenizer*);
 static void Tokenizer_delete_top_of_stack(Tokenizer*);
 static PyObject* Tokenizer_pop(Tokenizer*);
 static PyObject* Tokenizer_pop_keeping_context(Tokenizer*);
 static void* Tokenizer_fail_route(Tokenizer*);
 static int Tokenizer_write(Tokenizer*, PyObject*);
 static int Tokenizer_write_first(Tokenizer*, PyObject*);
 static int Tokenizer_write_text(Tokenizer*, Py_UNICODE);
 static int Tokenizer_write_all(Tokenizer*, PyObject*);
 static int Tokenizer_write_text_then_stack(Tokenizer*, const char*);
 static PyObject* Tokenizer_read(Tokenizer*, Py_ssize_t);
 static PyObject* Tokenizer_read_backwards(Tokenizer*, Py_ssize_t);
 static int Tokenizer_parse_template_or_argument(Tokenizer*);
 static int Tokenizer_parse_template(Tokenizer*);
 static int Tokenizer_parse_argument(Tokenizer*);
 static int Tokenizer_handle_template_param(Tokenizer*);
 static int Tokenizer_handle_template_param_value(Tokenizer*);
 static PyObject* Tokenizer_handle_template_end(Tokenizer*);
 static int Tokenizer_handle_argument_separator(Tokenizer*);
 static PyObject* Tokenizer_handle_argument_end(Tokenizer*);
 static int Tokenizer_parse_wikilink(Tokenizer*);
 static int Tokenizer_handle_wikilink_separator(Tokenizer*);
 static PyObject* Tokenizer_handle_wikilink_end(Tokenizer*);
 static int Tokenizer_parse_heading(Tokenizer*);
 static HeadingData* Tokenizer_handle_heading_end(Tokenizer*);
 static int Tokenizer_really_parse_entity(Tokenizer*);
 static int Tokenizer_parse_entity(Tokenizer*);
 static int Tokenizer_parse_comment(Tokenizer*);
 static int Tokenizer_verify_safe(Tokenizer*, int, Py_UNICODE);
 static PyObject* Tokenizer_parse(Tokenizer*, int);
 static int Tokenizer_handle_dl_term(Tokenizer*);
 static int Tokenizer_parse_tag(Tokenizer*);
 static PyObject* Tokenizer_parse(Tokenizer*, int, int);
 static PyObject* Tokenizer_tokenize(Tokenizer*, PyObject*);


 /* Macros for Python 2/3 compatibility: */

 #ifdef IS_PY3K
    #define NEW_INT_FUNC      PyLong_FromSsize_t
    #define IMPORT_NAME_FUNC  PyUnicode_FromString
    #define CREATE_MODULE     PyModule_Create(&module_def);
    #define ENTITYDEFS_MODULE "html.entities"
    #define INIT_FUNC_NAME    PyInit__tokenizer
    #define INIT_ERROR        return NULL
 #else
    #define NEW_INT_FUNC      PyInt_FromSsize_t
    #define IMPORT_NAME_FUNC  PyBytes_FromString
    #define CREATE_MODULE     Py_InitModule("_tokenizer", NULL);
    #define ENTITYDEFS_MODULE "htmlentitydefs"
    #define INIT_FUNC_NAME    init_tokenizer
    #define INIT_ERROR        return
 #endif


 /* More structs for creating the Tokenizer type: */

 static PyMethodDef
 Tokenizer_methods[] = {
 static PyMethodDef Tokenizer_methods[] = {
    {"tokenize", (PyCFunction) Tokenizer_tokenize, METH_VARARGS,
    "Build a list of tokens from a string of wikicode and return it."},
    {NULL}
 };

 static PyMemberDef
 Tokenizer_members[] = {
 static PyMemberDef Tokenizer_members[] = {
    {NULL}
 };

 static PyMethodDef
 module_methods[] = {
    {NULL}
 };

 static PyTypeObject
 TokenizerType = {
    PyObject_HEAD_INIT(NULL)
    0,                                                      /* ob_size */
 static PyTypeObject TokenizerType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    "_tokenizer.CTokenizer",                                /* tp_name */
    sizeof(Tokenizer),                                      /* tp_basicsize */
    0,                                                      /* tp_itemsize */
@@ -283,3 +338,12 @@ TokenizerType = {
    0,                                                      /* tp_alloc */
    Tokenizer_new,                                          /* tp_new */
 };

 #ifdef IS_PY3K
 static PyModuleDef module_def = {
    PyModuleDef_HEAD_INIT,
    "_tokenizer",
    "Creates a list of tokens from a string of wikicode.",
    -1, NULL, NULL, NULL, NULL, NULL
 };
 #endif
--- a/mwparserfromhell/parser/tokenizer.py
+++ b/mwparserfromhell/parser/tokenizer.py
--- a/mwparserfromhell/parser/tokens.py
+++ b/mwparserfromhell/parser/tokens.py
@@ -30,7 +30,7 @@ into the :py:class`~.Wikicode` tree by the :py:class:`~.Builder`.

 from __future__ import unicode_literals

 from ..compat import basestring, py3k
 from ..compat import py3k, str

 __all__ = ["Token"]

@@ -43,7 +43,7 @@ class Token(object):
    def __repr__(self):
        args = []
        for key, value in self._kwargs.items():
            if isinstance(value, basestring) and len(value) > 100:
            if isinstance(value, str) and len(value) > 100:
                args.append(key + "=" + repr(value[:97] + "..."))
            else:
                args.append(key + "=" + repr(value))
@@ -55,7 +55,7 @@ class Token(object):
        return False

    def __getattr__(self, key):
        return self._kwargs[key]
        return self._kwargs.get(key)

    def __setattr__(self, key, value):
        self._kwargs[key] = value
@@ -84,6 +84,10 @@ WikilinkOpen = make("WikilinkOpen")                                 # [[
 WikilinkSeparator = make("WikilinkSeparator")                       # |
 WikilinkClose = make("WikilinkClose")                               # ]]

 ExternalLinkOpen = make("ExternalLinkOpen")                         # [
 ExternalLinkSeparator = make("ExternalLinkSeparator")               #
 ExternalLinkClose = make("ExternalLinkClose")                       # ]

 HTMLEntityStart = make("HTMLEntityStart")                           # &
 HTMLEntityNumeric = make("HTMLEntityNumeric")                       # #
 HTMLEntityHex = make("HTMLEntityHex")                               # x
--- a/mwparserfromhell/utils.py
+++ b/mwparserfromhell/utils.py
@@ -31,7 +31,9 @@ from .compat import bytes, str
 from .nodes import Node
 from .smart_list import SmartList

 def parse_anything(value):
 __all__ = ["parse_anything"]

 def parse_anything(value, context=0):
    """Return a :py:class:`~.Wikicode` for *value*, allowing multiple types.

    This differs from :py:meth:`.Parser.parse` in that we accept more than just
@@ -42,6 +44,12 @@ def parse_anything(value):
    on-the-fly by various methods of :py:class:`~.Wikicode` and others like
    :py:class:`~.Template`, such as :py:meth:`wikicode.insert()
    <.Wikicode.insert>` or setting :py:meth:`template.name <.Template.name>`.

    If given, *context* will be passed as a starting context to the parser.
    This is helpful when this function is used inside node attribute setters.
    For example, :py:class:`~.ExternalLink`\ 's :py:attr:`~.ExternalLink.url`
    setter sets *context* to :py:mod:`contexts.EXT_LINK_URI <.contexts>` to
    prevent the URL itself from becoming an :py:class:`~.ExternalLink`.
    """
    from .parser import Parser
    from .wikicode import Wikicode
@@ -51,17 +59,17 @@ def parse_anything(value):
    elif isinstance(value, Node):
        return Wikicode(SmartList([value]))
    elif isinstance(value, str):
        return Parser(value).parse()
        return Parser().parse(value, context)
    elif isinstance(value, bytes):
        return Parser(value.decode("utf8")).parse()
        return Parser().parse(value.decode("utf8"), context)
    elif isinstance(value, int):
        return Parser(str(value)).parse()
        return Parser().parse(str(value), context)
    elif value is None:
        return Wikicode(SmartList())
    try:
        nodelist = SmartList()
        for item in value:
            nodelist += parse_anything(item).nodes
            nodelist += parse_anything(item, context).nodes
    except TypeError:
        error = "Needs string, Node, Wikicode, int, None, or iterable of these, but got {0}: {1}"
        raise ValueError(error.format(type(value).__name__, value))
--- a/mwparserfromhell/wikicode.py
+++ b/mwparserfromhell/wikicode.py
@@ -24,8 +24,8 @@ from __future__ import unicode_literals
 import re

 from .compat import maxsize, py3k, str
 from .nodes import (Argument, Comment, Heading, HTMLEntity, Node, Tag,
                    Template, Text, Wikilink)
 from .nodes import (Argument, Comment, ExternalLink, Heading, HTMLEntity,
                    Node, Tag, Template, Text, Wikilink)
 from .string_mixin import StringMixIn
 from .utils import parse_anything

@@ -60,19 +60,6 @@ class Wikicode(StringMixIn):
        for context, child in node.__iternodes__(self._get_all_nodes):
            yield child

    def _get_context(self, node, obj):
        """Return a ``Wikicode`` that contains *obj* in its descendants.

        The closest (shortest distance from *node*) suitable ``Wikicode`` will
        be returned, or ``None`` if the *obj* is the *node* itself.

        Raises ``ValueError`` if *obj* is not within *node*.
        """
        for context, child in node.__iternodes__(self._get_all_nodes):
            if self._is_equivalent(obj, child):
                return context
        raise ValueError(obj)

    def _get_all_nodes(self, code):
        """Iterate over all of our descendant nodes.

@@ -105,26 +92,56 @@ class Wikicode(StringMixIn):
            return False
        return obj in nodes

    def _do_search(self, obj, recursive, callback, context, *args, **kwargs):
        """Look within *context* for *obj*, executing *callback* if found.
    def _do_search(self, obj, recursive, context=None, literal=None):
        """Return some info about the location of *obj* within *context*.

        If *recursive* is ``True``, we'll look within context and its
        descendants, otherwise we'll just execute callback. We raise
        :py:exc:`ValueError` if *obj* isn't in our node list or context. If
        found, *callback* is passed the context, the index of the node within
        the context, and whatever were passed as ``*args`` and ``**kwargs``.
        If *recursive* is ``True``, we'll look within *context* (``self`` by
        default) and its descendants, otherwise just *context*. We raise
        :py:exc:`ValueError` if *obj* isn't found. The return data is a list of
        3-tuples (*type*, *context*, *data*) where *type* is *obj*\ 's best
        type resolution (either ``Node``, ``Wikicode``, or ``str``), *context*
        is the closest ``Wikicode`` encompassing it, and *data* is either a
        ``Node``, a list of ``Node``\ s, or ``None`` depending on *type*.
        """
        if recursive:
            for i, node in enumerate(context.nodes):
                if self._is_equivalent(obj, node):
                    return callback(context, i, *args, **kwargs)
                if self._contains(self._get_children(node), obj):
                    context = self._get_context(node, obj)
                    return self._do_search(obj, recursive, callback, context,
                                           *args, **kwargs)
            raise ValueError(obj)
        if not context:
            context = self
            literal = isinstance(obj, (Node, Wikicode))
            obj = parse_anything(obj)
            if not obj or obj not in self:
                raise ValueError(obj)
            if len(obj.nodes) == 1:
                obj = obj.get(0)

        compare = lambda a, b: (a is b) if literal else (a == b)
        results = []
        i = 0
        while i < len(context.nodes):
            node = context.get(i)
            if isinstance(obj, Node) and compare(obj, node):
                results.append((Node, context, node))
            elif isinstance(obj, Wikicode) and compare(obj.get(0), node):
                for j in range(1, len(obj.nodes)):
                    if not compare(obj.get(j), context.get(i + j)):
                        break
                else:
                    nodes = list(context.nodes[i:i + len(obj.nodes)])
                    results.append((Wikicode, context, nodes))
                    i += len(obj.nodes) - 1
            elif recursive:
                contexts = node.__iternodes__(self._get_all_nodes)
                processed = []
                for code in (ctx for ctx, child in contexts):
                    if code and code not in processed and obj in code:
                        search = self._do_search(obj, recursive, code, literal)
                        results.extend(search)
                        processed.append(code)
            i += 1

        callback(context, self.index(obj, recursive=False), *args, **kwargs)
        if not results and not literal and recursive:
            results.append((str, context, None))
        if not results and context is self:
            raise ValueError(obj)
        return results

    def _get_tree(self, code, lines, marker, indent):
        """Build a tree to illustrate the way the Wikicode object was parsed.
@@ -253,41 +270,64 @@ class Wikicode(StringMixIn):
    def insert_before(self, obj, value, recursive=True):
        """Insert *value* immediately before *obj* in the list of nodes.

        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
        anything parasable by :py:func:`.parse_anything`. If *recursive* is
        ``True``, we will try to find *obj* within our child nodes even if it
        is not a direct descendant of this :py:class:`~.Wikicode` object. If
        *obj* is not in the node list, :py:exc:`ValueError` is raised.
        *obj* can be either a string, a :py:class:`~.Node`, or other
        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
        for example). *value* can be anything parasable by
        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
        find *obj* within our child nodes even if it is not a direct descendant
        of this :py:class:`~.Wikicode` object. If *obj* is not found,
        :py:exc:`ValueError` is raised.
        """
        callback = lambda self, i, value: self.insert(i, value)
        self._do_search(obj, recursive, callback, self, value)
        for restype, context, data in self._do_search(obj, recursive):
            if restype in (Node, Wikicode):
                i = context.index(data if restype is Node else data[0], False)
                context.insert(i, value)
            else:
                obj = str(obj)
                context.nodes = str(context).replace(obj, str(value) + obj)

    def insert_after(self, obj, value, recursive=True):
        """Insert *value* immediately after *obj* in the list of nodes.

        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
        anything parasable by :py:func:`.parse_anything`. If *recursive* is
        ``True``, we will try to find *obj* within our child nodes even if it
        is not a direct descendant of this :py:class:`~.Wikicode` object. If
        *obj* is not in the node list, :py:exc:`ValueError` is raised.
        *obj* can be either a string, a :py:class:`~.Node`, or other
        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
        for example). *value* can be anything parasable by
        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
        find *obj* within our child nodes even if it is not a direct descendant
        of this :py:class:`~.Wikicode` object. If *obj* is not found,
        :py:exc:`ValueError` is raised.
        """
        callback = lambda self, i, value: self.insert(i + 1, value)
        self._do_search(obj, recursive, callback, self, value)
        for restype, context, data in self._do_search(obj, recursive):
            if restype in (Node, Wikicode):
                i = context.index(data if restype is Node else data[-1], False)
                context.insert(i + 1, value)
            else:
                obj = str(obj)
                context.nodes = str(context).replace(obj, obj + str(value))

    def replace(self, obj, value, recursive=True):
        """Replace *obj* with *value* in the list of nodes.

        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
        anything parasable by :py:func:`.parse_anything`. If *recursive* is
        ``True``, we will try to find *obj* within our child nodes even if it
        is not a direct descendant of this :py:class:`~.Wikicode` object. If
        *obj* is not in the node list, :py:exc:`ValueError` is raised.
        *obj* can be either a string, a :py:class:`~.Node`, or other
        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
        for example). *value* can be anything parasable by
        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
        find *obj* within our child nodes even if it is not a direct descendant
        of this :py:class:`~.Wikicode` object. If *obj* is not found,
        :py:exc:`ValueError` is raised.
        """
        def callback(self, i, value):
            self.nodes.pop(i)
            self.insert(i, value)

        self._do_search(obj, recursive, callback, self, value)
        for restype, context, data in self._do_search(obj, recursive):
            if restype is Node:
                i = context.index(data, False)
                context.nodes.pop(i)
                context.insert(i, value)
            elif restype is Wikicode:
                i = context.index(data[0], False)
                for _ in data:
                    context.nodes.pop(i)
                context.insert(i, value)
            else:
                context.nodes = str(context).replace(str(obj), str(value))

    def append(self, value):
        """Insert *value* at the end of the list of nodes.
@@ -301,15 +341,39 @@ class Wikicode(StringMixIn):
    def remove(self, obj, recursive=True):
        """Remove *obj* from the list of nodes.

        *obj* can be either a string or a :py:class:`~.Node`. If *recursive* is
        ``True``, we will try to find *obj* within our child nodes even if it
        is not a direct descendant of this :py:class:`~.Wikicode` object. If
        *obj* is not in the node list, :py:exc:`ValueError` is raised.
        *obj* can be either a string, a :py:class:`~.Node`, or other
        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
        for example). If *recursive* is ``True``, we will try to find *obj*
        within our child nodes even if it is not a direct descendant of this
        :py:class:`~.Wikicode` object. If *obj* is not found,
        :py:exc:`ValueError` is raised.
        """
        for restype, context, data in self._do_search(obj, recursive):
            if restype is Node:
                context.nodes.pop(context.index(data, False))
            elif restype is Wikicode:
                i = context.index(data[0], False)
                for _ in data:
                    context.nodes.pop(i)
            else:
                context.nodes = str(context).replace(str(obj), "")

    def matches(self, other):
        """Do a loose equivalency test suitable for comparing page names.

        *other* can be any string-like object, including
        :py:class:`~.Wikicode`. This operation is symmetric; both sides are
        adjusted. Specifically, whitespace and markup is stripped and the first
        letter's case is normalized. Typical usage is
        ``if template.name.matches("stub"): ...``.
        """
        callback = lambda self, i: self.nodes.pop(i)
        self._do_search(obj, recursive, callback, self)
        this = self.strip_code().strip()
        that = parse_anything(other).strip_code().strip()
        if not this or not that:
            return this == that
        return this[0].upper() + this[1:] == that[0].upper() + that[1:]

    def ifilter(self, recursive=False, matches=None, flags=FLAGS,
    def ifilter(self, recursive=True, matches=None, flags=FLAGS,
                forcetype=None):
        """Iterate over nodes in our list matching certain conditions.

@@ -327,7 +391,7 @@ class Wikicode(StringMixIn):
                if not matches or re.search(matches, str(node), flags):
                    yield node

    def filter(self, recursive=False, matches=None, flags=FLAGS,
    def filter(self, recursive=True, matches=None, flags=FLAGS,
               forcetype=None):
        """Return a list of nodes within our list matching certain conditions.

@@ -360,9 +424,8 @@ class Wikicode(StringMixIn):
        """
        if matches:
            matches = r"^(=+?)\s*" + matches + r"\s*\1$"
        headings = self.filter_headings(recursive=True)
        filtered = self.filter_headings(recursive=True, matches=matches,
                                        flags=flags)
        headings = self.filter_headings()
        filtered = self.filter_headings(matches=matches, flags=flags)
        if levels:
            filtered = [head for head in filtered if head.level in levels]

@@ -446,6 +509,6 @@ class Wikicode(StringMixIn):
        return "\n".join(self._get_tree(self, [], marker, 0))

 Wikicode._build_filter_methods(
    arguments=Argument, comments=Comment, headings=Heading,
    html_entities=HTMLEntity, tags=Tag, templates=Template, text=Text,
    wikilinks=Wikilink)
    arguments=Argument, comments=Comment, external_links=ExternalLink,
    headings=Heading, html_entities=HTMLEntity, tags=Tag, templates=Template,
    text=Text, wikilinks=Wikilink)
--- a/setup.py
+++ b/setup.py
@@ -29,16 +29,13 @@ from mwparserfromhell.compat import py3k
 with open("README.rst") as fp:
    long_docs = fp.read()

 # builder = Extension("mwparserfromhell.parser._builder",
 #                     sources = ["mwparserfromhell/parser/builder.c"])

 tokenizer = Extension("mwparserfromhell.parser._tokenizer",
                      sources = ["mwparserfromhell/parser/tokenizer.c"])

 setup(
    name = "mwparserfromhell",
    packages = find_packages(exclude=("tests",)),
    ext_modules = [] if py3k else [tokenizer],
    ext_modules = [tokenizer],
    test_suite = "tests",
    version = __version__,
    author = "Ben Kurtovic",
@@ -50,13 +47,13 @@ setup(
    keywords = "earwig mwparserfromhell wikipedia wiki mediawiki wikicode template parsing",
    license = "MIT License",
    classifiers = [
        "Development Status :: 3 - Alpha",
        "Development Status :: 4 - Beta",
        "Environment :: Console",
        "Intended Audience :: Developers",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
        "Programming Language :: Python :: 2.7",
        "Programming Language :: Python :: 3",
        "Programming Language :: Python :: 3.3",
        "Topic :: Text Processing :: Markup"
    ],
 )
--- a/tests/_test_tree_equality.py
+++ b/tests/_test_tree_equality.py
@@ -91,7 +91,27 @@ class TreeEqualityTestCase(TestCase):

    def assertTagNodeEqual(self, expected, actual):
        """Assert that two Tag nodes have the same data."""
        self.fail("Holding this until feature/html_tags is ready.")
        self.assertWikicodeEqual(expected.tag, actual.tag)
        if expected.contents is not None:
            self.assertWikicodeEqual(expected.contents, actual.contents)
        length = len(expected.attributes)
        self.assertEqual(length, len(actual.attributes))
        for i in range(length):
            exp_attr = expected.attributes[i]
            act_attr = actual.attributes[i]
            self.assertWikicodeEqual(exp_attr.name, act_attr.name)
            if exp_attr.value is not None:
                self.assertWikicodeEqual(exp_attr.value, act_attr.value)
                self.assertIs(exp_attr.quoted, act_attr.quoted)
            self.assertEqual(exp_attr.pad_first, act_attr.pad_first)
            self.assertEqual(exp_attr.pad_before_eq, act_attr.pad_before_eq)
            self.assertEqual(exp_attr.pad_after_eq, act_attr.pad_after_eq)
        self.assertIs(expected.wiki_markup, actual.wiki_markup)
        self.assertIs(expected.self_closing, actual.self_closing)
        self.assertIs(expected.invalid, actual.invalid)
        self.assertIs(expected.implicit, actual.implicit)
        self.assertEqual(expected.padding, actual.padding)
        self.assertWikicodeEqual(expected.closing_tag, actual.closing_tag)

    def assertTemplateNodeEqual(self, expected, actual):
        """Assert that two Template nodes have the same data."""
--- a/tests/test_attribute.py
+++ b/tests/test_attribute.py
@@ -0,0 +1,89 @@
 # -*- coding: utf-8  -*-
 #
 # Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.

 from __future__ import unicode_literals
 import unittest

 from mwparserfromhell.compat import str
 from mwparserfromhell.nodes import Template
 from mwparserfromhell.nodes.extras import Attribute

 from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext

 class TestAttribute(TreeEqualityTestCase):
    """Test cases for the Attribute node extra."""

    def test_unicode(self):
        """test Attribute.__unicode__()"""
        node = Attribute(wraptext("foo"))
        self.assertEqual(" foo", str(node))
        node2 = Attribute(wraptext("foo"), wraptext("bar"))
        self.assertEqual(' foo="bar"', str(node2))
        node3 = Attribute(wraptext("a"), wraptext("b"), True, "", " ", "   ")
        self.assertEqual('a =   "b"', str(node3))
        node3 = Attribute(wraptext("a"), wraptext("b"), False, "", " ", "   ")
        self.assertEqual("a =   b", str(node3))
        node4 = Attribute(wraptext("a"), wrap([]), False, " ", "", " ")
        self.assertEqual(" a= ", str(node4))

    def test_name(self):
        """test getter/setter for the name attribute"""
        name = wraptext("id")
        node = Attribute(name, wraptext("bar"))
        self.assertIs(name, node.name)
        node.name = "{{id}}"
        self.assertWikicodeEqual(wrap([Template(wraptext("id"))]), node.name)

    def test_value(self):
        """test getter/setter for the value attribute"""
        value = wraptext("foo")
        node = Attribute(wraptext("id"), value)
        self.assertIs(value, node.value)
        node.value = "{{bar}}"
        self.assertWikicodeEqual(wrap([Template(wraptext("bar"))]), node.value)
        node.value = None
        self.assertIs(None, node.value)

    def test_quoted(self):
        """test getter/setter for the quoted attribute"""
        node1 = Attribute(wraptext("id"), wraptext("foo"), False)
        node2 = Attribute(wraptext("id"), wraptext("bar"))
        self.assertFalse(node1.quoted)
        self.assertTrue(node2.quoted)
        node1.quoted = True
        node2.quoted = ""
        self.assertTrue(node1.quoted)
        self.assertFalse(node2.quoted)

    def test_padding(self):
        """test getter/setter for the padding attributes"""
        for pad in ["pad_first", "pad_before_eq", "pad_after_eq"]:
            node = Attribute(wraptext("id"), wraptext("foo"), **{pad: "\n"})
            self.assertEqual("\n", getattr(node, pad))
            setattr(node, pad, " ")
            self.assertEqual(" ", getattr(node, pad))
            setattr(node, pad, None)
            self.assertEqual("", getattr(node, pad))
            self.assertRaises(ValueError, setattr, node, pad, True)

 if __name__ == "__main__":
    unittest.main(verbosity=2)
--- a/tests/test_builder.py
+++ b/tests/test_builder.py
@@ -23,8 +23,8 @@
 from __future__ import unicode_literals
 import unittest

 from mwparserfromhell.nodes import (Argument, Comment, Heading, HTMLEntity,
                                    Tag, Template, Text, Wikilink)
 from mwparserfromhell.nodes import (Argument, Comment, ExternalLink, Heading,
                                    HTMLEntity, Tag, Template, Text, Wikilink)
 from mwparserfromhell.nodes.extras import Attribute, Parameter
 from mwparserfromhell.parser import tokens
 from mwparserfromhell.parser.builder import Builder
@@ -72,6 +72,14 @@ class TestBuilder(TreeEqualityTestCase):
             wrap([Template(wraptext("foo"), params=[
                 Parameter(wraptext("bar"), wraptext("baz"))])])),

            ([tokens.TemplateOpen(), tokens.TemplateParamSeparator(),
              tokens.TemplateParamSeparator(), tokens.TemplateParamEquals(),
              tokens.TemplateParamSeparator(), tokens.TemplateClose()],
             wrap([Template(wrap([]), params=[
                 Parameter(wraptext("1"), wrap([]), showkey=False),
                 Parameter(wrap([]), wrap([]), showkey=True),
                 Parameter(wraptext("2"), wrap([]), showkey=False)])])),

            ([tokens.TemplateOpen(), tokens.Text(text="foo"),
              tokens.TemplateParamSeparator(), tokens.Text(text="bar"),
              tokens.TemplateParamEquals(), tokens.Text(text="baz"),
@@ -142,6 +150,48 @@ class TestBuilder(TreeEqualityTestCase):
        for test, valid in tests:
            self.assertWikicodeEqual(valid, self.builder.build(test))

    def test_external_link(self):
        """tests for building ExternalLink nodes"""
        tests = [
            ([tokens.ExternalLinkOpen(brackets=False),
              tokens.Text(text="http://example.com/"),
              tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example.com/"),
                                brackets=False)])),

            ([tokens.ExternalLinkOpen(brackets=True),
              tokens.Text(text="http://example.com/"),
              tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example.com/"))])),

            ([tokens.ExternalLinkOpen(brackets=True),
              tokens.Text(text="http://example.com/"),
              tokens.ExternalLinkSeparator(), tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example.com/"), wrap([]))])),

            ([tokens.ExternalLinkOpen(brackets=True),
              tokens.Text(text="http://example.com/"),
              tokens.ExternalLinkSeparator(), tokens.Text(text="Example"),
              tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example.com/"),
                                wraptext("Example"))])),

            ([tokens.ExternalLinkOpen(brackets=False),
              tokens.Text(text="http://example"), tokens.Text(text=".com/foo"),
              tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example", ".com/foo"),
                                brackets=False)])),

            ([tokens.ExternalLinkOpen(brackets=True),
              tokens.Text(text="http://example"), tokens.Text(text=".com/foo"),
              tokens.ExternalLinkSeparator(), tokens.Text(text="Example"),
              tokens.Text(text=" Web Page"), tokens.ExternalLinkClose()],
             wrap([ExternalLink(wraptext("http://example", ".com/foo"),
                                wraptext("Example", " Web Page"))])),
        ]
        for test, valid in tests:
            self.assertWikicodeEqual(valid, self.builder.build(test))

    def test_html_entity(self):
        """tests for building HTMLEntity nodes"""
        tests = [
@@ -190,6 +240,129 @@ class TestBuilder(TreeEqualityTestCase):
        for test, valid in tests:
            self.assertWikicodeEqual(valid, self.builder.build(test))

    def test_tag(self):
        """tests for building Tag nodes"""
        tests = [
            # <ref></ref>
            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
              tokens.TagCloseOpen(padding=""), tokens.TagOpenClose(),
              tokens.Text(text="ref"), tokens.TagCloseClose()],
             wrap([Tag(wraptext("ref"), wrap([]),
                       closing_tag=wraptext("ref"))])),

            # <ref name></ref>
            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="name"), tokens.TagCloseOpen(padding=""),
              tokens.TagOpenClose(), tokens.Text(text="ref"),
              tokens.TagCloseClose()],
             wrap([Tag(wraptext("ref"), wrap([]),
                      attrs=[Attribute(wraptext("name"))])])),

            # <ref name="abc" />
            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="name"), tokens.TagAttrEquals(),
              tokens.TagAttrQuote(), tokens.Text(text="abc"),
              tokens.TagCloseSelfclose(padding=" ")],
             wrap([Tag(wraptext("ref"),
                       attrs=[Attribute(wraptext("name"), wraptext("abc"))],
                       self_closing=True, padding=" ")])),

            # <br/>
            ([tokens.TagOpenOpen(), tokens.Text(text="br"),
              tokens.TagCloseSelfclose(padding="")],
             wrap([Tag(wraptext("br"), self_closing=True)])),

            # <li>
            ([tokens.TagOpenOpen(), tokens.Text(text="li"),
              tokens.TagCloseSelfclose(padding="", implicit=True)],
             wrap([Tag(wraptext("li"), self_closing=True, implicit=True)])),

            # </br>
            ([tokens.TagOpenOpen(invalid=True), tokens.Text(text="br"),
              tokens.TagCloseSelfclose(padding="", implicit=True)],
             wrap([Tag(wraptext("br"), self_closing=True, invalid=True,
                       implicit=True)])),

            # </br/>
            ([tokens.TagOpenOpen(invalid=True), tokens.Text(text="br"),
              tokens.TagCloseSelfclose(padding="")],
             wrap([Tag(wraptext("br"), self_closing=True, invalid=True)])),

            # <ref name={{abc}}   foo="bar {{baz}}" abc={{de}}f ghi=j{{k}}{{l}}
            #      mno =  "{{p}} [[q]] {{r}}">[[Source]]</ref>
            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="name"), tokens.TagAttrEquals(),
              tokens.TemplateOpen(), tokens.Text(text="abc"),
              tokens.TemplateClose(),
              tokens.TagAttrStart(pad_first="   ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="foo"), tokens.TagAttrEquals(),
              tokens.TagAttrQuote(), tokens.Text(text="bar "),
              tokens.TemplateOpen(), tokens.Text(text="baz"),
              tokens.TemplateClose(),
              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="abc"), tokens.TagAttrEquals(),
              tokens.TemplateOpen(), tokens.Text(text="de"),
              tokens.TemplateClose(), tokens.Text(text="f"),
              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
                                  pad_after_eq=""),
              tokens.Text(text="ghi"), tokens.TagAttrEquals(),
              tokens.Text(text="j"), tokens.TemplateOpen(),
              tokens.Text(text="k"), tokens.TemplateClose(),
              tokens.TemplateOpen(), tokens.Text(text="l"),
              tokens.TemplateClose(),
              tokens.TagAttrStart(pad_first=" \n ", pad_before_eq=" ",
                                  pad_after_eq="  "),
              tokens.Text(text="mno"), tokens.TagAttrEquals(),
              tokens.TagAttrQuote(), tokens.TemplateOpen(),
              tokens.Text(text="p"), tokens.TemplateClose(),
              tokens.Text(text=" "), tokens.WikilinkOpen(),
              tokens.Text(text="q"), tokens.WikilinkClose(),
              tokens.Text(text=" "), tokens.TemplateOpen(),
              tokens.Text(text="r"), tokens.TemplateClose(),
              tokens.TagCloseOpen(padding=""), tokens.WikilinkOpen(),
              tokens.Text(text="Source"), tokens.WikilinkClose(),
              tokens.TagOpenClose(), tokens.Text(text="ref"),
              tokens.TagCloseClose()],
             wrap([Tag(wraptext("ref"), wrap([Wikilink(wraptext("Source"))]), [
                    Attribute(wraptext("name"),
                              wrap([Template(wraptext("abc"))]), False),
                    Attribute(wraptext("foo"), wrap([Text("bar "),
                              Template(wraptext("baz"))]), pad_first="   "),
                    Attribute(wraptext("abc"), wrap([Template(wraptext("de")),
                              Text("f")]), False),
                    Attribute(wraptext("ghi"), wrap([Text("j"),
                              Template(wraptext("k")),
                              Template(wraptext("l"))]), False),
                    Attribute(wraptext("mno"), wrap([Template(wraptext("p")),
                              Text(" "), Wikilink(wraptext("q")), Text(" "),
                              Template(wraptext("r"))]), True, " \n ", " ",
                              "  ")])])),

            # "''italic text''"
            ([tokens.TagOpenOpen(wiki_markup="''"), tokens.Text(text="i"),
              tokens.TagCloseOpen(), tokens.Text(text="italic text"),
              tokens.TagOpenClose(), tokens.Text(text="i"),
              tokens.TagCloseClose()],
             wrap([Tag(wraptext("i"), wraptext("italic text"),
                       wiki_markup="''")])),

            # * bullet
            ([tokens.TagOpenOpen(wiki_markup="*"), tokens.Text(text="li"),
              tokens.TagCloseSelfclose(), tokens.Text(text=" bullet")],
             wrap([Tag(wraptext("li"), wiki_markup="*", self_closing=True),
                   Text(" bullet")])),
        ]
        for test, valid in tests:
            self.assertWikicodeEqual(valid, self.builder.build(test))

    def test_integration(self):
        """a test for building a combination of templates together"""
        # {{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -61,36 +61,36 @@ class TestDocs(unittest.TestCase):

    def test_readme_2(self):
        """test a block of example code in the README"""
        text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
        temps = mwparserfromhell.parse(text).filter_templates()
        if py3k:
            res = "['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']"
        else:
            res = "[u'{{foo|{{bar}}={{baz|{{spam}}}}}}', u'{{bar}}', u'{{baz|{{spam}}}}', u'{{spam}}']"
        self.assertPrint(temps, res)

    def test_readme_3(self):
        """test a block of example code in the README"""
        code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
        if py3k:
            self.assertPrint(code.filter_templates(),
            self.assertPrint(code.filter_templates(recursive=False),
                             "['{{foo|this {{includes a|template}}}}']")
        else:
            self.assertPrint(code.filter_templates(),
            self.assertPrint(code.filter_templates(recursive=False),
                             "[u'{{foo|this {{includes a|template}}}}']")
        foo = code.filter_templates()[0]
        foo = code.filter_templates(recursive=False)[0]
        self.assertPrint(foo.get(1).value, "this {{includes a|template}}")
        self.assertPrint(foo.get(1).value.filter_templates()[0],
                         "{{includes a|template}}")
        self.assertPrint(foo.get(1).value.filter_templates()[0].get(1).value,
                         "template")

    def test_readme_3(self):
        """test a block of example code in the README"""
        text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
        temps = mwparserfromhell.parse(text).filter_templates(recursive=True)
        if py3k:
            res = "['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']"
        else:
            res = "[u'{{foo|{{bar}}={{baz|{{spam}}}}}}', u'{{bar}}', u'{{baz|{{spam}}}}', u'{{spam}}']"
        self.assertPrint(temps, res)

    def test_readme_4(self):
        """test a block of example code in the README"""
        text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
        code = mwparserfromhell.parse(text)
        for template in code.filter_templates():
            if template.name == "cleanup" and not template.has_param("date"):
            if template.name.matches("Cleanup") and not template.has("date"):
                template.add("date", "July 2012")
        res = "{{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{uncategorized}}"
        self.assertPrint(code, res)
--- a/tests/test_external_link.py
+++ b/tests/test_external_link.py
@@ -0,0 +1,130 @@
 # -*- coding: utf-8  -*-
 #
 # Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.

 from __future__ import unicode_literals
 import unittest

 from mwparserfromhell.compat import str
 from mwparserfromhell.nodes import ExternalLink, Text

 from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext

 class TestExternalLink(TreeEqualityTestCase):
    """Test cases for the ExternalLink node."""

    def test_unicode(self):
        """test ExternalLink.__unicode__()"""
        node = ExternalLink(wraptext("http://example.com/"), brackets=False)
        self.assertEqual("http://example.com/", str(node))
        node2 = ExternalLink(wraptext("http://example.com/"))
        self.assertEqual("[http://example.com/]", str(node2))
        node3 = ExternalLink(wraptext("http://example.com/"), wrap([]))
        self.assertEqual("[http://example.com/ ]", str(node3))
        node4 = ExternalLink(wraptext("http://example.com/"),
                             wraptext("Example Web Page"))
        self.assertEqual("[http://example.com/ Example Web Page]", str(node4))

    def test_iternodes(self):
        """test ExternalLink.__iternodes__()"""
        node1n1 = Text("http://example.com/")
        node2n1 = Text("http://example.com/")
        node2n2, node2n3 = Text("Example"), Text("Page")
        node1 = ExternalLink(wrap([node1n1]), brackets=False)
        node2 = ExternalLink(wrap([node2n1]), wrap([node2n2, node2n3]))
        gen1 = node1.__iternodes__(getnodes)
        gen2 = node2.__iternodes__(getnodes)
        self.assertEqual((None, node1), next(gen1))
        self.assertEqual((None, node2), next(gen2))
        self.assertEqual((node1.url, node1n1), next(gen1))
        self.assertEqual((node2.url, node2n1), next(gen2))
        self.assertEqual((node2.title, node2n2), next(gen2))
        self.assertEqual((node2.title, node2n3), next(gen2))
        self.assertRaises(StopIteration, next, gen1)
        self.assertRaises(StopIteration, next, gen2)

    def test_strip(self):
        """test ExternalLink.__strip__()"""
        node1 = ExternalLink(wraptext("http://example.com"), brackets=False)
        node2 = ExternalLink(wraptext("http://example.com"))
        node3 = ExternalLink(wraptext("http://example.com"), wrap([]))
        node4 = ExternalLink(wraptext("http://example.com"), wraptext("Link"))
        for a in (True, False):
            for b in (True, False):
                self.assertEqual("http://example.com", node1.__strip__(a, b))
                self.assertEqual(None, node2.__strip__(a, b))
                self.assertEqual(None, node3.__strip__(a, b))
                self.assertEqual("Link", node4.__strip__(a, b))

    def test_showtree(self):
        """test ExternalLink.__showtree__()"""
        output = []
        getter, marker = object(), object()
        get = lambda code: output.append((getter, code))
        mark = lambda: output.append(marker)
        node1 = ExternalLink(wraptext("http://example.com"), brackets=False)
        node2 = ExternalLink(wraptext("http://example.com"), wraptext("Link"))
        node1.__showtree__(output.append, get, mark)
        node2.__showtree__(output.append, get, mark)
        valid = [
            (getter, node1.url), "[", (getter, node2.url),
            (getter, node2.title), "]"]
        self.assertEqual(valid, output)

    def test_url(self):
        """test getter/setter for the url attribute"""
        url = wraptext("http://example.com/")
        node1 = ExternalLink(url, brackets=False)
        node2 = ExternalLink(url, wraptext("Example"))
        self.assertIs(url, node1.url)
        self.assertIs(url, node2.url)
        node1.url = "mailto:héhehé@spam.com"
        node2.url = "mailto:héhehé@spam.com"
        self.assertWikicodeEqual(wraptext("mailto:héhehé@spam.com"), node1.url)
        self.assertWikicodeEqual(wraptext("mailto:héhehé@spam.com"), node2.url)

    def test_title(self):
        """test getter/setter for the title attribute"""
        title = wraptext("Example!")
        node1 = ExternalLink(wraptext("http://example.com/"), brackets=False)
        node2 = ExternalLink(wraptext("http://example.com/"), title)
        self.assertIs(None, node1.title)
        self.assertIs(title, node2.title)
        node2.title = None
        self.assertIs(None, node2.title)
        node2.title = "My Website"
        self.assertWikicodeEqual(wraptext("My Website"), node2.title)

    def test_brackets(self):
        """test getter/setter for the brackets attribute"""
        node1 = ExternalLink(wraptext("http://example.com/"), brackets=False)
        node2 = ExternalLink(wraptext("http://example.com/"), wraptext("Link"))
        self.assertFalse(node1.brackets)
        self.assertTrue(node2.brackets)
        node1.brackets = True
        node2.brackets = False
        self.assertTrue(node1.brackets)
        self.assertFalse(node2.brackets)
        self.assertEqual("[http://example.com/]", str(node1))
        self.assertEqual("http://example.com/", str(node2))

 if __name__ == "__main__":
    unittest.main(verbosity=2)
--- a/tests/test_parser.py
+++ b/tests/test_parser.py
@@ -36,9 +36,9 @@ class TestParser(TreeEqualityTestCase):
    def test_use_c(self):
        """make sure the correct tokenizer is used"""
        if parser.use_c:
            self.assertTrue(parser.Parser(None)._tokenizer.USES_C)
            self.assertTrue(parser.Parser()._tokenizer.USES_C)
            parser.use_c = False
        self.assertFalse(parser.Parser(None)._tokenizer.USES_C)
        self.assertFalse(parser.Parser()._tokenizer.USES_C)

    def test_parsing(self):
        """integration test for parsing overall"""
@@ -59,7 +59,7 @@ class TestParser(TreeEqualityTestCase):
                ]))
            ])
        ])
        actual = parser.Parser(text).parse()
        actual = parser.Parser().parse(text)
        self.assertWikicodeEqual(expected, actual)

 if __name__ == "__main__":
--- a/tests/test_tag.py
+++ b/tests/test_tag.py
@@ -0,0 +1,315 @@
 # -*- coding: utf-8  -*-
 #
 # Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 # SOFTWARE.

 from __future__ import unicode_literals
 import unittest

 from mwparserfromhell.compat import str
 from mwparserfromhell.nodes import Tag, Template, Text
 from mwparserfromhell.nodes.extras import Attribute
 from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext

 agen = lambda name, value: Attribute(wraptext(name), wraptext(value))
 agennq = lambda name, value: Attribute(wraptext(name), wraptext(value), False)
 agenp = lambda name, v, a, b, c: Attribute(wraptext(name), v, True, a, b, c)
 agenpnv = lambda name, a, b, c: Attribute(wraptext(name), None, True, a, b, c)

 class TestTag(TreeEqualityTestCase):
    """Test cases for the Tag node."""

    def test_unicode(self):
        """test Tag.__unicode__()"""
        node1 = Tag(wraptext("ref"))
        node2 = Tag(wraptext("span"), wraptext("foo"),
                    [agen("style", "color: red;")])
        node3 = Tag(wraptext("ref"),
                    attrs=[agennq("name", "foo"),
                           agenpnv("some_attr", "   ", "", "")],
                    self_closing=True)
        node4 = Tag(wraptext("br"), self_closing=True, padding=" ")
        node5 = Tag(wraptext("br"), self_closing=True, implicit=True)
        node6 = Tag(wraptext("br"), self_closing=True, invalid=True,
                    implicit=True)
        node7 = Tag(wraptext("br"), self_closing=True, invalid=True,
                    padding=" ")
        node8 = Tag(wraptext("hr"), wiki_markup="----", self_closing=True)
        node9 = Tag(wraptext("i"), wraptext("italics!"), wiki_markup="''")

        self.assertEqual("<ref></ref>", str(node1))
        self.assertEqual('<span style="color: red;">foo</span>', str(node2))
        self.assertEqual("<ref name=foo   some_attr/>", str(node3))
        self.assertEqual("<br />", str(node4))
        self.assertEqual("<br>", str(node5))
        self.assertEqual("</br>", str(node6))
        self.assertEqual("</br />", str(node7))
        self.assertEqual("----", str(node8))
        self.assertEqual("''italics!''", str(node9))

    def test_iternodes(self):
        """test Tag.__iternodes__()"""
        node1n1, node1n2 = Text("ref"), Text("foobar")
        node2n1, node3n1, node3n2 = Text("bold text"), Text("img"), Text("id")
        node3n3, node3n4, node3n5 = Text("foo"), Text("class"), Text("bar")

        # <ref>foobar</ref>
        node1 = Tag(wrap([node1n1]), wrap([node1n2]))
        # '''bold text'''
        node2 = Tag(wraptext("b"), wrap([node2n1]), wiki_markup="'''")
        # <img id="foo" class="bar" />
        node3 = Tag(wrap([node3n1]),
                    attrs=[Attribute(wrap([node3n2]), wrap([node3n3])),
                           Attribute(wrap([node3n4]), wrap([node3n5]))],
                    self_closing=True, padding=" ")

        gen1 = node1.__iternodes__(getnodes)
        gen2 = node2.__iternodes__(getnodes)
        gen3 = node3.__iternodes__(getnodes)
        self.assertEqual((None, node1), next(gen1))
        self.assertEqual((None, node2), next(gen2))
        self.assertEqual((None, node3), next(gen3))
        self.assertEqual((node1.tag, node1n1), next(gen1))
        self.assertEqual((node3.tag, node3n1), next(gen3))
        self.assertEqual((node3.attributes[0].name, node3n2), next(gen3))
        self.assertEqual((node3.attributes[0].value, node3n3), next(gen3))
        self.assertEqual((node3.attributes[1].name, node3n4), next(gen3))
        self.assertEqual((node3.attributes[1].value, node3n5), next(gen3))
        self.assertEqual((node1.contents, node1n2), next(gen1))
        self.assertEqual((node2.contents, node2n1), next(gen2))
        self.assertEqual((node1.closing_tag, node1n1), next(gen1))
        self.assertRaises(StopIteration, next, gen1)
        self.assertRaises(StopIteration, next, gen2)
        self.assertRaises(StopIteration, next, gen3)

    def test_strip(self):
        """test Tag.__strip__()"""
        node1 = Tag(wraptext("i"), wraptext("foobar"))
        node2 = Tag(wraptext("math"), wraptext("foobar"))
        node3 = Tag(wraptext("br"), self_closing=True)
        for a in (True, False):
            for b in (True, False):
                self.assertEqual("foobar", node1.__strip__(a, b))
                self.assertEqual(None, node2.__strip__(a, b))
                self.assertEqual(None, node3.__strip__(a, b))

    def test_showtree(self):
        """test Tag.__showtree__()"""
        output = []
        getter, marker = object(), object()
        get = lambda code: output.append((getter, code))
        mark = lambda: output.append(marker)
        node1 = Tag(wraptext("ref"), wraptext("text"), [agen("name", "foo")])
        node2 = Tag(wraptext("br"), self_closing=True, padding=" ")
        node3 = Tag(wraptext("br"), self_closing=True, invalid=True,
                    implicit=True, padding=" ")
        node1.__showtree__(output.append, get, mark)
        node2.__showtree__(output.append, get, mark)
        node3.__showtree__(output.append, get, mark)
        valid = [
            "<", (getter, node1.tag), (getter, node1.attributes[0].name),
            "    = ", marker, (getter, node1.attributes[0].value), ">",
            (getter, node1.contents), "</", (getter, node1.closing_tag), ">",
            "<", (getter, node2.tag), "/>", "</", (getter, node3.tag), ">"]
        self.assertEqual(valid, output)

    def test_tag(self):
        """test getter/setter for the tag attribute"""
        tag = wraptext("ref")
        node = Tag(tag, wraptext("text"))
        self.assertIs(tag, node.tag)
        self.assertIs(tag, node.closing_tag)
        node.tag = "span"
        self.assertWikicodeEqual(wraptext("span"), node.tag)
        self.assertWikicodeEqual(wraptext("span"), node.closing_tag)
        self.assertEqual("<span>text</span>", node)

    def test_contents(self):
        """test getter/setter for the contents attribute"""
        contents = wraptext("text")
        node = Tag(wraptext("ref"), contents)
        self.assertIs(contents, node.contents)
        node.contents = "text and a {{template}}"
        parsed = wrap([Text("text and a "), Template(wraptext("template"))])
        self.assertWikicodeEqual(parsed, node.contents)
        self.assertEqual("<ref>text and a {{template}}</ref>", node)

    def test_attributes(self):
        """test getter for the attributes attribute"""
        attrs = [agen("name", "bar")]
        node1 = Tag(wraptext("ref"), wraptext("foo"))
        node2 = Tag(wraptext("ref"), wraptext("foo"), attrs)
        self.assertEqual([], node1.attributes)
        self.assertIs(attrs, node2.attributes)

    def test_wiki_markup(self):
        """test getter/setter for the wiki_markup attribute"""
        node = Tag(wraptext("i"), wraptext("italic text"))
        self.assertIs(None, node.wiki_markup)
        node.wiki_markup = "''"
        self.assertEqual("''", node.wiki_markup)
        self.assertEqual("''italic text''", node)
        node.wiki_markup = False
        self.assertFalse(node.wiki_markup)
        self.assertEqual("<i>italic text</i>", node)

    def test_self_closing(self):
        """test getter/setter for the self_closing attribute"""
        node = Tag(wraptext("ref"), wraptext("foobar"))
        self.assertFalse(node.self_closing)
        node.self_closing = True
        self.assertTrue(node.self_closing)
        self.assertEqual("<ref/>", node)
        node.self_closing = 0
        self.assertFalse(node.self_closing)
        self.assertEqual("<ref>foobar</ref>", node)

    def test_invalid(self):
        """test getter/setter for the invalid attribute"""
        node = Tag(wraptext("br"), self_closing=True, implicit=True)
        self.assertFalse(node.invalid)
        node.invalid = True
        self.assertTrue(node.invalid)
        self.assertEqual("</br>", node)
        node.invalid = 0
        self.assertFalse(node.invalid)
        self.assertEqual("<br>", node)

    def test_implicit(self):
        """test getter/setter for the implicit attribute"""
        node = Tag(wraptext("br"), self_closing=True)
        self.assertFalse(node.implicit)
        node.implicit = True
        self.assertTrue(node.implicit)
        self.assertEqual("<br>", node)
        node.implicit = 0
        self.assertFalse(node.implicit)
        self.assertEqual("<br/>", node)

    def test_padding(self):
        """test getter/setter for the padding attribute"""
        node = Tag(wraptext("ref"), wraptext("foobar"))
        self.assertEqual("", node.padding)
        node.padding = "  "
        self.assertEqual("  ", node.padding)
        self.assertEqual("<ref  >foobar</ref>", node)
        node.padding = None
        self.assertEqual("", node.padding)
        self.assertEqual("<ref>foobar</ref>", node)
        self.assertRaises(ValueError, setattr, node, "padding", True)

    def test_closing_tag(self):
        """test getter/setter for the closing_tag attribute"""
        tag = wraptext("ref")
        node = Tag(tag, wraptext("foobar"))
        self.assertIs(tag, node.closing_tag)
        node.closing_tag = "ref {{ignore me}}"
        parsed = wrap([Text("ref "), Template(wraptext("ignore me"))])
        self.assertWikicodeEqual(parsed, node.closing_tag)
        self.assertEqual("<ref>foobar</ref {{ignore me}}>", node)

    def test_has(self):
        """test Tag.has()"""
        node = Tag(wraptext("ref"), wraptext("cite"), [agen("name", "foo")])
        self.assertTrue(node.has("name"))
        self.assertTrue(node.has("  name  "))
        self.assertTrue(node.has(wraptext("name")))
        self.assertFalse(node.has("Name"))
        self.assertFalse(node.has("foo"))

        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
        node2 = Tag(wraptext("div"), attrs=attrs, self_closing=True)
        self.assertTrue(node2.has("id"))
        self.assertTrue(node2.has("class"))
        self.assertTrue(node2.has(attrs[1].pad_first + str(attrs[1].name) +
                                  attrs[1].pad_before_eq))
        self.assertTrue(node2.has(attrs[3]))
        self.assertTrue(node2.has(str(attrs[3])))
        self.assertFalse(node2.has("idclass"))
        self.assertFalse(node2.has("id class"))
        self.assertFalse(node2.has("id=foo"))

    def test_get(self):
        """test Tag.get()"""
        attrs = [agen("name", "foo")]
        node = Tag(wraptext("ref"), wraptext("cite"), attrs)
        self.assertIs(attrs[0], node.get("name"))
        self.assertIs(attrs[0], node.get("  name  "))
        self.assertIs(attrs[0], node.get(wraptext("name")))
        self.assertRaises(ValueError, node.get, "Name")
        self.assertRaises(ValueError, node.get, "foo")

        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
        node2 = Tag(wraptext("div"), attrs=attrs, self_closing=True)
        self.assertIs(attrs[0], node2.get("id"))
        self.assertIs(attrs[1], node2.get("class"))
        self.assertIs(attrs[1], node2.get(
            attrs[1].pad_first + str(attrs[1].name) + attrs[1].pad_before_eq))
        self.assertIs(attrs[3], node2.get(attrs[3]))
        self.assertIs(attrs[3], node2.get(str(attrs[3])))
        self.assertIs(attrs[3], node2.get(" foo"))
        self.assertRaises(ValueError, node2.get, "idclass")
        self.assertRaises(ValueError, node2.get, "id class")
        self.assertRaises(ValueError, node2.get, "id=foo")

    def test_add(self):
        """test Tag.add()"""
        node = Tag(wraptext("ref"), wraptext("cite"))
        node.add("name", "value")
        node.add("name", "value", quoted=False)
        node.add("name")
        node.add(1, False)
        node.add("style", "{{foobar}}")
        node.add("name", "value", True, "\n", " ", "   ")
        attr1 = ' name="value"'
        attr2 = " name=value"
        attr3 = " name"
        attr4 = ' 1="False"'
        attr5 = ' style="{{foobar}}"'
        attr6 = '\nname =   "value"'
        self.assertEqual(attr1, node.attributes[0])
        self.assertEqual(attr2, node.attributes[1])
        self.assertEqual(attr3, node.attributes[2])
        self.assertEqual(attr4, node.attributes[3])
        self.assertEqual(attr5, node.attributes[4])
        self.assertEqual(attr6, node.attributes[5])
        self.assertEqual(attr6, node.get("name"))
        self.assertWikicodeEqual(wrap([Template(wraptext("foobar"))]),
                                 node.attributes[4].value)
        self.assertEqual("".join(("<ref", attr1, attr2, attr3, attr4, attr5,
                                  attr6, ">cite</ref>")), node)

    def test_remove(self):
        """test Tag.remove()"""
        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
        node = Tag(wraptext("div"), attrs=attrs, self_closing=True)
        node.remove("class")
        self.assertEqual('<div id="foo" foo="bar" foo  \n />', node)
        node.remove("foo")
        self.assertEqual('<div id="foo"/>', node)
        self.assertRaises(ValueError, node.remove, "foo")
        node.remove("id")
        self.assertEqual('<div/>', node)

 if __name__ == "__main__":
    unittest.main(verbosity=2)
--- a/tests/test_template.py
+++ b/tests/test_template.py
@@ -115,23 +115,23 @@ class TestTemplate(TreeEqualityTestCase):
        self.assertEqual([], node1.params)
        self.assertIs(plist, node2.params)

    def test_has_param(self):
        """test Template.has_param()"""
    def test_has(self):
        """test Template.has()"""
        node1 = Template(wraptext("foobar"))
        node2 = Template(wraptext("foo"),
                         [pgenh("1", "bar"), pgens("\nabc ", "def")])
        node3 = Template(wraptext("foo"),
                         [pgenh("1", "a"), pgens("b", "c"), pgens("1", "d")])
        node4 = Template(wraptext("foo"), [pgenh("1", "a"), pgens("b", " ")])
        self.assertFalse(node1.has_param("foobar"))
        self.assertTrue(node2.has_param(1))
        self.assertTrue(node2.has_param("abc"))
        self.assertFalse(node2.has_param("def"))
        self.assertTrue(node3.has_param("1"))
        self.assertTrue(node3.has_param(" b "))
        self.assertFalse(node4.has_param("b"))
        self.assertTrue(node3.has_param("b", False))
        self.assertTrue(node4.has_param("b", False))
        self.assertFalse(node1.has("foobar"))
        self.assertTrue(node2.has(1))
        self.assertTrue(node2.has("abc"))
        self.assertFalse(node2.has("def"))
        self.assertTrue(node3.has("1"))
        self.assertTrue(node3.has(" b "))
        self.assertFalse(node4.has("b"))
        self.assertTrue(node3.has("b", False))
        self.assertTrue(node4.has("b", False))

    def test_get(self):
        """test Template.get()"""
--- a/tests/test_tokens.py
+++ b/tests/test_tokens.py
@@ -44,8 +44,8 @@ class TestTokens(unittest.TestCase):

        self.assertEqual("bar", token2.foo)
        self.assertEqual(123, token2.baz)
        self.assertRaises(KeyError, lambda: token1.foo)
        self.assertRaises(KeyError, lambda: token2.bar)
        self.assertFalse(token1.foo)
        self.assertFalse(token2.bar)

        token1.spam = "eggs"
        token2.foo = "ham"
@@ -53,7 +53,7 @@ class TestTokens(unittest.TestCase):

        self.assertEqual("eggs", token1.spam)
        self.assertEqual("ham", token2.foo)
        self.assertRaises(KeyError, lambda: token2.baz)
        self.assertFalse(token2.baz)
        self.assertRaises(KeyError, delattr, token2, "baz")

    def test_repr(self):
--- a/tests/test_wikicode.py
+++ b/tests/test_wikicode.py
@@ -21,6 +21,7 @@
 # SOFTWARE.

 from __future__ import unicode_literals
 from functools import partial
 import re
 from types import GeneratorType
 import unittest
@@ -122,66 +123,99 @@ class TestWikicode(TreeEqualityTestCase):
        code3.insert(-1000, "derp")
        self.assertEqual("derp{{foo}}bar[[baz]]", code3)

    def _test_search(self, meth, expected):
        """Base test for insert_before(), insert_after(), and replace()."""
        code = parse("{{a}}{{b}}{{c}}{{d}}{{e}}")
        func = partial(meth, code)
        func("{{b}}", "x", recursive=True)
        func("{{d}}", "[[y]]", recursive=False)
        func(code.get(2), "z")
        self.assertEqual(expected[0], code)
        self.assertRaises(ValueError, func, "{{r}}", "n", recursive=True)
        self.assertRaises(ValueError, func, "{{r}}", "n", recursive=False)
        fake = parse("{{a}}").get(0)
        self.assertRaises(ValueError, func, fake, "n", recursive=True)
        self.assertRaises(ValueError, func, fake, "n", recursive=False)

        code2 = parse("{{a}}{{a}}{{a}}{{b}}{{b}}{{b}}")
        func = partial(meth, code2)
        func(code2.get(1), "c", recursive=False)
        func("{{a}}", "d", recursive=False)
        func(code2.get(-1), "e", recursive=True)
        func("{{b}}", "f", recursive=True)
        self.assertEqual(expected[1], code2)

        code3 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
        func = partial(meth, code3)
        obj = code3.get(0).params[0].value.get(0)
        self.assertRaises(ValueError, func, obj, "x", recursive=False)
        func(obj, "x", recursive=True)
        self.assertRaises(ValueError, func, "{{f}}", "y", recursive=False)
        func("{{f}}", "y", recursive=True)
        self.assertEqual(expected[2], code3)

        code4 = parse("{{a}}{{b}}{{c}}{{d}}{{e}}{{f}}{{g}}{{h}}{{i}}{{j}}")
        func = partial(meth, code4)
        fake = parse("{{b}}{{c}}")
        self.assertRaises(ValueError, func, fake, "q", recursive=False)
        self.assertRaises(ValueError, func, fake, "q", recursive=True)
        func("{{b}}{{c}}", "w", recursive=False)
        func("{{d}}{{e}}", "x", recursive=True)
        func(wrap(code4.nodes[-2:]), "y", recursive=False)
        func(wrap(code4.nodes[-2:]), "z", recursive=True)
        self.assertEqual(expected[3], code4)
        self.assertRaises(ValueError, func, "{{c}}{{d}}", "q", recursive=False)
        self.assertRaises(ValueError, func, "{{c}}{{d}}", "q", recursive=True)

        code5 = parse("{{a|{{b}}{{c}}|{{f|{{g}}={{h}}{{i}}}}}}")
        func = partial(meth, code5)
        self.assertRaises(ValueError, func, "{{b}}{{c}}", "x", recursive=False)
        func("{{b}}{{c}}", "x", recursive=True)
        obj = code5.get(0).params[1].value.get(0).params[0].value
        self.assertRaises(ValueError, func, obj, "y", recursive=False)
        func(obj, "y", recursive=True)
        self.assertEqual(expected[4], code5)

        code6 = parse("here is {{some text and a {{template}}}}")
        func = partial(meth, code6)
        self.assertRaises(ValueError, func, "text and", "ab", recursive=False)
        func("text and", "ab", recursive=True)
        self.assertRaises(ValueError, func, "is {{some", "cd", recursive=False)
        func("is {{some", "cd", recursive=True)
        self.assertEqual(expected[5], code6)

    def test_insert_before(self):
        """test Wikicode.insert_before()"""
        code = parse("{{a}}{{b}}{{c}}{{d}}")
        code.insert_before("{{b}}", "x", recursive=True)
        code.insert_before("{{d}}", "[[y]]", recursive=False)
        self.assertEqual("{{a}}x{{b}}{{c}}[[y]]{{d}}", code)
        code.insert_before(code.get(2), "z")
        self.assertEqual("{{a}}xz{{b}}{{c}}[[y]]{{d}}", code)
        self.assertRaises(ValueError, code.insert_before, "{{r}}", "n",
                          recursive=True)
        self.assertRaises(ValueError, code.insert_before, "{{r}}", "n",
                          recursive=False)

        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
        code2.insert_before(code2.get(0).params[0].value.get(0), "x",
                            recursive=True)
        code2.insert_before("{{f}}", "y", recursive=True)
        self.assertEqual("{{a|x{{b}}|{{c|d=y{{f}}}}}}", code2)
        self.assertRaises(ValueError, code2.insert_before, "{{f}}", "y",
                          recursive=False)
        meth = lambda code, *args, **kw: code.insert_before(*args, **kw)
        expected = [
            "{{a}}xz{{b}}{{c}}[[y]]{{d}}{{e}}",
            "d{{a}}cd{{a}}d{{a}}f{{b}}f{{b}}ef{{b}}",
            "{{a|x{{b}}|{{c|d=y{{f}}}}}}",
            "{{a}}w{{b}}{{c}}x{{d}}{{e}}{{f}}{{g}}{{h}}yz{{i}}{{j}}",
            "{{a|x{{b}}{{c}}|{{f|{{g}}=y{{h}}{{i}}}}}}",
            "here cdis {{some abtext and a {{template}}}}"]
        self._test_search(meth, expected)

    def test_insert_after(self):
        """test Wikicode.insert_after()"""
        code = parse("{{a}}{{b}}{{c}}{{d}}")
        code.insert_after("{{b}}", "x", recursive=True)
        code.insert_after("{{d}}", "[[y]]", recursive=False)
        self.assertEqual("{{a}}{{b}}x{{c}}{{d}}[[y]]", code)
        code.insert_after(code.get(2), "z")
        self.assertEqual("{{a}}{{b}}xz{{c}}{{d}}[[y]]", code)
        self.assertRaises(ValueError, code.insert_after, "{{r}}", "n",
                          recursive=True)
        self.assertRaises(ValueError, code.insert_after, "{{r}}", "n",
                          recursive=False)

        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
        code2.insert_after(code2.get(0).params[0].value.get(0), "x",
                           recursive=True)
        code2.insert_after("{{f}}", "y", recursive=True)
        self.assertEqual("{{a|{{b}}x|{{c|d={{f}}y}}}}", code2)
        self.assertRaises(ValueError, code2.insert_after, "{{f}}", "y",
                          recursive=False)
        meth = lambda code, *args, **kw: code.insert_after(*args, **kw)
        expected = [
            "{{a}}{{b}}xz{{c}}{{d}}[[y]]{{e}}",
            "{{a}}d{{a}}dc{{a}}d{{b}}f{{b}}f{{b}}fe",
            "{{a|{{b}}x|{{c|d={{f}}y}}}}",
            "{{a}}{{b}}{{c}}w{{d}}{{e}}x{{f}}{{g}}{{h}}{{i}}{{j}}yz",
            "{{a|{{b}}{{c}}x|{{f|{{g}}={{h}}{{i}}y}}}}",
            "here is {{somecd text andab a {{template}}}}"]
        self._test_search(meth, expected)

    def test_replace(self):
        """test Wikicode.replace()"""
        code = parse("{{a}}{{b}}{{c}}{{d}}")
        code.replace("{{b}}", "x", recursive=True)
        code.replace("{{d}}", "[[y]]", recursive=False)
        self.assertEqual("{{a}}x{{c}}[[y]]", code)
        code.replace(code.get(1), "z")
        self.assertEqual("{{a}}z{{c}}[[y]]", code)
        self.assertRaises(ValueError, code.replace, "{{r}}", "n",
                          recursive=True)
        self.assertRaises(ValueError, code.replace, "{{r}}", "n",
                          recursive=False)

        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
        code2.replace(code2.get(0).params[0].value.get(0), "x", recursive=True)
        code2.replace("{{f}}", "y", recursive=True)
        self.assertEqual("{{a|x|{{c|d=y}}}}", code2)
        self.assertRaises(ValueError, code2.replace, "y", "z", recursive=False)
        meth = lambda code, *args, **kw: code.replace(*args, **kw)
        expected = [
            "{{a}}xz[[y]]{{e}}", "dcdffe", "{{a|x|{{c|d=y}}}}",
            "{{a}}wx{{f}}{{g}}z", "{{a|x|{{f|{{g}}=y}}}}",
            "here cd ab a {{template}}}}"]
        self._test_search(meth, expected)

    def test_append(self):
        """test Wikicode.append()"""
@@ -197,18 +231,25 @@ class TestWikicode(TreeEqualityTestCase):

    def test_remove(self):
        """test Wikicode.remove()"""
        code = parse("{{a}}{{b}}{{c}}{{d}}")
        code.remove("{{b}}", recursive=True)
        code.remove(code.get(1), recursive=True)
        self.assertEqual("{{a}}{{d}}", code)
        self.assertRaises(ValueError, code.remove, "{{r}}", recursive=True)
        self.assertRaises(ValueError, code.remove, "{{r}}", recursive=False)

        code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}")
        code2.remove(code2.get(0).params[0].value.get(0), recursive=True)
        code2.remove("{{f}}", recursive=True)
        self.assertEqual("{{a||{{c|d={{h}}}}}}", code2)
        self.assertRaises(ValueError, code2.remove, "{{h}}", recursive=False)
        meth = lambda code, obj, value, **kw: code.remove(obj, **kw)
        expected = [
            "{{a}}{{c}}", "", "{{a||{{c|d=}}}}", "{{a}}{{f}}",
            "{{a||{{f|{{g}}=}}}}", "here   a {{template}}}}"
        ]
        self._test_search(meth, expected)

    def test_matches(self):
        """test Wikicode.matches()"""
        code1 = parse("Cleanup")
        code2 = parse("\nstub<!-- TODO: make more specific -->")
        self.assertTrue(code1.matches("Cleanup"))
        self.assertTrue(code1.matches("cleanup"))
        self.assertTrue(code1.matches("  cleanup\n"))
        self.assertFalse(code1.matches("CLEANup"))
        self.assertFalse(code1.matches("Blah"))
        self.assertTrue(code2.matches("stub"))
        self.assertTrue(code2.matches("Stub<!-- no, it's fine! -->"))
        self.assertFalse(code2.matches("StuB"))

    def test_filter_family(self):
        """test the Wikicode.i?filter() family of functions"""
@@ -219,11 +260,11 @@ class TestWikicode(TreeEqualityTestCase):

        code = parse("a{{b}}c[[d]]{{{e}}}{{f}}[[g]]")
        for func in (code.filter, ifilter(code)):
            self.assertEqual(["a", "{{b}}", "c", "[[d]]", "{{{e}}}", "{{f}}",
                              "[[g]]"], func())
            self.assertEqual(["a", "{{b}}", "b", "c", "[[d]]", "d", "{{{e}}}",
                              "e", "{{f}}", "f", "[[g]]", "g"], func())
            self.assertEqual(["{{{e}}}"], func(forcetype=Argument))
            self.assertIs(code.get(4), func(forcetype=Argument)[0])
            self.assertEqual(["a", "c"], func(forcetype=Text))
            self.assertEqual(list("abcdefg"), func(forcetype=Text))
            self.assertEqual([], func(forcetype=Heading))
            self.assertRaises(TypeError, func, forcetype=True)

@@ -235,11 +276,12 @@ class TestWikicode(TreeEqualityTestCase):
            self.assertEqual(["{{{e}}}"], get_filter("arguments"))
            self.assertIs(code.get(4), get_filter("arguments")[0])
            self.assertEqual([], get_filter("comments"))
            self.assertEqual([], get_filter("external_links"))
            self.assertEqual([], get_filter("headings"))
            self.assertEqual([], get_filter("html_entities"))
            self.assertEqual([], get_filter("tags"))
            self.assertEqual(["{{b}}", "{{f}}"], get_filter("templates"))
            self.assertEqual(["a", "c"], get_filter("text"))
            self.assertEqual(list("abcdefg"), get_filter("text"))
            self.assertEqual(["[[d]]", "[[g]]"], get_filter("wikilinks"))

        code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}")
@@ -252,13 +294,13 @@ class TestWikicode(TreeEqualityTestCase):

        code3 = parse("{{foobar}}{{FOO}}{{baz}}{{bz}}")
        for func in (code3.filter, ifilter(code3)):
            self.assertEqual(["{{foobar}}", "{{FOO}}"], func(matches=r"foo"))
            self.assertEqual(["{{foobar}}", "{{FOO}}"], func(recursive=False, matches=r"foo"))
            self.assertEqual(["{{foobar}}", "{{FOO}}"],
                             func(matches=r"^{{foo.*?}}"))
                             func(recursive=False, matches=r"^{{foo.*?}}"))
            self.assertEqual(["{{foobar}}"],
                             func(matches=r"^{{foo.*?}}", flags=re.UNICODE))
            self.assertEqual(["{{baz}}", "{{bz}}"], func(matches=r"^{{b.*?z"))
            self.assertEqual(["{{baz}}"], func(matches=r"^{{b.+?z}}"))
                             func(recursive=False, matches=r"^{{foo.*?}}", flags=re.UNICODE))
            self.assertEqual(["{{baz}}", "{{bz}}"], func(recursive=False, matches=r"^{{b.*?z"))
            self.assertEqual(["{{baz}}"], func(recursive=False, matches=r"^{{b.+?z}}"))

        self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}"],
                         code2.filter_templates(recursive=False))
--- a/tests/tokenizer/external_links.mwtest
+++ b/tests/tokenizer/external_links.mwtest
@@ -0,0 +1,473 @@
 name:   basic
 label:  basic external link
 input:  "http://example.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose()]

 ---

 name:   basic_brackets
 label:  basic external link in brackets
 input:  "[http://example.com/]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose()]

 ---

 name:   brackets_space
 label:  basic external link in brackets, with a space after
 input:  "[http://example.com/ ]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), ExternalLinkClose()]

 ---

 name:   brackets_title
 label:  basic external link in brackets, with a title
 input:  "[http://example.com/ Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   brackets_multiword_title
 label:  basic external link in brackets, with a multi-word title
 input:  "[http://example.com/ Example Web Page]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example Web Page"), ExternalLinkClose()]

 ---

 name:   brackets_adjacent
 label:  three adjacent bracket-enclosed external links
 input:  "[http://foo.com/ Foo][http://bar.com/ Bar]\n[http://baz.com/ Baz]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://foo.com/"), ExternalLinkSeparator(), Text(text="Foo"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://bar.com/"), ExternalLinkSeparator(), Text(text="Bar"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=True), Text(text="http://baz.com/"), ExternalLinkSeparator(), Text(text="Baz"), ExternalLinkClose()]

 ---

 name:   brackets_newline_before
 label:  bracket-enclosed link with a newline before the title
 input:  "[http://example.com/ \nExample]"
 output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" \nExample]")]

 ---

 name:   brackets_newline_inside
 label:  bracket-enclosed link with a newline in the title
 input:  "[http://example.com/ Example \nWeb Page]"
 output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example \nWeb Page]")]

 ---

 name:   brackets_newline_after
 label:  bracket-enclosed link with a newline after the title
 input:  "[http://example.com/ Example\n]"
 output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example\n]")]

 ---

 name:   brackets_space_before
 label:  bracket-enclosed link with a space before the URL
 input:  "[ http://example.com Example]"
 output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" Example]")]

 ---

 name:   brackets_title_like_url
 label:  bracket-enclosed link with a title that looks like a URL
 input:  "[http://example.com http://example.com]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="http://example.com"), ExternalLinkClose()]

 ---

 name:   brackets_recursive
 label:  bracket-enclosed link with a bracket-enclosed link as the title
 input:  "[http://example.com [http://example.com]]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="[http://example.com"), ExternalLinkClose(), Text(text="]")]

 ---

 name:   period_after
 label:  a period after a free link that is excluded
 input:  "http://example.com."
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=".")]

 ---

 name:   colons_after
 label:  colons after a free link that are excluded
 input:  "http://example.com/foo:bar.:;baz!?,"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/foo:bar.:;baz"), ExternalLinkClose(), Text(text="!?,")]

 ---

 name:   close_paren_after_excluded
 label:  a closing parenthesis after a free link that is excluded
 input:  "http://example.)com)"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.)com"), ExternalLinkClose(), Text(text=")")]

 ---

 name:   close_paren_after_included
 label:  a closing parenthesis after a free link that is included because of an opening parenthesis in the URL
 input:  "http://example.(com)"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.(com)"), ExternalLinkClose()]

 ---

 name:   open_bracket_inside
 label:  an open bracket inside a free link that causes it to be ended abruptly
 input:  "http://foobar[baz.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://foobar"), ExternalLinkClose(), Text(text="[baz.com")]

 ---

 name:   brackets_period_after
 label:  a period after a bracket-enclosed link that is included
 input:  "[http://example.com. Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com."), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   brackets_colons_after
 label:  colons after a bracket-enclosed link that are included
 input:  "[http://example.com/foo:bar.:;baz!?, Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/foo:bar.:;baz!?,"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   brackets_close_paren_after_included
 label:  a closing parenthesis after a bracket-enclosed link that is included
 input:  "[http://example.)com) Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.)com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   brackets_close_paren_after_included_2
 label:  a closing parenthesis after a bracket-enclosed link that is also included
 input:  "[http://example.(com) Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://example.(com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   brackets_open_bracket_inside
 label:  an open bracket inside a bracket-enclosed link that is also included
 input:  "[http://foobar[baz.com Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="http://foobar[baz.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   adjacent_space
 label:  two free links separated by a space
 input:  "http://example.com http://example.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]

 ---

 name:   adjacent_newline
 label:  two free links separated by a newline
 input:  "http://example.com\nhttp://example.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]

 ---

 name:   adjacent_close_bracket
 label:  two free links separated by a close bracket
 input:  "http://example.com]http://example.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="]"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]

 ---

 name:   html_entity_in_url
 label:  a HTML entity parsed correctly inside a free link
 input:  "http://exa&nbsp;mple.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="mple.com/"), ExternalLinkClose()]

 ---

 name:   template_in_url
 label:  a template parsed correctly inside a free link
 input:  "http://exa{{template}}mple.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), TemplateOpen(), Text(text="template"), TemplateClose(), Text(text="mple.com/"), ExternalLinkClose()]

 ---

 name:   argument_in_url
 label:  an argument parsed correctly inside a free link
 input:  "http://exa{{{argument}}}mple.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ArgumentOpen(), Text(text="argument"), ArgumentClose(), Text(text="mple.com/"), ExternalLinkClose()]

 ---

 name:   wikilink_in_url
 label:  a wikilink that destroys a free link
 input:  "http://exa[[wikilink]]mple.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), WikilinkOpen(), Text(text="wikilink"), WikilinkClose(), Text(text="mple.com/")]

 ---

 name:   external_link_in_url
 label:  a bracketed link that destroys a free link
 input:  "http://exa[http://example.com/]mple.com/"
 output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose(), Text(text="mple.com/")]

 ---

 name:   spaces_padding
 label:  spaces padding a free link
 input:  "   http://example.com   "
 output: [Text(text="   "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="   ")]

 ---

 name:   text_and_spaces_padding
 label:  text and spaces padding a free link
 input:  "x   http://example.com   x"
 output: [Text(text="x   "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="   x")]

 ---

 name:   template_before
 label:  a template before a free link
 input:  "{{foo}}http://example.com"
 output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]

 ---

 name:   spaces_padding_no_slashes
 label:  spaces padding a free link with no slashes after the colon
 input:  "   mailto:example@example.com   "
 output: [Text(text="   "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text="   ")]

 ---

 name:   text_and_spaces_padding_no_slashes
 label:  text and spaces padding a free link with no slashes after the colon
 input:  "x   mailto:example@example.com   x"
 output: [Text(text="x   "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text="   x")]

 ---

 name:   template_before_no_slashes
 label:  a template before a free link with no slashes after the colon
 input:  "{{foo}}mailto:example@example.com"
 output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]

 ---

 name:   no_slashes
 label:  a free link with no slashes after the colon
 input:  "mailto:example@example.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]

 ---

 name:   slashes_optional
 label:  a free link using a scheme that doesn't need slashes, but has them anyway
 input:  "mailto://example@example.com"
 output: [ExternalLinkOpen(brackets=False), Text(text="mailto://example@example.com"), ExternalLinkClose()]

 ---

 name:   short
 label:  a very short free link
 input:  "mailto://abc"
 output: [ExternalLinkOpen(brackets=False), Text(text="mailto://abc"), ExternalLinkClose()]

 ---

 name:   slashes_missing
 label:  slashes missing from a free link with a scheme that requires them
 input:  "http:example@example.com"
 output: [Text(text="http:example@example.com")]

 ---

 name:   no_scheme_but_slashes
 label:  no scheme in a free link, but slashes (protocol-relative free links are not supported)
 input:  "//example.com"
 output: [Text(text="//example.com")]

 ---

 name:   no_scheme_but_colon
 label:  no scheme in a free link, but a colon
 input:  " :example.com"
 output: [Text(text=" :example.com")]

 ---

 name:   no_scheme_but_colon_and_slashes
 label:  no scheme in a free link, but a colon and slashes
 input:  " ://example.com"
 output: [Text(text=" ://example.com")]

 ---

 name:   fake_scheme_no_slashes
 label:  a nonexistent scheme in a free link, without slashes
 input:  "fake:example.com"
 output: [Text(text="fake:example.com")]

 ---

 name:   fake_scheme_slashes
 label:  a nonexistent scheme in a free link, with slashes
 input:  "fake://example.com"
 output: [Text(text="fake://example.com")]

 ---

 name:   fake_scheme_brackets_no_slashes
 label:  a nonexistent scheme in a bracketed link, without slashes
 input:  "[fake:example.com]"
 output: [Text(text="[fake:example.com]")]

 ---

 name:   fake_scheme_brackets_slashes
 label:  #=a nonexistent scheme in a bracketed link, with slashes
 input:  "[fake://example.com]"
 output: [Text(text="[fake://example.com]")]

 ---

 name:   interrupted_scheme
 label:  an otherwise valid scheme with something in the middle of it, in a free link
 input:  "ht?tp://example.com"
 output: [Text(text="ht?tp://example.com")]

 ---

 name:   interrupted_scheme_brackets
 label:  an otherwise valid scheme with something in the middle of it, in a bracketed link
 input:  "[ht?tp://example.com]"
 output: [Text(text="[ht?tp://example.com]")]

 ---

 name:   no_slashes_brackets
 label:  no slashes after the colon in a bracketed link
 input:  "[mailto:example@example.com Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="mailto:example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   space_before_no_slashes_brackets
 label:  a space before a bracketed link with no slashes after the colon
 input:  "[ mailto:example@example.com Example]"
 output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text=" Example]")]

 ---

 name:   slashes_optional_brackets
 label:  a bracketed link using a scheme that doesn't need slashes, but has them anyway
 input:  "[mailto://example@example.com Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="mailto://example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   short_brackets
 label:  a very short link in brackets
 input:  "[mailto://abc Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="mailto://abc"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   slashes_missing_brackets
 label:  slashes missing from a scheme that requires them in a bracketed link
 input:  "[http:example@example.com Example]"
 output: [Text(text="[http:example@example.com Example]")]

 ---

 name:   protcol_relative
 label:  a protocol-relative link (in brackets)
 input:  "[//example.com Example]"
 output: [ExternalLinkOpen(brackets=True), Text(text="//example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]

 ---

 name:   scheme_missing_but_colon_brackets
 label:  scheme missing from a bracketed link, but with a colon
 input:  "[:example.com Example]"
 output: [Text(text="[:example.com Example]")]

 ---

 name:   scheme_missing_but_colon_slashes_brackets
 label:  scheme missing from a bracketed link, but with a colon and slashes
 input:  "[://example.com Example]"
 output: [Text(text="[://example.com Example]")]

 ---

 name:   unclosed_protocol_relative
 label:  an unclosed protocol-relative bracketed link
 input:  "[//example.com"
 output: [Text(text="[//example.com")]

 ---

 name:   space_before_protcol_relative
 label:  a space before a protocol-relative bracketed link
 input:  "[ //example.com]"
 output: [Text(text="[ //example.com]")]

 ---

 name:   unclosed_just_scheme
 label:  an unclosed bracketed link, ending after the scheme
 input:  "[http"
 output: [Text(text="[http")]

 ---

 name:   unclosed_scheme_colon
 label:  an unclosed bracketed link, ending after the colon
 input:  "[http:"
 output: [Text(text="[http:")]

 ---

 name:   unclosed_scheme_colon_slashes
 label:  an unclosed bracketed link, ending after the slashes
 input:  "[http://"
 output: [Text(text="[http://")]

 ---

 name:   incomplete_bracket
 label:  just an open bracket
 input:  "["
 output: [Text(text="[")]

 ---

 name:   incomplete_scheme_colon
 label:  a free link with just a scheme and a colon
 input:  "http:"
 output: [Text(text="http:")]

 ---

 name:   incomplete_scheme_colon_slashes
 label:  a free link with just a scheme, colon, and slashes
 input:  "http://"
 output: [Text(text="http://")]

 ---

 name:   brackets_scheme_but_no_url
 label:  brackets around a scheme and a colon
 input:  "[mailto:]"
 output: [Text(text="[mailto:]")]

 ---

 name:   brackets_scheme_slashes_but_no_url
 label:  brackets around a scheme, colon, and slashes
 input:  "[http://]"
 output: [Text(text="[http://]")]

 ---

 name:   brackets_scheme_title_but_no_url
 label:  brackets around a scheme, colon, and slashes, with a title
 input:  "[http:// Example]"
 output: [Text(text="[http:// Example]")]
--- a/tests/tokenizer/html_entities.mwtest
+++ b/tests/tokenizer/html_entities.mwtest
@@ -117,6 +117,20 @@ output: [Text(text="&;")]

 ---

 name:   invalid_partial_amp_pound
 label:  invalid entities: just an ampersand, pound sign
 input:  "&#"
 output: [Text(text="&#")]

 ---

 name:   invalid_partial_amp_pound_x
 label:  invalid entities: just an ampersand, pound sign, x
 input:  "&#x"
 output: [Text(text="&#x")]

 ---

 name:   invalid_partial_amp_pound_semicolon
 label:  invalid entities: an ampersand, pound sign, and semicolon
 input:  "&#;"
--- a/tests/tokenizer/integration.mwtest
+++ b/tests/tokenizer/integration.mwtest
@@ -12,6 +12,13 @@ output: [TemplateOpen(), ArgumentOpen(), ArgumentOpen(), Text(text="foo"), Argum

 ---

 name:   link_in_template_name
 label:  a wikilink inside a template name, which breaks the template
 input:  "{{foo[[bar]]}}"
 output: [Text(text="{{foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="}}")]

 ---

 name:   rich_heading
 label:  a heading with templates/wikilinks in it
 input:  "== Head{{ing}} [[with]] {{{funky|{{stuf}}}}} =="
@@ -33,6 +40,13 @@ output: [Text(text="&n"), CommentStart(), Text(text="foo"), CommentEnd(), Text(t

 ---

 name:   rich_tags
 label:  a HTML tag with tons of other things in it
 input:  "{{dubious claim}}<ref name={{abc}}   foo="bar {{baz}}" abc={{de}}f ghi=j{{k}}{{l}} \n mno =  "{{p}} [[q]] {{r}}">[[Source]]</ref>"
 output: [TemplateOpen(), Text(text="dubious claim"), TemplateClose(), TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TemplateOpen(), Text(text="abc"), TemplateClose(), TagAttrStart(pad_first="   ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(), Text(text="bar "), TemplateOpen(), Text(text="baz"), TemplateClose(), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="abc"), TagAttrEquals(), TemplateOpen(), Text(text="de"), TemplateClose(), Text(text="f"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="ghi"), TagAttrEquals(), Text(text="j"), TemplateOpen(), Text(text="k"), TemplateClose(), TemplateOpen(), Text(text="l"), TemplateClose(), TagAttrStart(pad_first=" \n ", pad_before_eq=" ", pad_after_eq="  "), Text(text="mno"), TagAttrEquals(), TagAttrQuote(), TemplateOpen(), Text(text="p"), TemplateClose(), Text(text=" "), WikilinkOpen(), Text(text="q"), WikilinkClose(), Text(text=" "), TemplateOpen(), Text(text="r"), TemplateClose(), TagCloseOpen(padding=""), WikilinkOpen(), Text(text="Source"), WikilinkClose(), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   wildcard
 label:  a wildcard assortment of various things
 input:  "{{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}"
@@ -44,3 +58,17 @@ name:   wildcard_redux
 label:  an even wilder assortment of various things
 input:  "{{a|b|{{c|[[d]]{{{e}}}}}}}[[f|{{{g}}}<!--h-->]]{{i|j=&nbsp;}}"
 output: [TemplateOpen(), Text(text="a"), TemplateParamSeparator(), Text(text="b"), TemplateParamSeparator(), TemplateOpen(), Text(text="c"), TemplateParamSeparator(), WikilinkOpen(), Text(text="d"), WikilinkClose(), ArgumentOpen(), Text(text="e"), ArgumentClose(), TemplateClose(), TemplateClose(), WikilinkOpen(), Text(text="f"), WikilinkSeparator(), ArgumentOpen(), Text(text="g"), ArgumentClose(), CommentStart(), Text(text="h"), CommentEnd(), WikilinkClose(), TemplateOpen(), Text(text="i"), TemplateParamSeparator(), Text(text="j"), TemplateParamEquals(), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), TemplateClose()]

 ---

 name:   link_inside_dl
 label:  an external link inside a def list, such that the external link is parsed
 input:  ";;;mailto:example"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example"), ExternalLinkClose()]

 ---

 name:   link_inside_dl_2
 label:  an external link inside a def list, such that the external link is not parsed
 input:  ";;;malito:example"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="malito"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="example")]
--- a/tests/tokenizer/tags.mwtest
+++ b/tests/tokenizer/tags.mwtest
@@ -0,0 +1,578 @@
 name:   basic
 label:  a basic tag with an open and close
 input:  "<ref></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   basic_selfclosing
 label:  a basic self-closing tag
 input:  "<ref/>"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding="")]

 ---

 name:   content
 label:  a tag with some content in the middle
 input:  "<ref>this is a reference</ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), Text(text="this is a reference"), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   padded_open
 label:  a tag with some padding in the open tag
 input:  "<ref ></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   padded_close
 label:  a tag with some padding in the close tag
 input:  "<ref></ref >"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref "), TagCloseClose()]

 ---

 name:   padded_selfclosing
 label:  a self-closing tag with padding
 input:  "<ref />"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding=" ")]

 ---

 name:   attribute
 label:  a tag with a single attribute
 input:  "<ref name></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   attribute_value
 label:  a tag with a single attribute with a value
 input:  "<ref name=foo></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   attribute_quoted
 label:  a tag with a single quoted attribute
 input:  "<ref name="foo bar"></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   attribute_hyphen
 label:  a tag with a single attribute, containing a hyphen
 input:  "<ref name=foo-bar></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   attribute_quoted_hyphen
 label:  a tag with a single quoted attribute, containing a hyphen
 input:  "<ref name="foo-bar"></ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   attribute_selfclosing
 label:  a self-closing tag with a single attribute
 input:  "<ref name/>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseSelfclose(padding="")]

 ---

 name:   attribute_selfclosing_value
 label:  a self-closing tag with a single attribute with a value
 input:  "<ref name=foo/>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseSelfclose(padding="")]

 ---

 name:   attribute_selfclosing_value_quoted
 label:  a self-closing tag with a single quoted attribute
 input:  "<ref name="foo"/>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo"), TagCloseSelfclose(padding="")]

 ---

 name:   nested_tag
 label:  a tag nested within the attributes of another
 input:  "<ref name=<span style="color: red;">foo</span>>citation</ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   nested_tag_quoted
 label:  a tag nested within the attributes of another, quoted
 input:  "<ref name="<span style="color: red;">foo</span>">citation</ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   nested_troll_tag
 label:  a bogus tag that appears to be nested within the attributes of another
 input:  "<ref name=</ ><//>>citation</ref>"
 output: [Text(text="<ref name=</ ><//>>citation</ref>")]

 ---

 name:   nested_troll_tag_quoted
 label:  a bogus tag that appears to be nested within the attributes of another, quoted
 input:  "<ref name="</ ><//>">citation</ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="</ ><//>"), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   invalid_space_begin_open
 label:  invalid tag: a space at the beginning of the open tag
 input:  "< ref>test</ref>"
 output: [Text(text="< ref>test</ref>")]

 ---

 name:   invalid_space_begin_close
 label:  invalid tag: a space at the beginning of the close tag
 input:  "<ref>test</ ref>"
 output: [Text(text="<ref>test</ ref>")]

 ---

 name:   valid_space_end
 label:  valid tag: spaces at the ends of both the open and close tags
 input:  "<ref >test</ref >"
 output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), Text(text="test"), TagOpenClose(), Text(text="ref "), TagCloseClose()]

 ---

 name:   invalid_template_ends
 label:  invalid tag: a template at the ends of both the open and close tags
 input:  "<ref {{foo}}>test</ref {{foo}}>"
 output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]

 ---

 name:   invalid_template_ends_nospace
 label:  invalid tag: a template at the ends of both the open and close tags, without spacing
 input:  "<ref {{foo}}>test</ref{{foo}}>"
 output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]

 ---

 name:   valid_template_end_open
 label:  valid tag: a template at the end of the open tag
 input:  "<ref {{foo}}>test</ref>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref"), TagCloseClose()]

 ---

 name:   valid_template_end_open_space_end_close
 label:  valid tag: a template at the end of the open tag; whitespace at the end of the close tag
 input:  "<ref {{foo}}>test</ref\n>"
 output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref\n"), TagCloseClose()]

 ---

 name:   invalid_template_end_open_nospace
 label:  invalid tag: a template at the end of the open tag, without spacing
 input:  "<ref{{foo}}>test</ref>"
 output: [Text(text="<ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref>")]

 ---

 name:   invalid_template_start_close
 label:  invalid tag: a template at the beginning of the close tag
 input:  "<ref>test</{{foo}}ref>"
 output: [Text(text="<ref>test</"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>")]

 ---

 name:   invalid_template_start_open
 label:  invalid tag: a template at the beginning of the open tag
 input:  "<{{foo}}ref>test</ref>"
 output: [Text(text="<"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>test</ref>")]

 ---

 name:   unclosed_quote
 label:  a quoted attribute that is never closed
 input:  "<span style="foobar>stuff</span>"
 output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foobar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]

 ---

 name:   fake_quote
 label:  a fake quoted attribute
 input:  "<span style="foo"bar>stuff</span>"
 output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo\"bar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]

 ---

 name:   fake_quote_complex
 label:  a fake quoted attribute, with spaces and templates and links
 input:  "<span style="foo {{bar}}\n[[baz]]"buzz >stuff</span>"
 output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo"), TagAttrStart(pad_first=" ", pad_before_eq="\n", pad_after_eq=""), TemplateOpen(), Text(text="bar"), TemplateClose(), TagAttrStart(pad_first="", pad_before_eq=" ", pad_after_eq=""), WikilinkOpen(), Text(text="baz"), WikilinkClose(), Text(text="\"buzz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]

 ---

 name:   incomplete_lbracket
 label:  incomplete tags: just a left bracket
 input:  "<"
 output: [Text(text="<")]

 ---

 name:   incomplete_lbracket_junk
 label:  incomplete tags: just a left bracket, surrounded by stuff
 input:  "foo<bar"
 output: [Text(text="foo<bar")]

 ---

 name:   incomplete_unclosed_open
 label:  incomplete tags: an unclosed open tag
 input:  "junk <ref"
 output: [Text(text="junk <ref")]

 ---

 name:   incomplete_unclosed_open_space
 label:  incomplete tags: an unclosed open tag, space
 input:  "junk <ref "
 output: [Text(text="junk <ref ")]

 ---

 name:   incomplete_unclosed_open_unnamed_attr
 label:  incomplete tags: an unclosed open tag, unnamed attribute
 input:  "junk <ref name"
 output: [Text(text="junk <ref name")]

 ---

 name:   incomplete_unclosed_open_attr_equals
 label:  incomplete tags: an unclosed open tag, attribute, equal sign
 input:  "junk <ref name="
 output: [Text(text="junk <ref name=")]

 ---

 name:   incomplete_unclosed_open_attr_equals_quoted
 label:  incomplete tags: an unclosed open tag, attribute, equal sign, quote
 input:  "junk <ref name=""
 output: [Text(text="junk <ref name=\"")]

 ---

 name:   incomplete_unclosed_open_attr
 label:  incomplete tags: an unclosed open tag, attribute with a key/value
 input:  "junk <ref name=foo"
 output: [Text(text="junk <ref name=foo")]

 ---

 name:   incomplete_unclosed_open_attr_quoted
 label:  incomplete tags: an unclosed open tag, attribute with a key/value, quoted
 input:  "junk <ref name="foo""
 output: [Text(text="junk <ref name=\"foo\"")]

 ---

 name:   incomplete_open
 label:  incomplete tags: an open tag
 input:  "junk <ref>"
 output: [Text(text="junk <ref>")]

 ---

 name:   incomplete_open_unnamed_attr
 label:  incomplete tags: an open tag, unnamed attribute
 input:  "junk <ref name>"
 output: [Text(text="junk <ref name>")]

 ---

 name:   incomplete_open_attr_equals
 label:  incomplete tags: an open tag, attribute, equal sign
 input:  "junk <ref name=>"
 output: [Text(text="junk <ref name=>")]

 ---

 name:   incomplete_open_attr
 label:  incomplete tags: an open tag, attribute with a key/value
 input:  "junk <ref name=foo>"
 output: [Text(text="junk <ref name=foo>")]

 ---

 name:   incomplete_open_attr_quoted
 label:  incomplete tags: an open tag, attribute with a key/value, quoted
 input:  "junk <ref name="foo">"
 output: [Text(text="junk <ref name=\"foo\">")]

 ---

 name:   incomplete_open_text
 label:  incomplete tags: an open tag, text
 input:  "junk <ref>foo"
 output: [Text(text="junk <ref>foo")]

 ---

 name:   incomplete_open_attr_text
 label:  incomplete tags: an open tag, attribute with a key/value, text
 input:  "junk <ref name=foo>bar"
 output: [Text(text="junk <ref name=foo>bar")]

 ---

 name:   incomplete_open_text_lbracket
 label:  incomplete tags: an open tag, text, left open bracket
 input:  "junk <ref>bar<"
 output: [Text(text="junk <ref>bar<")]

 ---

 name:   incomplete_open_text_lbracket_slash
 label:  incomplete tags: an open tag, text, left bracket, slash
 input:  "junk <ref>bar</"
 output: [Text(text="junk <ref>bar</")]

 ---

 name:   incomplete_open_text_unclosed_close
 label:  incomplete tags: an open tag, text, unclosed close
 input:  "junk <ref>bar</ref"
 output: [Text(text="junk <ref>bar</ref")]

 ---

 name:   incomplete_open_text_wrong_close
 label:  incomplete tags: an open tag, text, wrong close
 input:  "junk <ref>bar</span>"
 output: [Text(text="junk <ref>bar</span>")]

 ---

 name:   incomplete_unclosed_close
 label:  incomplete tags: an unclosed close tag
 input:  "junk </"
 output: [Text(text="junk </")]

 ---

 name:   incomplete_unclosed_close_text
 label:  incomplete tags: an unclosed close tag, with text
 input:  "junk </br"
 output: [Text(text="junk </br")]

 ---

 name:   incomplete_close
 label:  incomplete tags: a close tag
 input:  "junk </ref>"
 output: [Text(text="junk </ref>")]

 ---

 name:   incomplete_no_tag_name_open
 label:  incomplete tags: no tag name within brackets; just an open
 input:  "junk <>"
 output: [Text(text="junk <>")]

 ---

 name:   incomplete_no_tag_name_selfclosing
 label:  incomplete tags: no tag name within brackets; self-closing
 input:  "junk < />"
 output: [Text(text="junk < />")]

 ---

 name:   incomplete_no_tag_name_open_close
 label:  incomplete tags: no tag name within brackets; open and close
 input:  "junk <></>"
 output: [Text(text="junk <></>")]

 ---

 name:   backslash_premature_before
 label:  a backslash before a quote before a space
 input:  "<foo attribute="this is\\" quoted">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\" quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_premature_after
 label:  a backslash before a quote after a space
 input:  "<foo attribute="this is \\"quoted">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is \\\"quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_premature_middle
 label:  a backslash before a quote in the middle of a word
 input:  "<foo attribute="this i\\"s quoted">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this i\\\"s quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_adjacent
 label:  escaped quotes next to unescaped quotes
 input:  "<foo attribute="\\"this is quoted\\"">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="\\\"this is quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_endquote
 label:  backslashes before the end quote, causing the attribute to become unquoted
 input:  "<foo attribute="this_is quoted\\">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), Text(text="\"this_is"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_double
 label:  two adjacent backslashes, which do *not* affect the quote
 input:  "<foo attribute="this is\\\\" quoted">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_triple
 label:  three adjacent backslashes, which do *not* affect the quote
 input:  "<foo attribute="this is\\\\\\" quoted">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   backslash_unaffecting
 label:  backslashes near quotes, but not immediately adjacent, thus having no effect
 input:  "<foo attribute="\\quote\\d" also="quote\\d\\">blah</foo>"
 output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="\\quote\\d"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="also"), TagAttrEquals(), Text(text="\"quote\\d\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]

 ---

 name:   unparsable
 label:  a tag that should not be put through the normal parser
 input:  "{{t1}}<nowiki>{{t2}}</nowiki>{{t3}}"
 output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]

 ---

 name:   unparsable_complex
 label:  a tag that should not be put through the normal parser; lots of stuff inside
 input:  "{{t1}}<pre>{{t2}}\n==Heading==\nThis is some text with a [[page|link]].</pre>{{t3}}"
 output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="pre"), TagCloseOpen(padding=""), Text(text="{{t2}}\n==Heading==\nThis is some text with a [[page|link]]."), TagOpenClose(), Text(text="pre"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]

 ---

 name:   unparsable_attributed
 label:  a tag that should not be put through the normal parser; parsed attributes
 input:  "{{t1}}<nowiki attr=val attr2="{{val2}}">{{t2}}</nowiki>{{t3}}"
 output: [TemplateOpen(), Text(text=u't1'), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr"), TagAttrEquals(), Text(text="val"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr2"), TagAttrEquals(), TagAttrQuote(), TemplateOpen(), Text(text="val2"), TemplateClose(), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]

 ---

 name:   unparsable_incomplete
 label:  a tag that should not be put through the normal parser; incomplete
 input:  "{{t1}}<nowiki>{{t2}}{{t3}}"
 output: [TemplateOpen(), Text(text="t1"), TemplateClose(), Text(text="<nowiki>"), TemplateOpen(), Text(text="t2"), TemplateClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]

 ---

 name:   unparsable_entity
 label:  a HTML entity inside unparsable text is still parsed
 input:  "{{t1}}<nowiki>{{t2}}&nbsp;{{t3}}</nowiki>{{t4}}"
 output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="{{t3}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t4"), TemplateClose()]

 ---

 name:   unparsable_entity_incomplete
 label:  an incomplete HTML entity inside unparsable text
 input:  "<nowiki>&</nowiki>"
 output: [TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="&"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]

 ---

 name:   unparsable_entity_incomplete_2
 label:  an incomplete HTML entity inside unparsable text
 input:  "<nowiki>&"
 output: [Text(text="<nowiki>&")]

 ---

 name:   single_open_close
 label:  a tag that supports being single; both an open and a close tag
 input:  "foo<li>bar{{baz}}</li>"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseOpen(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenClose(), Text(text="li"), TagCloseClose()]

 ---

 name:   single_open
 label:  a tag that supports being single; just an open tag
 input:  "foo<li>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_selfclose
 label:  a tag that supports being single; a self-closing tag
 input:  "foo<li/>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_close
 label:  a tag that supports being single; just a close tag
 input:  "foo</li>bar{{baz}}"
 output: [Text(text="foo</li>bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_only_open_close
 label:  a tag that can only be single; both an open and a close tag
 input:  "foo<br>bar{{baz}}</br>"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True)]

 ---

 name:   single_only_open
 label:  a tag that can only be single; just an open tag
 input:  "foo<br>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_only_selfclose
 label:  a tag that can only be single; a self-closing tag
 input:  "foo<br/>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_only_close
 label:  a tag that can only be single; just a close tag
 input:  "foo</br>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_only_double
 label:  a tag that can only be single; a tag with backslashes at the beginning and end
 input:  "foo</br/>bar{{baz}}"
 output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]

 ---

 name:   single_only_close_attribute
 label:  a tag that can only be single; presented as a close tag with an attribute
 input:  "</br id="break">"
 output: [TagOpenOpen(invalid=True), Text(text="br"), TagAttrStart(pad_first=" ", pad_after_eq="", pad_before_eq=""), Text(text="id"), TagAttrEquals(), TagAttrQuote(), Text(text="break"), TagCloseSelfclose(padding="", implicit=True)]

 ---

 name:   capitalization
 label:  caps should be ignored within tag names
 input:  "<NoWiKi>{{test}}</nOwIkI>"
 output: [TagOpenOpen(), Text(text="NoWiKi"), TagCloseOpen(padding=""), Text(text="{{test}}"), TagOpenClose(), Text(text="nOwIkI"), TagCloseClose()]
--- a/tests/tokenizer/tags_wikimarkup.mwtest
+++ b/tests/tokenizer/tags_wikimarkup.mwtest
@@ -0,0 +1,523 @@
 name:   basic_italics
 label:  basic italic text
 input:  "''text''"
 output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="text"), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   basic_bold
 label:  basic bold text
 input:  "'''text'''"
 output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="text"), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   basic_ul
 label:  basic unordered list
 input:  "*text"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="text")]

 ---

 name:   basic_ol
 label:  basic ordered list
 input:  "#text"
 output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="text")]

 ---

 name:   basic_dt
 label:  basic description term
 input:  ";text"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="text")]

 ---

 name:   basic_dd
 label:  basic description item
 input:  ":text"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="text")]

 ---

 name:   basic_hr
 label:  basic horizontal rule
 input:  "----"
 output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose()]

 ---

 name:   complex_italics
 label:  italics with a lot in them
 input:  "''this is a&nbsp;test of [[Italic text|italics]] with {{plenty|of|stuff}}''"
 output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of "), WikilinkOpen(), Text(text="Italic text"), WikilinkSeparator(), Text(text="italics"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   multiline_italics
 label:  italics spanning mulitple lines
 input:  "foo\nbar''testing\ntext\nspanning\n\n\n\n\nmultiple\nlines''foo\n\nbar"
 output: [Text(text="foo\nbar"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="testing\ntext\nspanning\n\n\n\n\nmultiple\nlines"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="foo\n\nbar")]

 ---

 name:   unending_italics
 label:  italics without an ending tag
 input:  "''unending formatting!"
 output: [Text(text="''unending formatting!")]

 ---

 name:   misleading_italics_end
 label:  italics with something that looks like an end but isn't
 input:  "''this is 'not' the en'd'<nowiki>''</nowiki>"
 output: [Text(text="''this is 'not' the en'd'"), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="''"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
 ]

 ---

 name:   italics_start_outside_end_inside
 label:  italics that start outside a link and end inside it
 input:  "''foo[[bar|baz'']]spam"
 output: [Text(text="''foo"), WikilinkOpen(), Text(text="bar"), WikilinkSeparator(), Text(text="baz''"), WikilinkClose(), Text(text="spam")]

 ---

 name:   italics_start_inside_end_outside
 label:  italics that start inside a link and end outside it
 input:  "[[foo|''bar]]baz''spam"
 output: [Text(text="[[foo|"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar]]baz"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="spam")]

 ---

 name:   complex_bold
 label:  bold with a lot in it
 input:  "'''this is a&nbsp;test of [[Bold text|bold]] with {{plenty|of|stuff}}'''"
 output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of "), WikilinkOpen(), Text(text="Bold text"), WikilinkSeparator(), Text(text="bold"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose(), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   multiline_bold
 label:  bold spanning mulitple lines
 input:  "foo\nbar'''testing\ntext\nspanning\n\n\n\n\nmultiple\nlines'''foo\n\nbar"
 output: [Text(text="foo\nbar"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="testing\ntext\nspanning\n\n\n\n\nmultiple\nlines"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="foo\n\nbar")]

 ---

 name:   unending_bold
 label:  bold without an ending tag
 input:  "'''unending formatting!"
 output: [Text(text="'''unending formatting!")]

 ---

 name:   misleading_bold_end
 label:  bold with something that looks like an end but isn't
 input:  "'''this is 'not' the en''d'<nowiki>'''</nowiki>"
 output: [Text(text="'"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="this is 'not' the en"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="d'"), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="'''"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]

 ---

 name:   bold_start_outside_end_inside
 label:  bold that start outside a link and end inside it
 input:  "'''foo[[bar|baz''']]spam"
 output: [Text(text="'''foo"), WikilinkOpen(), Text(text="bar"), WikilinkSeparator(), Text(text="baz'''"), WikilinkClose(), Text(text="spam")]

 ---

 name:   bold_start_inside_end_outside
 label:  bold that start inside a link and end outside it
 input:  "[[foo|'''bar]]baz'''spam"
 output: [Text(text="[[foo|"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bar]]baz"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="spam")]

 ---

 name:   bold_and_italics
 label:  bold and italics together
 input:  "this is '''''bold and italic text'''''!"
 output: [Text(text="this is "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold and italic text"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="!")]

 ---

 name:   both_then_bold
 label:  text that starts bold/italic, then is just bold
 input:  "'''''both''bold'''"
 output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   both_then_italics
 label:  text that starts bold/italic, then is just italic
 input:  "'''''both'''italics''"
 output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   bold_then_both
 label:  text that starts just bold, then is bold/italic
 input:  "'''bold''both'''''"
 output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="i"), TagCloseClose(), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   italics_then_both
 label:  text that starts just italic, then is bold/italic
 input:  "''italics'''both'''''"
 output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   italics_then_bold
 label:  text that starts italic, then is bold
 input:  "none''italics'''''bold'''none"
 output: [Text(text="none"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="none")]

 ---

 name:   bold_then_italics
 label:  text that starts bold, then is italic
 input:  "none'''bold'''''italics''none"
 output: [Text(text="none"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="none")]

 ---

 name:   five_three
 label:  five ticks to open, three to close (bold)
 input:  "'''''foobar'''"
 output: [Text(text="''"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="foobar"), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   five_two
 label:  five ticks to open, two to close (bold)
 input:  "'''''foobar''"
 output: [Text(text="'''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="foobar"), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   four
 label:  four ticks
 input:  "foo ''''bar'''' baz"
 output: [Text(text="foo '"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bar'"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text=" baz")]

 ---

 name:   four_two
 label:  four ticks to open, two to close
 input:  "foo ''''bar'' baz"
 output: [Text(text="foo ''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]

 ---

 name:   two_three
 label:  two ticks to open, three to close
 input:  "foo ''bar''' baz"
 output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar'"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]

 ---

 name:   two_four
 label:  two ticks to open, four to close
 input:  "foo ''bar'''' baz"
 output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar''"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]

 ---

 name:   two_three_two
 label:  two ticks to open, three to close, two afterwards
 input:  "foo ''bar''' baz''"
 output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar''' baz"), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   two_four_four
 label:  two ticks to open, four to close, four afterwards
 input:  "foo ''bar'''' baz''''"
 output: [Text(text="foo ''bar'"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text=" baz'"), TagOpenClose(), Text(text="b"), TagCloseClose()]

 ---

 name:   seven
 label:  seven ticks
 input:  "'''''''seven'''''''"
 output: [Text(text="''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="seven''"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]

 ---

 name:   complex_ul
 label:  ul with a lot in it
 input:  "* this is a&nbsp;test of an [[Unordered list|ul]] with {{plenty|of|stuff}}"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="Unordered list"), WikilinkSeparator(), Text(text="ul"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]

 ---

 name:   ul_multiline_template
 label:  ul with a template that spans multiple lines
 input:  "* this has a template with a {{line|\nbreak}}\nthis is not part of the list"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]

 ---

 name:   ul_adjacent
 label:  multiple adjacent uls
 input:  "a\n*b\n*c\nd\n*e\nf"
 output: [Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf")]

 ---

 name:   ul_depths
 label:  multiple adjacent uls, with differing depths
 input:  "*a\n**b\n***c\n********d\n**e\nf\n***g"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   ul_space_before
 label:  uls with space before them
 input:  "foo    *bar\n *baz\n*buzz"
 output: [Text(text="foo    *bar\n *baz\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="buzz")]

 ---

 name:   ul_interruption
 label:  high-depth ul with something blocking it
 input:  "**f*oobar"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="f*oobar")]

 ---

 name:   complex_ol
 label:  ol with a lot in it
 input:  "# this is a&nbsp;test of an [[Ordered list|ol]] with {{plenty|of|stuff}}"
 output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="Ordered list"), WikilinkSeparator(), Text(text="ol"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]

 ---

 name:   ol_multiline_template
 label:  ol with a template that spans moltiple lines
 input:  "# this has a template with a {{line|\nbreak}}\nthis is not part of the list"
 output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]

 ---

 name:   ol_adjacent
 label:  moltiple adjacent ols
 input:  "a\n#b\n#c\nd\n#e\nf"
 output: [Text(text="a\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf")]

 ---

 name:   ol_depths
 label:  moltiple adjacent ols, with differing depths
 input:  "#a\n##b\n###c\n########d\n##e\nf\n###g"
 output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   ol_space_before
 label:  ols with space before them
 input:  "foo    #bar\n #baz\n#buzz"
 output: [Text(text="foo    #bar\n #baz\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="buzz")]

 ---

 name:   ol_interruption
 label:  high-depth ol with something blocking it
 input:  "##f#oobar"
 output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="f#oobar")]

 ---

 name:   ul_ol_mix
 label:  a mix of adjacent uls and ols
 input:  "*a\n*#b\n*##c\n*##*#*#*d\n*#e\nf\n##*g"
 output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   complex_dt
 label:  dt with a lot in it
 input:  "; this is a&nbsp;test of an [[description term|dt]] with {{plenty|of|stuff}}"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="description term"), WikilinkSeparator(), Text(text="dt"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]

 ---

 name:   dt_multiline_template
 label:  dt with a template that spans mdttiple lines
 input:  "; this has a template with a {{line|\nbreak}}\nthis is not part of the list"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]

 ---

 name:   dt_adjacent
 label:  mdttiple adjacent dts
 input:  "a\n;b\n;c\nd\n;e\nf"
 output: [Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="e\nf")]

 ---

 name:   dt_depths
 label:  mdttiple adjacent dts, with differing depths
 input:  ";a\n;;b\n;;;c\n;;;;;;;;d\n;;e\nf\n;;;g"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   dt_space_before
 label:  dts with space before them
 input:  "foo    ;bar\n ;baz\n;buzz"
 output: [Text(text="foo    ;bar\n ;baz\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="buzz")]

 ---

 name:   dt_interruption
 label:  high-depth dt with something blocking it
 input:  ";;f;oobar"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="f;oobar")]

 ---

 name:   complex_dd
 label:  dd with a lot in it
 input:  ": this is a&nbsp;test of an [[description item|dd]] with {{plenty|of|stuff}}"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="description item"), WikilinkSeparator(), Text(text="dd"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]

 ---

 name:   dd_multiline_template
 label:  dd with a template that spans mddtiple lines
 input:  ": this has a template with a {{line|\nbreak}}\nthis is not part of the list"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]

 ---

 name:   dd_adjacent
 label:  mddtiple adjacent dds
 input:  "a\n:b\n:c\nd\n:e\nf"
 output: [Text(text="a\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf")]

 ---

 name:   dd_depths
 label:  mddtiple adjacent dds, with differing depths
 input:  ":a\n::b\n:::c\n::::::::d\n::e\nf\n:::g"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   dd_space_before
 label:  dds with space before them
 input:  "foo    :bar\n :baz\n:buzz"
 output: [Text(text="foo    :bar\n :baz\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="buzz")]

 ---

 name:   dd_interruption
 label:  high-depth dd with something blocking it
 input:  "::f:oobar"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="f:oobar")]

 ---

 name:   dt_dd_mix
 label:  a mix of adjacent dts and dds
 input:  ";a\n;:b\n;::c\n;::;:;:;d\n;:e\nf\n::;g"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="g")]

 ---

 name:   dt_dd_mix2
 label:  the correct usage of a dt/dd unit, as in a dl
 input:  ";foo:bar:baz"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="foo"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="bar:baz")]

 ---

 name:   dt_dd_mix3
 label:  another example of correct (but strange) dt/dd usage
 input:  ":;;::foo:bar:baz"
 output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="foo"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="bar:baz")]

 ---

 name:   ul_ol_dt_dd_mix
 label:  an assortment of uls, ols, dds, and dts
 input:  ";:#*foo\n:#*;foo\n#*;:foo\n*;:#foo"
 output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="foo")]

 ---

 name:   hr_text_before
 label:  text before an otherwise-valid hr
 input:  "foo----"
 output: [Text(text="foo----")]

 ---

 name:   hr_text_after
 label:  text after a valid hr
 input:  "----bar"
 output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="bar")]

 ---

 name:   hr_text_before_after
 label:  text at both ends of an otherwise-valid hr
 input:  "foo----bar"
 output: [Text(text="foo----bar")]

 ---

 name:   hr_newlines
 label:  newlines surrounding a valid hr
 input:  "foo\n----\nbar"
 output: [Text(text="foo\n"), TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\nbar")]

 ---

 name:   hr_adjacent
 label:  two adjacent hrs
 input:  "----\n----"
 output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\n"), TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose()]

 ---

 name:   hr_adjacent_space
 label:  two adjacent hrs, with a space before the second one, making it invalid
 input:  "----\n ----"
 output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\n ----")]

 ---

 name:   hr_short
 label:  an invalid three-hyphen-long hr
 input:  "---"
 output: [Text(text="---")]

 ---

 name:   hr_long
 label:  a very long, valid hr
 input:  "------------------------------------------"
 output: [TagOpenOpen(wiki_markup="------------------------------------------"), Text(text="hr"), TagCloseSelfclose()]

 ---

 name:   hr_interruption_short
 label:  a hr that is interrupted, making it invalid
 input:  "---x-"
 output: [Text(text="---x-")]

 ---

 name:   hr_interruption_long
 label:  a hr that is interrupted, but the first part remains valid because it is long enough
 input:  "----x--"
 output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="x--")]

 ---

 name:   nowiki_cancel
 label:  a nowiki tag before a list causes it to not be parsed
 input:  "<nowiki />* Unordered list"
 output: [TagOpenOpen(), Text(text="nowiki"), TagCloseSelfclose(padding=" "), Text(text="* Unordered list")]
--- a/tests/tokenizer/text.mwtest
+++ b/tests/tokenizer/text.mwtest
@@ -23,3 +23,10 @@ name:   unicode2
 label:  additional unicode check for non-BMP codepoints
 input:  "𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰"
 output: [Text(text="𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰")]

 ---

 name:   large
 label:  a lot of text, requiring multiple textbuffer blocks in the C tokenizer
 input:  "ZWfsZYcZyhGbkDYJiguJuuhsNyHGFkFhnjkbLJyXIygTHqcXdhsDkEOTSIKYlBiohLIkiXxvyebUyCGvvBcYqFdtcftGmaAanKXEIyYSEKlTfEEbdGhdePVwVImOyKiHSzAEuGyEVRIKPZaNjQsYqpqARIQfvAklFtQyTJVGlLwjJIxYkiqmHBmdOvTyNqJRbMvouoqXRyOhYDwowtkcZGSOcyzVxibQdnzhDYbrgbatUrlOMRvFSzmLWHRihtXnddwYadPgFWUOxAzAgddJVDXHerawdkrRuWaEXfuwQSkQUmLEJUmrgXDVlXCpciaisfuOUjBldElygamkkXbewzLucKRnAEBimIIotXeslRRhnqQjrypnLQvvdCsKFWPVTZaHvzJMFEahDHWcCbyXgxFvknWjhVfiLSDuFhGoFxqSvhjnnRZLmCMhmWeOgSoanDEInKTWHnbpKyUlabLppITDFFxyWKAnUYJQIcmYnrvMmzmtYvsbCYbebgAhMFVVFAKUSvlkLFYluDpbpBaNFWyfXTaOdSBrfiHDTWGBTUCXMqVvRCIMrEjWpQaGsABkioGnveQWqBTDdRQlxQiUipwfyqAocMddXqdvTHhEwjEzMkOSWVPjJvDtClhYwpvRztPmRKCSpGIpXQqrYtTLmShFdpKtOxGtGOZYIdyUGPjdmyvhJTQMtgYJWUUZnecRjBfQXsyWQWikyONySLzLEqRFqcJYdRNFcGwWZtfZasfFWcvdsHRXoqKlKYihRAOJdrPBDdxksXFwKceQVncmFXfUfBsNgjKzoObVExSnRnjegeEhqxXzPmFcuiasViAFeaXrAxXhSfSyCILkKYpjxNeKynUmdcGAbwRwRnlAFbOSCafmzXddiNpLCFTHBELvArdXFpKUGpSHRekhrMedMRNkQzmSyFKjVwiWwCvbNWjgxJRzYeRxHiCCRMXktmKBxbxGZvOpvZIJOwvGIxcBLzsMFlDqAMLtScdsJtrbIUAvKfcdChXGnBzIxGxXMgxJhayrziaCswdpjJJJhkaYnGhHXqZwOzHFdhhUIEtfjERdLaSPRTDDMHpQtonNaIgXUYhjdbnnKppfMBxgNSOOXJAPtFjfAKnrRDrumZBpNhxMstqjTGBViRkDqbTdXYUirsedifGYzZpQkvdNhtFTOPgsYXYCwZHLcSLSfwfpQKtWfZuRUUryHJsbVsAOQcIJdSKKlOvCeEjUQNRPHKXuBJUjPuaAJJxcDMqyaufqfVwUmHLdjeYZzSiiGLHOTCInpVAalbXXTMLugLiwFiyPSuSFiyJUKVrWjbZAHaJtZnQmnvorRrxdPKThqXzNgTjszQiCoMczRnwGYJMERUWGXFyrSbAqsHmLwLlnJOJoXNsjVehQjVOpQOQJAZWwFZBlgyVIplzLTlFwumPgBLYrUIAJAcmvHPGfHfWQguCjfTYzxYfbohaLFAPwxFRrNuCdCzLlEbuhyYjCmuDBTJDMCdLpNRVqEALjnPSaBPsKWRCKNGwEMFpiEWbYZRwaMopjoUuBUvMpvyLfsPKDrfQLiFOQIWPtLIMoijUEUYfhykHrSKbTtrvjwIzHdWZDVwLIpNkloCqpzIsErxxKAFuFEjikWNYChqYqVslXMtoSWzNhbMuxYbzLfJIcPGoUeGPkGyPQNhDyrjgdKekzftFrRPTuyLYqCArkDcWHTrjPQHfoThBNnTQyMwLEWxEnBXLtzJmFVLGEPrdbEwlXpgYfnVnWoNXgPQKKyiXifpvrmJATzQOzYwFhliiYxlbnsEPKbHYUfJLrwYPfSUwTIHiEvBFMrEtVmqJobfcwsiiEudTIiAnrtuywgKLOiMYbEIOAOJdOXqroPjWnQQcTNxFvkIEIsuHLyhSqSphuSmlvknzydQEnebOreeZwOouXYKlObAkaWHhOdTFLoMCHOWrVKeXjcniaxtgCziKEqWOZUWHJQpcDJzYnnduDZrmxgjZroBRwoPBUTJMYipsgJwbTSlvMyXXdAmiEWGMiQxhGvHGPLOKeTxNaLnFVbWpiYIVyqN"
 output: [Text(text="ZWfsZYcZyhGbkDYJiguJuuhsNyHGFkFhnjkbLJyXIygTHqcXdhsDkEOTSIKYlBiohLIkiXxvyebUyCGvvBcYqFdtcftGmaAanKXEIyYSEKlTfEEbdGhdePVwVImOyKiHSzAEuGyEVRIKPZaNjQsYqpqARIQfvAklFtQyTJVGlLwjJIxYkiqmHBmdOvTyNqJRbMvouoqXRyOhYDwowtkcZGSOcyzVxibQdnzhDYbrgbatUrlOMRvFSzmLWHRihtXnddwYadPgFWUOxAzAgddJVDXHerawdkrRuWaEXfuwQSkQUmLEJUmrgXDVlXCpciaisfuOUjBldElygamkkXbewzLucKRnAEBimIIotXeslRRhnqQjrypnLQvvdCsKFWPVTZaHvzJMFEahDHWcCbyXgxFvknWjhVfiLSDuFhGoFxqSvhjnnRZLmCMhmWeOgSoanDEInKTWHnbpKyUlabLppITDFFxyWKAnUYJQIcmYnrvMmzmtYvsbCYbebgAhMFVVFAKUSvlkLFYluDpbpBaNFWyfXTaOdSBrfiHDTWGBTUCXMqVvRCIMrEjWpQaGsABkioGnveQWqBTDdRQlxQiUipwfyqAocMddXqdvTHhEwjEzMkOSWVPjJvDtClhYwpvRztPmRKCSpGIpXQqrYtTLmShFdpKtOxGtGOZYIdyUGPjdmyvhJTQMtgYJWUUZnecRjBfQXsyWQWikyONySLzLEqRFqcJYdRNFcGwWZtfZasfFWcvdsHRXoqKlKYihRAOJdrPBDdxksXFwKceQVncmFXfUfBsNgjKzoObVExSnRnjegeEhqxXzPmFcuiasViAFeaXrAxXhSfSyCILkKYpjxNeKynUmdcGAbwRwRnlAFbOSCafmzXddiNpLCFTHBELvArdXFpKUGpSHRekhrMedMRNkQzmSyFKjVwiWwCvbNWjgxJRzYeRxHiCCRMXktmKBxbxGZvOpvZIJOwvGIxcBLzsMFlDqAMLtScdsJtrbIUAvKfcdChXGnBzIxGxXMgxJhayrziaCswdpjJJJhkaYnGhHXqZwOzHFdhhUIEtfjERdLaSPRTDDMHpQtonNaIgXUYhjdbnnKppfMBxgNSOOXJAPtFjfAKnrRDrumZBpNhxMstqjTGBViRkDqbTdXYUirsedifGYzZpQkvdNhtFTOPgsYXYCwZHLcSLSfwfpQKtWfZuRUUryHJsbVsAOQcIJdSKKlOvCeEjUQNRPHKXuBJUjPuaAJJxcDMqyaufqfVwUmHLdjeYZzSiiGLHOTCInpVAalbXXTMLugLiwFiyPSuSFiyJUKVrWjbZAHaJtZnQmnvorRrxdPKThqXzNgTjszQiCoMczRnwGYJMERUWGXFyrSbAqsHmLwLlnJOJoXNsjVehQjVOpQOQJAZWwFZBlgyVIplzLTlFwumPgBLYrUIAJAcmvHPGfHfWQguCjfTYzxYfbohaLFAPwxFRrNuCdCzLlEbuhyYjCmuDBTJDMCdLpNRVqEALjnPSaBPsKWRCKNGwEMFpiEWbYZRwaMopjoUuBUvMpvyLfsPKDrfQLiFOQIWPtLIMoijUEUYfhykHrSKbTtrvjwIzHdWZDVwLIpNkloCqpzIsErxxKAFuFEjikWNYChqYqVslXMtoSWzNhbMuxYbzLfJIcPGoUeGPkGyPQNhDyrjgdKekzftFrRPTuyLYqCArkDcWHTrjPQHfoThBNnTQyMwLEWxEnBXLtzJmFVLGEPrdbEwlXpgYfnVnWoNXgPQKKyiXifpvrmJATzQOzYwFhliiYxlbnsEPKbHYUfJLrwYPfSUwTIHiEvBFMrEtVmqJobfcwsiiEudTIiAnrtuywgKLOiMYbEIOAOJdOXqroPjWnQQcTNxFvkIEIsuHLyhSqSphuSmlvknzydQEnebOreeZwOouXYKlObAkaWHhOdTFLoMCHOWrVKeXjcniaxtgCziKEqWOZUWHJQpcDJzYnnduDZrmxgjZroBRwoPBUTJMYipsgJwbTSlvMyXXdAmiEWGMiQxhGvHGPLOKeTxNaLnFVbWpiYIVyqN")]
--- a/tests/tokenizer/wikilinks.mwtest
+++ b/tests/tokenizer/wikilinks.mwtest
@@ -40,17 +40,17 @@ output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar|b

 ---

 name:   nested
 label:  a wikilink nested within the value of another
 input:  "[[foo|[[bar]]]]"
 output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), WikilinkOpen(), Text(text="bar"), WikilinkClose(), WikilinkClose()]
 name:   newline_text
 label:  a newline in the middle of the text
 input:  "[[foo|foo\nbar]]"
 output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="foo\nbar"), WikilinkClose()]

 ---

 name:   nested_with_text
 label:  a wikilink nested within the value of another, separated by other data
 input:  "[[foo|a[[b]]c]]"
 output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c"), WikilinkClose()]
 name:   bracket_text
 label:  a left bracket in the middle of the text
 input:  "[[foo|bar[baz]]"
 output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar[baz"), WikilinkClose()]

 ---

@@ -96,13 +96,34 @@ output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(),

 ---

 name:   invalid_nested_text
 name:   invalid_nested_padding
 label:  invalid wikilink: trying to nest in the wrong context, with a text param
 input:  "[[foo[[bar]]|baz]]"
 output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="|baz]]")]

 ---

 name:   invalid_nested_text
 label:  invalid wikilink: a wikilink nested within the value of another
 input:  "[[foo|[[bar]]"
 output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose()]

 ---

 name:   invalid_nested_text_2
 label:  invalid wikilink: a wikilink nested within the value of another, two pairs of closing brackets
 input:  "[[foo|[[bar]]]]"
 output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="]]")]

 ---

 name:   invalid_nested_text_padding
 label:  invalid wikilink: a wikilink nested within the value of another, separated by other data
 input:  "[[foo|a[[b]]c]]"
 output: [Text(text="[[foo|a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c]]")]

 ---

 name:   incomplete_open_only
 label:  incomplete wikilinks: just an open
 input:  "[["