diff --git a/CHANGELOG b/CHANGELOG
index 9772f8b..67214fa 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,4 +1,24 @@
-v0.1.1 (19da4d2144) to v0.2:
+v0.3 (released August 24, 2013):
+
+- Added complete support for HTML Tags, including forms like <ref>foo</ref>,
+  <ref name="bar"/>, and wiki-markup tags like bold ('''), italics (''), and
+  lists (*, #, ; and :).
+- Added support for ExternalLinks (http://example.com/ and
+  [http://example.com/ Example]).
+- Wikicode's filter methods are now passed 'recursive=True' by default instead
+  of False. This is a breaking change if you rely on any filter() methods being
+  non-recursive by default.
+- Added a matches() method to Wikicode for page/template name comparisons.
+- The 'obj' param of Wikicode.insert_before(), insert_after(), replace(), and
+  remove() now accepts other Wikicode objects and strings representing parts of
+  wikitext, instead of just nodes. These methods also make all possible
+  substitutions instead of just one.
+- Renamed Template.has_param() to has() for consistency with Template's other
+  methods; has_param() is now an alias.
+- The C tokenizer extension now works on Python 3 in addition to Python 2.7.
+- Various bugfixes, internal changes, and cleanup.
+
+v0.2 (released June 20, 2013):
 
 - The parser now fully supports Python 3 in addition to Python 2.7.
 - Added a C tokenizer extension that is significantly faster than its Python
@@ -24,10 +44,14 @@ v0.1.1 (19da4d2144) to v0.2:
 - Fixed some broken example code in the README; other copyedits.
 - Other bugfixes and code cleanup.
 
-v0.1 (ba94938fe8) to v0.1.1 (19da4d2144):
+v0.1.1 (released September 21, 2012):
 
 - Added support for Comments (<!-- foo -->) and Wikilinks ([[foo]]).
 - Added corresponding ifilter_links() and filter_links() methods to Wikicode.
 - Fixed a bug when parsing incomplete templates.
 - Fixed strip_code() to affect the contents of headings.
 - Various copyedits in documentation and comments.
+
+v0.1 (released August 23, 2012):
+
+- Initial release.
diff --git a/README.rst b/README.rst
index 77c01eb..b5fd912 100644
--- a/README.rst
+++ b/README.rst
@@ -9,7 +9,8 @@ mwparserfromhell
 that provides an easy-to-use and outrageously powerful parser for MediaWiki_
 wikicode. It supports Python 2 and Python 3.
 
-Developed by Earwig_ with help from `Σ`_.
+Developed by Earwig_ with help from `Σ`_. Full documentation is available on
+ReadTheDocs_.
 
 Installation
 ------------
@@ -18,7 +19,7 @@ The easiest way to install the parser is through the `Python Package Index`_,
 so you can install the latest release with ``pip install mwparserfromhell``
 (`get pip`_). Alternatively, get the latest development version::
 
-    git clone git://github.com/earwig/mwparserfromhell.git
+    git clone https://github.com/earwig/mwparserfromhell.git
     cd mwparserfromhell
     python setup.py install
 
@@ -59,13 +60,20 @@ For example::
     >>> print template.get("eggs").value
     spam
 
-Since every node you reach is also a ``Wikicode`` object, it's trivial to get
-nested templates::
+Since nodes can contain other nodes, getting nested templates is trivial::
+
+    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
+    >>> mwparserfromhell.parse(text).filter_templates()
+    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']
+
+You can also pass ``recursive=False`` to ``filter_templates()`` and explore
+templates manually. This is possible because nodes can contain additional
+``Wikicode`` objects::
 
     >>> code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
-    >>> print code.filter_templates()
+    >>> print code.filter_templates(recursive=False)
     ['{{foo|this {{includes a|template}}}}']
-    >>> foo = code.filter_templates()[0]
+    >>> foo = code.filter_templates(recursive=False)[0]
     >>> print foo.get(1).value
     this {{includes a|template}}
     >>> print foo.get(1).value.filter_templates()[0]
@@ -73,21 +81,16 @@ nested templates::
     >>> print foo.get(1).value.filter_templates()[0].get(1).value
     template
 
-Additionally, you can include nested templates in ``filter_templates()`` by
-passing ``recursive=True``::
-
-    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
-    >>> mwparserfromhell.parse(text).filter_templates(recursive=True)
-    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']
-
 Templates can be easily modified to add, remove, or alter params. ``Wikicode``
-can also be treated like a list with ``append()``, ``insert()``, ``remove()``,
-``replace()``, and more::
+objects can be treated like lists, with ``append()``, ``insert()``,
+``remove()``, ``replace()``, and more. They also have a ``matches()`` method
+for comparing page or template names, which takes care of capitalization and
+whitespace::
 
     >>> text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
     >>> code = mwparserfromhell.parse(text)
     >>> for template in code.filter_templates():
-    ...     if template.name == "cleanup" and not template.has_param("date"):
+    ...     if template.name.matches("Cleanup") and not template.has("date"):
     ...         template.add("date", "July 2012")
     ...
     >>> print code
@@ -142,6 +145,7 @@ following code (via the API_)::
         return mwparserfromhell.parse(text)
 
 .. _MediaWiki:              http://mediawiki.org
+.. _ReadTheDocs:            http://mwparserfromhell.readthedocs.org
 .. _Earwig:                 http://en.wikipedia.org/wiki/User:The_Earwig
 .. _Σ:                      http://en.wikipedia.org/wiki/User:%CE%A3
 .. _Python Package Index:   http://pypi.python.org
diff --git a/docs/api/mwparserfromhell.nodes.rst b/docs/api/mwparserfromhell.nodes.rst
index d1016f9..7043070 100644
--- a/docs/api/mwparserfromhell.nodes.rst
+++ b/docs/api/mwparserfromhell.nodes.rst
@@ -25,6 +25,14 @@ nodes Package
     :undoc-members:
     :show-inheritance:
 
+:mod:`external_link` Module
+---------------------------
+
+.. automodule:: mwparserfromhell.nodes.external_link
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
 :mod:`heading` Module
 ---------------------
 
@@ -46,6 +54,7 @@ nodes Package
 
 .. automodule:: mwparserfromhell.nodes.tag
     :members:
+    :undoc-members:
     :show-inheritance:
 
 :mod:`template` Module
diff --git a/docs/api/mwparserfromhell.rst b/docs/api/mwparserfromhell.rst
index 3ca09c9..0da522e 100644
--- a/docs/api/mwparserfromhell.rst
+++ b/docs/api/mwparserfromhell.rst
@@ -30,6 +30,12 @@ mwparserfromhell Package
     :members:
     :undoc-members:
 
+:mod:`definitions` Module
+-------------------------
+
+.. automodule:: mwparserfromhell.definitions
+    :members:
+
 :mod:`utils` Module
 -------------------
 
diff --git a/docs/changelog.rst b/docs/changelog.rst
index 0e8bbef..b6db9d9 100644
--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@@ -1,10 +1,38 @@
 Changelog
 =========
 
+v0.3
+----
+
+`Released August 24, 2013 <https://github.com/earwig/mwparserfromhell/tree/v0.3>`_
+(`changes <https://github.com/earwig/mwparserfromhell/compare/v0.2...v0.3>`__):
+
+- Added complete support for HTML :py:class:`Tags <.Tag>`, including forms like
+  ``<ref>foo</ref>``, ``<ref name="bar"/>``, and wiki-markup tags like bold
+  (``'''``), italics (``''``), and lists (``*``, ``#``, ``;`` and ``:``).
+- Added support for :py:class:`.ExternalLink`\ s (``http://example.com/`` and
+  ``[http://example.com/ Example]``).
+- :py:class:`Wikicode's <.Wikicode>` :py:meth:`.filter` methods are now passed
+  *recursive=True* by default instead of *False*. **This is a breaking change
+  if you rely on any filter() methods being non-recursive by default.**
+- Added a :py:meth:`.matches` method to :py:class:`~.Wikicode` for
+  page/template name comparisons.
+- The *obj* param of :py:meth:`Wikicode.insert_before() <.insert_before>`,
+  :py:meth:`~.insert_after`, :py:meth:`~.Wikicode.replace`, and
+  :py:meth:`~.Wikicode.remove` now accepts :py:class:`~.Wikicode` objects and
+  strings representing parts of wikitext, instead of just nodes. These methods
+  also make all possible substitutions instead of just one.
+- Renamed :py:meth:`Template.has_param() <.has_param>` to
+  :py:meth:`~.Template.has` for consistency with :py:class:`~.Template`\ 's
+  other methods; :py:meth:`~.has_param` is now an alias.
+- The C tokenizer extension now works on Python 3 in addition to Python 2.7.
+- Various bugfixes, internal changes, and cleanup.
+
 v0.2
 ----
 
-19da4d2144_ to master_ (released June 20, 2013)
+`Released June 20, 2013 <https://github.com/earwig/mwparserfromhell/tree/v0.2>`_
+(`changes <https://github.com/earwig/mwparserfromhell/compare/v0.1.1...v0.2>`__):
 
 - The parser now fully supports Python 3 in addition to Python 2.7.
 - Added a C tokenizer extension that is significantly faster than its Python
@@ -38,7 +66,8 @@ v0.2
 v0.1.1
 ------
 
-ba94938fe8_ to 19da4d2144_ (released September 21, 2012)
+`Released September 21, 2012 <https://github.com/earwig/mwparserfromhell/tree/v0.1.1>`_
+(`changes <https://github.com/earwig/mwparserfromhell/compare/v0.1...v0.1.1>`__):
 
 - Added support for :py:class:`Comments <.Comment>` (``<!-- foo -->``) and
   :py:class:`Wikilinks <.Wikilink>` (``[[foo]]``).
@@ -51,8 +80,6 @@ ba94938fe8_ to 19da4d2144_ (released September 21, 2012)
 v0.1
 ----
 
-ba94938fe8_ (released August 23, 2012)
+`Released August 23, 2012 <https://github.com/earwig/mwparserfromhell/tree/v0.1>`_:
 
-.. _master:     https://github.com/earwig/mwparserfromhell/tree/v0.2
-.. _19da4d2144: https://github.com/earwig/mwparserfromhell/tree/v0.1.1
-.. _ba94938fe8: https://github.com/earwig/mwparserfromhell/tree/v0.1
+- Initial release.
diff --git a/docs/index.rst b/docs/index.rst
index 4355b61..a6d2df3 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,15 +1,18 @@
-MWParserFromHell v0.2 Documentation
-===================================
+MWParserFromHell v\ |version| Documentation
+===========================================
 
 :py:mod:`mwparserfromhell` (the *MediaWiki Parser from Hell*) is a Python
 package that provides an easy-to-use and outrageously powerful parser for
 MediaWiki_ wikicode. It supports Python 2 and Python 3.
 
-Developed by Earwig_ with help from `Σ`_.
+Developed by Earwig_ with contributions from `Σ`_, Legoktm_, and others.
+Development occurs on GitHub_.
 
 .. _MediaWiki:            http://mediawiki.org
 .. _Earwig:               http://en.wikipedia.org/wiki/User:The_Earwig
 .. _Σ:                    http://en.wikipedia.org/wiki/User:%CE%A3
+.. _Legoktm:              http://en.wikipedia.org/wiki/User:Legoktm
+.. _GitHub:               https://github.com/earwig/mwparserfromhell
 
 Installation
 ------------
@@ -18,7 +21,7 @@ The easiest way to install the parser is through the `Python Package Index`_,
 so you can install the latest release with ``pip install mwparserfromhell``
 (`get pip`_). Alternatively, get the latest development version::
 
-    git clone git://github.com/earwig/mwparserfromhell.git
+    git clone https://github.com/earwig/mwparserfromhell.git
     cd mwparserfromhell
     python setup.py install
 
diff --git a/docs/usage.rst b/docs/usage.rst
index 2fd19af..974c670 100644
--- a/docs/usage.rst
+++ b/docs/usage.rst
@@ -27,13 +27,20 @@ some extra methods. For example::
     >>> print template.get("eggs").value
     spam
 
-Since every node you reach is also a :py:class:`~.Wikicode` object, it's
-trivial to get nested templates::
+Since nodes can contain other nodes, getting nested templates is trivial::
+
+    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
+    >>> mwparserfromhell.parse(text).filter_templates()
+    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']
+
+You can also pass *recursive=False* to :py:meth:`~.filter_templates` and
+explore templates manually. This is possible because nodes can contain
+additional :py:class:`~.Wikicode` objects::
 
     >>> code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
-    >>> print code.filter_templates()
+    >>> print code.filter_templates(recursive=False)
     ['{{foo|this {{includes a|template}}}}']
-    >>> foo = code.filter_templates()[0]
+    >>> foo = code.filter_templates(recursive=False)[0]
     >>> print foo.get(1).value
     this {{includes a|template}}
     >>> print foo.get(1).value.filter_templates()[0]
@@ -41,22 +48,17 @@ trivial to get nested templates::
     >>> print foo.get(1).value.filter_templates()[0].get(1).value
     template
 
-Additionally, you can include nested templates in :py:meth:`~.filter_templates`
-by passing *recursive=True*::
-
-    >>> text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
-    >>> mwparserfromhell.parse(text).filter_templates(recursive=True)
-    ['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']
-
 Templates can be easily modified to add, remove, or alter params.
-:py:class:`~.Wikicode` can also be treated like a list with
+:py:class:`~.Wikicode` objects can be treated like lists, with
 :py:meth:`~.Wikicode.append`, :py:meth:`~.Wikicode.insert`,
-:py:meth:`~.Wikicode.remove`, :py:meth:`~.Wikicode.replace`, and more::
+:py:meth:`~.Wikicode.remove`, :py:meth:`~.Wikicode.replace`, and more. They
+also have a :py:meth:`~.Wikicode.matches` method for comparing page or template
+names, which takes care of capitalization and whitespace::
 
     >>> text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
     >>> code = mwparserfromhell.parse(text)
     >>> for template in code.filter_templates():
-    ...     if template.name == "cleanup" and not template.has_param("date"):
+    ...     if template.name.matches("Cleanup") and not template.has("date"):
     ...         template.add("date", "July 2012")
     ...
     >>> print code
diff --git a/mwparserfromhell/__init__.py b/mwparserfromhell/__init__.py
index 5db2d4c..6a45a11 100644
--- a/mwparserfromhell/__init__.py
+++ b/mwparserfromhell/__init__.py
@@ -31,9 +31,10 @@ from __future__ import unicode_literals
 __author__ = "Ben Kurtovic"
 __copyright__ = "Copyright (C) 2012, 2013 Ben Kurtovic"
 __license__ = "MIT License"
-__version__ = "0.2"
+__version__ = "0.3"
 __email__ = "ben.kurtovic@verizon.net"
 
-from . import compat, nodes, parser, smart_list, string_mixin, utils, wikicode
+from . import (compat, definitions, nodes, parser, smart_list, string_mixin,
+               utils, wikicode)
 
 parse = utils.parse_anything
diff --git a/mwparserfromhell/compat.py b/mwparserfromhell/compat.py
old mode 100755
new mode 100644
index bb81513..864605c
--- a/mwparserfromhell/compat.py
+++ b/mwparserfromhell/compat.py
@@ -15,14 +15,12 @@ py3k = sys.version_info[0] == 3
 if py3k:
     bytes = bytes
     str = str
-    basestring = str
     maxsize = sys.maxsize
     import html.entities as htmlentities
 
 else:
     bytes = str
     str = unicode
-    basestring = basestring
     maxsize = sys.maxint
     import htmlentitydefs as htmlentities
 
diff --git a/mwparserfromhell/definitions.py b/mwparserfromhell/definitions.py
new file mode 100644
index 0000000..9449bcb
--- /dev/null
+++ b/mwparserfromhell/definitions.py
@@ -0,0 +1,91 @@
+# -*- coding: utf-8  -*-
+#
+# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+"""Contains data about certain markup, like HTML tags and external links."""
+
+from __future__ import unicode_literals
+
+__all__ = ["get_html_tag", "is_parsable", "is_visible", "is_single",
+           "is_single_only", "is_scheme"]
+
+URI_SCHEMES = {
+    # [mediawiki/core.git]/includes/DefaultSettings.php @ 374a0ad943
+    "http": True, "https": True, "ftp": True, "ftps": True, "ssh": True,
+    "sftp": True, "irc": True, "ircs": True, "xmpp": False, "sip": False,
+    "sips": False, "gopher": True, "telnet": True, "nntp": True,
+    "worldwind": True, "mailto": False, "tel": False, "sms": False,
+    "news": False, "svn": True, "git": True, "mms": True, "bitcoin": False,
+    "magnet": False, "urn": False, "geo": False
+}
+
+PARSER_BLACKLIST = [
+    # enwiki extensions @ 2013-06-28
+    "categorytree", "gallery", "hiero", "imagemap", "inputbox", "math",
+    "nowiki", "pre", "score", "section", "source", "syntaxhighlight",
+    "templatedata", "timeline"
+]
+
+INVISIBLE_TAGS = [
+    # enwiki extensions @ 2013-06-28
+    "categorytree", "gallery", "imagemap", "inputbox", "math", "score",
+    "section", "templatedata", "timeline"
+]
+
+# [mediawiki/core.git]/includes/Sanitizer.php @ 87a0aef762
+SINGLE_ONLY = ["br", "hr", "meta", "link", "img"]
+SINGLE = SINGLE_ONLY + ["li", "dt", "dd"]
+
+MARKUP_TO_HTML = {
+    "#": "li",
+    "*": "li",
+    ";": "dt",
+    ":": "dd"
+}
+
+def get_html_tag(markup):
+    """Return the HTML tag associated with the given wiki-markup."""
+    return MARKUP_TO_HTML[markup]
+
+def is_parsable(tag):
+    """Return if the given *tag*'s contents should be passed to the parser."""
+    return tag.lower() not in PARSER_BLACKLIST
+
+def is_visible(tag):
+    """Return whether or not the given *tag* contains visible text."""
+    return tag.lower() not in INVISIBLE_TAGS
+
+def is_single(tag):
+    """Return whether or not the given *tag* can exist without a close tag."""
+    return tag.lower() in SINGLE
+
+def is_single_only(tag):
+    """Return whether or not the given *tag* must exist without a close tag."""
+    return tag.lower() in SINGLE_ONLY
+
+def is_scheme(scheme, slashes=True, reverse=False):
+    """Return whether *scheme* is valid for external links."""
+    if reverse:  # Convenience for C
+        scheme = scheme[::-1]
+    scheme = scheme.lower()
+    if slashes:
+        return scheme in URI_SCHEMES
+    return scheme in URI_SCHEMES and not URI_SCHEMES[scheme]
diff --git a/mwparserfromhell/nodes/__init__.py b/mwparserfromhell/nodes/__init__.py
index faaa0b2..ba97b3f 100644
--- a/mwparserfromhell/nodes/__init__.py
+++ b/mwparserfromhell/nodes/__init__.py
@@ -69,6 +69,7 @@ from . import extras
 from .text import Text
 from .argument import Argument
 from .comment import Comment
+from .external_link import ExternalLink
 from .heading import Heading
 from .html_entity import HTMLEntity
 from .tag import Tag
diff --git a/mwparserfromhell/nodes/external_link.py b/mwparserfromhell/nodes/external_link.py
new file mode 100644
index 0000000..d74f6b3
--- /dev/null
+++ b/mwparserfromhell/nodes/external_link.py
@@ -0,0 +1,97 @@
+# -*- coding: utf-8  -*-
+#
+# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+from __future__ import unicode_literals
+
+from . import Node
+from ..compat import str
+from ..utils import parse_anything
+
+__all__ = ["ExternalLink"]
+
+class ExternalLink(Node):
+    """Represents an external link, like ``[http://example.com/ Example]``."""
+
+    def __init__(self, url, title=None, brackets=True):
+        super(ExternalLink, self).__init__()
+        self._url = url
+        self._title = title
+        self._brackets = brackets
+
+    def __unicode__(self):
+        if self.brackets:
+            if self.title is not None:
+                return "[" + str(self.url) + " " + str(self.title) + "]"
+            return "[" + str(self.url) + "]"
+        return str(self.url)
+
+    def __iternodes__(self, getter):
+        yield None, self
+        for child in getter(self.url):
+            yield self.url, child
+        if self.title is not None:
+            for child in getter(self.title):
+                yield self.title, child
+
+    def __strip__(self, normalize, collapse):
+        if self.brackets:
+            if self.title:
+                return self.title.strip_code(normalize, collapse)
+            return None
+        return self.url.strip_code(normalize, collapse)
+
+    def __showtree__(self, write, get, mark):
+        if self.brackets:
+            write("[")
+        get(self.url)
+        if self.title is not None:
+            get(self.title)
+        if self.brackets:
+            write("]")
+
+    @property
+    def url(self):
+        """The URL of the link target, as a :py:class:`~.Wikicode` object."""
+        return self._url
+
+    @property
+    def title(self):
+        """The link title (if given), as a :py:class:`~.Wikicode` object."""
+        return self._title
+
+    @property
+    def brackets(self):
+        """Whether to enclose the URL in brackets or display it straight."""
+        return self._brackets
+
+    @url.setter
+    def url(self, value):
+        from ..parser import contexts
+        self._url = parse_anything(value, contexts.EXT_LINK_URI)
+
+    @title.setter
+    def title(self, value):
+        self._title = None if value is None else parse_anything(value)
+
+    @brackets.setter
+    def brackets(self, value):
+        self._brackets = bool(value)
diff --git a/mwparserfromhell/nodes/extras/attribute.py b/mwparserfromhell/nodes/extras/attribute.py
index ebb65ab..8f7f453 100644
--- a/mwparserfromhell/nodes/extras/attribute.py
+++ b/mwparserfromhell/nodes/extras/attribute.py
@@ -36,18 +36,34 @@ class Attribute(StringMixIn):
     whose value is ``"foo"``.
     """
 
-    def __init__(self, name, value=None, quoted=True):
+    def __init__(self, name, value=None, quoted=True, pad_first=" ",
+                 pad_before_eq="", pad_after_eq=""):
         super(Attribute, self).__init__()
         self._name = name
         self._value = value
         self._quoted = quoted
+        self._pad_first = pad_first
+        self._pad_before_eq = pad_before_eq
+        self._pad_after_eq = pad_after_eq
 
     def __unicode__(self):
-        if self.value:
+        result = self.pad_first + str(self.name) + self.pad_before_eq
+        if self.value is not None:
+            result += "=" + self.pad_after_eq
             if self.quoted:
-                return str(self.name) + '="' + str(self.value) + '"'
-            return str(self.name) + "=" + str(self.value)
-        return str(self.name)
+                return result + '"' + str(self.value) + '"'
+            return result + str(self.value)
+        return result
+
+    def _set_padding(self, attr, value):
+        """Setter for the value of a padding attribute."""
+        if not value:
+            setattr(self, attr, "")
+        else:
+            value = str(value)
+            if not value.isspace():
+                raise ValueError("padding must be entirely whitespace")
+            setattr(self, attr, value)
 
     @property
     def name(self):
@@ -64,14 +80,41 @@ class Attribute(StringMixIn):
         """Whether the attribute's value is quoted with double quotes."""
         return self._quoted
 
+    @property
+    def pad_first(self):
+        """Spacing to insert right before the attribute."""
+        return self._pad_first
+
+    @property
+    def pad_before_eq(self):
+        """Spacing to insert right before the equal sign."""
+        return self._pad_before_eq
+
+    @property
+    def pad_after_eq(self):
+        """Spacing to insert right after the equal sign."""
+        return self._pad_after_eq
+
     @name.setter
-    def name(self, newval):
-        self._name = parse_anything(newval)
+    def name(self, value):
+        self._name = parse_anything(value)
 
     @value.setter
     def value(self, newval):
-        self._value = parse_anything(newval)
+        self._value = None if newval is None else parse_anything(newval)
 
     @quoted.setter
-    def quoted(self, newval):
-        self._quoted = bool(newval)
+    def quoted(self, value):
+        self._quoted = bool(value)
+
+    @pad_first.setter
+    def pad_first(self, value):
+        self._set_padding("_pad_first", value)
+
+    @pad_before_eq.setter
+    def pad_before_eq(self, value):
+        self._set_padding("_pad_before_eq", value)
+
+    @pad_after_eq.setter
+    def pad_after_eq(self, value):
+        self._set_padding("_pad_after_eq", value)
diff --git a/mwparserfromhell/nodes/tag.py b/mwparserfromhell/nodes/tag.py
index eaf2b6e..06f43d0 100644
--- a/mwparserfromhell/nodes/tag.py
+++ b/mwparserfromhell/nodes/tag.py
@@ -22,8 +22,10 @@
 
 from __future__ import unicode_literals
 
-from . import Node, Text
+from . import Node
+from .extras import Attribute
 from ..compat import str
+from ..definitions import is_visible
 from ..utils import parse_anything
 
 __all__ = ["Tag"]
@@ -31,146 +33,85 @@ __all__ = ["Tag"]
 class Tag(Node):
     """Represents an HTML-style tag in wikicode, like ``<ref>``."""
 
-    TAG_UNKNOWN = 0
-
-    # Basic HTML:
-    TAG_ITALIC = 1
-    TAG_BOLD = 2
-    TAG_UNDERLINE = 3
-    TAG_STRIKETHROUGH = 4
-    TAG_UNORDERED_LIST = 5
-    TAG_ORDERED_LIST = 6
-    TAG_DEF_TERM = 7
-    TAG_DEF_ITEM = 8
-    TAG_BLOCKQUOTE = 9
-    TAG_RULE = 10
-    TAG_BREAK = 11
-    TAG_ABBR = 12
-    TAG_PRE = 13
-    TAG_MONOSPACE = 14
-    TAG_CODE = 15
-    TAG_SPAN = 16
-    TAG_DIV = 17
-    TAG_FONT = 18
-    TAG_SMALL = 19
-    TAG_BIG = 20
-    TAG_CENTER = 21
-
-    # MediaWiki parser hooks:
-    TAG_REF = 101
-    TAG_GALLERY = 102
-    TAG_MATH = 103
-    TAG_NOWIKI = 104
-    TAG_NOINCLUDE = 105
-    TAG_INCLUDEONLY = 106
-    TAG_ONLYINCLUDE = 107
-
-    # Additional parser hooks:
-    TAG_SYNTAXHIGHLIGHT = 201
-    TAG_POEM = 202
-
-    # Lists of tags:
-    TAGS_INVISIBLE = set((TAG_REF, TAG_GALLERY, TAG_MATH, TAG_NOINCLUDE))
-    TAGS_VISIBLE = set(range(300)) - TAGS_INVISIBLE
-
-    def __init__(self, type_, tag, contents=None, attrs=None, showtag=True,
-                 self_closing=False, open_padding=0, close_padding=0):
+    def __init__(self, tag, contents=None, attrs=None, wiki_markup=None,
+                 self_closing=False, invalid=False, implicit=False, padding="",
+                 closing_tag=None):
         super(Tag, self).__init__()
-        self._type = type_
         self._tag = tag
-        self._contents = contents
-        if attrs:
-            self._attrs = attrs
+        if contents is None and not self_closing:
+            self._contents = parse_anything("")
         else:
-            self._attrs = []
-        self._showtag = showtag
+            self._contents = contents
+        self._attrs = attrs if attrs else []
+        self._wiki_markup = wiki_markup
         self._self_closing = self_closing
-        self._open_padding = open_padding
-        self._close_padding = close_padding
+        self._invalid = invalid
+        self._implicit = implicit
+        self._padding = padding
+        if closing_tag:
+            self._closing_tag = closing_tag
+        else:
+            self._closing_tag = tag
 
     def __unicode__(self):
-        if not self.showtag:
-            open_, close = self._translate()
+        if self.wiki_markup:
             if self.self_closing:
-                return open_
+                return self.wiki_markup
             else:
-                return open_ + str(self.contents) + close
+                return self.wiki_markup + str(self.contents) + self.wiki_markup
 
-        result = "<" + str(self.tag)
-        if self.attrs:
-            result += " " + " ".join([str(attr) for attr in self.attrs])
+        result = ("</" if self.invalid else "<") + str(self.tag)
+        if self.attributes:
+            result += "".join([str(attr) for attr in self.attributes])
         if self.self_closing:
-            result += " " * self.open_padding + "/>"
+            result += self.padding + (">" if self.implicit else "/>")
         else:
-            result += " " * self.open_padding + ">" + str(self.contents)
-            result += "</" + str(self.tag) + " " * self.close_padding + ">"
+            result += self.padding + ">" + str(self.contents)
+            result += "</" + str(self.closing_tag) + ">"
         return result
 
     def __iternodes__(self, getter):
         yield None, self
-        if self.showtag:
+        if not self.wiki_markup:
             for child in getter(self.tag):
                 yield self.tag, child
-            for attr in self.attrs:
+            for attr in self.attributes:
                 for child in getter(attr.name):
                     yield attr.name, child
                 if attr.value:
                     for child in getter(attr.value):
                         yield attr.value, child
-        for child in getter(self.contents):
-            yield self.contents, child
+        if self.contents:
+            for child in getter(self.contents):
+                yield self.contents, child
+        if not self.self_closing and not self.wiki_markup and self.closing_tag:
+            for child in getter(self.closing_tag):
+                yield self.closing_tag, child
 
     def __strip__(self, normalize, collapse):
-        if self.type in self.TAGS_VISIBLE:
+        if self.contents and is_visible(self.tag):
             return self.contents.strip_code(normalize, collapse)
         return None
 
     def __showtree__(self, write, get, mark):
-        tagnodes = self.tag.nodes
-        if (not self.attrs and len(tagnodes) == 1 and isinstance(tagnodes[0], Text)):
-            write("<" + str(tagnodes[0]) + ">")
+        write("</" if self.invalid else "<")
+        get(self.tag)
+        for attr in self.attributes:
+            get(attr.name)
+            if not attr.value:
+                continue
+            write("    = ")
+            mark()
+            get(attr.value)
+        if self.self_closing:
+            write(">" if self.implicit else "/>")
         else:
-            write("<")
-            get(self.tag)
-            for attr in self.attrs:
-                get(attr.name)
-                if not attr.value:
-                    continue
-                write("    = ")
-                mark()
-                get(attr.value)
             write(">")
-        get(self.contents)
-        if len(tagnodes) == 1 and isinstance(tagnodes[0], Text):
-            write("</" + str(tagnodes[0]) + ">")
-        else:
+            get(self.contents)
             write("</")
-            get(self.tag)
+            get(self.closing_tag)
             write(">")
 
-    def _translate(self):
-        """If the HTML-style tag has a wikicode representation, return that.
-
-        For example, ``<b>Foo</b>`` can be represented as ``'''Foo'''``. This
-        returns a tuple of the character starting the sequence and the
-        character ending it.
-        """
-        translations = {
-            self.TAG_ITALIC: ("''", "''"),
-            self.TAG_BOLD: ("'''", "'''"),
-            self.TAG_UNORDERED_LIST: ("*", ""),
-            self.TAG_ORDERED_LIST: ("#", ""),
-            self.TAG_DEF_TERM: (";", ""),
-            self.TAG_DEF_ITEM: (":", ""),
-            self.TAG_RULE: ("----", ""),
-        }
-        return translations[self.type]
-
-    @property
-    def type(self):
-        """The tag type."""
-        return self._type
-
     @property
     def tag(self):
         """The tag itself, as a :py:class:`~.Wikicode` object."""
@@ -182,7 +123,7 @@ class Tag(Node):
         return self._contents
 
     @property
-    def attrs(self):
+    def attributes(self):
         """The list of attributes affecting the tag.
 
         Each attribute is an instance of :py:class:`~.Attribute`.
@@ -190,52 +131,142 @@ class Tag(Node):
         return self._attrs
 
     @property
-    def showtag(self):
-        """Whether to show the tag itself instead of a wikicode version."""
-        return self._showtag
+    def wiki_markup(self):
+        """The wikified version of a tag to show instead of HTML.
+
+        If set to a value, this will be displayed instead of the brackets.
+        For example, set to ``''`` to replace ``<i>`` or ``----`` to replace
+        ``<hr>``.
+        """
+        return self._wiki_markup
 
     @property
     def self_closing(self):
-        """Whether the tag is self-closing with no content."""
+        """Whether the tag is self-closing with no content (like ``<br/>``)."""
         return self._self_closing
 
     @property
-    def open_padding(self):
-        """How much spacing to insert before the first closing >."""
-        return self._open_padding
+    def invalid(self):
+        """Whether the tag starts with a backslash after the opening bracket.
+
+        This makes the tag look like a lone close tag. It is technically
+        invalid and is only parsable Wikicode when the tag itself is
+        single-only, like ``<br>`` and ``<img>``. See
+        :py:func:`.definitions.is_single_only`.
+        """
+        return self._invalid
 
     @property
-    def close_padding(self):
-        """How much spacing to insert before the last closing >."""
-        return self._close_padding
+    def implicit(self):
+        """Whether the tag is implicitly self-closing, with no ending slash.
 
-    @type.setter
-    def type(self, value):
-        value = int(value)
-        if value not in self.TAGS_INVISIBLE | self.TAGS_VISIBLE:
-            raise ValueError(value)
-        self._type = value
+        This is only possible for specific "single" tags like ``<br>`` and
+        ``<li>``. See :py:func:`.definitions.is_single`. This field only has an
+        effect if :py:attr:`self_closing` is also ``True``.
+        """
+        return self._implicit
+
+    @property
+    def padding(self):
+        """Spacing to insert before the first closing ``>``."""
+        return self._padding
+
+    @property
+    def closing_tag(self):
+        """The closing tag, as a :py:class:`~.Wikicode` object.
+
+        This will usually equal :py:attr:`tag`, unless there is additional
+        spacing, comments, or the like.
+        """
+        return self._closing_tag
 
     @tag.setter
     def tag(self, value):
-        self._tag = parse_anything(value)
+        self._tag = self._closing_tag = parse_anything(value)
 
     @contents.setter
     def contents(self, value):
         self._contents = parse_anything(value)
 
-    @showtag.setter
-    def showtag(self, value):
-        self._showtag = bool(value)
+    @wiki_markup.setter
+    def wiki_markup(self, value):
+        self._wiki_markup = str(value) if value else None
 
     @self_closing.setter
     def self_closing(self, value):
         self._self_closing = bool(value)
 
-    @open_padding.setter
-    def open_padding(self, value):
-        self._open_padding = int(value)
+    @invalid.setter
+    def invalid(self, value):
+        self._invalid = bool(value)
+
+    @implicit.setter
+    def implicit(self, value):
+        self._implicit = bool(value)
 
-    @close_padding.setter
-    def close_padding(self, value):
-        self._close_padding = int(value)
+    @padding.setter
+    def padding(self, value):
+        if not value:
+            self._padding = ""
+        else:
+            value = str(value)
+            if not value.isspace():
+                raise ValueError("padding must be entirely whitespace")
+            self._padding = value
+
+    @closing_tag.setter
+    def closing_tag(self, value):
+        self._closing_tag = parse_anything(value)
+
+    def has(self, name):
+        """Return whether any attribute in the tag has the given *name*.
+
+        Note that a tag may have multiple attributes with the same name, but
+        only the last one is read by the MediaWiki parser.
+        """
+        for attr in self.attributes:
+            if attr.name == name.strip():
+                return True
+        return False
+
+    def get(self, name):
+        """Get the attribute with the given *name*.
+
+        The returned object is a :py:class:`~.Attribute` instance. Raises
+        :py:exc:`ValueError` if no attribute has this name. Since multiple
+        attributes can have the same name, we'll return the last match, since
+        all but the last are ignored by the MediaWiki parser.
+        """
+        for attr in reversed(self.attributes):
+            if attr.name == name.strip():
+                return attr
+        raise ValueError(name)
+
+    def add(self, name, value=None, quoted=True, pad_first=" ",
+            pad_before_eq="", pad_after_eq=""):
+        """Add an attribute with the given *name* and *value*.
+
+        *name* and *value* can be anything parasable by
+        :py:func:`.utils.parse_anything`; *value* can be omitted if the
+        attribute is valueless. *quoted* is a bool telling whether to wrap the
+        *value* in double quotes (this is recommended). *pad_first*,
+        *pad_before_eq*, and *pad_after_eq* are whitespace used as padding
+        before the name, before the equal sign (or after the name if no value),
+        and after the equal sign (ignored if no value), respectively.
+        """
+        if value is not None:
+            value = parse_anything(value)
+        attr = Attribute(parse_anything(name), value, quoted)
+        attr.pad_first = pad_first
+        attr.pad_before_eq = pad_before_eq
+        attr.pad_after_eq = pad_after_eq
+        self.attributes.append(attr)
+        return attr
+
+    def remove(self, name):
+        """Remove all attributes with the given *name*."""
+        attrs = [attr for attr in self.attributes if attr.name == name.strip()]
+        if not attrs:
+            raise ValueError(name)
+        for attr in attrs:
+            self.attributes.remove(attr)
diff --git a/mwparserfromhell/nodes/template.py b/mwparserfromhell/nodes/template.py
index 6dfc4f0..a6b1665 100644
--- a/mwparserfromhell/nodes/template.py
+++ b/mwparserfromhell/nodes/template.py
@@ -26,7 +26,7 @@ import re
 
 from . import HTMLEntity, Node, Text
 from .extras import Parameter
-from ..compat import basestring, str
+from ..compat import str
 from ..utils import parse_anything
 
 __all__ = ["Template"]
@@ -84,7 +84,7 @@ class Template(Node):
         replacement = str(HTMLEntity(value=ord(char)))
         for node in code.filter_text(recursive=False):
             if char in node:
-                code.replace(node, node.replace(char, replacement))
+                code.replace(node, node.replace(char, replacement), False)
 
     def _blank_param_value(self, value):
         """Remove the content from *value* while keeping its whitespace.
@@ -164,15 +164,15 @@ class Template(Node):
     def name(self, value):
         self._name = parse_anything(value)
 
-    def has_param(self, name, ignore_empty=True):
+    def has(self, name, ignore_empty=True):
         """Return ``True`` if any parameter in the template is named *name*.
 
         With *ignore_empty*, ``False`` will be returned even if the template
         contains a parameter with the name *name*, if the parameter's value
         is empty. Note that a template may have multiple parameters with the
-        same name.
+        same name, but only the last one is read by the MediaWiki parser.
         """
-        name = name.strip() if isinstance(name, basestring) else str(name)
+        name = str(name).strip()
         for param in self.params:
             if param.name.strip() == name:
                 if ignore_empty and not param.value.strip():
@@ -180,6 +180,9 @@ class Template(Node):
                 return True
         return False
 
+    has_param = lambda self, *args, **kwargs: self.has(*args, **kwargs)
+    has_param.__doc__ = "Alias for :py:meth:`has`."
+
     def get(self, name):
         """Get the parameter whose name is *name*.
 
@@ -188,7 +191,7 @@ class Template(Node):
         parameters can have the same name, we'll return the last match, since
         the last parameter is the only one read by the MediaWiki parser.
         """
-        name = name.strip() if isinstance(name, basestring) else str(name)
+        name = str(name).strip()
         for param in reversed(self.params):
             if param.name.strip() == name:
                 return param
@@ -226,7 +229,7 @@ class Template(Node):
         name, value = parse_anything(name), parse_anything(value)
         self._surface_escape(value, "|")
 
-        if self.has_param(name):
+        if self.has(name):
             self.remove(name, keep_field=True)
             existing = self.get(name)
             if showkey is not None:
@@ -291,7 +294,7 @@ class Template(Node):
         the first instance if none have dependents, otherwise the one with
         dependents will be kept).
         """
-        name = name.strip() if isinstance(name, basestring) else str(name)
+        name = str(name).strip()
         removed = False
         to_remove = []
         for i, param in enumerate(self.params):
diff --git a/mwparserfromhell/parser/__init__.py b/mwparserfromhell/parser/__init__.py
index 1fb95b5..22c3dc2 100644
--- a/mwparserfromhell/parser/__init__.py
+++ b/mwparserfromhell/parser/__init__.py
@@ -46,16 +46,15 @@ class Parser(object):
     :py:class:`~.Node`\ s by the :py:class:`~.Builder`.
     """
 
-    def __init__(self, text):
-        self.text = text
+    def __init__(self):
         if use_c and CTokenizer:
             self._tokenizer = CTokenizer()
         else:
             self._tokenizer = Tokenizer()
         self._builder = Builder()
 
-    def parse(self):
-        """Return a string as a parsed :py:class:`~.Wikicode` object tree."""
-        tokens = self._tokenizer.tokenize(self.text)
+    def parse(self, text, context=0):
+        """Parse *text*, returning a :py:class:`~.Wikicode` object tree."""
+        tokens = self._tokenizer.tokenize(text, context)
         code = self._builder.build(tokens)
         return code
diff --git a/mwparserfromhell/parser/builder.py b/mwparserfromhell/parser/builder.py
index 2cd7831..d31f450 100644
--- a/mwparserfromhell/parser/builder.py
+++ b/mwparserfromhell/parser/builder.py
@@ -24,8 +24,8 @@ from __future__ import unicode_literals
 
 from . import tokens
 from ..compat import str
-from ..nodes import (Argument, Comment, Heading, HTMLEntity, Tag, Template,
-                     Text, Wikilink)
+from ..nodes import (Argument, Comment, ExternalLink, Heading, HTMLEntity, Tag,
+                     Template, Text, Wikilink)
 from ..nodes.extras import Attribute, Parameter
 from ..smart_list import SmartList
 from ..wikicode import Wikicode
@@ -83,7 +83,7 @@ class Builder(object):
                                     tokens.TemplateClose)):
                 self._tokens.append(token)
                 value = self._pop()
-                if not key:
+                if key is None:
                     key = self._wrap([Text(str(default))])
                 return Parameter(key, value, showkey)
             else:
@@ -142,6 +142,22 @@ class Builder(object):
             else:
                 self._write(self._handle_token(token))
 
+    def _handle_external_link(self, token):
+        """Handle when an external link is at the head of the tokens."""
+        brackets, url = token.brackets, None
+        self._push()
+        while self._tokens:
+            token = self._tokens.pop()
+            if isinstance(token, tokens.ExternalLinkSeparator):
+                url = self._pop()
+                self._push()
+            elif isinstance(token, tokens.ExternalLinkClose):
+                if url is not None:
+                    return ExternalLink(url, self._pop(), brackets)
+                return ExternalLink(self._pop(), brackets=brackets)
+            else:
+                self._write(self._handle_token(token))
+
     def _handle_entity(self):
         """Handle a case where an HTML entity is at the head of the tokens."""
         token = self._tokens.pop()
@@ -170,7 +186,7 @@ class Builder(object):
                 self._write(self._handle_token(token))
 
     def _handle_comment(self):
-        """Handle a case where a hidden comment is at the head of the tokens."""
+        """Handle a case where an HTML comment is at the head of the tokens."""
         self._push()
         while self._tokens:
             token = self._tokens.pop()
@@ -180,7 +196,7 @@ class Builder(object):
             else:
                 self._write(self._handle_token(token))
 
-    def _handle_attribute(self):
+    def _handle_attribute(self, start):
         """Handle a case where a tag attribute is at the head of the tokens."""
         name, quoted = None, False
         self._push()
@@ -191,37 +207,46 @@ class Builder(object):
                 self._push()
             elif isinstance(token, tokens.TagAttrQuote):
                 quoted = True
-            elif isinstance(token, (tokens.TagAttrStart,
-                                    tokens.TagCloseOpen)):
+            elif isinstance(token, (tokens.TagAttrStart, tokens.TagCloseOpen,
+                                    tokens.TagCloseSelfclose)):
                 self._tokens.append(token)
-                if name is not None:
-                    return Attribute(name, self._pop(), quoted)
-                return Attribute(self._pop(), quoted=quoted)
+                if name:
+                    value = self._pop()
+                else:
+                    name, value = self._pop(), None
+                return Attribute(name, value, quoted, start.pad_first,
+                                 start.pad_before_eq, start.pad_after_eq)
             else:
                 self._write(self._handle_token(token))
 
     def _handle_tag(self, token):
         """Handle a case where a tag is at the head of the tokens."""
-        type_, showtag = token.type, token.showtag
-        attrs = []
+        close_tokens = (tokens.TagCloseSelfclose, tokens.TagCloseClose)
+        implicit, attrs, contents, closing_tag = False, [], None, None
+        wiki_markup, invalid = token.wiki_markup, token.invalid or False
         self._push()
         while self._tokens:
             token = self._tokens.pop()
             if isinstance(token, tokens.TagAttrStart):
-                attrs.append(self._handle_attribute())
+                attrs.append(self._handle_attribute(token))
             elif isinstance(token, tokens.TagCloseOpen):
-                open_pad = token.padding
+                padding = token.padding or ""
                 tag = self._pop()
                 self._push()
-            elif isinstance(token, tokens.TagCloseSelfclose):
-                tag = self._pop()
-                return Tag(type_, tag, attrs=attrs, showtag=showtag,
-                           self_closing=True, open_padding=token.padding)
             elif isinstance(token, tokens.TagOpenClose):
                 contents = self._pop()
-            elif isinstance(token, tokens.TagCloseClose):
-                return Tag(type_, tag, contents, attrs, showtag, False,
-                           open_pad, token.padding)
+                self._push()
+            elif isinstance(token, close_tokens):
+                if isinstance(token, tokens.TagCloseSelfclose):
+                    tag = self._pop()
+                    self_closing = True
+                    padding = token.padding or ""
+                    implicit = token.implicit or False
+                else:
+                    self_closing = False
+                    closing_tag = self._pop()
+                return Tag(tag, contents, attrs, wiki_markup, self_closing,
+                           invalid, implicit, padding, closing_tag)
             else:
                 self._write(self._handle_token(token))
 
@@ -235,6 +260,8 @@ class Builder(object):
             return self._handle_argument()
         elif isinstance(token, tokens.WikilinkOpen):
             return self._handle_wikilink()
+        elif isinstance(token, tokens.ExternalLinkOpen):
+            return self._handle_external_link(token)
         elif isinstance(token, tokens.HTMLEntityStart):
             return self._handle_entity()
         elif isinstance(token, tokens.HeadingStart):
diff --git a/mwparserfromhell/parser/contexts.py b/mwparserfromhell/parser/contexts.py
index 896d137..33da8f7 100644
--- a/mwparserfromhell/parser/contexts.py
+++ b/mwparserfromhell/parser/contexts.py
@@ -51,6 +51,12 @@ Local (stack-specific) contexts:
     * :py:const:`WIKILINK_TITLE`
     * :py:const:`WIKILINK_TEXT`
 
+* :py:const:`EXT_LINK`
+
+    * :py:const:`EXT_LINK_URI`
+    * :py:const:`EXT_LINK_TITLE`
+    * :py:const:`EXT_LINK_BRACKETS`
+
 * :py:const:`HEADING`
 
     * :py:const:`HEADING_LEVEL_1`
@@ -60,7 +66,21 @@ Local (stack-specific) contexts:
     * :py:const:`HEADING_LEVEL_5`
     * :py:const:`HEADING_LEVEL_6`
 
-* :py:const:`COMMENT`
+* :py:const:`TAG`
+
+    * :py:const:`TAG_OPEN`
+    * :py:const:`TAG_ATTR`
+    * :py:const:`TAG_BODY`
+    * :py:const:`TAG_CLOSE`
+
+* :py:const:`STYLE`
+
+    * :py:const:`STYLE_ITALICS`
+    * :py:const:`STYLE_BOLD`
+    * :py:const:`STYLE_PASS_AGAIN`
+    * :py:const:`STYLE_SECOND_PASS`
+
+* :py:const:`DL_TERM`
 
 * :py:const:`SAFETY_CHECK`
 
@@ -74,41 +94,76 @@ Local (stack-specific) contexts:
 Global contexts:
 
 * :py:const:`GL_HEADING`
+
+Aggregate contexts:
+
+* :py:const:`FAIL`
+* :py:const:`UNSAFE`
+* :py:const:`DOUBLE`
+* :py:const:`INVALID_LINK`
+
 """
 
 # Local contexts:
 
-TEMPLATE =              0b00000000000000000111
-TEMPLATE_NAME =         0b00000000000000000001
-TEMPLATE_PARAM_KEY =    0b00000000000000000010
-TEMPLATE_PARAM_VALUE =  0b00000000000000000100
-
-ARGUMENT =              0b00000000000000011000
-ARGUMENT_NAME =         0b00000000000000001000
-ARGUMENT_DEFAULT =      0b00000000000000010000
-
-WIKILINK =              0b00000000000001100000
-WIKILINK_TITLE =        0b00000000000000100000
-WIKILINK_TEXT =         0b00000000000001000000
-
-HEADING =               0b00000001111110000000
-HEADING_LEVEL_1 =       0b00000000000010000000
-HEADING_LEVEL_2 =       0b00000000000100000000
-HEADING_LEVEL_3 =       0b00000000001000000000
-HEADING_LEVEL_4 =       0b00000000010000000000
-HEADING_LEVEL_5 =       0b00000000100000000000
-HEADING_LEVEL_6 =       0b00000001000000000000
-
-COMMENT =               0b00000010000000000000
-
-SAFETY_CHECK =          0b11111100000000000000
-HAS_TEXT =              0b00000100000000000000
-FAIL_ON_TEXT =          0b00001000000000000000
-FAIL_NEXT  =            0b00010000000000000000
-FAIL_ON_LBRACE =        0b00100000000000000000
-FAIL_ON_RBRACE =        0b01000000000000000000
-FAIL_ON_EQUALS =        0b10000000000000000000
+TEMPLATE_NAME =        1 << 0
+TEMPLATE_PARAM_KEY =   1 << 1
+TEMPLATE_PARAM_VALUE = 1 << 2
+TEMPLATE = TEMPLATE_NAME + TEMPLATE_PARAM_KEY + TEMPLATE_PARAM_VALUE
+
+ARGUMENT_NAME =    1 << 3
+ARGUMENT_DEFAULT = 1 << 4
+ARGUMENT = ARGUMENT_NAME + ARGUMENT_DEFAULT
+
+WIKILINK_TITLE = 1 << 5
+WIKILINK_TEXT =  1 << 6
+WIKILINK = WIKILINK_TITLE + WIKILINK_TEXT
+
+EXT_LINK_URI      = 1 << 7
+EXT_LINK_TITLE    = 1 << 8
+EXT_LINK_BRACKETS = 1 << 9
+EXT_LINK = EXT_LINK_URI + EXT_LINK_TITLE + EXT_LINK_BRACKETS
+
+HEADING_LEVEL_1 = 1 << 10
+HEADING_LEVEL_2 = 1 << 11
+HEADING_LEVEL_3 = 1 << 12
+HEADING_LEVEL_4 = 1 << 13
+HEADING_LEVEL_5 = 1 << 14
+HEADING_LEVEL_6 = 1 << 15
+HEADING = (HEADING_LEVEL_1 + HEADING_LEVEL_2 + HEADING_LEVEL_3 +
+           HEADING_LEVEL_4 + HEADING_LEVEL_5 + HEADING_LEVEL_6)
+
+TAG_OPEN =  1 << 16
+TAG_ATTR =  1 << 17
+TAG_BODY =  1 << 18
+TAG_CLOSE = 1 << 19
+TAG = TAG_OPEN + TAG_ATTR + TAG_BODY + TAG_CLOSE
+
+STYLE_ITALICS =      1 << 20
+STYLE_BOLD =         1 << 21
+STYLE_PASS_AGAIN =   1 << 22
+STYLE_SECOND_PASS =  1 << 23
+STYLE = STYLE_ITALICS + STYLE_BOLD + STYLE_PASS_AGAIN + STYLE_SECOND_PASS
+
+DL_TERM = 1 << 24
+
+HAS_TEXT =       1 << 25
+FAIL_ON_TEXT =   1 << 26
+FAIL_NEXT  =     1 << 27
+FAIL_ON_LBRACE = 1 << 28
+FAIL_ON_RBRACE = 1 << 29
+FAIL_ON_EQUALS = 1 << 30
+SAFETY_CHECK = (HAS_TEXT + FAIL_ON_TEXT + FAIL_NEXT + FAIL_ON_LBRACE +
+                FAIL_ON_RBRACE + FAIL_ON_EQUALS)
 
 # Global contexts:
 
-GL_HEADING = 0b1
+GL_HEADING = 1 << 0
+
+# Aggregate contexts:
+
+FAIL = TEMPLATE + ARGUMENT + WIKILINK + EXT_LINK_TITLE + HEADING + TAG + STYLE
+UNSAFE = (TEMPLATE_NAME + WIKILINK + EXT_LINK_TITLE + TEMPLATE_PARAM_KEY +
+          ARGUMENT_NAME + TAG_CLOSE)
+DOUBLE = TEMPLATE_PARAM_KEY + TAG_CLOSE
+INVALID_LINK = TEMPLATE_NAME + ARGUMENT_NAME + WIKILINK + EXT_LINK
diff --git a/mwparserfromhell/parser/tokenizer.c b/mwparserfromhell/parser/tokenizer.c
index df65d0e..c9527ab 100644
--- a/mwparserfromhell/parser/tokenizer.c
+++ b/mwparserfromhell/parser/tokenizer.c
@@ -24,28 +24,71 @@ SOFTWARE.
 #include "tokenizer.h"
 
 /*
+    Determine whether the given Py_UNICODE is a marker.
+*/
+static int is_marker(Py_UNICODE this)
+{
+    int i;
+
+    for (i = 0; i < NUM_MARKERS; i++) {
+        if (*MARKERS[i] == this)
+            return 1;
+    }
+    return 0;
+}
+
+/*
     Given a context, return the heading level encoded within it.
 */
 static int heading_level_from_context(int n)
 {
     int level;
+
     n /= LC_HEADING_LEVEL_1;
     for (level = 1; n > 1; n >>= 1)
         level++;
     return level;
 }
 
-static PyObject*
-Tokenizer_new(PyTypeObject* type, PyObject* args, PyObject* kwds)
+/*
+    Call the given function in definitions.py, using 'in1', 'in2', and 'in3' as
+    parameters, and return its output as a bool.
+*/
+static int call_def_func(const char* funcname, PyObject* in1, PyObject* in2,
+                         PyObject* in3)
 {
-    Tokenizer* self = (Tokenizer*) type->tp_alloc(type, 0);
-    return (PyObject*) self;
+    PyObject* func = PyObject_GetAttrString(definitions, funcname);
+    PyObject* result = PyObject_CallFunctionObjArgs(func, in1, in2, in3, NULL);
+    int ans = (result == Py_True) ? 1 : 0;
+
+    Py_DECREF(func);
+    Py_DECREF(result);
+    return ans;
+}
+
+/*
+    Sanitize the name of a tag so it can be compared with others for equality.
+*/
+static PyObject* strip_tag_name(PyObject* token)
+{
+    PyObject *text, *rstripped, *lowered;
+
+    text = PyObject_GetAttrString(token, "text");
+    if (!text)
+        return NULL;
+    rstripped = PyObject_CallMethod(text, "rstrip", NULL);
+    Py_DECREF(text);
+    if (!rstripped)
+        return NULL;
+    lowered = PyObject_CallMethod(rstripped, "lower", NULL);
+    Py_DECREF(rstripped);
+    return lowered;
 }
 
-static struct Textbuffer*
-Textbuffer_new(void)
+static Textbuffer* Textbuffer_new(void)
 {
-    struct Textbuffer* buffer = malloc(sizeof(struct Textbuffer));
+    Textbuffer* buffer = malloc(sizeof(Textbuffer));
+
     if (!buffer) {
         PyErr_NoMemory();
         return NULL;
@@ -57,60 +100,151 @@ Textbuffer_new(void)
         PyErr_NoMemory();
         return NULL;
     }
-    buffer->next = NULL;
+    buffer->prev = buffer->next = NULL;
     return buffer;
 }
 
-static void
-Tokenizer_dealloc(Tokenizer* self)
+static void Textbuffer_dealloc(Textbuffer* self)
 {
-    struct Stack *this = self->topstack, *next;
-    Py_XDECREF(self->text);
+    Textbuffer* next;
 
-    while (this) {
-        Py_DECREF(this->stack);
-        Textbuffer_dealloc(this->textbuffer);
-        next = this->next;
-        free(this);
-        this = next;
+    while (self) {
+        free(self->data);
+        next = self->next;
+        free(self);
+        self = next;
+    }
+}
+
+/*
+    Write a Unicode codepoint to the given textbuffer.
+*/
+static int Textbuffer_write(Textbuffer** this, Py_UNICODE code)
+{
+    Textbuffer* self = *this;
+
+    if (self->size == TEXTBUFFER_BLOCKSIZE) {
+        Textbuffer* new = Textbuffer_new();
+        if (!new)
+            return -1;
+        new->next = self;
+        self->prev = new;
+        *this = self = new;
+    }
+    self->data[self->size++] = code;
+    return 0;
+}
+
+/*
+    Return the contents of the textbuffer as a Python Unicode object.
+*/
+static PyObject* Textbuffer_render(Textbuffer* self)
+{
+    PyObject *result = PyUnicode_FromUnicode(self->data, self->size);
+    PyObject *left, *concat;
+
+    while (self->next) {
+        self = self->next;
+        left = PyUnicode_FromUnicode(self->data, self->size);
+        concat = PyUnicode_Concat(left, result);
+        Py_DECREF(left);
+        Py_DECREF(result);
+        result = concat;
+    }
+    return result;
+}
+
+static TagData* TagData_new(void)
+{
+    TagData *self = malloc(sizeof(TagData));
+
+    #define ALLOC_BUFFER(name)     \
+        name = Textbuffer_new();   \
+        if (!name) {               \
+            TagData_dealloc(self); \
+            return NULL;           \
+        }
+
+    if (!self) {
+        PyErr_NoMemory();
+        return NULL;
     }
-    self->ob_type->tp_free((PyObject*) self);
+    self->context = TAG_NAME;
+    ALLOC_BUFFER(self->pad_first)
+    ALLOC_BUFFER(self->pad_before_eq)
+    ALLOC_BUFFER(self->pad_after_eq)
+    self->reset = 0;
+    return self;
+}
+
+static void TagData_dealloc(TagData* self)
+{
+    #define DEALLOC_BUFFER(name) \
+        if (name)                \
+            Textbuffer_dealloc(name);
+
+    DEALLOC_BUFFER(self->pad_first);
+    DEALLOC_BUFFER(self->pad_before_eq);
+    DEALLOC_BUFFER(self->pad_after_eq);
+    free(self);
+}
+
+static int TagData_reset_buffers(TagData* self)
+{
+    #define RESET_BUFFER(name)    \
+        Textbuffer_dealloc(name); \
+        name = Textbuffer_new();  \
+        if (!name)                \
+            return -1;
+
+    RESET_BUFFER(self->pad_first)
+    RESET_BUFFER(self->pad_before_eq)
+    RESET_BUFFER(self->pad_after_eq)
+    return 0;
+}
+
+static PyObject*
+Tokenizer_new(PyTypeObject* type, PyObject* args, PyObject* kwds)
+{
+    Tokenizer* self = (Tokenizer*) type->tp_alloc(type, 0);
+    return (PyObject*) self;
 }
 
-static void
-Textbuffer_dealloc(struct Textbuffer* this)
+static void Tokenizer_dealloc(Tokenizer* self)
 {
-    struct Textbuffer* next;
+    Stack *this = self->topstack, *next;
+    Py_XDECREF(self->text);
+
     while (this) {
-        free(this->data);
+        Py_DECREF(this->stack);
+        Textbuffer_dealloc(this->textbuffer);
         next = this->next;
         free(this);
         this = next;
     }
+    Py_TYPE(self)->tp_free((PyObject*) self);
 }
 
-static int
-Tokenizer_init(Tokenizer* self, PyObject* args, PyObject* kwds)
+static int Tokenizer_init(Tokenizer* self, PyObject* args, PyObject* kwds)
 {
     static char* kwlist[] = {NULL};
+
     if (!PyArg_ParseTupleAndKeywords(args, kwds, "", kwlist))
         return -1;
     self->text = Py_None;
     Py_INCREF(Py_None);
     self->topstack = NULL;
-    self->head = 0;
-    self->length = 0;
-    self->global = 0;
+    self->head = self->length = self->global = self->depth = self->cycles = 0;
     return 0;
 }
 
 /*
     Add a new token stack, context, and textbuffer to the list.
 */
-static int
-Tokenizer_push(Tokenizer* self, int context)
+static int Tokenizer_push(Tokenizer* self, int context)
 {
-    struct Stack* top = malloc(sizeof(struct Stack));
+    Stack* top = malloc(sizeof(Stack));
+
     if (!top) {
         PyErr_NoMemory();
         return -1;
@@ -128,32 +262,13 @@ Tokenizer_push(Tokenizer* self, int context)
 }
 
 /*
-    Return the contents of the textbuffer as a Python Unicode object.
-*/
-static PyObject*
-Textbuffer_render(struct Textbuffer* self)
-{
-    PyObject *result = PyUnicode_FromUnicode(self->data, self->size);
-    PyObject *left, *concat;
-    while (self->next) {
-        self = self->next;
-        left = PyUnicode_FromUnicode(self->data, self->size);
-        concat = PyUnicode_Concat(left, result);
-        Py_DECREF(left);
-        Py_DECREF(result);
-        result = concat;
-    }
-    return result;
-}
-
-/*
     Push the textbuffer onto the stack as a Text node and clear it.
 */
-static int
-Tokenizer_push_textbuffer(Tokenizer* self)
+static int Tokenizer_push_textbuffer(Tokenizer* self)
 {
     PyObject *text, *kwargs, *token;
-    struct Textbuffer* buffer = self->topstack->textbuffer;
+    Textbuffer* buffer = self->topstack->textbuffer;
+
     if (buffer->size == 0 && !buffer->next)
         return 0;
     text = Textbuffer_render(buffer);
@@ -185,10 +300,10 @@ Tokenizer_push_textbuffer(Tokenizer* self)
 /*
     Pop and deallocate the top token stack/context/textbuffer.
 */
-static void
-Tokenizer_delete_top_of_stack(Tokenizer* self)
+static void Tokenizer_delete_top_of_stack(Tokenizer* self)
 {
-    struct Stack* top = self->topstack;
+    Stack* top = self->topstack;
+
     Py_DECREF(top->stack);
     Textbuffer_dealloc(top->textbuffer);
     self->topstack = top->next;
@@ -199,10 +314,10 @@ Tokenizer_delete_top_of_stack(Tokenizer* self)
 /*
     Pop the current stack/context/textbuffer, returing the stack.
 */
-static PyObject*
-Tokenizer_pop(Tokenizer* self)
+static PyObject* Tokenizer_pop(Tokenizer* self)
 {
     PyObject* stack;
+
     if (Tokenizer_push_textbuffer(self))
         return NULL;
     stack = self->topstack->stack;
@@ -215,11 +330,11 @@ Tokenizer_pop(Tokenizer* self)
     Pop the current stack/context/textbuffer, returing the stack. We will also
     replace the underlying stack's context with the current stack's.
 */
-static PyObject*
-Tokenizer_pop_keeping_context(Tokenizer* self)
+static PyObject* Tokenizer_pop_keeping_context(Tokenizer* self)
 {
     PyObject* stack;
     int context;
+
     if (Tokenizer_push_textbuffer(self))
         return NULL;
     stack = self->topstack->stack;
@@ -234,70 +349,133 @@ Tokenizer_pop_keeping_context(Tokenizer* self)
     Fail the current tokenization route. Discards the current
     stack/context/textbuffer and raises a BadRoute exception.
 */
-static void*
-Tokenizer_fail_route(Tokenizer* self)
+static void* Tokenizer_fail_route(Tokenizer* self)
 {
+    int context = self->topstack->context;
     PyObject* stack = Tokenizer_pop(self);
+
     Py_XDECREF(stack);
-    FAIL_ROUTE();
+    FAIL_ROUTE(context);
     return NULL;
 }
 
 /*
-    Write a token to the end of the current token stack.
+    Write a token to the current token stack.
 */
-static int
-Tokenizer_write(Tokenizer* self, PyObject* token)
+static int Tokenizer_emit_token(Tokenizer* self, PyObject* token, int first)
 {
+    PyObject* instance;
+
     if (Tokenizer_push_textbuffer(self))
         return -1;
-    if (PyList_Append(self->topstack->stack, token))
+    instance = PyObject_CallObject(token, NULL);
+    if (!instance)
         return -1;
+    if (first ? PyList_Insert(self->topstack->stack, 0, instance) :
+                PyList_Append(self->topstack->stack, instance)) {
+        Py_DECREF(instance);
+        return -1;
+    }
+    Py_DECREF(instance);
     return 0;
 }
 
 /*
-    Write a token to the beginning of the current token stack.
+    Write a token to the current token stack, with kwargs. Steals a reference
+    to kwargs.
 */
-static int
-Tokenizer_write_first(Tokenizer* self, PyObject* token)
+static int Tokenizer_emit_token_kwargs(Tokenizer* self, PyObject* token,
+                                       PyObject* kwargs, int first)
 {
-    if (Tokenizer_push_textbuffer(self))
+    PyObject* instance;
+
+    if (Tokenizer_push_textbuffer(self)) {
+        Py_DECREF(kwargs);
+        return -1;
+    }
+    instance = PyObject_Call(token, NOARGS, kwargs);
+    if (!instance) {
+        Py_DECREF(kwargs);
         return -1;
-    if (PyList_Insert(self->topstack->stack, 0, token))
+    }
+    if (first ? PyList_Insert(self->topstack->stack, 0, instance):
+                PyList_Append(self->topstack->stack, instance)) {
+        Py_DECREF(instance);
+        Py_DECREF(kwargs);
         return -1;
+    }
+    Py_DECREF(instance);
+    Py_DECREF(kwargs);
     return 0;
 }
 
 /*
-    Write text to the current textbuffer.
+    Write a Unicode codepoint to the current textbuffer.
 */
-static int
-Tokenizer_write_text(Tokenizer* self, Py_UNICODE text)
+static int Tokenizer_emit_char(Tokenizer* self, Py_UNICODE code)
 {
-    struct Textbuffer* buf = self->topstack->textbuffer;
-    if (buf->size == TEXTBUFFER_BLOCKSIZE) {
-        struct Textbuffer* new = Textbuffer_new();
-        if (!new)
+    return Textbuffer_write(&(self->topstack->textbuffer), code);
+}
+
+/*
+    Write a string of text to the current textbuffer.
+*/
+static int Tokenizer_emit_text(Tokenizer* self, const char* text)
+{
+    int i = 0;
+
+    while (text[i]) {
+        if (Tokenizer_emit_char(self, text[i]))
             return -1;
-        new->next = buf;
-        self->topstack->textbuffer = new;
-        buf = new;
+        i++;
     }
-    buf->data[buf->size] = text;
-    buf->size++;
     return 0;
 }
 
 /*
-    Write a series of tokens to the current stack at once.
+    Write the contents of another textbuffer to the current textbuffer,
+    deallocating it in the process.
 */
 static int
-Tokenizer_write_all(Tokenizer* self, PyObject* tokenlist)
+Tokenizer_emit_textbuffer(Tokenizer* self, Textbuffer* buffer, int reverse)
+{
+    Textbuffer *original = buffer;
+    int i;
+
+    if (reverse) {
+        do {
+            for (i = buffer->size - 1; i >= 0; i--) {
+                if (Tokenizer_emit_char(self, buffer->data[i])) {
+                    Textbuffer_dealloc(original);
+                    return -1;
+                }
+            }
+        } while ((buffer = buffer->next));
+    }
+    else {
+        while (buffer->next)
+            buffer = buffer->next;
+        do {
+            for (i = 0; i < buffer->size; i++) {
+                if (Tokenizer_emit_char(self, buffer->data[i])) {
+                    Textbuffer_dealloc(original);
+                    return -1;
+                }
+            }
+        } while ((buffer = buffer->prev));
+    }
+    Textbuffer_dealloc(original);
+    return 0;
+}
+
+/*
+    Write a series of tokens to the current stack at once.
+*/
+static int Tokenizer_emit_all(Tokenizer* self, PyObject* tokenlist)
 {
     int pushed = 0;
     PyObject *stack, *token, *left, *right, *text;
-    struct Textbuffer* buffer;
+    Textbuffer* buffer;
     Py_ssize_t size;
 
     if (PyList_GET_SIZE(tokenlist) > 0) {
@@ -351,23 +529,17 @@ Tokenizer_write_all(Tokenizer* self, PyObject* tokenlist)
     Pop the current stack, write text, and then write the stack. 'text' is a
     NULL-terminated array of chars.
 */
-static int
-Tokenizer_write_text_then_stack(Tokenizer* self, const char* text)
+static int Tokenizer_emit_text_then_stack(Tokenizer* self, const char* text)
 {
     PyObject* stack = Tokenizer_pop(self);
-    int i = 0;
-    while (1) {
-        if (!text[i])
-            break;
-        if (Tokenizer_write_text(self, (Py_UNICODE) text[i])) {
-            Py_XDECREF(stack);
-            return -1;
-        }
-        i++;
+
+    if (Tokenizer_emit_text(self, text)) {
+        Py_DECREF(stack);
+        return -1;
     }
     if (stack) {
         if (PyList_GET_SIZE(stack) > 0) {
-            if (Tokenizer_write_all(self, stack)) {
+            if (Tokenizer_emit_all(self, stack)) {
                 Py_DECREF(stack);
                 return -1;
             }
@@ -381,10 +553,10 @@ Tokenizer_write_text_then_stack(Tokenizer* self, const char* text)
 /*
     Read the value at a relative point in the wikicode, forwards.
 */
-static PyObject*
-Tokenizer_read(Tokenizer* self, Py_ssize_t delta)
+static PyObject* Tokenizer_read(Tokenizer* self, Py_ssize_t delta)
 {
     Py_ssize_t index = self->head + delta;
+
     if (index >= self->length)
         return EMPTY;
     return PyList_GET_ITEM(self->text, index);
@@ -393,10 +565,10 @@ Tokenizer_read(Tokenizer* self, Py_ssize_t delta)
 /*
     Read the value at a relative point in the wikicode, backwards.
 */
-static PyObject*
-Tokenizer_read_backwards(Tokenizer* self, Py_ssize_t delta)
+static PyObject* Tokenizer_read_backwards(Tokenizer* self, Py_ssize_t delta)
 {
     Py_ssize_t index;
+
     if (delta > self->head)
         return EMPTY;
     index = self->head - delta;
@@ -404,10 +576,67 @@ Tokenizer_read_backwards(Tokenizer* self, Py_ssize_t delta)
 }
 
 /*
+    Parse a template at the head of the wikicode string.
+*/
+static int Tokenizer_parse_template(Tokenizer* self)
+{
+    PyObject *template;
+    Py_ssize_t reset = self->head;
+
+    template = Tokenizer_parse(self, LC_TEMPLATE_NAME, 1);
+    if (BAD_ROUTE) {
+        self->head = reset;
+        return 0;
+    }
+    if (!template)
+        return -1;
+    if (Tokenizer_emit_first(self, TemplateOpen)) {
+        Py_DECREF(template);
+        return -1;
+    }
+    if (Tokenizer_emit_all(self, template)) {
+        Py_DECREF(template);
+        return -1;
+    }
+    Py_DECREF(template);
+    if (Tokenizer_emit(self, TemplateClose))
+        return -1;
+    return 0;
+}
+
+/*
+    Parse an argument at the head of the wikicode string.
+*/
+static int Tokenizer_parse_argument(Tokenizer* self)
+{
+    PyObject *argument;
+    Py_ssize_t reset = self->head;
+
+    argument = Tokenizer_parse(self, LC_ARGUMENT_NAME, 1);
+    if (BAD_ROUTE) {
+        self->head = reset;
+        return 0;
+    }
+    if (!argument)
+        return -1;
+    if (Tokenizer_emit_first(self, ArgumentOpen)) {
+        Py_DECREF(argument);
+        return -1;
+    }
+    if (Tokenizer_emit_all(self, argument)) {
+        Py_DECREF(argument);
+        return -1;
+    }
+    Py_DECREF(argument);
+    if (Tokenizer_emit(self, ArgumentClose))
+        return -1;
+    return 0;
+}
+
+/*
     Parse a template or argument at the head of the wikicode string.
 */
-static int
-Tokenizer_parse_template_or_argument(Tokenizer* self)
+static int Tokenizer_parse_template_or_argument(Tokenizer* self)
 {
     unsigned int braces = 2, i;
     PyObject *tokenlist;
@@ -421,17 +650,16 @@ Tokenizer_parse_template_or_argument(Tokenizer* self)
         return -1;
     while (braces) {
         if (braces == 1) {
-            if (Tokenizer_write_text_then_stack(self, "{"))
+            if (Tokenizer_emit_text_then_stack(self, "{"))
                 return -1;
             return 0;
         }
         if (braces == 2) {
             if (Tokenizer_parse_template(self))
                 return -1;
-
             if (BAD_ROUTE) {
                 RESET_ROUTE();
-                if (Tokenizer_write_text_then_stack(self, "{{"))
+                if (Tokenizer_emit_text_then_stack(self, "{{"))
                     return -1;
                 return 0;
             }
@@ -448,7 +676,7 @@ Tokenizer_parse_template_or_argument(Tokenizer* self)
                 RESET_ROUTE();
                 for (i = 0; i < braces; i++) text[i] = *"{";
                 text[braces] = *"";
-                if (Tokenizer_write_text_then_stack(self, text)) {
+                if (Tokenizer_emit_text_then_stack(self, text)) {
                     Py_XDECREF(text);
                     return -1;
                 }
@@ -466,134 +694,42 @@ Tokenizer_parse_template_or_argument(Tokenizer* self)
     tokenlist = Tokenizer_pop(self);
     if (!tokenlist)
         return -1;
-    if (Tokenizer_write_all(self, tokenlist)) {
+    if (Tokenizer_emit_all(self, tokenlist)) {
         Py_DECREF(tokenlist);
         return -1;
     }
     Py_DECREF(tokenlist);
+    if (self->topstack->context & LC_FAIL_NEXT)
+        self->topstack->context ^= LC_FAIL_NEXT;
     return 0;
 }
 
 /*
-    Parse a template at the head of the wikicode string.
+    Handle a template parameter at the head of the string.
 */
-static int
-Tokenizer_parse_template(Tokenizer* self)
+static int Tokenizer_handle_template_param(Tokenizer* self)
 {
-    PyObject *template, *token;
-    Py_ssize_t reset = self->head;
+    PyObject *stack;
 
-    template = Tokenizer_parse(self, LC_TEMPLATE_NAME);
-    if (BAD_ROUTE) {
-        self->head = reset;
-        return 0;
+    if (self->topstack->context & LC_TEMPLATE_NAME)
+        self->topstack->context ^= LC_TEMPLATE_NAME;
+    else if (self->topstack->context & LC_TEMPLATE_PARAM_VALUE)
+        self->topstack->context ^= LC_TEMPLATE_PARAM_VALUE;
+    if (self->topstack->context & LC_TEMPLATE_PARAM_KEY) {
+        stack = Tokenizer_pop_keeping_context(self);
+        if (!stack)
+            return -1;
+        if (Tokenizer_emit_all(self, stack)) {
+            Py_DECREF(stack);
+            return -1;
+        }
+        Py_DECREF(stack);
     }
-    if (!template)
-        return -1;
-    token = PyObject_CallObject(TemplateOpen, NULL);
-    if (!token) {
-        Py_DECREF(template);
+    else
+        self->topstack->context |= LC_TEMPLATE_PARAM_KEY;
+    if (Tokenizer_emit(self, TemplateParamSeparator))
         return -1;
-    }
-    if (Tokenizer_write_first(self, token)) {
-        Py_DECREF(token);
-        Py_DECREF(template);
-        return -1;
-    }
-    Py_DECREF(token);
-    if (Tokenizer_write_all(self, template)) {
-        Py_DECREF(template);
-        return -1;
-    }
-    Py_DECREF(template);
-    token = PyObject_CallObject(TemplateClose, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        return -1;
-    }
-    Py_DECREF(token);
-    return 0;
-}
-
-/*
-    Parse an argument at the head of the wikicode string.
-*/
-static int
-Tokenizer_parse_argument(Tokenizer* self)
-{
-    PyObject *argument, *token;
-    Py_ssize_t reset = self->head;
-
-    argument = Tokenizer_parse(self, LC_ARGUMENT_NAME);
-    if (BAD_ROUTE) {
-        self->head = reset;
-        return 0;
-    }
-    if (!argument)
-        return -1;
-    token = PyObject_CallObject(ArgumentOpen, NULL);
-    if (!token) {
-        Py_DECREF(argument);
-        return -1;
-    }
-    if (Tokenizer_write_first(self, token)) {
-        Py_DECREF(token);
-        Py_DECREF(argument);
-        return -1;
-    }
-    Py_DECREF(token);
-    if (Tokenizer_write_all(self, argument)) {
-        Py_DECREF(argument);
-        return -1;
-    }
-    Py_DECREF(argument);
-    token = PyObject_CallObject(ArgumentClose, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        return -1;
-    }
-    Py_DECREF(token);
-    return 0;
-}
-
-/*
-    Handle a template parameter at the head of the string.
-*/
-static int
-Tokenizer_handle_template_param(Tokenizer* self)
-{
-    PyObject *stack, *token;
-
-    if (self->topstack->context & LC_TEMPLATE_NAME)
-        self->topstack->context ^= LC_TEMPLATE_NAME;
-    else if (self->topstack->context & LC_TEMPLATE_PARAM_VALUE)
-        self->topstack->context ^= LC_TEMPLATE_PARAM_VALUE;
-    if (self->topstack->context & LC_TEMPLATE_PARAM_KEY) {
-        stack = Tokenizer_pop_keeping_context(self);
-        if (!stack)
-            return -1;
-        if (Tokenizer_write_all(self, stack)) {
-            Py_DECREF(stack);
-            return -1;
-        }
-        Py_DECREF(stack);
-    }
-    else
-        self->topstack->context |= LC_TEMPLATE_PARAM_KEY;
-
-    token = PyObject_CallObject(TemplateParamSeparator, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        return -1;
-    }
-    Py_DECREF(token);
-    if (Tokenizer_push(self, self->topstack->context))
+    if (Tokenizer_push(self, self->topstack->context))
         return -1;
     return 0;
 }
@@ -601,37 +737,29 @@ Tokenizer_handle_template_param(Tokenizer* self)
 /*
     Handle a template parameter's value at the head of the string.
 */
-static int
-Tokenizer_handle_template_param_value(Tokenizer* self)
+static int Tokenizer_handle_template_param_value(Tokenizer* self)
 {
-    PyObject *stack, *token;
+    PyObject *stack;
 
     stack = Tokenizer_pop_keeping_context(self);
     if (!stack)
         return -1;
-    if (Tokenizer_write_all(self, stack)) {
+    if (Tokenizer_emit_all(self, stack)) {
         Py_DECREF(stack);
         return -1;
     }
     Py_DECREF(stack);
     self->topstack->context ^= LC_TEMPLATE_PARAM_KEY;
     self->topstack->context |= LC_TEMPLATE_PARAM_VALUE;
-    token = PyObject_CallObject(TemplateParamEquals, NULL);
-    if (!token)
+    if (Tokenizer_emit(self, TemplateParamEquals))
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        return -1;
-    }
-    Py_DECREF(token);
     return 0;
 }
 
 /*
     Handle the end of a template at the head of the string.
 */
-static PyObject*
-Tokenizer_handle_template_end(Tokenizer* self)
+static PyObject* Tokenizer_handle_template_end(Tokenizer* self)
 {
     PyObject* stack;
 
@@ -639,7 +767,7 @@ Tokenizer_handle_template_end(Tokenizer* self)
         stack = Tokenizer_pop_keeping_context(self);
         if (!stack)
             return NULL;
-        if (Tokenizer_write_all(self, stack)) {
+        if (Tokenizer_emit_all(self, stack)) {
             Py_DECREF(stack);
             return NULL;
         }
@@ -653,30 +781,22 @@ Tokenizer_handle_template_end(Tokenizer* self)
 /*
     Handle the separator between an argument's name and default.
 */
-static int
-Tokenizer_handle_argument_separator(Tokenizer* self)
+static int Tokenizer_handle_argument_separator(Tokenizer* self)
 {
-    PyObject* token;
     self->topstack->context ^= LC_ARGUMENT_NAME;
     self->topstack->context |= LC_ARGUMENT_DEFAULT;
-    token = PyObject_CallObject(ArgumentSeparator, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit(self, ArgumentSeparator))
         return -1;
-    }
-    Py_DECREF(token);
     return 0;
 }
 
 /*
     Handle the end of an argument at the head of the string.
 */
-static PyObject*
-Tokenizer_handle_argument_end(Tokenizer* self)
+static PyObject* Tokenizer_handle_argument_end(Tokenizer* self)
 {
     PyObject* stack = Tokenizer_pop(self);
+
     self->head += 2;
     return stack;
 }
@@ -684,79 +804,55 @@ Tokenizer_handle_argument_end(Tokenizer* self)
 /*
     Parse an internal wikilink at the head of the wikicode string.
 */
-static int
-Tokenizer_parse_wikilink(Tokenizer* self)
+static int Tokenizer_parse_wikilink(Tokenizer* self)
 {
     Py_ssize_t reset;
-    PyObject *wikilink, *token;
-    int i;
+    PyObject *wikilink;
 
     self->head += 2;
     reset = self->head - 1;
-    wikilink = Tokenizer_parse(self, LC_WIKILINK_TITLE);
+    wikilink = Tokenizer_parse(self, LC_WIKILINK_TITLE, 1);
     if (BAD_ROUTE) {
         RESET_ROUTE();
         self->head = reset;
-        for (i = 0; i < 2; i++) {
-            if (Tokenizer_write_text(self, *"["))
-                return -1;
-        }
+        if (Tokenizer_emit_text(self, "[["))
+            return -1;
         return 0;
     }
     if (!wikilink)
         return -1;
-    token = PyObject_CallObject(WikilinkOpen, NULL);
-    if (!token) {
-        Py_DECREF(wikilink);
-        return -1;
-    }
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit(self, WikilinkOpen)) {
         Py_DECREF(wikilink);
         return -1;
     }
-    Py_DECREF(token);
-    if (Tokenizer_write_all(self, wikilink)) {
+    if (Tokenizer_emit_all(self, wikilink)) {
         Py_DECREF(wikilink);
         return -1;
     }
     Py_DECREF(wikilink);
-    token = PyObject_CallObject(WikilinkClose, NULL);
-    if (!token)
+    if (Tokenizer_emit(self, WikilinkClose))
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        return -1;
-    }
-    Py_DECREF(token);
+    if (self->topstack->context & LC_FAIL_NEXT)
+        self->topstack->context ^= LC_FAIL_NEXT;
     return 0;
 }
 
 /*
     Handle the separator between a wikilink's title and its text.
 */
-static int
-Tokenizer_handle_wikilink_separator(Tokenizer* self)
+static int Tokenizer_handle_wikilink_separator(Tokenizer* self)
 {
-    PyObject* token;
     self->topstack->context ^= LC_WIKILINK_TITLE;
     self->topstack->context |= LC_WIKILINK_TEXT;
-    token = PyObject_CallObject(WikilinkSeparator, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit(self, WikilinkSeparator))
         return -1;
-    }
-    Py_DECREF(token);
     return 0;
 }
 
 /*
     Handle the end of a wikilink at the head of the string.
 */
-static PyObject*
-Tokenizer_handle_wikilink_end(Tokenizer* self)
+static PyObject* Tokenizer_handle_wikilink_end(Tokenizer* self)
 {
     PyObject* stack = Tokenizer_pop(self);
     self->head += 1;
@@ -764,139 +860,468 @@ Tokenizer_handle_wikilink_end(Tokenizer* self)
 }
 
 /*
-    Parse a section heading at the head of the wikicode string.
+    Parse the URI scheme of a bracket-enclosed external link.
 */
-static int
-Tokenizer_parse_heading(Tokenizer* self)
+static int Tokenizer_parse_bracketed_uri_scheme(Tokenizer* self)
 {
-    Py_ssize_t reset = self->head;
-    int best = 1, i, context, diff;
-    HeadingData *heading;
-    PyObject *level, *kwargs, *token;
+    static const char* valid = "abcdefghijklmnopqrstuvwxyz0123456789+.-";
+    Textbuffer* buffer;
+    PyObject* scheme;
+    Py_UNICODE this;
+    int slashes, i;
 
-    self->global |= GL_HEADING;
-    self->head += 1;
-    while (Tokenizer_READ(self, 0) == *"=") {
-        best++;
-        self->head++;
+    if (Tokenizer_push(self, LC_EXT_LINK_URI))
+        return -1;
+    if (Tokenizer_READ(self, 0) == *"/" && Tokenizer_READ(self, 1) == *"/") {
+        if (Tokenizer_emit_text(self, "//"))
+            return -1;
+        self->head += 2;
     }
-    context = LC_HEADING_LEVEL_1 << (best > 5 ? 5 : best - 1);
-    heading = (HeadingData*) Tokenizer_parse(self, context);
-    if (BAD_ROUTE) {
-        RESET_ROUTE();
-        self->head = reset + best - 1;
-        for (i = 0; i < best; i++) {
-            if (Tokenizer_write_text(self, *"="))
+    else {
+        buffer = Textbuffer_new();
+        if (!buffer)
+            return -1;
+        while ((this = Tokenizer_READ(self, 0)) != *"") {
+            i = 0;
+            while (1) {
+                if (!valid[i])
+                    goto end_of_loop;
+                if (this == valid[i])
+                    break;
+                i++;
+            }
+            Textbuffer_write(&buffer, this);
+            if (Tokenizer_emit_char(self, this)) {
+                Textbuffer_dealloc(buffer);
                 return -1;
+            }
+            self->head++;
         }
-        self->global ^= GL_HEADING;
-        return 0;
+        end_of_loop:
+        if (this != *":") {
+            Textbuffer_dealloc(buffer);
+            Tokenizer_fail_route(self);
+            return 0;
+        }
+        if (Tokenizer_emit_char(self, *":")) {
+            Textbuffer_dealloc(buffer);
+            return -1;
+        }
+        self->head++;
+        slashes = (Tokenizer_READ(self, 0) == *"/" &&
+                   Tokenizer_READ(self, 1) == *"/");
+        if (slashes) {
+            if (Tokenizer_emit_text(self, "//")) {
+                Textbuffer_dealloc(buffer);
+                return -1;
+            }
+            self->head += 2;
+        }
+        scheme = Textbuffer_render(buffer);
+        Textbuffer_dealloc(buffer);
+        if (!scheme)
+            return -1;
+        if (!IS_SCHEME(scheme, slashes, 0)) {
+            Py_DECREF(scheme);
+            Tokenizer_fail_route(self);
+            return 0;
+        }
+        Py_DECREF(scheme);
     }
+    return 0;
+}
 
-    level = PyInt_FromSsize_t(heading->level);
-    if (!level) {
-        Py_DECREF(heading->title);
-        free(heading);
-        return -1;
-    }
-    kwargs = PyDict_New();
-    if (!kwargs) {
-        Py_DECREF(level);
-        Py_DECREF(heading->title);
-        free(heading);
-        return -1;
-    }
-    PyDict_SetItemString(kwargs, "level", level);
-    Py_DECREF(level);
-    token = PyObject_Call(HeadingStart, NOARGS, kwargs);
-    Py_DECREF(kwargs);
-    if (!token) {
-        Py_DECREF(heading->title);
-        free(heading);
-        return -1;
+/*
+    Parse the URI scheme of a free (no brackets) external link.
+*/
+static int Tokenizer_parse_free_uri_scheme(Tokenizer* self)
+{
+    static const char* valid = "abcdefghijklmnopqrstuvwxyz0123456789+.-";
+    Textbuffer *scheme_buffer = Textbuffer_new(), *temp_buffer;
+    PyObject *scheme;
+    Py_UNICODE chunk;
+    int slashes, i, j;
+
+    if (!scheme_buffer)
+        return -1;
+    // We have to backtrack through the textbuffer looking for our scheme since
+    // it was just parsed as text:
+    temp_buffer = self->topstack->textbuffer;
+    while (temp_buffer) {
+        for (i = temp_buffer->size - 1; i >= 0; i--) {
+            chunk = temp_buffer->data[i];
+            if (Py_UNICODE_ISSPACE(chunk) || is_marker(chunk))
+                goto end_of_loop;
+            j = 0;
+            while (1) {
+                if (!valid[j]) {
+                    Textbuffer_dealloc(scheme_buffer);
+                    FAIL_ROUTE(0);
+                    return 0;
+                }
+                if (chunk == valid[j])
+                    break;
+                j++;
+            }
+            Textbuffer_write(&scheme_buffer, chunk);
+        }
+        temp_buffer = temp_buffer->next;
     }
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        Py_DECREF(heading->title);
-        free(heading);
+    end_of_loop:
+    scheme = Textbuffer_render(scheme_buffer);
+    if (!scheme) {
+        Textbuffer_dealloc(scheme_buffer);
         return -1;
     }
-    Py_DECREF(token);
-    if (heading->level < best) {
-        diff = best - heading->level;
-        for (i = 0; i < diff; i++) {
-            if (Tokenizer_write_text(self, *"=")) {
-                Py_DECREF(heading->title);
-                free(heading);
-                return -1;
-            }
-        }
+    slashes = (Tokenizer_READ(self, 0) == *"/" &&
+               Tokenizer_READ(self, 1) == *"/");
+    if (!IS_SCHEME(scheme, slashes, 1)) {
+        Py_DECREF(scheme);
+        Textbuffer_dealloc(scheme_buffer);
+        FAIL_ROUTE(0);
+        return 0;
     }
-    if (Tokenizer_write_all(self, heading->title)) {
-        Py_DECREF(heading->title);
-        free(heading);
+    Py_DECREF(scheme);
+    if (Tokenizer_push(self, LC_EXT_LINK_URI)) {
+        Textbuffer_dealloc(scheme_buffer);
         return -1;
     }
-    Py_DECREF(heading->title);
-    free(heading);
-    token = PyObject_CallObject(HeadingEnd, NULL);
-    if (!token)
+    if (Tokenizer_emit_textbuffer(self, scheme_buffer, 1))
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit_char(self, *":"))
         return -1;
+    if (slashes) {
+        if (Tokenizer_emit_text(self, "//"))
+            return -1;
+        self->head += 2;
     }
-    Py_DECREF(token);
-    self->global ^= GL_HEADING;
     return 0;
 }
 
 /*
-    Handle the end of a section heading at the head of the string.
+    Handle text in a free external link, including trailing punctuation.
 */
-static HeadingData*
-Tokenizer_handle_heading_end(Tokenizer* self)
+static int
+Tokenizer_handle_free_link_text(Tokenizer* self, int* parens,
+                                Textbuffer** tail, Py_UNICODE this)
 {
-    Py_ssize_t reset = self->head, best;
-    int i, current, level, diff;
-    HeadingData *after, *heading;
-    PyObject *stack;
+    #define PUSH_TAIL_BUFFER(tail, error)                 \
+        if ((tail)->size || (tail)->next) {               \
+            if (Tokenizer_emit_textbuffer(self, tail, 0)) \
+                return error;                             \
+            tail = Textbuffer_new();                      \
+            if (!(tail))                                  \
+                return error;                             \
+        }
 
-    self->head += 1;
-    best = 1;
-    while (Tokenizer_READ(self, 0) == *"=") {
-        best++;
-        self->head++;
+    if (this == *"(" && !(*parens)) {
+        *parens = 1;
+        PUSH_TAIL_BUFFER(*tail, -1)
     }
-    current = heading_level_from_context(self->topstack->context);
-    level = current > best ? (best > 6 ? 6 : best) :
-                             (current > 6 ? 6 : current);
-    after = (HeadingData*) Tokenizer_parse(self, self->topstack->context);
-    if (BAD_ROUTE) {
-        RESET_ROUTE();
-        if (level < best) {
-            diff = best - level;
-            for (i = 0; i < diff; i++) {
-                if (Tokenizer_write_text(self, *"="))
-                    return NULL;
-            }
+    else if (this == *"," || this == *";" || this == *"\\" || this == *"." ||
+             this == *":" || this == *"!" || this == *"?" ||
+             (!(*parens) && this == *")"))
+        return Textbuffer_write(tail, this);
+    else
+        PUSH_TAIL_BUFFER(*tail, -1)
+    return Tokenizer_emit_char(self, this);
+}
+
+/*
+    Really parse an external link.
+*/
+static PyObject*
+Tokenizer_really_parse_external_link(Tokenizer* self, int brackets,
+                                     Textbuffer** extra)
+{
+    Py_UNICODE this, next;
+    int parens = 0;
+
+    if (brackets ? Tokenizer_parse_bracketed_uri_scheme(self) :
+                   Tokenizer_parse_free_uri_scheme(self))
+        return NULL;
+    if (BAD_ROUTE)
+        return NULL;
+    this = Tokenizer_READ(self, 0);
+    if (this == *"" || this == *"\n" || this == *" " || this == *"]")
+        return Tokenizer_fail_route(self);
+    if (!brackets && this == *"[")
+        return Tokenizer_fail_route(self);
+    while (1) {
+        this = Tokenizer_READ(self, 0);
+        next = Tokenizer_READ(self, 1);
+        if (this == *"" || this == *"\n") {
+            if (brackets)
+                return Tokenizer_fail_route(self);
+            self->head--;
+            return Tokenizer_pop(self);
         }
-        self->head = reset + best - 1;
-    }
-    else {
-        for (i = 0; i < best; i++) {
-            if (Tokenizer_write_text(self, *"=")) {
-                Py_DECREF(after->title);
-                free(after);
+        if (this == *"{" && next == *"{" && Tokenizer_CAN_RECURSE(self)) {
+            PUSH_TAIL_BUFFER(*extra, NULL)
+            if (Tokenizer_parse_template_or_argument(self))
                 return NULL;
+        }
+        else if (this == *"[") {
+            if (!brackets) {
+                self->head--;
+                return Tokenizer_pop(self);
             }
+            if (Tokenizer_emit_char(self, *"["))
+                return NULL;
         }
-        if (Tokenizer_write_all(self, after->title)) {
-            Py_DECREF(after->title);
-            free(after);
-            return NULL;
+        else if (this == *"]") {
+            if (!brackets)
+                self->head--;
+            return Tokenizer_pop(self);
         }
-        Py_DECREF(after->title);
+        else if (this == *"&") {
+            PUSH_TAIL_BUFFER(*extra, NULL)
+            if (Tokenizer_parse_entity(self))
+                return NULL;
+        }
+        else if (this == *" ") {
+            if (brackets) {
+                if (Tokenizer_emit(self, ExternalLinkSeparator))
+                    return NULL;
+                self->topstack->context ^= LC_EXT_LINK_URI;
+                self->topstack->context |= LC_EXT_LINK_TITLE;
+                self->head++;
+                return Tokenizer_parse(self, 0, 0);
+            }
+            if (Textbuffer_write(extra, *" "))
+                return NULL;
+            return Tokenizer_pop(self);
+        }
+        else if (!brackets) {
+            if (Tokenizer_handle_free_link_text(self, &parens, extra, this))
+                return NULL;
+        }
+        else {
+            if (Tokenizer_emit_char(self, this))
+                return NULL;
+        }
+        self->head++;
+    }
+}
+
+/*
+    Remove the URI scheme of a new external link from the textbuffer.
+*/
+static int
+Tokenizer_remove_uri_scheme_from_textbuffer(Tokenizer* self, PyObject* link)
+{
+    PyObject *text = PyObject_GetAttrString(PyList_GET_ITEM(link, 0), "text"),
+             *split, *scheme;
+    Py_ssize_t length;
+    Textbuffer* temp;
+
+    if (!text)
+        return -1;
+    split = PyObject_CallMethod(text, "split", "si", ":", 1);
+    Py_DECREF(text);
+    if (!split)
+        return -1;
+    scheme = PyList_GET_ITEM(split, 0);
+    length = PyUnicode_GET_SIZE(scheme);
+    while (length) {
+        temp = self->topstack->textbuffer;
+        if (length <= temp->size) {
+            temp->size -= length;
+            break;
+        }
+        length -= temp->size;
+        self->topstack->textbuffer = temp->next;
+        free(temp->data);
+        free(temp);
+    }
+    Py_DECREF(split);
+    return 0;
+}
+
+/*
+    Parse an external link at the head of the wikicode string.
+*/
+static int Tokenizer_parse_external_link(Tokenizer* self, int brackets)
+{
+    #define INVALID_CONTEXT self->topstack->context & AGG_INVALID_LINK
+    #define NOT_A_LINK                                        \
+        if (!brackets && self->topstack->context & LC_DLTERM) \
+            return Tokenizer_handle_dl_term(self);            \
+        return Tokenizer_emit_char(self, Tokenizer_READ(self, 0))
+
+    Py_ssize_t reset = self->head;
+    PyObject *link, *kwargs;
+    Textbuffer *extra = 0;
+
+    if (INVALID_CONTEXT || !(Tokenizer_CAN_RECURSE(self))) {
+        NOT_A_LINK;
+    }
+    extra = Textbuffer_new();
+    if (!extra)
+        return -1;
+    self->head++;
+    link = Tokenizer_really_parse_external_link(self, brackets, &extra);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        Textbuffer_dealloc(extra);
+        NOT_A_LINK;
+    }
+    if (!link) {
+        Textbuffer_dealloc(extra);
+        return -1;
+    }
+    if (!brackets) {
+        if (Tokenizer_remove_uri_scheme_from_textbuffer(self, link)) {
+            Textbuffer_dealloc(extra);
+            Py_DECREF(link);
+            return -1;
+        }
+    }
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Textbuffer_dealloc(extra);
+        Py_DECREF(link);
+        return -1;
+    }
+    PyDict_SetItemString(kwargs, "brackets", brackets ? Py_True : Py_False);
+    if (Tokenizer_emit_kwargs(self, ExternalLinkOpen, kwargs)) {
+        Textbuffer_dealloc(extra);
+        Py_DECREF(link);
+        return -1;
+    }
+    if (Tokenizer_emit_all(self, link)) {
+        Textbuffer_dealloc(extra);
+        Py_DECREF(link);
+        return -1;
+    }
+    Py_DECREF(link);
+    if (Tokenizer_emit(self, ExternalLinkClose)) {
+        Textbuffer_dealloc(extra);
+        return -1;
+    }
+    if (extra->size || extra->next)
+        return Tokenizer_emit_textbuffer(self, extra, 0);
+    Textbuffer_dealloc(extra);
+    return 0;
+}
+
+/*
+    Parse a section heading at the head of the wikicode string.
+*/
+static int Tokenizer_parse_heading(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    int best = 1, i, context, diff;
+    HeadingData *heading;
+    PyObject *level, *kwargs;
+
+    self->global |= GL_HEADING;
+    self->head += 1;
+    while (Tokenizer_READ(self, 0) == *"=") {
+        best++;
+        self->head++;
+    }
+    context = LC_HEADING_LEVEL_1 << (best > 5 ? 5 : best - 1);
+    heading = (HeadingData*) Tokenizer_parse(self, context, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset + best - 1;
+        for (i = 0; i < best; i++) {
+            if (Tokenizer_emit_char(self, *"="))
+                return -1;
+        }
+        self->global ^= GL_HEADING;
+        return 0;
+    }
+    level = NEW_INT_FUNC(heading->level);
+    if (!level) {
+        Py_DECREF(heading->title);
+        free(heading);
+        return -1;
+    }
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(level);
+        Py_DECREF(heading->title);
+        free(heading);
+        return -1;
+    }
+    PyDict_SetItemString(kwargs, "level", level);
+    Py_DECREF(level);
+    if (Tokenizer_emit_kwargs(self, HeadingStart, kwargs)) {
+        Py_DECREF(heading->title);
+        free(heading);
+        return -1;
+    }
+    if (heading->level < best) {
+        diff = best - heading->level;
+        for (i = 0; i < diff; i++) {
+            if (Tokenizer_emit_char(self, *"=")) {
+                Py_DECREF(heading->title);
+                free(heading);
+                return -1;
+            }
+        }
+    }
+    if (Tokenizer_emit_all(self, heading->title)) {
+        Py_DECREF(heading->title);
+        free(heading);
+        return -1;
+    }
+    Py_DECREF(heading->title);
+    free(heading);
+    if (Tokenizer_emit(self, HeadingEnd))
+        return -1;
+    self->global ^= GL_HEADING;
+    return 0;
+}
+
+/*
+    Handle the end of a section heading at the head of the string.
+*/
+static HeadingData* Tokenizer_handle_heading_end(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head, best;
+    int i, current, level, diff;
+    HeadingData *after, *heading;
+    PyObject *stack;
+
+    self->head += 1;
+    best = 1;
+    while (Tokenizer_READ(self, 0) == *"=") {
+        best++;
+        self->head++;
+    }
+    current = heading_level_from_context(self->topstack->context);
+    level = current > best ? (best > 6 ? 6 : best) :
+                             (current > 6 ? 6 : current);
+    after = (HeadingData*) Tokenizer_parse(self, self->topstack->context, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        if (level < best) {
+            diff = best - level;
+            for (i = 0; i < diff; i++) {
+                if (Tokenizer_emit_char(self, *"="))
+                    return NULL;
+            }
+        }
+        self->head = reset + best - 1;
+    }
+    else {
+        for (i = 0; i < best; i++) {
+            if (Tokenizer_emit_char(self, *"=")) {
+                Py_DECREF(after->title);
+                free(after);
+                return NULL;
+            }
+        }
+        if (Tokenizer_emit_all(self, after->title)) {
+            Py_DECREF(after->title);
+            free(after);
+            return NULL;
+        }
+        Py_DECREF(after->title);
         level = after->level;
         free(after);
     }
@@ -916,10 +1341,9 @@ Tokenizer_handle_heading_end(Tokenizer* self)
 /*
     Actually parse an HTML entity and ensure that it is valid.
 */
-static int
-Tokenizer_really_parse_entity(Tokenizer* self)
+static int Tokenizer_really_parse_entity(Tokenizer* self)
 {
-    PyObject *token, *kwargs, *textobj;
+    PyObject *kwargs, *textobj;
     Py_UNICODE this;
     int numeric, hexadecimal, i, j, zeroes, test;
     char *valid, *text, *buffer, *def;
@@ -930,14 +1354,8 @@ Tokenizer_really_parse_entity(Tokenizer* self)
         return 0;                   \
     }
 
-    token = PyObject_CallObject(HTMLEntityStart, NULL);
-    if (!token)
-        return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit(self, HTMLEntityStart))
         return -1;
-    }
-    Py_DECREF(token);
     self->head++;
     this = Tokenizer_READ(self, 0);
     if (this == *"") {
@@ -946,14 +1364,8 @@ Tokenizer_really_parse_entity(Tokenizer* self)
     }
     if (this == *"#") {
         numeric = 1;
-        token = PyObject_CallObject(HTMLEntityNumeric, NULL);
-        if (!token)
+        if (Tokenizer_emit(self, HTMLEntityNumeric))
             return -1;
-        if (Tokenizer_write(self, token)) {
-            Py_DECREF(token);
-            return -1;
-        }
-        Py_DECREF(token);
         self->head++;
         this = Tokenizer_READ(self, 0);
         if (this == *"") {
@@ -966,15 +1378,8 @@ Tokenizer_really_parse_entity(Tokenizer* self)
             if (!kwargs)
                 return -1;
             PyDict_SetItemString(kwargs, "char", Tokenizer_read(self, 0));
-            token = PyObject_Call(HTMLEntityHex, NOARGS, kwargs);
-            Py_DECREF(kwargs);
-            if (!token)
-                return -1;
-            if (Tokenizer_write(self, token)) {
-                Py_DECREF(token);
+            if (Tokenizer_emit_kwargs(self, HTMLEntityHex, kwargs))
                 return -1;
-            }
-            Py_DECREF(token);
             self->head++;
         }
         else
@@ -1007,7 +1412,7 @@ Tokenizer_really_parse_entity(Tokenizer* self)
             self->head++;
             continue;
         }
-        if (i >= 8)
+        if (i >= MAX_ENTITY_SIZE)
             FAIL_ROUTE_AND_EXIT()
         for (j = 0; j < NUM_MARKERS; j++) {
             if (this == *MARKERS[j])
@@ -1021,178 +1426,1020 @@ Tokenizer_really_parse_entity(Tokenizer* self)
                 break;
             j++;
         }
-        text[i] = this;
+        text[i] = (char) this;
+        self->head++;
+        i++;
+    }
+    if (numeric) {
+        sscanf(text, (hexadecimal ? "%x" : "%d"), &test);
+        if (test < 1 || test > 0x10FFFF)
+            FAIL_ROUTE_AND_EXIT()
+    }
+    else {
+        i = 0;
+        while (1) {
+            def = entitydefs[i];
+            if (!def)  // We've reached the end of the defs without finding it
+                FAIL_ROUTE_AND_EXIT()
+            if (strcmp(text, def) == 0)
+                break;
+            i++;
+        }
+    }
+    if (zeroes) {
+        buffer = calloc(strlen(text) + zeroes + 1, sizeof(char));
+        if (!buffer) {
+            free(text);
+            PyErr_NoMemory();
+            return -1;
+        }
+        for (i = 0; i < zeroes; i++)
+            strcat(buffer, "0");
+        strcat(buffer, text);
+        free(text);
+        text = buffer;
+    }
+    textobj = PyUnicode_FromString(text);
+    if (!textobj) {
+        free(text);
+        return -1;
+    }
+    free(text);
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(textobj);
+        return -1;
+    }
+    PyDict_SetItemString(kwargs, "text", textobj);
+    Py_DECREF(textobj);
+    if (Tokenizer_emit_kwargs(self, Text, kwargs))
+        return -1;
+    if (Tokenizer_emit(self, HTMLEntityEnd))
+        return -1;
+    return 0;
+}
+
+/*
+    Parse an HTML entity at the head of the wikicode string.
+*/
+static int Tokenizer_parse_entity(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    PyObject *tokenlist;
+
+    if (Tokenizer_push(self, 0))
+        return -1;
+    if (Tokenizer_really_parse_entity(self))
+        return -1;
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        if (Tokenizer_emit_char(self, *"&"))
+            return -1;
+        return 0;
+    }
+    tokenlist = Tokenizer_pop(self);
+    if (!tokenlist)
+        return -1;
+    if (Tokenizer_emit_all(self, tokenlist)) {
+        Py_DECREF(tokenlist);
+        return -1;
+    }
+    Py_DECREF(tokenlist);
+    return 0;
+}
+
+/*
+    Parse an HTML comment at the head of the wikicode string.
+*/
+static int Tokenizer_parse_comment(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head + 3;
+    PyObject *comment;
+    Py_UNICODE this;
+
+    self->head += 4;
+    if (Tokenizer_push(self, 0))
+        return -1;
+    while (1) {
+        this = Tokenizer_READ(self, 0);
+        if (this == *"") {
+            comment = Tokenizer_pop(self);
+            Py_XDECREF(comment);
+            self->head = reset;
+            return Tokenizer_emit_text(self, "<!--");
+        }
+        if (this == *"-" && Tokenizer_READ(self, 1) == this &&
+                            Tokenizer_READ(self, 2) == *">") {
+            if (Tokenizer_emit_first(self, CommentStart))
+                return -1;
+            if (Tokenizer_emit(self, CommentEnd))
+                return -1;
+            comment = Tokenizer_pop(self);
+            if (!comment)
+                return -1;
+            if (Tokenizer_emit_all(self, comment))
+                return -1;
+            Py_DECREF(comment);
+            self->head += 2;
+            return 0;
+        }
+        if (Tokenizer_emit_char(self, this))
+            return -1;
+        self->head++;
+    }
+}
+
+/*
+    Write a pending tag attribute from data to the stack.
+*/
+static int Tokenizer_push_tag_buffer(Tokenizer* self, TagData* data)
+{
+    PyObject *tokens, *kwargs, *pad_first, *pad_before_eq, *pad_after_eq;
+
+    if (data->context & TAG_QUOTED) {
+        if (Tokenizer_emit_first(self, TagAttrQuote))
+            return -1;
+        tokens = Tokenizer_pop(self);
+        if (!tokens)
+            return -1;
+        if (Tokenizer_emit_all(self, tokens)) {
+            Py_DECREF(tokens);
+            return -1;
+        }
+        Py_DECREF(tokens);
+    }
+    pad_first = Textbuffer_render(data->pad_first);
+    pad_before_eq = Textbuffer_render(data->pad_before_eq);
+    pad_after_eq = Textbuffer_render(data->pad_after_eq);
+    if (!pad_first || !pad_before_eq || !pad_after_eq)
+        return -1;
+    kwargs = PyDict_New();
+    if (!kwargs)
+        return -1;
+    PyDict_SetItemString(kwargs, "pad_first", pad_first);
+    PyDict_SetItemString(kwargs, "pad_before_eq", pad_before_eq);
+    PyDict_SetItemString(kwargs, "pad_after_eq", pad_after_eq);
+    Py_DECREF(pad_first);
+    Py_DECREF(pad_before_eq);
+    Py_DECREF(pad_after_eq);
+    if (Tokenizer_emit_first_kwargs(self, TagAttrStart, kwargs))
+        return -1;
+    tokens = Tokenizer_pop(self);
+    if (!tokens)
+        return -1;
+    if (Tokenizer_emit_all(self, tokens)) {
+        Py_DECREF(tokens);
+        return -1;
+    }
+    Py_DECREF(tokens);
+    if (TagData_reset_buffers(data))
+        return -1;
+    return 0;
+}
+
+/*
+    Handle whitespace inside of an HTML open tag.
+*/
+static int
+Tokenizer_handle_tag_space(Tokenizer* self, TagData* data, Py_UNICODE text)
+{
+    int ctx = data->context;
+    int end_of_value = (ctx & TAG_ATTR_VALUE &&
+                        !(ctx & (TAG_QUOTED | TAG_NOTE_QUOTE)));
+
+    if (end_of_value || (ctx & TAG_QUOTED && ctx & TAG_NOTE_SPACE)) {
+        if (Tokenizer_push_tag_buffer(self, data))
+            return -1;
+        data->context = TAG_ATTR_READY;
+    }
+    else if (ctx & TAG_NOTE_SPACE)
+        data->context = TAG_ATTR_READY;
+    else if (ctx & TAG_ATTR_NAME) {
+        data->context |= TAG_NOTE_EQUALS;
+        if (Textbuffer_write(&(data->pad_before_eq), text))
+            return -1;
+    }
+    if (ctx & TAG_QUOTED && !(ctx & TAG_NOTE_SPACE)) {
+        if (Tokenizer_emit_char(self, text))
+            return -1;
+    }
+    else if (data->context & TAG_ATTR_READY)
+        return Textbuffer_write(&(data->pad_first), text);
+    else if (data->context & TAG_ATTR_VALUE)
+        return Textbuffer_write(&(data->pad_after_eq), text);
+    return 0;
+}
+
+/*
+    Handle regular text inside of an HTML open tag.
+*/
+static int Tokenizer_handle_tag_text(Tokenizer* self, Py_UNICODE text)
+{
+    Py_UNICODE next = Tokenizer_READ(self, 1);
+
+    if (!is_marker(text) || !Tokenizer_CAN_RECURSE(self))
+        return Tokenizer_emit_char(self, text);
+    else if (text == next && next == *"{")
+        return Tokenizer_parse_template_or_argument(self);
+    else if (text == next && next == *"[")
+        return Tokenizer_parse_wikilink(self);
+    else if (text == *"<")
+        return Tokenizer_parse_tag(self);
+    return Tokenizer_emit_char(self, text);
+}
+
+/*
+    Handle all sorts of text data inside of an HTML open tag.
+*/
+static int
+Tokenizer_handle_tag_data(Tokenizer* self, TagData* data, Py_UNICODE chunk)
+{
+    PyObject *trash;
+    int first_time, escaped;
+
+    if (data->context & TAG_NAME) {
+        first_time = !(data->context & TAG_NOTE_SPACE);
+        if (is_marker(chunk) || (Py_UNICODE_ISSPACE(chunk) && first_time)) {
+            // Tags must start with text, not spaces
+            Tokenizer_fail_route(self);
+            return 0;
+        }
+        else if (first_time)
+            data->context |= TAG_NOTE_SPACE;
+        else if (Py_UNICODE_ISSPACE(chunk)) {
+            data->context = TAG_ATTR_READY;
+            return Tokenizer_handle_tag_space(self, data, chunk);
+        }
+    }
+    else if (Py_UNICODE_ISSPACE(chunk))
+        return Tokenizer_handle_tag_space(self, data, chunk);
+    else if (data->context & TAG_NOTE_SPACE) {
+        if (data->context & TAG_QUOTED) {
+            data->context = TAG_ATTR_VALUE;
+            trash = Tokenizer_pop(self);
+            Py_XDECREF(trash);
+            self->head = data->reset - 1;  // Will be auto-incremented
+        }
+        else
+            Tokenizer_fail_route(self);
+        return 0;
+    }
+    else if (data->context & TAG_ATTR_READY) {
+        data->context = TAG_ATTR_NAME;
+        if (Tokenizer_push(self, LC_TAG_ATTR))
+            return -1;
+    }
+    else if (data->context & TAG_ATTR_NAME) {
+        if (chunk == *"=") {
+            data->context = TAG_ATTR_VALUE | TAG_NOTE_QUOTE;
+            if (Tokenizer_emit(self, TagAttrEquals))
+                return -1;
+            return 0;
+        }
+        if (data->context & TAG_NOTE_EQUALS) {
+            if (Tokenizer_push_tag_buffer(self, data))
+                return -1;
+            data->context = TAG_ATTR_NAME;
+            if (Tokenizer_push(self, LC_TAG_ATTR))
+                return -1;
+        }
+    }
+    else if (data->context & TAG_ATTR_VALUE) {
+        escaped = (Tokenizer_READ_BACKWARDS(self, 1) == *"\\" &&
+                   Tokenizer_READ_BACKWARDS(self, 2) != *"\\");
+        if (data->context & TAG_NOTE_QUOTE) {
+            data->context ^= TAG_NOTE_QUOTE;
+            if (chunk == *"\"" && !escaped) {
+                data->context |= TAG_QUOTED;
+                if (Tokenizer_push(self, self->topstack->context))
+                    return -1;
+                data->reset = self->head;
+                return 0;
+            }
+        }
+        else if (data->context & TAG_QUOTED) {
+            if (chunk == *"\"" && !escaped) {
+                data->context |= TAG_NOTE_SPACE;
+                return 0;
+            }
+        }
+    }
+    return Tokenizer_handle_tag_text(self, chunk);
+}
+
+/*
+    Handle the closing of a open tag (<foo>).
+*/
+static int
+Tokenizer_handle_tag_close_open(Tokenizer* self, TagData* data, PyObject* cls)
+{
+    PyObject *padding, *kwargs;
+
+    if (data->context & (TAG_ATTR_NAME | TAG_ATTR_VALUE)) {
+        if (Tokenizer_push_tag_buffer(self, data))
+            return -1;
+    }
+    padding = Textbuffer_render(data->pad_first);
+    if (!padding)
+        return -1;
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(padding);
+        return -1;
+    }
+    PyDict_SetItemString(kwargs, "padding", padding);
+    Py_DECREF(padding);
+    if (Tokenizer_emit_kwargs(self, cls, kwargs))
+        return -1;
+    self->head++;
+    return 0;
+}
+
+/*
+    Handle the opening of a closing tag (</foo>).
+*/
+static int Tokenizer_handle_tag_open_close(Tokenizer* self)
+{
+    if (Tokenizer_emit(self, TagOpenClose))
+        return -1;
+    if (Tokenizer_push(self, LC_TAG_CLOSE))
+        return -1;
+    self->head++;
+    return 0;
+}
+
+/*
+    Handle the ending of a closing tag (</foo>).
+*/
+static PyObject* Tokenizer_handle_tag_close_close(Tokenizer* self)
+{
+    PyObject *closing, *first, *so, *sc;
+    int valid = 1;
+
+    closing = Tokenizer_pop(self);
+    if (!closing)
+        return NULL;
+    if (PyList_GET_SIZE(closing) != 1)
+        valid = 0;
+    else {
+        first = PyList_GET_ITEM(closing, 0);
+        switch (PyObject_IsInstance(first, Text)) {
+            case 0:
+                valid = 0;
+                break;
+            case 1: {
+                so = strip_tag_name(first);
+                sc = strip_tag_name(PyList_GET_ITEM(self->topstack->stack, 1));
+                if (so && sc) {
+                    if (PyUnicode_Compare(so, sc))
+                        valid = 0;
+                    Py_DECREF(so);
+                    Py_DECREF(sc);
+                    break;
+                }
+                Py_XDECREF(so);
+                Py_XDECREF(sc);
+            }
+            case -1:
+                Py_DECREF(closing);
+                return NULL;
+        }
+    }
+    if (!valid) {
+        Py_DECREF(closing);
+        return Tokenizer_fail_route(self);
+    }
+    if (Tokenizer_emit_all(self, closing)) {
+        Py_DECREF(closing);
+        return NULL;
+    }
+    Py_DECREF(closing);
+    if (Tokenizer_emit(self, TagCloseClose))
+        return NULL;
+    return Tokenizer_pop(self);
+}
+
+/*
+    Handle the body of an HTML tag that is parser-blacklisted.
+*/
+static PyObject* Tokenizer_handle_blacklisted_tag(Tokenizer* self)
+{
+    Py_UNICODE this, next;
+
+    while (1) {
+        this = Tokenizer_READ(self, 0);
+        next = Tokenizer_READ(self, 1);
+        if (this == *"")
+            return Tokenizer_fail_route(self);
+        else if (this == *"<" && next == *"/") {
+            if (Tokenizer_handle_tag_open_close(self))
+                return NULL;
+            self->head++;
+            return Tokenizer_parse(self, 0, 0);
+        }
+        else if (this == *"&") {
+            if (Tokenizer_parse_entity(self))
+                return NULL;
+        }
+        else if (Tokenizer_emit_char(self, this))
+            return NULL;
+        self->head++;
+    }
+}
+
+/*
+    Handle the end of an implicitly closing single-only HTML tag.
+*/
+static PyObject* Tokenizer_handle_single_only_tag_end(Tokenizer* self)
+{
+    PyObject *top, *padding, *kwargs;
+
+    top = PyObject_CallMethod(self->topstack->stack, "pop", NULL);
+    if (!top)
+        return NULL;
+    padding = PyObject_GetAttrString(top, "padding");
+    Py_DECREF(top);
+    if (!padding)
+        return NULL;
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(padding);
+        return NULL;
+    }
+    PyDict_SetItemString(kwargs, "padding", padding);
+    PyDict_SetItemString(kwargs, "implicit", Py_True);
+    Py_DECREF(padding);
+    if (Tokenizer_emit_kwargs(self, TagCloseSelfclose, kwargs))
+        return NULL;
+    self->head--;  // Offset displacement done by handle_tag_close_open
+    return Tokenizer_pop(self);
+}
+
+/*
+    Handle the stream end when inside a single-supporting HTML tag.
+*/
+static PyObject* Tokenizer_handle_single_tag_end(Tokenizer* self)
+{
+    PyObject *token = 0, *padding, *kwargs;
+    Py_ssize_t len, index;
+    int is_instance;
+
+    len = PyList_GET_SIZE(self->topstack->stack);
+    for (index = 0; index < len; index++) {
+        token = PyList_GET_ITEM(self->topstack->stack, index);
+        is_instance = PyObject_IsInstance(token, TagCloseOpen);
+        if (is_instance == -1)
+            return NULL;
+        else if (is_instance == 1)
+            break;
+    }
+    if (!token)
+        return NULL;
+    padding = PyObject_GetAttrString(token, "padding");
+    if (!padding)
+        return NULL;
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(padding);
+        return NULL;
+    }
+    PyDict_SetItemString(kwargs, "padding", padding);
+    PyDict_SetItemString(kwargs, "implicit", Py_True);
+    Py_DECREF(padding);
+    token = PyObject_Call(TagCloseSelfclose, NOARGS, kwargs);
+    Py_DECREF(kwargs);
+    if (!token)
+        return NULL;
+    if (PyList_SetItem(self->topstack->stack, index, token)) {
+        Py_DECREF(token);
+        return NULL;
+    }
+    return Tokenizer_pop(self);
+}
+
+/*
+    Actually parse an HTML tag, starting with the open (<foo>).
+*/
+static PyObject* Tokenizer_really_parse_tag(Tokenizer* self)
+{
+    TagData *data = TagData_new();
+    PyObject *token, *text, *trash;
+    Py_UNICODE this, next;
+    int can_exit;
+
+    if (!data)
+        return NULL;
+    if (Tokenizer_push(self, LC_TAG_OPEN)) {
+        TagData_dealloc(data);
+        return NULL;
+    }
+    if (Tokenizer_emit(self, TagOpenOpen)) {
+        TagData_dealloc(data);
+        return NULL;
+    }
+    while (1) {
+        this = Tokenizer_READ(self, 0);
+        next = Tokenizer_READ(self, 1);
+        can_exit = (!(data->context & (TAG_QUOTED | TAG_NAME)) ||
+                    data->context & TAG_NOTE_SPACE);
+        if (this == *"") {
+            if (self->topstack->context & LC_TAG_ATTR) {
+                if (data->context & TAG_QUOTED) {
+                    // Unclosed attribute quote: reset, don't die
+                    data->context = TAG_ATTR_VALUE;
+                    trash = Tokenizer_pop(self);
+                    Py_XDECREF(trash);
+                    self->head = data->reset;
+                    continue;
+                }
+                trash = Tokenizer_pop(self);
+                Py_XDECREF(trash);
+            }
+            TagData_dealloc(data);
+            return Tokenizer_fail_route(self);
+        }
+        else if (this == *">" && can_exit) {
+            if (Tokenizer_handle_tag_close_open(self, data, TagCloseOpen)) {
+                TagData_dealloc(data);
+                return NULL;
+            }
+            TagData_dealloc(data);
+            self->topstack->context = LC_TAG_BODY;
+            token = PyList_GET_ITEM(self->topstack->stack, 1);
+            text = PyObject_GetAttrString(token, "text");
+            if (!text)
+                return NULL;
+            if (IS_SINGLE_ONLY(text)) {
+                Py_DECREF(text);
+                return Tokenizer_handle_single_only_tag_end(self);
+            }
+            if (IS_PARSABLE(text)) {
+                Py_DECREF(text);
+                return Tokenizer_parse(self, 0, 0);
+            }
+            Py_DECREF(text);
+            return Tokenizer_handle_blacklisted_tag(self);
+        }
+        else if (this == *"/" && next == *">" && can_exit) {
+            if (Tokenizer_handle_tag_close_open(self, data,
+                                                TagCloseSelfclose)) {
+                TagData_dealloc(data);
+                return NULL;
+            }
+            TagData_dealloc(data);
+            return Tokenizer_pop(self);
+        }
+        else {
+            if (Tokenizer_handle_tag_data(self, data, this) || BAD_ROUTE) {
+                TagData_dealloc(data);
+                return NULL;
+            }
+        }
+        self->head++;
+    }
+}
+
+/*
+    Handle the (possible) start of an implicitly closing single tag.
+*/
+static int Tokenizer_handle_invalid_tag_start(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head + 1, pos = 0;
+    Textbuffer* buf;
+    PyObject *name, *tag;
+    Py_UNICODE this;
+
+    self->head += 2;
+    buf = Textbuffer_new();
+    if (!buf)
+        return -1;
+    while (1) {
+        this = Tokenizer_READ(self, pos);
+        if (Py_UNICODE_ISSPACE(this) || is_marker(this)) {
+            name = Textbuffer_render(buf);
+            if (!name) {
+                Textbuffer_dealloc(buf);
+                return -1;
+            }
+            if (!IS_SINGLE_ONLY(name))
+                FAIL_ROUTE(0);
+            Py_DECREF(name);
+            break;
+        }
+        Textbuffer_write(&buf, this);
+        pos++;
+    }
+    Textbuffer_dealloc(buf);
+    if (!BAD_ROUTE)
+        tag = Tokenizer_really_parse_tag(self);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        return Tokenizer_emit_text(self, "</");
+    }
+    if (!tag)
+        return -1;
+    // Set invalid=True flag of TagOpenOpen
+    if (PyObject_SetAttrString(PyList_GET_ITEM(tag, 0), "invalid", Py_True))
+        return -1;
+    if (Tokenizer_emit_all(self, tag)) {
+        Py_DECREF(tag);
+        return -1;
+    }
+    Py_DECREF(tag);
+    return 0;
+}
+
+/*
+    Parse an HTML tag at the head of the wikicode string.
+*/
+static int Tokenizer_parse_tag(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    PyObject* tag;
+
+    self->head++;
+    tag = Tokenizer_really_parse_tag(self);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        return Tokenizer_emit_char(self, *"<");
+    }
+    if (!tag) {
+        return -1;
+    }
+    if (Tokenizer_emit_all(self, tag)) {
+        Py_DECREF(tag);
+        return -1;
+    }
+    Py_DECREF(tag);
+    return 0;
+}
+
+/*
+    Write the body of a tag and the tokens that should surround it.
+*/
+static int Tokenizer_emit_style_tag(Tokenizer* self, const char* tag,
+                                    const char* ticks, PyObject* body)
+{
+    PyObject *markup, *kwargs;
+
+    markup = PyUnicode_FromString(ticks);
+    if (!markup)
+        return -1;
+    kwargs = PyDict_New();
+    if (!kwargs) {
+        Py_DECREF(markup);
+        return -1;
+    }
+    PyDict_SetItemString(kwargs, "wiki_markup", markup);
+    Py_DECREF(markup);
+    if (Tokenizer_emit_kwargs(self, TagOpenOpen, kwargs))
+        return -1;
+    if (Tokenizer_emit_text(self, tag))
+        return -1;
+    if (Tokenizer_emit(self, TagCloseOpen))
+        return -1;
+    if (Tokenizer_emit_all(self, body))
+        return -1;
+    Py_DECREF(body);
+    if (Tokenizer_emit(self, TagOpenClose))
+        return -1;
+    if (Tokenizer_emit_text(self, tag))
+        return -1;
+    if (Tokenizer_emit(self, TagCloseClose))
+        return -1;
+    return 0;
+}
+
+/*
+    Parse wiki-style italics.
+*/
+static int Tokenizer_parse_italics(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    int context;
+    PyObject *stack;
+
+    stack = Tokenizer_parse(self, LC_STYLE_ITALICS, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        if (BAD_ROUTE_CONTEXT & LC_STYLE_PASS_AGAIN) {
+            context = LC_STYLE_ITALICS | LC_STYLE_SECOND_PASS;
+            stack = Tokenizer_parse(self, context, 1);
+        }
+        else
+            return Tokenizer_emit_text(self, "''");
+    }
+    if (!stack)
+        return -1;
+    return Tokenizer_emit_style_tag(self, "i", "''", stack);
+}
+
+/*
+    Parse wiki-style bold.
+*/
+static int Tokenizer_parse_bold(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    PyObject *stack;
+
+    stack = Tokenizer_parse(self, LC_STYLE_BOLD, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        if (self->topstack->context & LC_STYLE_SECOND_PASS)
+            return Tokenizer_emit_char(self, *"'") ? -1 : 1;
+        if (self->topstack->context & LC_STYLE_ITALICS) {
+            self->topstack->context |= LC_STYLE_PASS_AGAIN;
+            return Tokenizer_emit_text(self, "'''");
+        }
+        if (Tokenizer_emit_char(self, *"'"))
+            return -1;
+        return Tokenizer_parse_italics(self);
+    }
+    if (!stack)
+        return -1;
+    return Tokenizer_emit_style_tag(self, "b", "'''", stack);
+}
+
+/*
+    Parse wiki-style italics and bold together (i.e., five ticks).
+*/
+static int Tokenizer_parse_italics_and_bold(Tokenizer* self)
+{
+    Py_ssize_t reset = self->head;
+    PyObject *stack, *stack2;
+
+    stack = Tokenizer_parse(self, LC_STYLE_BOLD, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        stack = Tokenizer_parse(self, LC_STYLE_ITALICS, 1);
+        if (BAD_ROUTE) {
+            RESET_ROUTE();
+            self->head = reset;
+            return Tokenizer_emit_text(self, "'''''");
+        }
+        if (!stack)
+            return -1;
+        reset = self->head;
+        stack2 = Tokenizer_parse(self, LC_STYLE_BOLD, 1);
+        if (BAD_ROUTE) {
+            RESET_ROUTE();
+            self->head = reset;
+            if (Tokenizer_emit_text(self, "'''"))
+                return -1;
+            return Tokenizer_emit_style_tag(self, "i", "''", stack);
+        }
+        if (!stack2)
+            return -1;
+        if (Tokenizer_push(self, 0))
+            return -1;
+        if (Tokenizer_emit_style_tag(self, "i", "''", stack))
+            return -1;
+        if (Tokenizer_emit_all(self, stack2))
+            return -1;
+        Py_DECREF(stack2);
+        stack2 = Tokenizer_pop(self);
+        if (!stack2)
+            return -1;
+        return Tokenizer_emit_style_tag(self, "b", "'''", stack2);
+    }
+    if (!stack)
+        return -1;
+    reset = self->head;
+    stack2 = Tokenizer_parse(self, LC_STYLE_ITALICS, 1);
+    if (BAD_ROUTE) {
+        RESET_ROUTE();
+        self->head = reset;
+        if (Tokenizer_emit_text(self, "''"))
+            return -1;
+        return Tokenizer_emit_style_tag(self, "b", "'''", stack);
+    }
+    if (!stack2)
+        return -1;
+    if (Tokenizer_push(self, 0))
+        return -1;
+    if (Tokenizer_emit_style_tag(self, "b", "'''", stack))
+        return -1;
+    if (Tokenizer_emit_all(self, stack2))
+        return -1;
+    Py_DECREF(stack2);
+    stack2 = Tokenizer_pop(self);
+    if (!stack2)
+        return -1;
+    return Tokenizer_emit_style_tag(self, "i", "''", stack2);
+}
+
+/*
+    Parse wiki-style formatting (''/''' for italics/bold).
+*/
+static PyObject* Tokenizer_parse_style(Tokenizer* self)
+{
+    int context = self->topstack->context, ticks = 2, i;
+
+    self->head += 2;
+    while (Tokenizer_READ(self, 0) == *"'") {
         self->head++;
-        i++;
+        ticks++;
     }
-    if (numeric) {
-        sscanf(text, (hexadecimal ? "%x" : "%d"), &test);
-        if (test < 1 || test > 0x10FFFF)
-            FAIL_ROUTE_AND_EXIT()
+    if (ticks > 5) {
+        for (i = 0; i < ticks - 5; i++) {
+            if (Tokenizer_emit_char(self, *"'"))
+                return NULL;
+        }
+        ticks = 5;
     }
-    else {
-        i = 0;
-        while (1) {
-            def = entitydefs[i];
-            if (!def)  // We've reached the end of the defs without finding it
-                FAIL_ROUTE_AND_EXIT()
-            if (strcmp(text, def) == 0)
-                break;
-            i++;
+    else if (ticks == 4) {
+        if (Tokenizer_emit_char(self, *"'"))
+            return NULL;
+        ticks = 3;
+    }
+    if ((context & LC_STYLE_ITALICS && (ticks == 2 || ticks == 5)) ||
+           (context & LC_STYLE_BOLD && (ticks == 3 || ticks == 5))) {
+        if (ticks == 5)
+            self->head -= context & LC_STYLE_ITALICS ? 3 : 2;
+        return Tokenizer_pop(self);
+    }
+    if (!Tokenizer_CAN_RECURSE(self)) {
+        if (ticks == 3) {
+            if (context & LC_STYLE_SECOND_PASS) {
+                if (Tokenizer_emit_char(self, *"'"))
+                    return NULL;
+                return Tokenizer_pop(self);
+            }
+            if (context & LC_STYLE_ITALICS)
+                self->topstack->context |= LC_STYLE_PASS_AGAIN;
+        }
+        for (i = 0; i < ticks; i++) {
+            if (Tokenizer_emit_char(self, *"'"))
+                return NULL;
         }
     }
-    if (zeroes) {
-        buffer = calloc(strlen(text) + zeroes + 1, sizeof(char));
-        if (!buffer) {
-            free(text);
-            PyErr_NoMemory();
-            return -1;
+    else if (ticks == 2) {
+        if (Tokenizer_parse_italics(self))
+            return NULL;
+    }
+    else if (ticks == 3) {
+        switch (Tokenizer_parse_bold(self)) {
+            case 1:
+                return Tokenizer_pop(self);
+            case -1:
+                return NULL;
         }
-        for (i = 0; i < zeroes; i++)
-            strcat(buffer, "0");
-        strcat(buffer, text);
-        free(text);
-        text = buffer;
     }
-    textobj = PyUnicode_FromString(text);
-    if (!textobj) {
-        free(text);
-        return -1;
+    else {
+        if (Tokenizer_parse_italics_and_bold(self))
+            return NULL;
     }
-    free(text);
+    self->head--;
+    return Py_None;
+}
+
+/*
+    Handle a list marker at the head (#, *, ;, :).
+*/
+static int Tokenizer_handle_list_marker(Tokenizer* self)
+{
+    PyObject *markup = Tokenizer_read(self, 0), *kwargs;
+    Py_UNICODE code = *PyUnicode_AS_UNICODE(markup);
+
+    if (code == *";")
+        self->topstack->context |= LC_DLTERM;
     kwargs = PyDict_New();
-    if (!kwargs) {
-        Py_DECREF(textobj);
-        return -1;
-    }
-    PyDict_SetItemString(kwargs, "text", textobj);
-    Py_DECREF(textobj);
-    token = PyObject_Call(Text, NOARGS, kwargs);
-    Py_DECREF(kwargs);
-    if (!token)
+    if (!kwargs)
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    PyDict_SetItemString(kwargs, "wiki_markup", markup);
+    if (Tokenizer_emit_kwargs(self, TagOpenOpen, kwargs))
         return -1;
-    }
-    Py_DECREF(token);
-    token = PyObject_CallObject(HTMLEntityEnd, NULL);
-    if (!token)
+    if (Tokenizer_emit_text(self, GET_HTML_TAG(code)))
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    if (Tokenizer_emit(self, TagCloseSelfclose))
         return -1;
-    }
-    Py_DECREF(token);
     return 0;
 }
 
 /*
-    Parse an HTML entity at the head of the wikicode string.
+    Handle a wiki-style list (#, *, ;, :).
 */
-static int
-Tokenizer_parse_entity(Tokenizer* self)
+static int Tokenizer_handle_list(Tokenizer* self)
 {
-    Py_ssize_t reset = self->head;
-    PyObject *tokenlist;
+    Py_UNICODE marker = Tokenizer_READ(self, 1);
 
-    if (Tokenizer_push(self, 0))
-        return -1;
-    if (Tokenizer_really_parse_entity(self))
+    if (Tokenizer_handle_list_marker(self))
         return -1;
-    if (BAD_ROUTE) {
-        RESET_ROUTE();
-        self->head = reset;
-        if (Tokenizer_write_text(self, *"&"))
+    while (marker == *"#" || marker == *"*" || marker == *";" ||
+           marker == *":") {
+        self->head++;
+        if (Tokenizer_handle_list_marker(self))
             return -1;
-        return 0;
+        marker = Tokenizer_READ(self, 1);
     }
-    tokenlist = Tokenizer_pop(self);
-    if (!tokenlist)
-        return -1;
-    if (Tokenizer_write_all(self, tokenlist)) {
-        Py_DECREF(tokenlist);
-        return -1;
-    }
-    Py_DECREF(tokenlist);
     return 0;
 }
 
 /*
-    Parse an HTML comment at the head of the wikicode string.
+    Handle a wiki-style horizontal rule (----) in the string.
 */
-static int
-Tokenizer_parse_comment(Tokenizer* self)
+static int Tokenizer_handle_hr(Tokenizer* self)
 {
-    Py_ssize_t reset = self->head + 3;
-    PyObject *token, *comment;
+    PyObject *markup, *kwargs;
+    Textbuffer *buffer = Textbuffer_new();
     int i;
 
-    self->head += 4;
-    comment = Tokenizer_parse(self, LC_COMMENT);
-    if (BAD_ROUTE) {
-        const char* text = "<!--";
-        RESET_ROUTE();
-        self->head = reset;
-        i = 0;
-        while (1) {
-            if (!text[i])
-                return 0;
-            if (Tokenizer_write_text(self, (Py_UNICODE) text[i])) {
-                Py_XDECREF(text);
-                return -1;
-            }
-            i++;
-        }
-        return 0;
-    }
-    if (!comment)
-        return -1;
-    token = PyObject_CallObject(CommentStart, NULL);
-    if (!token) {
-        Py_DECREF(comment);
+    if (!buffer)
         return -1;
+    self->head += 3;
+    for (i = 0; i < 4; i++) {
+        if (Textbuffer_write(&buffer, *"-"))
+            return -1;
     }
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
-        Py_DECREF(comment);
-        return -1;
+    while (Tokenizer_READ(self, 1) == *"-") {
+        if (Textbuffer_write(&buffer, *"-"))
+            return -1;
+        self->head++;
     }
-    Py_DECREF(token);
-    if (Tokenizer_write_all(self, comment)) {
-        Py_DECREF(comment);
+    markup = Textbuffer_render(buffer);
+    Textbuffer_dealloc(buffer);
+    if (!markup)
         return -1;
-    }
-    Py_DECREF(comment);
-    token = PyObject_CallObject(CommentEnd, NULL);
-    if (!token)
+    kwargs = PyDict_New();
+    if (!kwargs)
         return -1;
-    if (Tokenizer_write(self, token)) {
-        Py_DECREF(token);
+    PyDict_SetItemString(kwargs, "wiki_markup", markup);
+    Py_DECREF(markup);
+    if (Tokenizer_emit_kwargs(self, TagOpenOpen, kwargs))
+        return -1;
+    if (Tokenizer_emit_text(self, "hr"))
+        return -1;
+    if (Tokenizer_emit(self, TagCloseSelfclose))
         return -1;
-    }
-    Py_DECREF(token);
-    self->head += 2;
     return 0;
 }
 
 /*
+    Handle the term in a description list ('foo' in ';foo:bar').
+*/
+static int Tokenizer_handle_dl_term(Tokenizer* self)
+{
+    self->topstack->context ^= LC_DLTERM;
+    if (Tokenizer_READ(self, 0) == *":")
+        return Tokenizer_handle_list_marker(self);
+    return Tokenizer_emit_char(self, *"\n");
+}
+
+/*
+    Handle the end of the stream of wikitext.
+*/
+static PyObject* Tokenizer_handle_end(Tokenizer* self, int context)
+{
+    PyObject *token, *text, *trash;
+    int single;
+
+    if (context & AGG_FAIL) {
+        if (context & LC_TAG_BODY) {
+            token = PyList_GET_ITEM(self->topstack->stack, 1);
+            text = PyObject_GetAttrString(token, "text");
+            if (!text)
+                return NULL;
+            single = IS_SINGLE(text);
+            Py_DECREF(text);
+            if (single)
+                return Tokenizer_handle_single_tag_end(self);
+        }
+        else if (context & AGG_DOUBLE) {
+            trash = Tokenizer_pop(self);
+            Py_XDECREF(trash);
+        }
+        return Tokenizer_fail_route(self);
+    }
+    return Tokenizer_pop(self);
+}
+
+/*
     Make sure we are not trying to write an invalid character. Return 0 if
     everything is safe, or -1 if the route must be failed.
 */
-static int
-Tokenizer_verify_safe(Tokenizer* self, int context, Py_UNICODE data)
+static int Tokenizer_verify_safe(Tokenizer* self, int context, Py_UNICODE data)
 {
-    if (context & LC_FAIL_NEXT) {
+    if (context & LC_FAIL_NEXT)
         return -1;
-    }
-    if (context & LC_WIKILINK_TITLE) {
-        if (data == *"]" || data == *"{")
+    if (context & LC_WIKILINK) {
+        if (context & LC_WIKILINK_TEXT)
+            return (data == *"[" && Tokenizer_READ(self, 1) == *"[") ? -1 : 0;
+        else if (data == *"]" || data == *"{")
             self->topstack->context |= LC_FAIL_NEXT;
         else if (data == *"\n" || data == *"[" || data == *"}")
             return -1;
         return 0;
     }
+    if (context & LC_EXT_LINK_TITLE)
+        return (data == *"\n") ? -1 : 0;
+    if (context & LC_TAG_CLOSE)
+        return (data == *"<") ? -1 : 0;
     if (context & LC_TEMPLATE_NAME) {
         if (data == *"{" || data == *"}" || data == *"[") {
             self->topstack->context |= LC_FAIL_NEXT;
@@ -1252,72 +2499,48 @@ Tokenizer_verify_safe(Tokenizer* self, int context, Py_UNICODE data)
 }
 
 /*
-    Parse the wikicode string, using context for when to stop.
+    Parse the wikicode string, using context for when to stop. If push is true,
+    we will push a new context, otherwise we won't and context will be ignored.
 */
-static PyObject*
-Tokenizer_parse(Tokenizer* self, int context)
+static PyObject* Tokenizer_parse(Tokenizer* self, int context, int push)
 {
-    static int fail_contexts = (LC_TEMPLATE | LC_ARGUMENT | LC_WIKILINK |
-                                LC_HEADING | LC_COMMENT);
-    static int unsafe_contexts = (LC_TEMPLATE_NAME | LC_WIKILINK_TITLE |
-                                  LC_TEMPLATE_PARAM_KEY | LC_ARGUMENT_NAME);
-    int this_context, is_marker, i;
+    int this_context;
     Py_UNICODE this, next, next_next, last;
-    PyObject *trash;
+    PyObject* temp;
 
-    if (Tokenizer_push(self, context))
-        return NULL;
+    if (push) {
+        if (Tokenizer_push(self, context))
+            return NULL;
+    }
     while (1) {
         this = Tokenizer_READ(self, 0);
         this_context = self->topstack->context;
-        if (this_context & unsafe_contexts) {
+        if (this_context & AGG_UNSAFE) {
             if (Tokenizer_verify_safe(self, this_context, this) < 0) {
-                if (this_context & LC_TEMPLATE_PARAM_KEY) {
-                    trash = Tokenizer_pop(self);
-                    Py_XDECREF(trash);
+                if (this_context & AGG_DOUBLE) {
+                    temp = Tokenizer_pop(self);
+                    Py_XDECREF(temp);
                 }
-                Tokenizer_fail_route(self);
-                return NULL;
-            }
-        }
-        is_marker = 0;
-        for (i = 0; i < NUM_MARKERS; i++) {
-            if (*MARKERS[i] == this) {
-                is_marker = 1;
-                break;
+                return Tokenizer_fail_route(self);
             }
         }
-        if (!is_marker) {
-            Tokenizer_write_text(self, this);
+        if (!is_marker(this)) {
+            if (Tokenizer_emit_char(self, this))
+                return NULL;
             self->head++;
             continue;
         }
-        if (this == *"") {
-            if (this_context & LC_TEMPLATE_PARAM_KEY) {
-                trash = Tokenizer_pop(self);
-                Py_XDECREF(trash);
-            }
-            if (this_context & fail_contexts)
-                return Tokenizer_fail_route(self);
-            return Tokenizer_pop(self);
-        }
+        if (this == *"")
+            return Tokenizer_handle_end(self, this_context);
         next = Tokenizer_READ(self, 1);
-        if (this_context & LC_COMMENT) {
-            if (this == next && next == *"-") {
-                if (Tokenizer_READ(self, 2) == *">")
-                    return Tokenizer_pop(self);
-            }
-            Tokenizer_write_text(self, this);
-        }
-        else if (this == next && next == *"{") {
+        last = Tokenizer_READ_BACKWARDS(self, 1);
+        if (this == next && next == *"{") {
             if (Tokenizer_CAN_RECURSE(self)) {
                 if (Tokenizer_parse_template_or_argument(self))
                     return NULL;
-                if (self->topstack->context & LC_FAIL_NEXT)
-                    self->topstack->context ^= LC_FAIL_NEXT;
             }
-            else
-                Tokenizer_write_text(self, this);
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
         }
         else if (this == *"|" && this_context & LC_TEMPLATE) {
             if (Tokenizer_handle_template_param(self))
@@ -1337,18 +2560,16 @@ Tokenizer_parse(Tokenizer* self, int context)
             if (Tokenizer_READ(self, 2) == *"}") {
                 return Tokenizer_handle_argument_end(self);
             }
-            Tokenizer_write_text(self, this);
+            if (Tokenizer_emit_char(self, this))
+                return NULL;
         }
-        else if (this == next && next == *"[") {
-            if (!(this_context & LC_WIKILINK_TITLE) &&
-                                                Tokenizer_CAN_RECURSE(self)) {
+        else if (this == next && next == *"[" && Tokenizer_CAN_RECURSE(self)) {
+            if (!(this_context & AGG_INVALID_LINK)) {
                 if (Tokenizer_parse_wikilink(self))
                     return NULL;
-                if (self->topstack->context & LC_FAIL_NEXT)
-                    self->topstack->context ^= LC_FAIL_NEXT;
             }
-            else
-                Tokenizer_write_text(self, this);
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
         }
         else if (this == *"|" && this_context & LC_WIKILINK_TITLE) {
             if (Tokenizer_handle_wikilink_separator(self))
@@ -1356,14 +2577,23 @@ Tokenizer_parse(Tokenizer* self, int context)
         }
         else if (this == next && next == *"]" && this_context & LC_WIKILINK)
             return Tokenizer_handle_wikilink_end(self);
+        else if (this == *"[") {
+            if (Tokenizer_parse_external_link(self, 1))
+                return NULL;
+        }
+        else if (this == *":" && !is_marker(last)) {
+            if (Tokenizer_parse_external_link(self, 0))
+                return NULL;
+        }
+        else if (this == *"]" && this_context & LC_EXT_LINK_TITLE)
+            return Tokenizer_pop(self);
         else if (this == *"=" && !(self->global & GL_HEADING)) {
-            last = *PyUnicode_AS_UNICODE(Tokenizer_read_backwards(self, 1));
             if (last == *"\n" || last == *"") {
                 if (Tokenizer_parse_heading(self))
                     return NULL;
             }
-            else
-                Tokenizer_write_text(self, this);
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
         }
         else if (this == *"=" && this_context & LC_HEADING)
             return (PyObject*) Tokenizer_handle_heading_end(self);
@@ -1379,11 +2609,51 @@ Tokenizer_parse(Tokenizer* self, int context)
                 if (Tokenizer_parse_comment(self))
                     return NULL;
             }
-            else
-                Tokenizer_write_text(self, this);
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
         }
-        else
-            Tokenizer_write_text(self, this);
+        else if (this == *"<" && next == *"/" &&
+                                            Tokenizer_READ(self, 2) != *"") {
+            if (this_context & LC_TAG_BODY ?
+                Tokenizer_handle_tag_open_close(self) :
+                Tokenizer_handle_invalid_tag_start(self))
+                return NULL;
+        }
+        else if (this == *"<" && !(this_context & LC_TAG_CLOSE)) {
+            if (Tokenizer_CAN_RECURSE(self)) {
+                if (Tokenizer_parse_tag(self))
+                    return NULL;
+            }
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
+        }
+        else if (this == *">" && this_context & LC_TAG_CLOSE)
+            return Tokenizer_handle_tag_close_close(self);
+        else if (this == next && next == *"'") {
+            temp = Tokenizer_parse_style(self);
+            if (temp != Py_None)
+                return temp;
+        }
+        else if (last == *"\n" || last == *"") {
+            if (this == *"#" || this == *"*" || this == *";" || this == *":") {
+                if (Tokenizer_handle_list(self))
+                    return NULL;
+            }
+            else if (this == *"-" && this == next &&
+                     this == Tokenizer_READ(self, 2) &&
+                     this == Tokenizer_READ(self, 3)) {
+                if (Tokenizer_handle_hr(self))
+                    return NULL;
+            }
+            else if (Tokenizer_emit_char(self, this))
+                return NULL;
+        }
+        else if ((this == *"\n" || this == *":") && this_context & LC_DLTERM) {
+            if (Tokenizer_handle_dl_term(self))
+                return NULL;
+        }
+        else if (Tokenizer_emit_char(self, this))
+            return NULL;
         self->head++;
     }
 }
@@ -1391,17 +2661,21 @@ Tokenizer_parse(Tokenizer* self, int context)
 /*
     Build a list of tokens from a string of wikicode and return it.
 */
-static PyObject*
-Tokenizer_tokenize(Tokenizer* self, PyObject* args)
+static PyObject* Tokenizer_tokenize(Tokenizer* self, PyObject* args)
 {
     PyObject *text, *temp;
+    int context = 0;
 
-    if (!PyArg_ParseTuple(args, "U", &text)) {
+    if (PyArg_ParseTuple(args, "U|i", &text, &context)) {
+        Py_XDECREF(self->text);
+        self->text = PySequence_Fast(text, "expected a sequence");
+    }
+    else {
         const char* encoded;
         Py_ssize_t size;
         /* Failed to parse a Unicode object; try a string instead. */
         PyErr_Clear();
-        if (!PyArg_ParseTuple(args, "s#", &encoded, &size))
+        if (!PyArg_ParseTuple(args, "s#|i", &encoded, &size, &context))
             return NULL;
         temp = PyUnicode_FromStringAndSize(encoded, size);
         if (!text)
@@ -1411,65 +2685,66 @@ Tokenizer_tokenize(Tokenizer* self, PyObject* args)
         Py_XDECREF(temp);
         self->text = text;
     }
-    else {
-        Py_XDECREF(self->text);
-        self->text = PySequence_Fast(text, "expected a sequence");
-    }
+    self->head = self->global = self->depth = self->cycles = 0;
     self->length = PyList_GET_SIZE(self->text);
-    return Tokenizer_parse(self, 0);
+    return Tokenizer_parse(self, context, 1);
 }
 
-PyMODINIT_FUNC
-init_tokenizer(void)
+static int load_entitydefs(void)
 {
-    PyObject *module, *tempmod, *defmap, *deflist, *globals, *locals,
-             *fromlist, *modname;
+    PyObject *tempmod, *defmap, *deflist;
     unsigned numdefs, i;
-    char *name;
+#ifdef IS_PY3K
+    PyObject *string;
+#endif
 
-    TokenizerType.tp_new = PyType_GenericNew;
-    if (PyType_Ready(&TokenizerType) < 0)
-        return;
-    module = Py_InitModule("_tokenizer", module_methods);
-    Py_INCREF(&TokenizerType);
-    PyModule_AddObject(module, "CTokenizer", (PyObject*) &TokenizerType);
-    Py_INCREF(Py_True);
-    PyDict_SetItemString(TokenizerType.tp_dict, "USES_C", Py_True);
-
-    tempmod = PyImport_ImportModule("htmlentitydefs");
+    tempmod = PyImport_ImportModule(ENTITYDEFS_MODULE);
     if (!tempmod)
-        return;
+        return -1;
     defmap = PyObject_GetAttrString(tempmod, "entitydefs");
     if (!defmap)
-        return;
+        return -1;
     Py_DECREF(tempmod);
     deflist = PyDict_Keys(defmap);
     if (!deflist)
-        return;
+        return -1;
     Py_DECREF(defmap);
     numdefs = (unsigned) PyList_GET_SIZE(defmap);
     entitydefs = calloc(numdefs + 1, sizeof(char*));
-    for (i = 0; i < numdefs; i++)
+    if (!entitydefs)
+        return -1;
+    for (i = 0; i < numdefs; i++) {
+#ifdef IS_PY3K
+        string = PyUnicode_AsASCIIString(PyList_GET_ITEM(deflist, i));
+        if (!string)
+            return -1;
+        entitydefs[i] = PyBytes_AsString(string);
+#else
         entitydefs[i] = PyBytes_AsString(PyList_GET_ITEM(deflist, i));
+#endif
+        if (!entitydefs[i])
+            return -1;
+    }
     Py_DECREF(deflist);
+    return 0;
+}
 
-    EMPTY = PyUnicode_FromString("");
-    NOARGS = PyTuple_New(0);
+static int load_tokens(void)
+{
+    PyObject *tempmod, *tokens,
+             *globals = PyEval_GetGlobals(),
+             *locals = PyEval_GetLocals(),
+             *fromlist = PyList_New(1),
+             *modname = IMPORT_NAME_FUNC("tokens");
+    char *name = "mwparserfromhell.parser";
 
-    name = "mwparserfromhell.parser";
-    globals = PyEval_GetGlobals();
-    locals = PyEval_GetLocals();
-    fromlist = PyList_New(1);
-    if (!fromlist)
-        return;
-    modname = PyBytes_FromString("tokens");
-    if (!modname)
-        return;
+    if (!fromlist || !modname)
+        return -1;
     PyList_SET_ITEM(fromlist, 0, modname);
     tempmod = PyImport_ImportModuleLevel(name, globals, locals, fromlist, 0);
     Py_DECREF(fromlist);
     if (!tempmod)
-        return;
+        return -1;
     tokens = PyObject_GetAttrString(tempmod, "tokens");
     Py_DECREF(tempmod);
 
@@ -1490,6 +2765,11 @@ init_tokenizer(void)
     WikilinkSeparator = PyObject_GetAttrString(tokens, "WikilinkSeparator");
     WikilinkClose = PyObject_GetAttrString(tokens, "WikilinkClose");
 
+    ExternalLinkOpen = PyObject_GetAttrString(tokens, "ExternalLinkOpen");
+    ExternalLinkSeparator = PyObject_GetAttrString(tokens,
+                                                   "ExternalLinkSeparator");
+    ExternalLinkClose = PyObject_GetAttrString(tokens, "ExternalLinkClose");
+
     HTMLEntityStart = PyObject_GetAttrString(tokens, "HTMLEntityStart");
     HTMLEntityNumeric = PyObject_GetAttrString(tokens, "HTMLEntityNumeric");
     HTMLEntityHex = PyObject_GetAttrString(tokens, "HTMLEntityHex");
@@ -1509,4 +2789,53 @@ init_tokenizer(void)
     TagCloseSelfclose = PyObject_GetAttrString(tokens, "TagCloseSelfclose");
     TagOpenClose = PyObject_GetAttrString(tokens, "TagOpenClose");
     TagCloseClose = PyObject_GetAttrString(tokens, "TagCloseClose");
+
+    Py_DECREF(tokens);
+    return 0;
+}
+
+static int load_definitions(void)
+{
+    PyObject *tempmod,
+             *globals = PyEval_GetGlobals(),
+             *locals = PyEval_GetLocals(),
+             *fromlist = PyList_New(1),
+             *modname = IMPORT_NAME_FUNC("definitions");
+    char *name = "mwparserfromhell";
+
+    if (!fromlist || !modname)
+        return -1;
+    PyList_SET_ITEM(fromlist, 0, modname);
+    tempmod = PyImport_ImportModuleLevel(name, globals, locals, fromlist, 0);
+    Py_DECREF(fromlist);
+    if (!tempmod)
+        return -1;
+    definitions = PyObject_GetAttrString(tempmod, "definitions");
+    Py_DECREF(tempmod);
+    return 0;
+}
+
+PyMODINIT_FUNC INIT_FUNC_NAME(void)
+{
+    PyObject *module;
+
+    TokenizerType.tp_new = PyType_GenericNew;
+    if (PyType_Ready(&TokenizerType) < 0)
+        INIT_ERROR;
+    module = CREATE_MODULE;
+    if (!module)
+        INIT_ERROR;
+    Py_INCREF(&TokenizerType);
+    PyModule_AddObject(module, "CTokenizer", (PyObject*) &TokenizerType);
+    Py_INCREF(Py_True);
+    PyDict_SetItemString(TokenizerType.tp_dict, "USES_C", Py_True);
+    EMPTY = PyUnicode_FromString("");
+    NOARGS = PyTuple_New(0);
+    if (!EMPTY || !NOARGS)
+        INIT_ERROR;
+    if (load_entitydefs() || load_tokens() || load_definitions())
+        INIT_ERROR;
+#ifdef IS_PY3K
+    return module;
+#endif
 }
diff --git a/mwparserfromhell/parser/tokenizer.h b/mwparserfromhell/parser/tokenizer.h
index 1f58c49..da3c57a 100644
--- a/mwparserfromhell/parser/tokenizer.h
+++ b/mwparserfromhell/parser/tokenizer.h
@@ -28,6 +28,7 @@ SOFTWARE.
 #include <Python.h>
 #include <math.h>
 #include <structmember.h>
+#include <bytesobject.h>
 
 #if PY_MAJOR_VERSION >= 3
 #define IS_PY3K
@@ -41,8 +42,8 @@ SOFTWARE.
 #define ALPHANUM  "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
 
 static const char* MARKERS[] = {
-    "{",  "}", "[", "]", "<", ">", "|", "=", "&", "#", "*", ";", ":", "/", "-",
-    "!", "\n", ""};
+    "{", "}", "[", "]", "<", ">", "|", "=", "&", "'", "#", "*", ";", ":", "/",
+    "-", "\n", ""};
 
 #define NUM_MARKERS 18
 #define TEXTBUFFER_BLOCKSIZE 1024
@@ -51,19 +52,20 @@ static const char* MARKERS[] = {
 #define MAX_BRACES 255
 #define MAX_ENTITY_SIZE 8
 
-static int route_state = 0;
-#define BAD_ROUTE     (route_state)
-#define FAIL_ROUTE()  (route_state = 1)
-#define RESET_ROUTE() (route_state = 0)
+static int route_state = 0, route_context = 0;
+#define BAD_ROUTE            route_state
+#define BAD_ROUTE_CONTEXT    route_context
+#define FAIL_ROUTE(context)  route_state = 1; route_context = context
+#define RESET_ROUTE()        route_state = 0
 
 static char** entitydefs;
 
 static PyObject* EMPTY;
 static PyObject* NOARGS;
-static PyObject* tokens;
+static PyObject* definitions;
 
 
-/* Tokens */
+/* Tokens: */
 
 static PyObject* Text;
 
@@ -80,6 +82,10 @@ static PyObject* WikilinkOpen;
 static PyObject* WikilinkSeparator;
 static PyObject* WikilinkClose;
 
+static PyObject* ExternalLinkOpen;
+static PyObject* ExternalLinkSeparator;
+static PyObject* ExternalLinkClose;
+
 static PyObject* HTMLEntityStart;
 static PyObject* HTMLEntityNumeric;
 static PyObject* HTMLEntityHex;
@@ -102,47 +108,83 @@ static PyObject* TagCloseClose;
 
 /* Local contexts: */
 
-#define LC_TEMPLATE             0x00007
-#define LC_TEMPLATE_NAME        0x00001
-#define LC_TEMPLATE_PARAM_KEY   0x00002
-#define LC_TEMPLATE_PARAM_VALUE 0x00004
-
-#define LC_ARGUMENT             0x00018
-#define LC_ARGUMENT_NAME        0x00008
-#define LC_ARGUMENT_DEFAULT     0x00010
-
-#define LC_WIKILINK             0x00060
-#define LC_WIKILINK_TITLE       0x00020
-#define LC_WIKILINK_TEXT        0x00040
-
-#define LC_HEADING              0x01F80
-#define LC_HEADING_LEVEL_1      0x00080
-#define LC_HEADING_LEVEL_2      0x00100
-#define LC_HEADING_LEVEL_3      0x00200
-#define LC_HEADING_LEVEL_4      0x00400
-#define LC_HEADING_LEVEL_5      0x00800
-#define LC_HEADING_LEVEL_6      0x01000
-
-#define LC_COMMENT              0x02000
-
-#define LC_SAFETY_CHECK         0xFC000
-#define LC_HAS_TEXT             0x04000
-#define LC_FAIL_ON_TEXT         0x08000
-#define LC_FAIL_NEXT            0x10000
-#define LC_FAIL_ON_LBRACE       0x20000
-#define LC_FAIL_ON_RBRACE       0x40000
-#define LC_FAIL_ON_EQUALS       0x80000
+#define LC_TEMPLATE             0x00000007
+#define LC_TEMPLATE_NAME        0x00000001
+#define LC_TEMPLATE_PARAM_KEY   0x00000002
+#define LC_TEMPLATE_PARAM_VALUE 0x00000004
+
+#define LC_ARGUMENT             0x00000018
+#define LC_ARGUMENT_NAME        0x00000008
+#define LC_ARGUMENT_DEFAULT     0x00000010
+
+#define LC_WIKILINK             0x00000060
+#define LC_WIKILINK_TITLE       0x00000020
+#define LC_WIKILINK_TEXT        0x00000040
+
+#define LC_EXT_LINK             0x00000380
+#define LC_EXT_LINK_URI         0x00000080
+#define LC_EXT_LINK_TITLE       0x00000100
+#define LC_EXT_LINK_BRACKETS    0x00000200
+
+#define LC_HEADING              0x0000FC00
+#define LC_HEADING_LEVEL_1      0x00000400
+#define LC_HEADING_LEVEL_2      0x00000800
+#define LC_HEADING_LEVEL_3      0x00001000
+#define LC_HEADING_LEVEL_4      0x00002000
+#define LC_HEADING_LEVEL_5      0x00004000
+#define LC_HEADING_LEVEL_6      0x00008000
+
+#define LC_TAG                  0x000F0000
+#define LC_TAG_OPEN             0x00010000
+#define LC_TAG_ATTR             0x00020000
+#define LC_TAG_BODY             0x00040000
+#define LC_TAG_CLOSE            0x00080000
+
+#define LC_STYLE                0x00F00000
+#define LC_STYLE_ITALICS        0x00100000
+#define LC_STYLE_BOLD           0x00200000
+#define LC_STYLE_PASS_AGAIN     0x00400000
+#define LC_STYLE_SECOND_PASS    0x00800000
+
+#define LC_DLTERM               0x01000000
+
+#define LC_SAFETY_CHECK         0x7E000000
+#define LC_HAS_TEXT             0x02000000
+#define LC_FAIL_ON_TEXT         0x04000000
+#define LC_FAIL_NEXT            0x08000000
+#define LC_FAIL_ON_LBRACE       0x10000000
+#define LC_FAIL_ON_RBRACE       0x20000000
+#define LC_FAIL_ON_EQUALS       0x40000000
 
 /* Global contexts: */
 
 #define GL_HEADING 0x1
 
+/* Aggregate contexts: */
+
+#define AGG_FAIL         (LC_TEMPLATE | LC_ARGUMENT | LC_WIKILINK | LC_EXT_LINK_TITLE | LC_HEADING | LC_TAG | LC_STYLE)
+#define AGG_UNSAFE       (LC_TEMPLATE_NAME | LC_WIKILINK | LC_EXT_LINK_TITLE | LC_TEMPLATE_PARAM_KEY | LC_ARGUMENT_NAME)
+#define AGG_DOUBLE       (LC_TEMPLATE_PARAM_KEY | LC_TAG_CLOSE)
+#define AGG_INVALID_LINK (LC_TEMPLATE_NAME | LC_ARGUMENT_NAME | LC_WIKILINK | LC_EXT_LINK)
+
+/* Tag contexts: */
+
+#define TAG_NAME        0x01
+#define TAG_ATTR_READY  0x02
+#define TAG_ATTR_NAME   0x04
+#define TAG_ATTR_VALUE  0x08
+#define TAG_QUOTED      0x10
+#define TAG_NOTE_SPACE  0x20
+#define TAG_NOTE_EQUALS 0x40
+#define TAG_NOTE_QUOTE  0x80
+
 
 /* Miscellaneous structs: */
 
 struct Textbuffer {
     Py_ssize_t size;
     Py_UNICODE* data;
+    struct Textbuffer* prev;
     struct Textbuffer* next;
 };
 
@@ -158,13 +200,24 @@ typedef struct {
     int level;
 } HeadingData;
 
+typedef struct {
+    int context;
+    struct Textbuffer* pad_first;
+    struct Textbuffer* pad_before_eq;
+    struct Textbuffer* pad_after_eq;
+    Py_ssize_t reset;
+} TagData;
+
+typedef struct Textbuffer Textbuffer;
+typedef struct Stack Stack;
+
 
 /* Tokenizer object definition: */
 
 typedef struct {
     PyObject_HEAD
     PyObject* text;         /* text to tokenize */
-    struct Stack* topstack; /* topmost stack */
+    Stack* topstack;        /* topmost stack */
     Py_ssize_t head;        /* current position in text */
     Py_ssize_t length;      /* length of text */
     int global;             /* global context */
@@ -173,78 +226,80 @@ typedef struct {
 } Tokenizer;
 
 
-/* Macros for accessing Tokenizer data: */
+/* Macros related to Tokenizer functions: */
 
 #define Tokenizer_READ(self, delta) (*PyUnicode_AS_UNICODE(Tokenizer_read(self, delta)))
+#define Tokenizer_READ_BACKWARDS(self, delta) \
+                (*PyUnicode_AS_UNICODE(Tokenizer_read_backwards(self, delta)))
 #define Tokenizer_CAN_RECURSE(self) (self->depth < MAX_DEPTH && self->cycles < MAX_CYCLES)
 
+#define Tokenizer_emit(self, token) Tokenizer_emit_token(self, token, 0)
+#define Tokenizer_emit_first(self, token) Tokenizer_emit_token(self, token, 1)
+#define Tokenizer_emit_kwargs(self, token, kwargs) Tokenizer_emit_token_kwargs(self, token, kwargs, 0)
+#define Tokenizer_emit_first_kwargs(self, token, kwargs) Tokenizer_emit_token_kwargs(self, token, kwargs, 1)
+
+
+/* Macros for accessing definitions: */
+
+#define GET_HTML_TAG(markup) (markup == *":" ? "dd" : markup == *";" ? "dt" : "li")
+#define IS_PARSABLE(tag) (call_def_func("is_parsable", tag, NULL, NULL))
+#define IS_SINGLE(tag) (call_def_func("is_single", tag, NULL, NULL))
+#define IS_SINGLE_ONLY(tag) (call_def_func("is_single_only", tag, NULL, NULL))
+#define IS_SCHEME(scheme, slashes, reverse) \
+    (call_def_func("is_scheme", scheme, slashes ? Py_True : Py_False, reverse ? Py_True : Py_False))
+
 
 /* Function prototypes: */
 
-static int heading_level_from_context(int);
+static Textbuffer* Textbuffer_new(void);
+static void Textbuffer_dealloc(Textbuffer*);
+
+static TagData* TagData_new(void);
+static void TagData_dealloc(TagData*);
+
 static PyObject* Tokenizer_new(PyTypeObject*, PyObject*, PyObject*);
-static struct Textbuffer* Textbuffer_new(void);
 static void Tokenizer_dealloc(Tokenizer*);
-static void Textbuffer_dealloc(struct Textbuffer*);
 static int Tokenizer_init(Tokenizer*, PyObject*, PyObject*);
-static int Tokenizer_push(Tokenizer*, int);
-static PyObject* Textbuffer_render(struct Textbuffer*);
-static int Tokenizer_push_textbuffer(Tokenizer*);
-static void Tokenizer_delete_top_of_stack(Tokenizer*);
-static PyObject* Tokenizer_pop(Tokenizer*);
-static PyObject* Tokenizer_pop_keeping_context(Tokenizer*);
-static void* Tokenizer_fail_route(Tokenizer*);
-static int Tokenizer_write(Tokenizer*, PyObject*);
-static int Tokenizer_write_first(Tokenizer*, PyObject*);
-static int Tokenizer_write_text(Tokenizer*, Py_UNICODE);
-static int Tokenizer_write_all(Tokenizer*, PyObject*);
-static int Tokenizer_write_text_then_stack(Tokenizer*, const char*);
-static PyObject* Tokenizer_read(Tokenizer*, Py_ssize_t);
-static PyObject* Tokenizer_read_backwards(Tokenizer*, Py_ssize_t);
-static int Tokenizer_parse_template_or_argument(Tokenizer*);
-static int Tokenizer_parse_template(Tokenizer*);
-static int Tokenizer_parse_argument(Tokenizer*);
-static int Tokenizer_handle_template_param(Tokenizer*);
-static int Tokenizer_handle_template_param_value(Tokenizer*);
-static PyObject* Tokenizer_handle_template_end(Tokenizer*);
-static int Tokenizer_handle_argument_separator(Tokenizer*);
-static PyObject* Tokenizer_handle_argument_end(Tokenizer*);
-static int Tokenizer_parse_wikilink(Tokenizer*);
-static int Tokenizer_handle_wikilink_separator(Tokenizer*);
-static PyObject* Tokenizer_handle_wikilink_end(Tokenizer*);
-static int Tokenizer_parse_heading(Tokenizer*);
-static HeadingData* Tokenizer_handle_heading_end(Tokenizer*);
-static int Tokenizer_really_parse_entity(Tokenizer*);
 static int Tokenizer_parse_entity(Tokenizer*);
-static int Tokenizer_parse_comment(Tokenizer*);
-static int Tokenizer_verify_safe(Tokenizer*, int, Py_UNICODE);
-static PyObject* Tokenizer_parse(Tokenizer*, int);
+static int Tokenizer_handle_dl_term(Tokenizer*);
+static int Tokenizer_parse_tag(Tokenizer*);
+static PyObject* Tokenizer_parse(Tokenizer*, int, int);
 static PyObject* Tokenizer_tokenize(Tokenizer*, PyObject*);
 
 
+/* Macros for Python 2/3 compatibility: */
+
+#ifdef IS_PY3K
+    #define NEW_INT_FUNC      PyLong_FromSsize_t
+    #define IMPORT_NAME_FUNC  PyUnicode_FromString
+    #define CREATE_MODULE     PyModule_Create(&module_def);
+    #define ENTITYDEFS_MODULE "html.entities"
+    #define INIT_FUNC_NAME    PyInit__tokenizer
+    #define INIT_ERROR        return NULL
+#else
+    #define NEW_INT_FUNC      PyInt_FromSsize_t
+    #define IMPORT_NAME_FUNC  PyBytes_FromString
+    #define CREATE_MODULE     Py_InitModule("_tokenizer", NULL);
+    #define ENTITYDEFS_MODULE "htmlentitydefs"
+    #define INIT_FUNC_NAME    init_tokenizer
+    #define INIT_ERROR        return
+#endif
+
+
 /* More structs for creating the Tokenizer type: */
 
-static PyMethodDef
-Tokenizer_methods[] = {
+static PyMethodDef Tokenizer_methods[] = {
     {"tokenize", (PyCFunction) Tokenizer_tokenize, METH_VARARGS,
     "Build a list of tokens from a string of wikicode and return it."},
     {NULL}
 };
 
-static PyMemberDef
-Tokenizer_members[] = {
+static PyMemberDef Tokenizer_members[] = {
     {NULL}
 };
 
-static PyMethodDef
-module_methods[] = {
-    {NULL}
-};
-
-static PyTypeObject
-TokenizerType = {
-    PyObject_HEAD_INIT(NULL)
-    0,                                                      /* ob_size */
+static PyTypeObject TokenizerType = {
+    PyVarObject_HEAD_INIT(NULL, 0)
     "_tokenizer.CTokenizer",                                /* tp_name */
     sizeof(Tokenizer),                                      /* tp_basicsize */
     0,                                                      /* tp_itemsize */
@@ -283,3 +338,12 @@ TokenizerType = {
     0,                                                      /* tp_alloc */
     Tokenizer_new,                                          /* tp_new */
 };
+
+#ifdef IS_PY3K
+static PyModuleDef module_def = {
+    PyModuleDef_HEAD_INIT,
+    "_tokenizer",
+    "Creates a list of tokens from a string of wikicode.",
+    -1, NULL, NULL, NULL, NULL, NULL
+};
+#endif
diff --git a/mwparserfromhell/parser/tokenizer.py b/mwparserfromhell/parser/tokenizer.py
index 24eb9db..8fae729 100644
--- a/mwparserfromhell/parser/tokenizer.py
+++ b/mwparserfromhell/parser/tokenizer.py
@@ -24,15 +24,36 @@ from __future__ import unicode_literals
 from math import log
 import re
 
-from . import contexts
-from . import tokens
+from . import contexts, tokens
 from ..compat import htmlentities
+from ..definitions import (get_html_tag, is_parsable, is_single,
+                           is_single_only, is_scheme)
 
 __all__ = ["Tokenizer"]
 
 class BadRoute(Exception):
     """Raised internally when the current tokenization route is invalid."""
-    pass
+
+    def __init__(self, context=0):
+        super(BadRoute, self).__init__()
+        self.context = context
+
+
+class _TagOpenData(object):
+    """Stores data about an HTML open tag, like ``<ref name="foo">``."""
+    CX_NAME =        1 << 0
+    CX_ATTR_READY =  1 << 1
+    CX_ATTR_NAME =   1 << 2
+    CX_ATTR_VALUE =  1 << 3
+    CX_QUOTED =      1 << 4
+    CX_NOTE_SPACE =  1 << 5
+    CX_NOTE_EQUALS = 1 << 6
+    CX_NOTE_QUOTE =  1 << 7
+
+    def __init__(self):
+        self.context = self.CX_NAME
+        self.padding_buffer = {"first": "", "before_eq": "", "after_eq": ""}
+        self.reset = 0
 
 
 class Tokenizer(object):
@@ -40,11 +61,12 @@ class Tokenizer(object):
     USES_C = False
     START = object()
     END = object()
-    MARKERS = ["{", "}", "[", "]", "<", ">", "|", "=", "&", "#", "*", ";", ":",
-               "/", "-", "!", "\n", END]
+    MARKERS = ["{", "}", "[", "]", "<", ">", "|", "=", "&", "'", "#", "*", ";",
+               ":", "/", "-", "\n", START, END]
     MAX_DEPTH = 40
     MAX_CYCLES = 100000
-    regex = re.compile(r"([{}\[\]<>|=&#*;:/\-!\n])", flags=re.IGNORECASE)
+    regex = re.compile(r"([{}\[\]<>|=&'#*;:/\\\"\-!\n])", flags=re.IGNORECASE)
+    tag_splitter = re.compile(r"([\s\"\\]+)")
 
     def __init__(self):
         self._text = None
@@ -114,36 +136,37 @@ class Tokenizer(object):
         Discards the current stack/context/textbuffer and raises
         :py:exc:`~.BadRoute`.
         """
+        context = self._context
         self._pop()
-        raise BadRoute()
+        raise BadRoute(context)
 
-    def _write(self, token):
+    def _emit(self, token):
         """Write a token to the end of the current token stack."""
         self._push_textbuffer()
         self._stack.append(token)
 
-    def _write_first(self, token):
+    def _emit_first(self, token):
         """Write a token to the beginning of the current token stack."""
         self._push_textbuffer()
         self._stack.insert(0, token)
 
-    def _write_text(self, text):
+    def _emit_text(self, text):
         """Write text to the current textbuffer."""
         self._textbuffer.append(text)
 
-    def _write_all(self, tokenlist):
+    def _emit_all(self, tokenlist):
         """Write a series of tokens to the current stack at once."""
         if tokenlist and isinstance(tokenlist[0], tokens.Text):
-            self._write_text(tokenlist.pop(0).text)
+            self._emit_text(tokenlist.pop(0).text)
         self._push_textbuffer()
         self._stack.extend(tokenlist)
 
-    def _write_text_then_stack(self, text):
+    def _emit_text_then_stack(self, text):
         """Pop the current stack, write *text*, and then write the stack."""
         stack = self._pop()
-        self._write_text(text)
+        self._emit_text(text)
         if stack:
-            self._write_all(stack)
+            self._emit_all(stack)
         self._head -= 1
 
     def _read(self, delta=0, wrap=False, strict=False):
@@ -168,6 +191,30 @@ class Tokenizer(object):
                 self._fail_route()
             return self.END
 
+    def _parse_template(self):
+        """Parse a template at the head of the wikicode string."""
+        reset = self._head
+        try:
+            template = self._parse(contexts.TEMPLATE_NAME)
+        except BadRoute:
+            self._head = reset
+            raise
+        self._emit_first(tokens.TemplateOpen())
+        self._emit_all(template)
+        self._emit(tokens.TemplateClose())
+
+    def _parse_argument(self):
+        """Parse an argument at the head of the wikicode string."""
+        reset = self._head
+        try:
+            argument = self._parse(contexts.ARGUMENT_NAME)
+        except BadRoute:
+            self._head = reset
+            raise
+        self._emit_first(tokens.ArgumentOpen())
+        self._emit_all(argument)
+        self._emit(tokens.ArgumentClose())
+
     def _parse_template_or_argument(self):
         """Parse a template or argument at the head of the wikicode string."""
         self._head += 2
@@ -179,12 +226,12 @@ class Tokenizer(object):
 
         while braces:
             if braces == 1:
-                return self._write_text_then_stack("{")
+                return self._emit_text_then_stack("{")
             if braces == 2:
                 try:
                     self._parse_template()
                 except BadRoute:
-                    return self._write_text_then_stack("{{")
+                    return self._emit_text_then_stack("{{")
                 break
             try:
                 self._parse_argument()
@@ -194,35 +241,13 @@ class Tokenizer(object):
                     self._parse_template()
                     braces -= 2
                 except BadRoute:
-                    return self._write_text_then_stack("{" * braces)
+                    return self._emit_text_then_stack("{" * braces)
             if braces:
                 self._head += 1
 
-        self._write_all(self._pop())
-
-    def _parse_template(self):
-        """Parse a template at the head of the wikicode string."""
-        reset = self._head
-        try:
-            template = self._parse(contexts.TEMPLATE_NAME)
-        except BadRoute:
-            self._head = reset
-            raise
-        self._write_first(tokens.TemplateOpen())
-        self._write_all(template)
-        self._write(tokens.TemplateClose())
-
-    def _parse_argument(self):
-        """Parse an argument at the head of the wikicode string."""
-        reset = self._head
-        try:
-            argument = self._parse(contexts.ARGUMENT_NAME)
-        except BadRoute:
-            self._head = reset
-            raise
-        self._write_first(tokens.ArgumentOpen())
-        self._write_all(argument)
-        self._write(tokens.ArgumentClose())
+        self._emit_all(self._pop())
+        if self._context & contexts.FAIL_NEXT:
+            self._context ^= contexts.FAIL_NEXT
 
     def _handle_template_param(self):
         """Handle a template parameter at the head of the string."""
@@ -231,22 +256,22 @@ class Tokenizer(object):
         elif self._context & contexts.TEMPLATE_PARAM_VALUE:
             self._context ^= contexts.TEMPLATE_PARAM_VALUE
         elif self._context & contexts.TEMPLATE_PARAM_KEY:
-            self._write_all(self._pop(keep_context=True))
+            self._emit_all(self._pop(keep_context=True))
         self._context |= contexts.TEMPLATE_PARAM_KEY
-        self._write(tokens.TemplateParamSeparator())
+        self._emit(tokens.TemplateParamSeparator())
         self._push(self._context)
 
     def _handle_template_param_value(self):
         """Handle a template parameter's value at the head of the string."""
-        self._write_all(self._pop(keep_context=True))
+        self._emit_all(self._pop(keep_context=True))
         self._context ^= contexts.TEMPLATE_PARAM_KEY
         self._context |= contexts.TEMPLATE_PARAM_VALUE
-        self._write(tokens.TemplateParamEquals())
+        self._emit(tokens.TemplateParamEquals())
 
     def _handle_template_end(self):
         """Handle the end of a template at the head of the string."""
         if self._context & contexts.TEMPLATE_PARAM_KEY:
-            self._write_all(self._pop(keep_context=True))
+            self._emit_all(self._pop(keep_context=True))
         self._head += 1
         return self._pop()
 
@@ -254,7 +279,7 @@ class Tokenizer(object):
         """Handle the separator between an argument's name and default."""
         self._context ^= contexts.ARGUMENT_NAME
         self._context |= contexts.ARGUMENT_DEFAULT
-        self._write(tokens.ArgumentSeparator())
+        self._emit(tokens.ArgumentSeparator())
 
     def _handle_argument_end(self):
         """Handle the end of an argument at the head of the string."""
@@ -269,23 +294,187 @@ class Tokenizer(object):
             wikilink = self._parse(contexts.WIKILINK_TITLE)
         except BadRoute:
             self._head = reset
-            self._write_text("[[")
+            self._emit_text("[[")
         else:
-            self._write(tokens.WikilinkOpen())
-            self._write_all(wikilink)
-            self._write(tokens.WikilinkClose())
+            if self._context & contexts.FAIL_NEXT:
+                self._context ^= contexts.FAIL_NEXT
+            self._emit(tokens.WikilinkOpen())
+            self._emit_all(wikilink)
+            self._emit(tokens.WikilinkClose())
 
     def _handle_wikilink_separator(self):
         """Handle the separator between a wikilink's title and its text."""
         self._context ^= contexts.WIKILINK_TITLE
         self._context |= contexts.WIKILINK_TEXT
-        self._write(tokens.WikilinkSeparator())
+        self._emit(tokens.WikilinkSeparator())
 
     def _handle_wikilink_end(self):
         """Handle the end of a wikilink at the head of the string."""
         self._head += 1
         return self._pop()
 
+    def _parse_bracketed_uri_scheme(self):
+        """Parse the URI scheme of a bracket-enclosed external link."""
+        self._push(contexts.EXT_LINK_URI)
+        if self._read() == self._read(1) == "/":
+            self._emit_text("//")
+            self._head += 2
+        else:
+            valid = "abcdefghijklmnopqrstuvwxyz0123456789+.-"
+            all_valid = lambda: all(char in valid for char in self._read())
+            scheme = ""
+            while self._read() is not self.END and all_valid():
+                scheme += self._read()
+                self._emit_text(self._read())
+                self._head += 1
+            if self._read() != ":":
+                self._fail_route()
+            self._emit_text(":")
+            self._head += 1
+            slashes = self._read() == self._read(1) == "/"
+            if slashes:
+                self._emit_text("//")
+                self._head += 2
+            if not is_scheme(scheme, slashes):
+                self._fail_route()
+
+    def _parse_free_uri_scheme(self):
+        """Parse the URI scheme of a free (no brackets) external link."""
+        valid = "abcdefghijklmnopqrstuvwxyz0123456789+.-"
+        scheme = []
+        try:
+            # We have to backtrack through the textbuffer looking for our
+            # scheme since it was just parsed as text:
+            for chunk in reversed(self._textbuffer):
+                for char in reversed(chunk):
+                    if char.isspace() or char in self.MARKERS:
+                        raise StopIteration()
+                    if char not in valid:
+                        raise BadRoute()
+                    scheme.append(char)
+        except StopIteration:
+            pass
+        scheme = "".join(reversed(scheme))
+        slashes = self._read() == self._read(1) == "/"
+        if not is_scheme(scheme, slashes):
+            raise BadRoute()
+        self._push(contexts.EXT_LINK_URI)
+        self._emit_text(scheme)
+        self._emit_text(":")
+        if slashes:
+            self._emit_text("//")
+            self._head += 2
+
+    def _handle_free_link_text(self, punct, tail, this):
+        """Handle text in a free ext link, including trailing punctuation."""
+        if "(" in this and ")" in punct:
+            punct = punct[:-1]  # ')' is not longer valid punctuation
+        if this.endswith(punct):
+            for i in reversed(range(-len(this), 0)):
+                if i == -len(this) or this[i - 1] not in punct:
+                    break
+            stripped = this[:i]
+            if stripped and tail:
+                self._emit_text(tail)
+                tail = ""
+            tail += this[i:]
+            this = stripped
+        elif tail:
+            self._emit_text(tail)
+            tail = ""
+        self._emit_text(this)
+        return punct, tail
+
+    def _really_parse_external_link(self, brackets):
+        """Really parse an external link."""
+        if brackets:
+            self._parse_bracketed_uri_scheme()
+            invalid = ("\n", " ", "]")
+        else:
+            self._parse_free_uri_scheme()
+            invalid = ("\n", " ", "[", "]")
+            punct = tuple(",;\.:!?)")
+        if self._read() is self.END or self._read()[0] in invalid:
+            self._fail_route()
+        tail = ""
+        while True:
+            this, next = self._read(), self._read(1)
+            if this is self.END or this == "\n":
+                if brackets:
+                    self._fail_route()
+                return self._pop(), tail, -1
+            elif this == next == "{" and self._can_recurse():
+                if tail:
+                    self._emit_text(tail)
+                    tail = ""
+                self._parse_template_or_argument()
+            elif this == "[":
+                if brackets:
+                    self._emit_text("[")
+                else:
+                    return self._pop(), tail, -1
+            elif this == "]":
+                return self._pop(), tail, 0 if brackets else -1
+            elif this == "&":
+                if tail:
+                    self._emit_text(tail)
+                    tail = ""
+                self._parse_entity()
+            elif " " in this:
+                before, after = this.split(" ", 1)
+                if brackets:
+                    self._emit_text(before)
+                    self._emit(tokens.ExternalLinkSeparator())
+                    if after:
+                        self._emit_text(after)
+                    self._context ^= contexts.EXT_LINK_URI
+                    self._context |= contexts.EXT_LINK_TITLE
+                    self._head += 1
+                    return self._parse(push=False), None, 0
+                punct, tail = self._handle_free_link_text(punct, tail, before)
+                return self._pop(), tail + " " + after, 0
+            elif not brackets:
+                punct, tail = self._handle_free_link_text(punct, tail, this)
+            else:
+                self._emit_text(this)
+            self._head += 1
+
+    def _remove_uri_scheme_from_textbuffer(self, scheme):
+        """Remove the URI scheme of a new external link from the textbuffer."""
+        length = len(scheme)
+        while length:
+            if length < len(self._textbuffer[-1]):
+                self._textbuffer[-1] = self._textbuffer[-1][:-length]
+                break
+            length -= len(self._textbuffer[-1])
+            self._textbuffer.pop()
+
+    def _parse_external_link(self, brackets):
+        """Parse an external link at the head of the wikicode string."""
+        reset = self._head
+        self._head += 1
+        try:
+            bad_context = self._context & contexts.INVALID_LINK
+            if bad_context or not self._can_recurse():
+                raise BadRoute()
+            link, extra, delta = self._really_parse_external_link(brackets)
+        except BadRoute:
+            self._head = reset
+            if not brackets and self._context & contexts.DL_TERM:
+                self._handle_dl_term()
+            else:
+                self._emit_text(self._read())
+        else:
+            if not brackets:
+                scheme = link[0].text.split(":", 1)[0]
+                self._remove_uri_scheme_from_textbuffer(scheme)
+            self._emit(tokens.ExternalLinkOpen(brackets=brackets))
+            self._emit_all(link)
+            self._emit(tokens.ExternalLinkClose())
+            self._head += delta
+            if extra:
+                self._emit_text(extra)
+
     def _parse_heading(self):
         """Parse a section heading at the head of the wikicode string."""
         self._global |= contexts.GL_HEADING
@@ -301,13 +490,13 @@ class Tokenizer(object):
             title, level = self._parse(context)
         except BadRoute:
             self._head = reset + best - 1
-            self._write_text("=" * best)
+            self._emit_text("=" * best)
         else:
-            self._write(tokens.HeadingStart(level=level))
+            self._emit(tokens.HeadingStart(level=level))
             if level < best:
-                self._write_text("=" * (best - level))
-            self._write_all(title)
-            self._write(tokens.HeadingEnd())
+                self._emit_text("=" * (best - level))
+            self._emit_all(title)
+            self._emit(tokens.HeadingEnd())
         finally:
             self._global ^= contexts.GL_HEADING
 
@@ -326,28 +515,28 @@ class Tokenizer(object):
             after, after_level = self._parse(self._context)
         except BadRoute:
             if level < best:
-                self._write_text("=" * (best - level))
+                self._emit_text("=" * (best - level))
             self._head = reset + best - 1
             return self._pop(), level
         else:  # Found another closure
-            self._write_text("=" * best)
-            self._write_all(after)
+            self._emit_text("=" * best)
+            self._emit_all(after)
             return self._pop(), after_level
 
     def _really_parse_entity(self):
         """Actually parse an HTML entity and ensure that it is valid."""
-        self._write(tokens.HTMLEntityStart())
+        self._emit(tokens.HTMLEntityStart())
         self._head += 1
 
         this = self._read(strict=True)
         if this == "#":
             numeric = True
-            self._write(tokens.HTMLEntityNumeric())
+            self._emit(tokens.HTMLEntityNumeric())
             self._head += 1
             this = self._read(strict=True)
             if this[0].lower() == "x":
                 hexadecimal = True
-                self._write(tokens.HTMLEntityHex(char=this[0]))
+                self._emit(tokens.HTMLEntityHex(char=this[0]))
                 this = this[1:]
                 if not this:
                     self._fail_route()
@@ -373,8 +562,8 @@ class Tokenizer(object):
             if this not in htmlentities.entitydefs:
                 self._fail_route()
 
-        self._write(tokens.Text(text=this))
-        self._write(tokens.HTMLEntityEnd())
+        self._emit(tokens.Text(text=this))
+        self._emit(tokens.HTMLEntityEnd())
 
     def _parse_entity(self):
         """Parse an HTML entity at the head of the wikicode string."""
@@ -384,37 +573,420 @@ class Tokenizer(object):
             self._really_parse_entity()
         except BadRoute:
             self._head = reset
-            self._write_text(self._read())
+            self._emit_text(self._read())
         else:
-            self._write_all(self._pop())
+            self._emit_all(self._pop())
 
     def _parse_comment(self):
         """Parse an HTML comment at the head of the wikicode string."""
         self._head += 4
         reset = self._head - 1
+        self._push()
+        while True:
+            this = self._read()
+            if this == self.END:
+                self._pop()
+                self._head = reset
+                self._emit_text("<!--")
+                return
+            if this == self._read(1) == "-" and self._read(2) == ">":
+                self._emit_first(tokens.CommentStart())
+                self._emit(tokens.CommentEnd())
+                self._emit_all(self._pop())
+                self._head += 2
+                return
+            self._emit_text(this)
+            self._head += 1
+
+    def _push_tag_buffer(self, data):
+        """Write a pending tag attribute from *data* to the stack."""
+        if data.context & data.CX_QUOTED:
+            self._emit_first(tokens.TagAttrQuote())
+            self._emit_all(self._pop())
+        buf = data.padding_buffer
+        self._emit_first(tokens.TagAttrStart(pad_first=buf["first"],
+            pad_before_eq=buf["before_eq"], pad_after_eq=buf["after_eq"]))
+        self._emit_all(self._pop())
+        data.padding_buffer = {key: "" for key in data.padding_buffer}
+
+    def _handle_tag_space(self, data, text):
+        """Handle whitespace (*text*) inside of an HTML open tag."""
+        ctx = data.context
+        end_of_value = ctx & data.CX_ATTR_VALUE and not ctx & (data.CX_QUOTED | data.CX_NOTE_QUOTE)
+        if end_of_value or (ctx & data.CX_QUOTED and ctx & data.CX_NOTE_SPACE):
+            self._push_tag_buffer(data)
+            data.context = data.CX_ATTR_READY
+        elif ctx & data.CX_NOTE_SPACE:
+            data.context = data.CX_ATTR_READY
+        elif ctx & data.CX_ATTR_NAME:
+            data.context |= data.CX_NOTE_EQUALS
+            data.padding_buffer["before_eq"] += text
+        if ctx & data.CX_QUOTED and not ctx & data.CX_NOTE_SPACE:
+            self._emit_text(text)
+        elif data.context & data.CX_ATTR_READY:
+            data.padding_buffer["first"] += text
+        elif data.context & data.CX_ATTR_VALUE:
+            data.padding_buffer["after_eq"] += text
+
+    def _handle_tag_text(self, text):
+        """Handle regular *text* inside of an HTML open tag."""
+        next = self._read(1)
+        if not self._can_recurse() or text not in self.MARKERS:
+            self._emit_text(text)
+        elif text == next == "{":
+            self._parse_template_or_argument()
+        elif text == next == "[":
+            self._parse_wikilink()
+        elif text == "<":
+            self._parse_tag()
+        else:
+            self._emit_text(text)
+
+    def _handle_tag_data(self, data, text):
+        """Handle all sorts of *text* data inside of an HTML open tag."""
+        for chunk in self.tag_splitter.split(text):
+            if not chunk:
+                continue
+            if data.context & data.CX_NAME:
+                if chunk in self.MARKERS or chunk.isspace():
+                    self._fail_route()  # Tags must start with text, not spaces
+                data.context = data.CX_NOTE_SPACE
+            elif chunk.isspace():
+                self._handle_tag_space(data, chunk)
+                continue
+            elif data.context & data.CX_NOTE_SPACE:
+                if data.context & data.CX_QUOTED:
+                    data.context = data.CX_ATTR_VALUE
+                    self._pop()
+                    self._head = data.reset - 1  # Will be auto-incremented
+                    return  # Break early
+                self._fail_route()
+            elif data.context & data.CX_ATTR_READY:
+                data.context = data.CX_ATTR_NAME
+                self._push(contexts.TAG_ATTR)
+            elif data.context & data.CX_ATTR_NAME:
+                if chunk == "=":
+                    data.context = data.CX_ATTR_VALUE | data.CX_NOTE_QUOTE
+                    self._emit(tokens.TagAttrEquals())
+                    continue
+                if data.context & data.CX_NOTE_EQUALS:
+                    self._push_tag_buffer(data)
+                    data.context = data.CX_ATTR_NAME
+                    self._push(contexts.TAG_ATTR)
+            elif data.context & data.CX_ATTR_VALUE:
+                escaped = self._read(-1) == "\\" and self._read(-2) != "\\"
+                if data.context & data.CX_NOTE_QUOTE:
+                    data.context ^= data.CX_NOTE_QUOTE
+                    if chunk == '"' and not escaped:
+                        data.context |= data.CX_QUOTED
+                        self._push(self._context)
+                        data.reset = self._head
+                        continue
+                elif data.context & data.CX_QUOTED:
+                    if chunk == '"' and not escaped:
+                        data.context |= data.CX_NOTE_SPACE
+                        continue
+            self._handle_tag_text(chunk)
+
+    def _handle_tag_close_open(self, data, token):
+        """Handle the closing of a open tag (``<foo>``)."""
+        if data.context & (data.CX_ATTR_NAME | data.CX_ATTR_VALUE):
+            self._push_tag_buffer(data)
+        self._emit(token(padding=data.padding_buffer["first"]))
+        self._head += 1
+
+    def _handle_tag_open_close(self):
+        """Handle the opening of a closing tag (``</foo>``)."""
+        self._emit(tokens.TagOpenClose())
+        self._push(contexts.TAG_CLOSE)
+        self._head += 1
+
+    def _handle_tag_close_close(self):
+        """Handle the ending of a closing tag (``</foo>``)."""
+        strip = lambda tok: tok.text.rstrip().lower()
+        closing = self._pop()
+        if len(closing) != 1 or (not isinstance(closing[0], tokens.Text) or
+                                 strip(closing[0]) != strip(self._stack[1])):
+            self._fail_route()
+        self._emit_all(closing)
+        self._emit(tokens.TagCloseClose())
+        return self._pop()
+
+    def _handle_blacklisted_tag(self):
+        """Handle the body of an HTML tag that is parser-blacklisted."""
+        while True:
+            this, next = self._read(), self._read(1)
+            if this is self.END:
+                self._fail_route()
+            elif this == "<" and next == "/":
+                self._handle_tag_open_close()
+                self._head += 1
+                return self._parse(push=False)
+            elif this == "&":
+                self._parse_entity()
+            else:
+                self._emit_text(this)
+            self._head += 1
+
+    def _handle_single_only_tag_end(self):
+        """Handle the end of an implicitly closing single-only HTML tag."""
+        padding = self._stack.pop().padding
+        self._emit(tokens.TagCloseSelfclose(padding=padding, implicit=True))
+        self._head -= 1  # Offset displacement done by _handle_tag_close_open
+        return self._pop()
+
+    def _handle_single_tag_end(self):
+        """Handle the stream end when inside a single-supporting HTML tag."""
+        gen = enumerate(self._stack)
+        index = next(i for i, t in gen if isinstance(t, tokens.TagCloseOpen))
+        padding = self._stack[index].padding
+        token = tokens.TagCloseSelfclose(padding=padding, implicit=True)
+        self._stack[index] = token
+        return self._pop()
+
+    def _really_parse_tag(self):
+        """Actually parse an HTML tag, starting with the open (``<foo>``)."""
+        data = _TagOpenData()
+        self._push(contexts.TAG_OPEN)
+        self._emit(tokens.TagOpenOpen())
+        while True:
+            this, next = self._read(), self._read(1)
+            can_exit = (not data.context & (data.CX_QUOTED | data.CX_NAME) or
+                        data.context & data.CX_NOTE_SPACE)
+            if this is self.END:
+                if self._context & contexts.TAG_ATTR:
+                    if data.context & data.CX_QUOTED:
+                        # Unclosed attribute quote: reset, don't die
+                        data.context = data.CX_ATTR_VALUE
+                        self._pop()
+                        self._head = data.reset
+                        continue
+                    self._pop()
+                self._fail_route()
+            elif this == ">" and can_exit:
+                self._handle_tag_close_open(data, tokens.TagCloseOpen)
+                self._context = contexts.TAG_BODY
+                if is_single_only(self._stack[1].text):
+                    return self._handle_single_only_tag_end()
+                if is_parsable(self._stack[1].text):
+                    return self._parse(push=False)
+                return self._handle_blacklisted_tag()
+            elif this == "/" and next == ">" and can_exit:
+                self._handle_tag_close_open(data, tokens.TagCloseSelfclose)
+                return self._pop()
+            else:
+                self._handle_tag_data(data, this)
+            self._head += 1
+
+    def _handle_invalid_tag_start(self):
+        """Handle the (possible) start of an implicitly closing single tag."""
+        reset = self._head + 1
+        self._head += 2
         try:
-            comment = self._parse(contexts.COMMENT)
+            if not is_single_only(self.tag_splitter.split(self._read())[0]):
+                raise BadRoute()
+            tag = self._really_parse_tag()
         except BadRoute:
             self._head = reset
-            self._write_text("<!--")
+            self._emit_text("</")
         else:
-            self._write(tokens.CommentStart())
-            self._write_all(comment)
-            self._write(tokens.CommentEnd())
-            self._head += 2
+            tag[0].invalid = True  # Set flag of TagOpenOpen
+            self._emit_all(tag)
+
+    def _parse_tag(self):
+        """Parse an HTML tag at the head of the wikicode string."""
+        reset = self._head
+        self._head += 1
+        try:
+            tag = self._really_parse_tag()
+        except BadRoute:
+            self._head = reset
+            self._emit_text("<")
+        else:
+            self._emit_all(tag)
+
+    def _emit_style_tag(self, tag, markup, body):
+        """Write the body of a tag and the tokens that should surround it."""
+        self._emit(tokens.TagOpenOpen(wiki_markup=markup))
+        self._emit_text(tag)
+        self._emit(tokens.TagCloseOpen())
+        self._emit_all(body)
+        self._emit(tokens.TagOpenClose())
+        self._emit_text(tag)
+        self._emit(tokens.TagCloseClose())
+
+    def _parse_italics(self):
+        """Parse wiki-style italics."""
+        reset = self._head
+        try:
+            stack = self._parse(contexts.STYLE_ITALICS)
+        except BadRoute as route:
+            self._head = reset
+            if route.context & contexts.STYLE_PASS_AGAIN:
+                new_ctx = contexts.STYLE_ITALICS | contexts.STYLE_SECOND_PASS
+                stack = self._parse(new_ctx)
+            else:
+                return self._emit_text("''")
+        self._emit_style_tag("i", "''", stack)
+
+    def _parse_bold(self):
+        """Parse wiki-style bold."""
+        reset = self._head
+        try:
+            stack = self._parse(contexts.STYLE_BOLD)
+        except BadRoute:
+            self._head = reset
+            if self._context & contexts.STYLE_SECOND_PASS:
+                self._emit_text("'")
+                return True
+            elif self._context & contexts.STYLE_ITALICS:
+                self._context |= contexts.STYLE_PASS_AGAIN
+                self._emit_text("'''")
+            else:
+                self._emit_text("'")
+                self._parse_italics()
+        else:
+            self._emit_style_tag("b", "'''", stack)
+
+    def _parse_italics_and_bold(self):
+        """Parse wiki-style italics and bold together (i.e., five ticks)."""
+        reset = self._head
+        try:
+            stack = self._parse(contexts.STYLE_BOLD)
+        except BadRoute:
+            self._head = reset
+            try:
+                stack = self._parse(contexts.STYLE_ITALICS)
+            except BadRoute:
+                self._head = reset
+                self._emit_text("'''''")
+            else:
+                reset = self._head
+                try:
+                    stack2 = self._parse(contexts.STYLE_BOLD)
+                except BadRoute:
+                    self._head = reset
+                    self._emit_text("'''")
+                    self._emit_style_tag("i", "''", stack)
+                else:
+                    self._push()
+                    self._emit_style_tag("i", "''", stack)
+                    self._emit_all(stack2)
+                    self._emit_style_tag("b", "'''", self._pop())
+        else:
+            reset = self._head
+            try:
+                stack2 = self._parse(contexts.STYLE_ITALICS)
+            except BadRoute:
+                self._head = reset
+                self._emit_text("''")
+                self._emit_style_tag("b", "'''", stack)
+            else:
+                self._push()
+                self._emit_style_tag("b", "'''", stack)
+                self._emit_all(stack2)
+                self._emit_style_tag("i", "''", self._pop())
+
+    def _parse_style(self):
+        """Parse wiki-style formatting (``''``/``'''`` for italics/bold)."""
+        self._head += 2
+        ticks = 2
+        while self._read() == "'":
+            self._head += 1
+            ticks += 1
+        italics = self._context & contexts.STYLE_ITALICS
+        bold = self._context & contexts.STYLE_BOLD
+
+        if ticks > 5:
+            self._emit_text("'" * (ticks - 5))
+            ticks = 5
+        elif ticks == 4:
+            self._emit_text("'")
+            ticks = 3
+
+        if (italics and ticks in (2, 5)) or (bold and ticks in (3, 5)):
+            if ticks == 5:
+                self._head -= 3 if italics else 2
+            return self._pop()
+        elif not self._can_recurse():
+            if ticks == 3:
+                if self._context & contexts.STYLE_SECOND_PASS:
+                    self._emit_text("'")
+                    return self._pop()
+                if self._context & contexts.STYLE_ITALICS:
+                    self._context |= contexts.STYLE_PASS_AGAIN
+            self._emit_text("'" * ticks)
+        elif ticks == 2:
+            self._parse_italics()
+        elif ticks == 3:
+            if self._parse_bold():
+                return self._pop()
+        elif ticks == 5:
+            self._parse_italics_and_bold()
+        self._head -= 1
+
+    def _handle_list_marker(self):
+        """Handle a list marker at the head (``#``, ``*``, ``;``, ``:``)."""
+        markup = self._read()
+        if markup == ";":
+            self._context |= contexts.DL_TERM
+        self._emit(tokens.TagOpenOpen(wiki_markup=markup))
+        self._emit_text(get_html_tag(markup))
+        self._emit(tokens.TagCloseSelfclose())
+
+    def _handle_list(self):
+        """Handle a wiki-style list (``#``, ``*``, ``;``, ``:``)."""
+        self._handle_list_marker()
+        while self._read(1) in ("#", "*", ";", ":"):
+            self._head += 1
+            self._handle_list_marker()
+
+    def _handle_hr(self):
+        """Handle a wiki-style horizontal rule (``----``) in the string."""
+        length = 4
+        self._head += 3
+        while self._read(1) == "-":
+            length += 1
+            self._head += 1
+        self._emit(tokens.TagOpenOpen(wiki_markup="-" * length))
+        self._emit_text("hr")
+        self._emit(tokens.TagCloseSelfclose())
+
+    def _handle_dl_term(self):
+        """Handle the term in a description list (``foo`` in ``;foo:bar``)."""
+        self._context ^= contexts.DL_TERM
+        if self._read() == ":":
+            self._handle_list_marker()
+        else:
+            self._emit_text("\n")
+
+    def _handle_end(self):
+        """Handle the end of the stream of wikitext."""
+        if self._context & contexts.FAIL:
+            if self._context & contexts.TAG_BODY:
+                if is_single(self._stack[1].text):
+                    return self._handle_single_tag_end()
+            if self._context & contexts.DOUBLE:
+                self._pop()
+            self._fail_route()
+        return self._pop()
 
     def _verify_safe(self, this):
         """Make sure we are not trying to write an invalid character."""
         context = self._context
         if context & contexts.FAIL_NEXT:
             return False
-        if context & contexts.WIKILINK_TITLE:
-            if this == "]" or this == "{":
+        if context & contexts.WIKILINK:
+            if context & contexts.WIKILINK_TEXT:
+                return not (this == self._read(1) == "[")
+            elif this == "]" or this == "{":
                 self._context |= contexts.FAIL_NEXT
             elif this == "\n" or this == "[" or this == "}":
                 return False
             return True
-        if context & contexts.TEMPLATE_NAME:
+        elif context & contexts.EXT_LINK_TITLE:
+            return this != "\n"
+        elif context & contexts.TEMPLATE_NAME:
             if this == "{" or this == "}" or this == "[":
                 self._context |= contexts.FAIL_NEXT
                 return True
@@ -432,6 +1004,8 @@ class Tokenizer(object):
             elif this is self.END or not this.isspace():
                 self._context |= contexts.HAS_TEXT
             return True
+        elif context & contexts.TAG_CLOSE:
+            return this != "<"
         else:
             if context & contexts.FAIL_ON_EQUALS:
                 if this == "=":
@@ -458,44 +1032,29 @@ class Tokenizer(object):
                 self._context |= contexts.FAIL_ON_RBRACE
             return True
 
-    def _parse(self, context=0):
+    def _parse(self, context=0, push=True):
         """Parse the wikicode string, using *context* for when to stop."""
-        self._push(context)
+        if push:
+            self._push(context)
         while True:
             this = self._read()
-            unsafe = (contexts.TEMPLATE_NAME | contexts.WIKILINK_TITLE |
-                      contexts.TEMPLATE_PARAM_KEY | contexts.ARGUMENT_NAME)
-            if self._context & unsafe:
+            if self._context & contexts.UNSAFE:
                 if not self._verify_safe(this):
-                    if self._context & contexts.TEMPLATE_PARAM_KEY:
+                    if self._context & contexts.DOUBLE:
                         self._pop()
                     self._fail_route()
             if this not in self.MARKERS:
-                self._write_text(this)
+                self._emit_text(this)
                 self._head += 1
                 continue
             if this is self.END:
-                fail = (contexts.TEMPLATE | contexts.ARGUMENT |
-                        contexts.WIKILINK | contexts.HEADING |
-                        contexts.COMMENT)
-                if self._context & contexts.TEMPLATE_PARAM_KEY:
-                    self._pop()
-                if self._context & fail:
-                    self._fail_route()
-                return self._pop()
+                return self._handle_end()
             next = self._read(1)
-            if self._context & contexts.COMMENT:
-                if this == next == "-" and self._read(2) == ">":
-                    return self._pop()
-                else:
-                    self._write_text(this)
-            elif this == next == "{":
+            if this == next == "{":
                 if self._can_recurse():
                     self._parse_template_or_argument()
-                    if self._context & contexts.FAIL_NEXT:
-                        self._context ^= contexts.FAIL_NEXT
                 else:
-                    self._write_text("{")
+                    self._emit_text("{")
             elif this == "|" and self._context & contexts.TEMPLATE:
                 self._handle_template_param()
             elif this == "=" and self._context & contexts.TEMPLATE_PARAM_KEY:
@@ -508,23 +1067,27 @@ class Tokenizer(object):
                 if self._read(2) == "}":
                     return self._handle_argument_end()
                 else:
-                    self._write_text("}")
-            elif this == next == "[":
-                if not self._context & contexts.WIKILINK_TITLE and self._can_recurse():
+                    self._emit_text("}")
+            elif this == next == "[" and self._can_recurse():
+                if not self._context & contexts.INVALID_LINK:
                     self._parse_wikilink()
-                    if self._context & contexts.FAIL_NEXT:
-                        self._context ^= contexts.FAIL_NEXT
                 else:
-                    self._write_text("[")
+                    self._emit_text("[")
             elif this == "|" and self._context & contexts.WIKILINK_TITLE:
                 self._handle_wikilink_separator()
             elif this == next == "]" and self._context & contexts.WIKILINK:
                 return self._handle_wikilink_end()
+            elif this == "[":
+                self._parse_external_link(True)
+            elif this == ":" and self._read(-1) not in self.MARKERS:
+                self._parse_external_link(False)
+            elif this == "]" and self._context & contexts.EXT_LINK_TITLE:
+                return self._pop()
             elif this == "=" and not self._global & contexts.GL_HEADING:
                 if self._read(-1) in ("\n", self.START):
                     self._parse_heading()
                 else:
-                    self._write_text("=")
+                    self._emit_text("=")
             elif this == "=" and self._context & contexts.HEADING:
                 return self._handle_heading_end()
             elif this == "\n" and self._context & contexts.HEADING:
@@ -535,13 +1098,39 @@ class Tokenizer(object):
                 if self._read(2) == self._read(3) == "-":
                     self._parse_comment()
                 else:
-                    self._write_text(this)
+                    self._emit_text(this)
+            elif this == "<" and next == "/" and self._read(2) is not self.END:
+                if self._context & contexts.TAG_BODY:
+                    self._handle_tag_open_close()
+                else:
+                    self._handle_invalid_tag_start()
+            elif this == "<" and not self._context & contexts.TAG_CLOSE:
+                if self._can_recurse():
+                    self._parse_tag()
+                else:
+                    self._emit_text("<")
+            elif this == ">" and self._context & contexts.TAG_CLOSE:
+                return self._handle_tag_close_close()
+            elif this == next == "'":
+                result = self._parse_style()
+                if result is not None:
+                    return result
+            elif self._read(-1) in ("\n", self.START):
+                if this in ("#", "*", ";", ":"):
+                    self._handle_list()
+                elif this == next == self._read(2) == self._read(3) == "-":
+                    self._handle_hr()
+                else:
+                    self._emit_text(this)
+            elif this in ("\n", ":") and self._context & contexts.DL_TERM:
+                self._handle_dl_term()
             else:
-                self._write_text(this)
+                self._emit_text(this)
             self._head += 1
 
-    def tokenize(self, text):
+    def tokenize(self, text, context=0):
         """Build a list of tokens from a string of wikicode and return it."""
         split = self.regex.split(text)
         self._text = [segment for segment in split if segment]
-        return self._parse()
+        self._head = self._global = self._depth = self._cycles = 0
+        return self._parse(context)
diff --git a/mwparserfromhell/parser/tokens.py b/mwparserfromhell/parser/tokens.py
index b11ca15..57308ea 100644
--- a/mwparserfromhell/parser/tokens.py
+++ b/mwparserfromhell/parser/tokens.py
@@ -30,7 +30,7 @@ into the :py:class`~.Wikicode` tree by the :py:class:`~.Builder`.
 
 from __future__ import unicode_literals
 
-from ..compat import basestring, py3k
+from ..compat import py3k, str
 
 __all__ = ["Token"]
 
@@ -43,7 +43,7 @@ class Token(object):
     def __repr__(self):
         args = []
         for key, value in self._kwargs.items():
-            if isinstance(value, basestring) and len(value) > 100:
+            if isinstance(value, str) and len(value) > 100:
                 args.append(key + "=" + repr(value[:97] + "..."))
             else:
                 args.append(key + "=" + repr(value))
@@ -55,7 +55,7 @@ class Token(object):
         return False
 
     def __getattr__(self, key):
-        return self._kwargs[key]
+        return self._kwargs.get(key)
 
     def __setattr__(self, key, value):
         self._kwargs[key] = value
@@ -84,6 +84,10 @@ WikilinkOpen = make("WikilinkOpen")                                 # [[
 WikilinkSeparator = make("WikilinkSeparator")                       # |
 WikilinkClose = make("WikilinkClose")                               # ]]
 
+ExternalLinkOpen = make("ExternalLinkOpen")                         # [
+ExternalLinkSeparator = make("ExternalLinkSeparator")               #
+ExternalLinkClose = make("ExternalLinkClose")                       # ]
+
 HTMLEntityStart = make("HTMLEntityStart")                           # &
 HTMLEntityNumeric = make("HTMLEntityNumeric")                       # #
 HTMLEntityHex = make("HTMLEntityHex")                               # x
diff --git a/mwparserfromhell/utils.py b/mwparserfromhell/utils.py
index b797419..758e751 100644
--- a/mwparserfromhell/utils.py
+++ b/mwparserfromhell/utils.py
@@ -31,7 +31,9 @@ from .compat import bytes, str
 from .nodes import Node
 from .smart_list import SmartList
 
-def parse_anything(value):
+__all__ = ["parse_anything"]
+
+def parse_anything(value, context=0):
     """Return a :py:class:`~.Wikicode` for *value*, allowing multiple types.
 
     This differs from :py:meth:`.Parser.parse` in that we accept more than just
@@ -42,6 +44,12 @@ def parse_anything(value):
     on-the-fly by various methods of :py:class:`~.Wikicode` and others like
     :py:class:`~.Template`, such as :py:meth:`wikicode.insert()
     <.Wikicode.insert>` or setting :py:meth:`template.name <.Template.name>`.
+
+    If given, *context* will be passed as a starting context to the parser.
+    This is helpful when this function is used inside node attribute setters.
+    For example, :py:class:`~.ExternalLink`\ 's :py:attr:`~.ExternalLink.url`
+    setter sets *context* to :py:mod:`contexts.EXT_LINK_URI <.contexts>` to
+    prevent the URL itself from becoming an :py:class:`~.ExternalLink`.
     """
     from .parser import Parser
     from .wikicode import Wikicode
@@ -51,17 +59,17 @@ def parse_anything(value):
     elif isinstance(value, Node):
         return Wikicode(SmartList([value]))
     elif isinstance(value, str):
-        return Parser(value).parse()
+        return Parser().parse(value, context)
     elif isinstance(value, bytes):
-        return Parser(value.decode("utf8")).parse()
+        return Parser().parse(value.decode("utf8"), context)
     elif isinstance(value, int):
-        return Parser(str(value)).parse()
+        return Parser().parse(str(value), context)
     elif value is None:
         return Wikicode(SmartList())
     try:
         nodelist = SmartList()
         for item in value:
-            nodelist += parse_anything(item).nodes
+            nodelist += parse_anything(item, context).nodes
     except TypeError:
         error = "Needs string, Node, Wikicode, int, None, or iterable of these, but got {0}: {1}"
         raise ValueError(error.format(type(value).__name__, value))
diff --git a/mwparserfromhell/wikicode.py b/mwparserfromhell/wikicode.py
index 4ec889e..c3249d9 100644
--- a/mwparserfromhell/wikicode.py
+++ b/mwparserfromhell/wikicode.py
@@ -24,8 +24,8 @@ from __future__ import unicode_literals
 import re
 
 from .compat import maxsize, py3k, str
-from .nodes import (Argument, Comment, Heading, HTMLEntity, Node, Tag,
-                    Template, Text, Wikilink)
+from .nodes import (Argument, Comment, ExternalLink, Heading, HTMLEntity,
+                    Node, Tag, Template, Text, Wikilink)
 from .string_mixin import StringMixIn
 from .utils import parse_anything
 
@@ -60,19 +60,6 @@ class Wikicode(StringMixIn):
         for context, child in node.__iternodes__(self._get_all_nodes):
             yield child
 
-    def _get_context(self, node, obj):
-        """Return a ``Wikicode`` that contains *obj* in its descendants.
-
-        The closest (shortest distance from *node*) suitable ``Wikicode`` will
-        be returned, or ``None`` if the *obj* is the *node* itself.
-
-        Raises ``ValueError`` if *obj* is not within *node*.
-        """
-        for context, child in node.__iternodes__(self._get_all_nodes):
-            if self._is_equivalent(obj, child):
-                return context
-        raise ValueError(obj)
-
     def _get_all_nodes(self, code):
         """Iterate over all of our descendant nodes.
 
@@ -105,26 +92,56 @@ class Wikicode(StringMixIn):
             return False
         return obj in nodes
 
-    def _do_search(self, obj, recursive, callback, context, *args, **kwargs):
-        """Look within *context* for *obj*, executing *callback* if found.
+    def _do_search(self, obj, recursive, context=None, literal=None):
+        """Return some info about the location of *obj* within *context*.
 
-        If *recursive* is ``True``, we'll look within context and its
-        descendants, otherwise we'll just execute callback. We raise
-        :py:exc:`ValueError` if *obj* isn't in our node list or context. If
-        found, *callback* is passed the context, the index of the node within
-        the context, and whatever were passed as ``*args`` and ``**kwargs``.
+        If *recursive* is ``True``, we'll look within *context* (``self`` by
+        default) and its descendants, otherwise just *context*. We raise
+        :py:exc:`ValueError` if *obj* isn't found. The return data is a list of
+        3-tuples (*type*, *context*, *data*) where *type* is *obj*\ 's best
+        type resolution (either ``Node``, ``Wikicode``, or ``str``), *context*
+        is the closest ``Wikicode`` encompassing it, and *data* is either a
+        ``Node``, a list of ``Node``\ s, or ``None`` depending on *type*.
         """
-        if recursive:
-            for i, node in enumerate(context.nodes):
-                if self._is_equivalent(obj, node):
-                    return callback(context, i, *args, **kwargs)
-                if self._contains(self._get_children(node), obj):
-                    context = self._get_context(node, obj)
-                    return self._do_search(obj, recursive, callback, context,
-                                           *args, **kwargs)
-            raise ValueError(obj)
+        if not context:
+            context = self
+            literal = isinstance(obj, (Node, Wikicode))
+            obj = parse_anything(obj)
+            if not obj or obj not in self:
+                raise ValueError(obj)
+            if len(obj.nodes) == 1:
+                obj = obj.get(0)
+
+        compare = lambda a, b: (a is b) if literal else (a == b)
+        results = []
+        i = 0
+        while i < len(context.nodes):
+            node = context.get(i)
+            if isinstance(obj, Node) and compare(obj, node):
+                results.append((Node, context, node))
+            elif isinstance(obj, Wikicode) and compare(obj.get(0), node):
+                for j in range(1, len(obj.nodes)):
+                    if not compare(obj.get(j), context.get(i + j)):
+                        break
+                else:
+                    nodes = list(context.nodes[i:i + len(obj.nodes)])
+                    results.append((Wikicode, context, nodes))
+                    i += len(obj.nodes) - 1
+            elif recursive:
+                contexts = node.__iternodes__(self._get_all_nodes)
+                processed = []
+                for code in (ctx for ctx, child in contexts):
+                    if code and code not in processed and obj in code:
+                        search = self._do_search(obj, recursive, code, literal)
+                        results.extend(search)
+                        processed.append(code)
+            i += 1
 
-        callback(context, self.index(obj, recursive=False), *args, **kwargs)
+        if not results and not literal and recursive:
+            results.append((str, context, None))
+        if not results and context is self:
+            raise ValueError(obj)
+        return results
 
     def _get_tree(self, code, lines, marker, indent):
         """Build a tree to illustrate the way the Wikicode object was parsed.
@@ -253,41 +270,64 @@ class Wikicode(StringMixIn):
     def insert_before(self, obj, value, recursive=True):
         """Insert *value* immediately before *obj* in the list of nodes.
 
-        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
-        anything parasable by :py:func:`.parse_anything`. If *recursive* is
-        ``True``, we will try to find *obj* within our child nodes even if it
-        is not a direct descendant of this :py:class:`~.Wikicode` object. If
-        *obj* is not in the node list, :py:exc:`ValueError` is raised.
+        *obj* can be either a string, a :py:class:`~.Node`, or other
+        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
+        for example). *value* can be anything parasable by
+        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
+        find *obj* within our child nodes even if it is not a direct descendant
+        of this :py:class:`~.Wikicode` object. If *obj* is not found,
+        :py:exc:`ValueError` is raised.
         """
-        callback = lambda self, i, value: self.insert(i, value)
-        self._do_search(obj, recursive, callback, self, value)
+        for restype, context, data in self._do_search(obj, recursive):
+            if restype in (Node, Wikicode):
+                i = context.index(data if restype is Node else data[0], False)
+                context.insert(i, value)
+            else:
+                obj = str(obj)
+                context.nodes = str(context).replace(obj, str(value) + obj)
 
     def insert_after(self, obj, value, recursive=True):
         """Insert *value* immediately after *obj* in the list of nodes.
 
-        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
-        anything parasable by :py:func:`.parse_anything`. If *recursive* is
-        ``True``, we will try to find *obj* within our child nodes even if it
-        is not a direct descendant of this :py:class:`~.Wikicode` object. If
-        *obj* is not in the node list, :py:exc:`ValueError` is raised.
+        *obj* can be either a string, a :py:class:`~.Node`, or other
+        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
+        for example). *value* can be anything parasable by
+        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
+        find *obj* within our child nodes even if it is not a direct descendant
+        of this :py:class:`~.Wikicode` object. If *obj* is not found,
+        :py:exc:`ValueError` is raised.
         """
-        callback = lambda self, i, value: self.insert(i + 1, value)
-        self._do_search(obj, recursive, callback, self, value)
+        for restype, context, data in self._do_search(obj, recursive):
+            if restype in (Node, Wikicode):
+                i = context.index(data if restype is Node else data[-1], False)
+                context.insert(i + 1, value)
+            else:
+                obj = str(obj)
+                context.nodes = str(context).replace(obj, obj + str(value))
 
     def replace(self, obj, value, recursive=True):
         """Replace *obj* with *value* in the list of nodes.
 
-        *obj* can be either a string or a :py:class:`~.Node`. *value* can be
-        anything parasable by :py:func:`.parse_anything`. If *recursive* is
-        ``True``, we will try to find *obj* within our child nodes even if it
-        is not a direct descendant of this :py:class:`~.Wikicode` object. If
-        *obj* is not in the node list, :py:exc:`ValueError` is raised.
+        *obj* can be either a string, a :py:class:`~.Node`, or other
+        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
+        for example). *value* can be anything parasable by
+        :py:func:`.parse_anything`. If *recursive* is ``True``, we will try to
+        find *obj* within our child nodes even if it is not a direct descendant
+        of this :py:class:`~.Wikicode` object. If *obj* is not found,
+        :py:exc:`ValueError` is raised.
         """
-        def callback(self, i, value):
-            self.nodes.pop(i)
-            self.insert(i, value)
-
-        self._do_search(obj, recursive, callback, self, value)
+        for restype, context, data in self._do_search(obj, recursive):
+            if restype is Node:
+                i = context.index(data, False)
+                context.nodes.pop(i)
+                context.insert(i, value)
+            elif restype is Wikicode:
+                i = context.index(data[0], False)
+                for _ in data:
+                    context.nodes.pop(i)
+                context.insert(i, value)
+            else:
+                context.nodes = str(context).replace(str(obj), str(value))
 
     def append(self, value):
         """Insert *value* at the end of the list of nodes.
@@ -301,15 +341,39 @@ class Wikicode(StringMixIn):
     def remove(self, obj, recursive=True):
         """Remove *obj* from the list of nodes.
 
-        *obj* can be either a string or a :py:class:`~.Node`. If *recursive* is
-        ``True``, we will try to find *obj* within our child nodes even if it
-        is not a direct descendant of this :py:class:`~.Wikicode` object. If
-        *obj* is not in the node list, :py:exc:`ValueError` is raised.
+        *obj* can be either a string, a :py:class:`~.Node`, or other
+        :py:class:`~.Wikicode` object (as created by :py:meth:`get_sections`,
+        for example). If *recursive* is ``True``, we will try to find *obj*
+        within our child nodes even if it is not a direct descendant of this
+        :py:class:`~.Wikicode` object. If *obj* is not found,
+        :py:exc:`ValueError` is raised.
+        """
+        for restype, context, data in self._do_search(obj, recursive):
+            if restype is Node:
+                context.nodes.pop(context.index(data, False))
+            elif restype is Wikicode:
+                i = context.index(data[0], False)
+                for _ in data:
+                    context.nodes.pop(i)
+            else:
+                context.nodes = str(context).replace(str(obj), "")
+
+    def matches(self, other):
+        """Do a loose equivalency test suitable for comparing page names.
+
+        *other* can be any string-like object, including
+        :py:class:`~.Wikicode`. This operation is symmetric; both sides are
+        adjusted. Specifically, whitespace and markup is stripped and the first
+        letter's case is normalized. Typical usage is
+        ``if template.name.matches("stub"): ...``.
         """
-        callback = lambda self, i: self.nodes.pop(i)
-        self._do_search(obj, recursive, callback, self)
+        this = self.strip_code().strip()
+        that = parse_anything(other).strip_code().strip()
+        if not this or not that:
+            return this == that
+        return this[0].upper() + this[1:] == that[0].upper() + that[1:]
 
-    def ifilter(self, recursive=False, matches=None, flags=FLAGS,
+    def ifilter(self, recursive=True, matches=None, flags=FLAGS,
                 forcetype=None):
         """Iterate over nodes in our list matching certain conditions.
 
@@ -327,7 +391,7 @@ class Wikicode(StringMixIn):
                 if not matches or re.search(matches, str(node), flags):
                     yield node
 
-    def filter(self, recursive=False, matches=None, flags=FLAGS,
+    def filter(self, recursive=True, matches=None, flags=FLAGS,
                forcetype=None):
         """Return a list of nodes within our list matching certain conditions.
 
@@ -360,9 +424,8 @@ class Wikicode(StringMixIn):
         """
         if matches:
             matches = r"^(=+?)\s*" + matches + r"\s*\1$"
-        headings = self.filter_headings(recursive=True)
-        filtered = self.filter_headings(recursive=True, matches=matches,
-                                        flags=flags)
+        headings = self.filter_headings()
+        filtered = self.filter_headings(matches=matches, flags=flags)
         if levels:
             filtered = [head for head in filtered if head.level in levels]
 
@@ -446,6 +509,6 @@ class Wikicode(StringMixIn):
         return "\n".join(self._get_tree(self, [], marker, 0))
 
 Wikicode._build_filter_methods(
-    arguments=Argument, comments=Comment, headings=Heading,
-    html_entities=HTMLEntity, tags=Tag, templates=Template, text=Text,
-    wikilinks=Wikilink)
+    arguments=Argument, comments=Comment, external_links=ExternalLink,
+    headings=Heading, html_entities=HTMLEntity, tags=Tag, templates=Template,
+    text=Text, wikilinks=Wikilink)
diff --git a/setup.py b/setup.py
index 8b4ae86..3ef7e0e 100644
--- a/setup.py
+++ b/setup.py
@@ -29,16 +29,13 @@ from mwparserfromhell.compat import py3k
 with open("README.rst") as fp:
     long_docs = fp.read()
 
-# builder = Extension("mwparserfromhell.parser._builder",
-#                     sources = ["mwparserfromhell/parser/builder.c"])
-
 tokenizer = Extension("mwparserfromhell.parser._tokenizer",
                       sources = ["mwparserfromhell/parser/tokenizer.c"])
 
 setup(
     name = "mwparserfromhell",
     packages = find_packages(exclude=("tests",)),
-    ext_modules = [] if py3k else [tokenizer],
+    ext_modules = [tokenizer],
     test_suite = "tests",
     version = __version__,
     author = "Ben Kurtovic",
@@ -50,13 +47,13 @@ setup(
     keywords = "earwig mwparserfromhell wikipedia wiki mediawiki wikicode template parsing",
     license = "MIT License",
     classifiers = [
-        "Development Status :: 3 - Alpha",
+        "Development Status :: 4 - Beta",
         "Environment :: Console",
         "Intended Audience :: Developers",
         "License :: OSI Approved :: MIT License",
         "Operating System :: OS Independent",
         "Programming Language :: Python :: 2.7",
-        "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.3",
         "Topic :: Text Processing :: Markup"
     ],
 )
diff --git a/tests/_test_tree_equality.py b/tests/_test_tree_equality.py
index 52130ed..3267b45 100644
--- a/tests/_test_tree_equality.py
+++ b/tests/_test_tree_equality.py
@@ -91,7 +91,27 @@ class TreeEqualityTestCase(TestCase):
 
     def assertTagNodeEqual(self, expected, actual):
         """Assert that two Tag nodes have the same data."""
-        self.fail("Holding this until feature/html_tags is ready.")
+        self.assertWikicodeEqual(expected.tag, actual.tag)
+        if expected.contents is not None:
+            self.assertWikicodeEqual(expected.contents, actual.contents)
+        length = len(expected.attributes)
+        self.assertEqual(length, len(actual.attributes))
+        for i in range(length):
+            exp_attr = expected.attributes[i]
+            act_attr = actual.attributes[i]
+            self.assertWikicodeEqual(exp_attr.name, act_attr.name)
+            if exp_attr.value is not None:
+                self.assertWikicodeEqual(exp_attr.value, act_attr.value)
+                self.assertIs(exp_attr.quoted, act_attr.quoted)
+            self.assertEqual(exp_attr.pad_first, act_attr.pad_first)
+            self.assertEqual(exp_attr.pad_before_eq, act_attr.pad_before_eq)
+            self.assertEqual(exp_attr.pad_after_eq, act_attr.pad_after_eq)
+        self.assertIs(expected.wiki_markup, actual.wiki_markup)
+        self.assertIs(expected.self_closing, actual.self_closing)
+        self.assertIs(expected.invalid, actual.invalid)
+        self.assertIs(expected.implicit, actual.implicit)
+        self.assertEqual(expected.padding, actual.padding)
+        self.assertWikicodeEqual(expected.closing_tag, actual.closing_tag)
 
     def assertTemplateNodeEqual(self, expected, actual):
         """Assert that two Template nodes have the same data."""
diff --git a/tests/test_attribute.py b/tests/test_attribute.py
new file mode 100644
index 0000000..f34c670
--- /dev/null
+++ b/tests/test_attribute.py
@@ -0,0 +1,89 @@
+# -*- coding: utf-8  -*-
+#
+# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+from __future__ import unicode_literals
+import unittest
+
+from mwparserfromhell.compat import str
+from mwparserfromhell.nodes import Template
+from mwparserfromhell.nodes.extras import Attribute
+
+from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext
+
+class TestAttribute(TreeEqualityTestCase):
+    """Test cases for the Attribute node extra."""
+
+    def test_unicode(self):
+        """test Attribute.__unicode__()"""
+        node = Attribute(wraptext("foo"))
+        self.assertEqual(" foo", str(node))
+        node2 = Attribute(wraptext("foo"), wraptext("bar"))
+        self.assertEqual(' foo="bar"', str(node2))
+        node3 = Attribute(wraptext("a"), wraptext("b"), True, "", " ", "   ")
+        self.assertEqual('a =   "b"', str(node3))
+        node3 = Attribute(wraptext("a"), wraptext("b"), False, "", " ", "   ")
+        self.assertEqual("a =   b", str(node3))
+        node4 = Attribute(wraptext("a"), wrap([]), False, " ", "", " ")
+        self.assertEqual(" a= ", str(node4))
+
+    def test_name(self):
+        """test getter/setter for the name attribute"""
+        name = wraptext("id")
+        node = Attribute(name, wraptext("bar"))
+        self.assertIs(name, node.name)
+        node.name = "{{id}}"
+        self.assertWikicodeEqual(wrap([Template(wraptext("id"))]), node.name)
+
+    def test_value(self):
+        """test getter/setter for the value attribute"""
+        value = wraptext("foo")
+        node = Attribute(wraptext("id"), value)
+        self.assertIs(value, node.value)
+        node.value = "{{bar}}"
+        self.assertWikicodeEqual(wrap([Template(wraptext("bar"))]), node.value)
+        node.value = None
+        self.assertIs(None, node.value)
+
+    def test_quoted(self):
+        """test getter/setter for the quoted attribute"""
+        node1 = Attribute(wraptext("id"), wraptext("foo"), False)
+        node2 = Attribute(wraptext("id"), wraptext("bar"))
+        self.assertFalse(node1.quoted)
+        self.assertTrue(node2.quoted)
+        node1.quoted = True
+        node2.quoted = ""
+        self.assertTrue(node1.quoted)
+        self.assertFalse(node2.quoted)
+
+    def test_padding(self):
+        """test getter/setter for the padding attributes"""
+        for pad in ["pad_first", "pad_before_eq", "pad_after_eq"]:
+            node = Attribute(wraptext("id"), wraptext("foo"), **{pad: "\n"})
+            self.assertEqual("\n", getattr(node, pad))
+            setattr(node, pad, " ")
+            self.assertEqual(" ", getattr(node, pad))
+            setattr(node, pad, None)
+            self.assertEqual("", getattr(node, pad))
+            self.assertRaises(ValueError, setattr, node, pad, True)
+
+if __name__ == "__main__":
+    unittest.main(verbosity=2)
diff --git a/tests/test_builder.py b/tests/test_builder.py
index 2d44b6c..152ab53 100644
--- a/tests/test_builder.py
+++ b/tests/test_builder.py
@@ -23,8 +23,8 @@
 from __future__ import unicode_literals
 import unittest
 
-from mwparserfromhell.nodes import (Argument, Comment, Heading, HTMLEntity,
-                                    Tag, Template, Text, Wikilink)
+from mwparserfromhell.nodes import (Argument, Comment, ExternalLink, Heading,
+                                    HTMLEntity, Tag, Template, Text, Wikilink)
 from mwparserfromhell.nodes.extras import Attribute, Parameter
 from mwparserfromhell.parser import tokens
 from mwparserfromhell.parser.builder import Builder
@@ -72,6 +72,14 @@ class TestBuilder(TreeEqualityTestCase):
              wrap([Template(wraptext("foo"), params=[
                  Parameter(wraptext("bar"), wraptext("baz"))])])),
 
+            ([tokens.TemplateOpen(), tokens.TemplateParamSeparator(),
+              tokens.TemplateParamSeparator(), tokens.TemplateParamEquals(),
+              tokens.TemplateParamSeparator(), tokens.TemplateClose()],
+             wrap([Template(wrap([]), params=[
+                 Parameter(wraptext("1"), wrap([]), showkey=False),
+                 Parameter(wrap([]), wrap([]), showkey=True),
+                 Parameter(wraptext("2"), wrap([]), showkey=False)])])),
+
             ([tokens.TemplateOpen(), tokens.Text(text="foo"),
               tokens.TemplateParamSeparator(), tokens.Text(text="bar"),
               tokens.TemplateParamEquals(), tokens.Text(text="baz"),
@@ -142,6 +150,48 @@ class TestBuilder(TreeEqualityTestCase):
         for test, valid in tests:
             self.assertWikicodeEqual(valid, self.builder.build(test))
 
+    def test_external_link(self):
+        """tests for building ExternalLink nodes"""
+        tests = [
+            ([tokens.ExternalLinkOpen(brackets=False),
+              tokens.Text(text="http://example.com/"),
+              tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example.com/"),
+                                brackets=False)])),
+
+            ([tokens.ExternalLinkOpen(brackets=True),
+              tokens.Text(text="http://example.com/"),
+              tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example.com/"))])),
+
+            ([tokens.ExternalLinkOpen(brackets=True),
+              tokens.Text(text="http://example.com/"),
+              tokens.ExternalLinkSeparator(), tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example.com/"), wrap([]))])),
+
+            ([tokens.ExternalLinkOpen(brackets=True),
+              tokens.Text(text="http://example.com/"),
+              tokens.ExternalLinkSeparator(), tokens.Text(text="Example"),
+              tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example.com/"),
+                                wraptext("Example"))])),
+
+            ([tokens.ExternalLinkOpen(brackets=False),
+              tokens.Text(text="http://example"), tokens.Text(text=".com/foo"),
+              tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example", ".com/foo"),
+                                brackets=False)])),
+
+            ([tokens.ExternalLinkOpen(brackets=True),
+              tokens.Text(text="http://example"), tokens.Text(text=".com/foo"),
+              tokens.ExternalLinkSeparator(), tokens.Text(text="Example"),
+              tokens.Text(text=" Web Page"), tokens.ExternalLinkClose()],
+             wrap([ExternalLink(wraptext("http://example", ".com/foo"),
+                                wraptext("Example", " Web Page"))])),
+        ]
+        for test, valid in tests:
+            self.assertWikicodeEqual(valid, self.builder.build(test))
+
     def test_html_entity(self):
         """tests for building HTMLEntity nodes"""
         tests = [
@@ -190,6 +240,129 @@ class TestBuilder(TreeEqualityTestCase):
         for test, valid in tests:
             self.assertWikicodeEqual(valid, self.builder.build(test))
 
+    def test_tag(self):
+        """tests for building Tag nodes"""
+        tests = [
+            # <ref></ref>
+            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
+              tokens.TagCloseOpen(padding=""), tokens.TagOpenClose(),
+              tokens.Text(text="ref"), tokens.TagCloseClose()],
+             wrap([Tag(wraptext("ref"), wrap([]),
+                       closing_tag=wraptext("ref"))])),
+
+            # <ref name></ref>
+            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
+              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="name"), tokens.TagCloseOpen(padding=""),
+              tokens.TagOpenClose(), tokens.Text(text="ref"),
+              tokens.TagCloseClose()],
+             wrap([Tag(wraptext("ref"), wrap([]),
+                      attrs=[Attribute(wraptext("name"))])])),
+
+            # <ref name="abc" />
+            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
+              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="name"), tokens.TagAttrEquals(),
+              tokens.TagAttrQuote(), tokens.Text(text="abc"),
+              tokens.TagCloseSelfclose(padding=" ")],
+             wrap([Tag(wraptext("ref"),
+                       attrs=[Attribute(wraptext("name"), wraptext("abc"))],
+                       self_closing=True, padding=" ")])),
+
+            # <br/>
+            ([tokens.TagOpenOpen(), tokens.Text(text="br"),
+              tokens.TagCloseSelfclose(padding="")],
+             wrap([Tag(wraptext("br"), self_closing=True)])),
+
+            # <li>
+            ([tokens.TagOpenOpen(), tokens.Text(text="li"),
+              tokens.TagCloseSelfclose(padding="", implicit=True)],
+             wrap([Tag(wraptext("li"), self_closing=True, implicit=True)])),
+
+            # </br>
+            ([tokens.TagOpenOpen(invalid=True), tokens.Text(text="br"),
+              tokens.TagCloseSelfclose(padding="", implicit=True)],
+             wrap([Tag(wraptext("br"), self_closing=True, invalid=True,
+                       implicit=True)])),
+
+            # </br/>
+            ([tokens.TagOpenOpen(invalid=True), tokens.Text(text="br"),
+              tokens.TagCloseSelfclose(padding="")],
+             wrap([Tag(wraptext("br"), self_closing=True, invalid=True)])),
+
+            # <ref name={{abc}}   foo="bar {{baz}}" abc={{de}}f ghi=j{{k}}{{l}}
+            #      mno =  "{{p}} [[q]] {{r}}">[[Source]]</ref>
+            ([tokens.TagOpenOpen(), tokens.Text(text="ref"),
+              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="name"), tokens.TagAttrEquals(),
+              tokens.TemplateOpen(), tokens.Text(text="abc"),
+              tokens.TemplateClose(),
+              tokens.TagAttrStart(pad_first="   ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="foo"), tokens.TagAttrEquals(),
+              tokens.TagAttrQuote(), tokens.Text(text="bar "),
+              tokens.TemplateOpen(), tokens.Text(text="baz"),
+              tokens.TemplateClose(),
+              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="abc"), tokens.TagAttrEquals(),
+              tokens.TemplateOpen(), tokens.Text(text="de"),
+              tokens.TemplateClose(), tokens.Text(text="f"),
+              tokens.TagAttrStart(pad_first=" ", pad_before_eq="",
+                                  pad_after_eq=""),
+              tokens.Text(text="ghi"), tokens.TagAttrEquals(),
+              tokens.Text(text="j"), tokens.TemplateOpen(),
+              tokens.Text(text="k"), tokens.TemplateClose(),
+              tokens.TemplateOpen(), tokens.Text(text="l"),
+              tokens.TemplateClose(),
+              tokens.TagAttrStart(pad_first=" \n ", pad_before_eq=" ",
+                                  pad_after_eq="  "),
+              tokens.Text(text="mno"), tokens.TagAttrEquals(),
+              tokens.TagAttrQuote(), tokens.TemplateOpen(),
+              tokens.Text(text="p"), tokens.TemplateClose(),
+              tokens.Text(text=" "), tokens.WikilinkOpen(),
+              tokens.Text(text="q"), tokens.WikilinkClose(),
+              tokens.Text(text=" "), tokens.TemplateOpen(),
+              tokens.Text(text="r"), tokens.TemplateClose(),
+              tokens.TagCloseOpen(padding=""), tokens.WikilinkOpen(),
+              tokens.Text(text="Source"), tokens.WikilinkClose(),
+              tokens.TagOpenClose(), tokens.Text(text="ref"),
+              tokens.TagCloseClose()],
+             wrap([Tag(wraptext("ref"), wrap([Wikilink(wraptext("Source"))]), [
+                    Attribute(wraptext("name"),
+                              wrap([Template(wraptext("abc"))]), False),
+                    Attribute(wraptext("foo"), wrap([Text("bar "),
+                              Template(wraptext("baz"))]), pad_first="   "),
+                    Attribute(wraptext("abc"), wrap([Template(wraptext("de")),
+                              Text("f")]), False),
+                    Attribute(wraptext("ghi"), wrap([Text("j"),
+                              Template(wraptext("k")),
+                              Template(wraptext("l"))]), False),
+                    Attribute(wraptext("mno"), wrap([Template(wraptext("p")),
+                              Text(" "), Wikilink(wraptext("q")), Text(" "),
+                              Template(wraptext("r"))]), True, " \n ", " ",
+                              "  ")])])),
+
+            # "''italic text''"
+            ([tokens.TagOpenOpen(wiki_markup="''"), tokens.Text(text="i"),
+              tokens.TagCloseOpen(), tokens.Text(text="italic text"),
+              tokens.TagOpenClose(), tokens.Text(text="i"),
+              tokens.TagCloseClose()],
+             wrap([Tag(wraptext("i"), wraptext("italic text"),
+                       wiki_markup="''")])),
+
+            # * bullet
+            ([tokens.TagOpenOpen(wiki_markup="*"), tokens.Text(text="li"),
+              tokens.TagCloseSelfclose(), tokens.Text(text=" bullet")],
+             wrap([Tag(wraptext("li"), wiki_markup="*", self_closing=True),
+                   Text(" bullet")])),
+        ]
+        for test, valid in tests:
+            self.assertWikicodeEqual(valid, self.builder.build(test))
+
     def test_integration(self):
         """a test for building a combination of templates together"""
         # {{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}
diff --git a/tests/test_docs.py b/tests/test_docs.py
index 8d95c47..6d066bd 100644
--- a/tests/test_docs.py
+++ b/tests/test_docs.py
@@ -61,36 +61,36 @@ class TestDocs(unittest.TestCase):
 
     def test_readme_2(self):
         """test a block of example code in the README"""
+        text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
+        temps = mwparserfromhell.parse(text).filter_templates()
+        if py3k:
+            res = "['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']"
+        else:
+            res = "[u'{{foo|{{bar}}={{baz|{{spam}}}}}}', u'{{bar}}', u'{{baz|{{spam}}}}', u'{{spam}}']"
+        self.assertPrint(temps, res)
+
+    def test_readme_3(self):
+        """test a block of example code in the README"""
         code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}")
         if py3k:
-            self.assertPrint(code.filter_templates(),
+            self.assertPrint(code.filter_templates(recursive=False),
                              "['{{foo|this {{includes a|template}}}}']")
         else:
-            self.assertPrint(code.filter_templates(),
+            self.assertPrint(code.filter_templates(recursive=False),
                              "[u'{{foo|this {{includes a|template}}}}']")
-        foo = code.filter_templates()[0]
+        foo = code.filter_templates(recursive=False)[0]
         self.assertPrint(foo.get(1).value, "this {{includes a|template}}")
         self.assertPrint(foo.get(1).value.filter_templates()[0],
                          "{{includes a|template}}")
         self.assertPrint(foo.get(1).value.filter_templates()[0].get(1).value,
                          "template")
 
-    def test_readme_3(self):
-        """test a block of example code in the README"""
-        text = "{{foo|{{bar}}={{baz|{{spam}}}}}}"
-        temps = mwparserfromhell.parse(text).filter_templates(recursive=True)
-        if py3k:
-            res = "['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']"
-        else:
-            res = "[u'{{foo|{{bar}}={{baz|{{spam}}}}}}', u'{{bar}}', u'{{baz|{{spam}}}}', u'{{spam}}']"
-        self.assertPrint(temps, res)
-
     def test_readme_4(self):
         """test a block of example code in the README"""
         text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}"
         code = mwparserfromhell.parse(text)
         for template in code.filter_templates():
-            if template.name == "cleanup" and not template.has_param("date"):
+            if template.name.matches("Cleanup") and not template.has("date"):
                 template.add("date", "July 2012")
         res = "{{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{uncategorized}}"
         self.assertPrint(code, res)
diff --git a/tests/test_external_link.py b/tests/test_external_link.py
new file mode 100644
index 0000000..13a82bf
--- /dev/null
+++ b/tests/test_external_link.py
@@ -0,0 +1,130 @@
+# -*- coding: utf-8  -*-
+#
+# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+from __future__ import unicode_literals
+import unittest
+
+from mwparserfromhell.compat import str
+from mwparserfromhell.nodes import ExternalLink, Text
+
+from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext
+
+class TestExternalLink(TreeEqualityTestCase):
+    """Test cases for the ExternalLink node."""
+
+    def test_unicode(self):
+        """test ExternalLink.__unicode__()"""
+        node = ExternalLink(wraptext("http://example.com/"), brackets=False)
+        self.assertEqual("http://example.com/", str(node))
+        node2 = ExternalLink(wraptext("http://example.com/"))
+        self.assertEqual("[http://example.com/]", str(node2))
+        node3 = ExternalLink(wraptext("http://example.com/"), wrap([]))
+        self.assertEqual("[http://example.com/ ]", str(node3))
+        node4 = ExternalLink(wraptext("http://example.com/"),
+                             wraptext("Example Web Page"))
+        self.assertEqual("[http://example.com/ Example Web Page]", str(node4))
+
+    def test_iternodes(self):
+        """test ExternalLink.__iternodes__()"""
+        node1n1 = Text("http://example.com/")
+        node2n1 = Text("http://example.com/")
+        node2n2, node2n3 = Text("Example"), Text("Page")
+        node1 = ExternalLink(wrap([node1n1]), brackets=False)
+        node2 = ExternalLink(wrap([node2n1]), wrap([node2n2, node2n3]))
+        gen1 = node1.__iternodes__(getnodes)
+        gen2 = node2.__iternodes__(getnodes)
+        self.assertEqual((None, node1), next(gen1))
+        self.assertEqual((None, node2), next(gen2))
+        self.assertEqual((node1.url, node1n1), next(gen1))
+        self.assertEqual((node2.url, node2n1), next(gen2))
+        self.assertEqual((node2.title, node2n2), next(gen2))
+        self.assertEqual((node2.title, node2n3), next(gen2))
+        self.assertRaises(StopIteration, next, gen1)
+        self.assertRaises(StopIteration, next, gen2)
+
+    def test_strip(self):
+        """test ExternalLink.__strip__()"""
+        node1 = ExternalLink(wraptext("http://example.com"), brackets=False)
+        node2 = ExternalLink(wraptext("http://example.com"))
+        node3 = ExternalLink(wraptext("http://example.com"), wrap([]))
+        node4 = ExternalLink(wraptext("http://example.com"), wraptext("Link"))
+        for a in (True, False):
+            for b in (True, False):
+                self.assertEqual("http://example.com", node1.__strip__(a, b))
+                self.assertEqual(None, node2.__strip__(a, b))
+                self.assertEqual(None, node3.__strip__(a, b))
+                self.assertEqual("Link", node4.__strip__(a, b))
+
+    def test_showtree(self):
+        """test ExternalLink.__showtree__()"""
+        output = []
+        getter, marker = object(), object()
+        get = lambda code: output.append((getter, code))
+        mark = lambda: output.append(marker)
+        node1 = ExternalLink(wraptext("http://example.com"), brackets=False)
+        node2 = ExternalLink(wraptext("http://example.com"), wraptext("Link"))
+        node1.__showtree__(output.append, get, mark)
+        node2.__showtree__(output.append, get, mark)
+        valid = [
+            (getter, node1.url), "[", (getter, node2.url),
+            (getter, node2.title), "]"]
+        self.assertEqual(valid, output)
+
+    def test_url(self):
+        """test getter/setter for the url attribute"""
+        url = wraptext("http://example.com/")
+        node1 = ExternalLink(url, brackets=False)
+        node2 = ExternalLink(url, wraptext("Example"))
+        self.assertIs(url, node1.url)
+        self.assertIs(url, node2.url)
+        node1.url = "mailto:héhehé@spam.com"
+        node2.url = "mailto:héhehé@spam.com"
+        self.assertWikicodeEqual(wraptext("mailto:héhehé@spam.com"), node1.url)
+        self.assertWikicodeEqual(wraptext("mailto:héhehé@spam.com"), node2.url)
+
+    def test_title(self):
+        """test getter/setter for the title attribute"""
+        title = wraptext("Example!")
+        node1 = ExternalLink(wraptext("http://example.com/"), brackets=False)
+        node2 = ExternalLink(wraptext("http://example.com/"), title)
+        self.assertIs(None, node1.title)
+        self.assertIs(title, node2.title)
+        node2.title = None
+        self.assertIs(None, node2.title)
+        node2.title = "My Website"
+        self.assertWikicodeEqual(wraptext("My Website"), node2.title)
+
+    def test_brackets(self):
+        """test getter/setter for the brackets attribute"""
+        node1 = ExternalLink(wraptext("http://example.com/"), brackets=False)
+        node2 = ExternalLink(wraptext("http://example.com/"), wraptext("Link"))
+        self.assertFalse(node1.brackets)
+        self.assertTrue(node2.brackets)
+        node1.brackets = True
+        node2.brackets = False
+        self.assertTrue(node1.brackets)
+        self.assertFalse(node2.brackets)
+        self.assertEqual("[http://example.com/]", str(node1))
+        self.assertEqual("http://example.com/", str(node2))
+
+if __name__ == "__main__":
+    unittest.main(verbosity=2)
diff --git a/tests/test_parser.py b/tests/test_parser.py
index ec5f065..8760c0e 100644
--- a/tests/test_parser.py
+++ b/tests/test_parser.py
@@ -36,9 +36,9 @@ class TestParser(TreeEqualityTestCase):
     def test_use_c(self):
         """make sure the correct tokenizer is used"""
         if parser.use_c:
-            self.assertTrue(parser.Parser(None)._tokenizer.USES_C)
+            self.assertTrue(parser.Parser()._tokenizer.USES_C)
             parser.use_c = False
-        self.assertFalse(parser.Parser(None)._tokenizer.USES_C)
+        self.assertFalse(parser.Parser()._tokenizer.USES_C)
 
     def test_parsing(self):
         """integration test for parsing overall"""
@@ -59,7 +59,7 @@ class TestParser(TreeEqualityTestCase):
                 ]))
             ])
         ])
-        actual = parser.Parser(text).parse()
+        actual = parser.Parser().parse(text)
         self.assertWikicodeEqual(expected, actual)
 
 if __name__ == "__main__":
diff --git a/tests/test_tag.py b/tests/test_tag.py
new file mode 100644
index 0000000..5ef92a5
--- /dev/null
+++ b/tests/test_tag.py
@@ -0,0 +1,315 @@
+# -*- coding: utf-8  -*-
+#
+# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+from __future__ import unicode_literals
+import unittest
+
+from mwparserfromhell.compat import str
+from mwparserfromhell.nodes import Tag, Template, Text
+from mwparserfromhell.nodes.extras import Attribute
+from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext
+
+agen = lambda name, value: Attribute(wraptext(name), wraptext(value))
+agennq = lambda name, value: Attribute(wraptext(name), wraptext(value), False)
+agenp = lambda name, v, a, b, c: Attribute(wraptext(name), v, True, a, b, c)
+agenpnv = lambda name, a, b, c: Attribute(wraptext(name), None, True, a, b, c)
+
+class TestTag(TreeEqualityTestCase):
+    """Test cases for the Tag node."""
+
+    def test_unicode(self):
+        """test Tag.__unicode__()"""
+        node1 = Tag(wraptext("ref"))
+        node2 = Tag(wraptext("span"), wraptext("foo"),
+                    [agen("style", "color: red;")])
+        node3 = Tag(wraptext("ref"),
+                    attrs=[agennq("name", "foo"),
+                           agenpnv("some_attr", "   ", "", "")],
+                    self_closing=True)
+        node4 = Tag(wraptext("br"), self_closing=True, padding=" ")
+        node5 = Tag(wraptext("br"), self_closing=True, implicit=True)
+        node6 = Tag(wraptext("br"), self_closing=True, invalid=True,
+                    implicit=True)
+        node7 = Tag(wraptext("br"), self_closing=True, invalid=True,
+                    padding=" ")
+        node8 = Tag(wraptext("hr"), wiki_markup="----", self_closing=True)
+        node9 = Tag(wraptext("i"), wraptext("italics!"), wiki_markup="''")
+
+        self.assertEqual("<ref></ref>", str(node1))
+        self.assertEqual('<span style="color: red;">foo</span>', str(node2))
+        self.assertEqual("<ref name=foo   some_attr/>", str(node3))
+        self.assertEqual("<br />", str(node4))
+        self.assertEqual("<br>", str(node5))
+        self.assertEqual("</br>", str(node6))
+        self.assertEqual("</br />", str(node7))
+        self.assertEqual("----", str(node8))
+        self.assertEqual("''italics!''", str(node9))
+
+    def test_iternodes(self):
+        """test Tag.__iternodes__()"""
+        node1n1, node1n2 = Text("ref"), Text("foobar")
+        node2n1, node3n1, node3n2 = Text("bold text"), Text("img"), Text("id")
+        node3n3, node3n4, node3n5 = Text("foo"), Text("class"), Text("bar")
+
+        # <ref>foobar</ref>
+        node1 = Tag(wrap([node1n1]), wrap([node1n2]))
+        # '''bold text'''
+        node2 = Tag(wraptext("b"), wrap([node2n1]), wiki_markup="'''")
+        # <img id="foo" class="bar" />
+        node3 = Tag(wrap([node3n1]),
+                    attrs=[Attribute(wrap([node3n2]), wrap([node3n3])),
+                           Attribute(wrap([node3n4]), wrap([node3n5]))],
+                    self_closing=True, padding=" ")
+
+        gen1 = node1.__iternodes__(getnodes)
+        gen2 = node2.__iternodes__(getnodes)
+        gen3 = node3.__iternodes__(getnodes)
+        self.assertEqual((None, node1), next(gen1))
+        self.assertEqual((None, node2), next(gen2))
+        self.assertEqual((None, node3), next(gen3))
+        self.assertEqual((node1.tag, node1n1), next(gen1))
+        self.assertEqual((node3.tag, node3n1), next(gen3))
+        self.assertEqual((node3.attributes[0].name, node3n2), next(gen3))
+        self.assertEqual((node3.attributes[0].value, node3n3), next(gen3))
+        self.assertEqual((node3.attributes[1].name, node3n4), next(gen3))
+        self.assertEqual((node3.attributes[1].value, node3n5), next(gen3))
+        self.assertEqual((node1.contents, node1n2), next(gen1))
+        self.assertEqual((node2.contents, node2n1), next(gen2))
+        self.assertEqual((node1.closing_tag, node1n1), next(gen1))
+        self.assertRaises(StopIteration, next, gen1)
+        self.assertRaises(StopIteration, next, gen2)
+        self.assertRaises(StopIteration, next, gen3)
+
+    def test_strip(self):
+        """test Tag.__strip__()"""
+        node1 = Tag(wraptext("i"), wraptext("foobar"))
+        node2 = Tag(wraptext("math"), wraptext("foobar"))
+        node3 = Tag(wraptext("br"), self_closing=True)
+        for a in (True, False):
+            for b in (True, False):
+                self.assertEqual("foobar", node1.__strip__(a, b))
+                self.assertEqual(None, node2.__strip__(a, b))
+                self.assertEqual(None, node3.__strip__(a, b))
+
+    def test_showtree(self):
+        """test Tag.__showtree__()"""
+        output = []
+        getter, marker = object(), object()
+        get = lambda code: output.append((getter, code))
+        mark = lambda: output.append(marker)
+        node1 = Tag(wraptext("ref"), wraptext("text"), [agen("name", "foo")])
+        node2 = Tag(wraptext("br"), self_closing=True, padding=" ")
+        node3 = Tag(wraptext("br"), self_closing=True, invalid=True,
+                    implicit=True, padding=" ")
+        node1.__showtree__(output.append, get, mark)
+        node2.__showtree__(output.append, get, mark)
+        node3.__showtree__(output.append, get, mark)
+        valid = [
+            "<", (getter, node1.tag), (getter, node1.attributes[0].name),
+            "    = ", marker, (getter, node1.attributes[0].value), ">",
+            (getter, node1.contents), "</", (getter, node1.closing_tag), ">",
+            "<", (getter, node2.tag), "/>", "</", (getter, node3.tag), ">"]
+        self.assertEqual(valid, output)
+
+    def test_tag(self):
+        """test getter/setter for the tag attribute"""
+        tag = wraptext("ref")
+        node = Tag(tag, wraptext("text"))
+        self.assertIs(tag, node.tag)
+        self.assertIs(tag, node.closing_tag)
+        node.tag = "span"
+        self.assertWikicodeEqual(wraptext("span"), node.tag)
+        self.assertWikicodeEqual(wraptext("span"), node.closing_tag)
+        self.assertEqual("<span>text</span>", node)
+
+    def test_contents(self):
+        """test getter/setter for the contents attribute"""
+        contents = wraptext("text")
+        node = Tag(wraptext("ref"), contents)
+        self.assertIs(contents, node.contents)
+        node.contents = "text and a {{template}}"
+        parsed = wrap([Text("text and a "), Template(wraptext("template"))])
+        self.assertWikicodeEqual(parsed, node.contents)
+        self.assertEqual("<ref>text and a {{template}}</ref>", node)
+
+    def test_attributes(self):
+        """test getter for the attributes attribute"""
+        attrs = [agen("name", "bar")]
+        node1 = Tag(wraptext("ref"), wraptext("foo"))
+        node2 = Tag(wraptext("ref"), wraptext("foo"), attrs)
+        self.assertEqual([], node1.attributes)
+        self.assertIs(attrs, node2.attributes)
+
+    def test_wiki_markup(self):
+        """test getter/setter for the wiki_markup attribute"""
+        node = Tag(wraptext("i"), wraptext("italic text"))
+        self.assertIs(None, node.wiki_markup)
+        node.wiki_markup = "''"
+        self.assertEqual("''", node.wiki_markup)
+        self.assertEqual("''italic text''", node)
+        node.wiki_markup = False
+        self.assertFalse(node.wiki_markup)
+        self.assertEqual("<i>italic text</i>", node)
+
+    def test_self_closing(self):
+        """test getter/setter for the self_closing attribute"""
+        node = Tag(wraptext("ref"), wraptext("foobar"))
+        self.assertFalse(node.self_closing)
+        node.self_closing = True
+        self.assertTrue(node.self_closing)
+        self.assertEqual("<ref/>", node)
+        node.self_closing = 0
+        self.assertFalse(node.self_closing)
+        self.assertEqual("<ref>foobar</ref>", node)
+
+    def test_invalid(self):
+        """test getter/setter for the invalid attribute"""
+        node = Tag(wraptext("br"), self_closing=True, implicit=True)
+        self.assertFalse(node.invalid)
+        node.invalid = True
+        self.assertTrue(node.invalid)
+        self.assertEqual("</br>", node)
+        node.invalid = 0
+        self.assertFalse(node.invalid)
+        self.assertEqual("<br>", node)
+
+    def test_implicit(self):
+        """test getter/setter for the implicit attribute"""
+        node = Tag(wraptext("br"), self_closing=True)
+        self.assertFalse(node.implicit)
+        node.implicit = True
+        self.assertTrue(node.implicit)
+        self.assertEqual("<br>", node)
+        node.implicit = 0
+        self.assertFalse(node.implicit)
+        self.assertEqual("<br/>", node)
+
+    def test_padding(self):
+        """test getter/setter for the padding attribute"""
+        node = Tag(wraptext("ref"), wraptext("foobar"))
+        self.assertEqual("", node.padding)
+        node.padding = "  "
+        self.assertEqual("  ", node.padding)
+        self.assertEqual("<ref  >foobar</ref>", node)
+        node.padding = None
+        self.assertEqual("", node.padding)
+        self.assertEqual("<ref>foobar</ref>", node)
+        self.assertRaises(ValueError, setattr, node, "padding", True)
+
+    def test_closing_tag(self):
+        """test getter/setter for the closing_tag attribute"""
+        tag = wraptext("ref")
+        node = Tag(tag, wraptext("foobar"))
+        self.assertIs(tag, node.closing_tag)
+        node.closing_tag = "ref {{ignore me}}"
+        parsed = wrap([Text("ref "), Template(wraptext("ignore me"))])
+        self.assertWikicodeEqual(parsed, node.closing_tag)
+        self.assertEqual("<ref>foobar</ref {{ignore me}}>", node)
+
+    def test_has(self):
+        """test Tag.has()"""
+        node = Tag(wraptext("ref"), wraptext("cite"), [agen("name", "foo")])
+        self.assertTrue(node.has("name"))
+        self.assertTrue(node.has("  name  "))
+        self.assertTrue(node.has(wraptext("name")))
+        self.assertFalse(node.has("Name"))
+        self.assertFalse(node.has("foo"))
+
+        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
+                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
+        node2 = Tag(wraptext("div"), attrs=attrs, self_closing=True)
+        self.assertTrue(node2.has("id"))
+        self.assertTrue(node2.has("class"))
+        self.assertTrue(node2.has(attrs[1].pad_first + str(attrs[1].name) +
+                                  attrs[1].pad_before_eq))
+        self.assertTrue(node2.has(attrs[3]))
+        self.assertTrue(node2.has(str(attrs[3])))
+        self.assertFalse(node2.has("idclass"))
+        self.assertFalse(node2.has("id class"))
+        self.assertFalse(node2.has("id=foo"))
+
+    def test_get(self):
+        """test Tag.get()"""
+        attrs = [agen("name", "foo")]
+        node = Tag(wraptext("ref"), wraptext("cite"), attrs)
+        self.assertIs(attrs[0], node.get("name"))
+        self.assertIs(attrs[0], node.get("  name  "))
+        self.assertIs(attrs[0], node.get(wraptext("name")))
+        self.assertRaises(ValueError, node.get, "Name")
+        self.assertRaises(ValueError, node.get, "foo")
+
+        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
+                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
+        node2 = Tag(wraptext("div"), attrs=attrs, self_closing=True)
+        self.assertIs(attrs[0], node2.get("id"))
+        self.assertIs(attrs[1], node2.get("class"))
+        self.assertIs(attrs[1], node2.get(
+            attrs[1].pad_first + str(attrs[1].name) + attrs[1].pad_before_eq))
+        self.assertIs(attrs[3], node2.get(attrs[3]))
+        self.assertIs(attrs[3], node2.get(str(attrs[3])))
+        self.assertIs(attrs[3], node2.get(" foo"))
+        self.assertRaises(ValueError, node2.get, "idclass")
+        self.assertRaises(ValueError, node2.get, "id class")
+        self.assertRaises(ValueError, node2.get, "id=foo")
+
+    def test_add(self):
+        """test Tag.add()"""
+        node = Tag(wraptext("ref"), wraptext("cite"))
+        node.add("name", "value")
+        node.add("name", "value", quoted=False)
+        node.add("name")
+        node.add(1, False)
+        node.add("style", "{{foobar}}")
+        node.add("name", "value", True, "\n", " ", "   ")
+        attr1 = ' name="value"'
+        attr2 = " name=value"
+        attr3 = " name"
+        attr4 = ' 1="False"'
+        attr5 = ' style="{{foobar}}"'
+        attr6 = '\nname =   "value"'
+        self.assertEqual(attr1, node.attributes[0])
+        self.assertEqual(attr2, node.attributes[1])
+        self.assertEqual(attr3, node.attributes[2])
+        self.assertEqual(attr4, node.attributes[3])
+        self.assertEqual(attr5, node.attributes[4])
+        self.assertEqual(attr6, node.attributes[5])
+        self.assertEqual(attr6, node.get("name"))
+        self.assertWikicodeEqual(wrap([Template(wraptext("foobar"))]),
+                                 node.attributes[4].value)
+        self.assertEqual("".join(("<ref", attr1, attr2, attr3, attr4, attr5,
+                                  attr6, ">cite</ref>")), node)
+
+    def test_remove(self):
+        """test Tag.remove()"""
+        attrs = [agen("id", "foo"), agenp("class", "bar", "  ", "\n", "\n"),
+                 agen("foo", "bar"), agenpnv("foo", " ", "  \n ", " \t")]
+        node = Tag(wraptext("div"), attrs=attrs, self_closing=True)
+        node.remove("class")
+        self.assertEqual('<div id="foo" foo="bar" foo  \n />', node)
+        node.remove("foo")
+        self.assertEqual('<div id="foo"/>', node)
+        self.assertRaises(ValueError, node.remove, "foo")
+        node.remove("id")
+        self.assertEqual('<div/>', node)
+
+if __name__ == "__main__":
+    unittest.main(verbosity=2)
diff --git a/tests/test_template.py b/tests/test_template.py
index 28592df..9ed099d 100644
--- a/tests/test_template.py
+++ b/tests/test_template.py
@@ -115,23 +115,23 @@ class TestTemplate(TreeEqualityTestCase):
         self.assertEqual([], node1.params)
         self.assertIs(plist, node2.params)
 
-    def test_has_param(self):
-        """test Template.has_param()"""
+    def test_has(self):
+        """test Template.has()"""
         node1 = Template(wraptext("foobar"))
         node2 = Template(wraptext("foo"),
                          [pgenh("1", "bar"), pgens("\nabc ", "def")])
         node3 = Template(wraptext("foo"),
                          [pgenh("1", "a"), pgens("b", "c"), pgens("1", "d")])
         node4 = Template(wraptext("foo"), [pgenh("1", "a"), pgens("b", " ")])
-        self.assertFalse(node1.has_param("foobar"))
-        self.assertTrue(node2.has_param(1))
-        self.assertTrue(node2.has_param("abc"))
-        self.assertFalse(node2.has_param("def"))
-        self.assertTrue(node3.has_param("1"))
-        self.assertTrue(node3.has_param(" b "))
-        self.assertFalse(node4.has_param("b"))
-        self.assertTrue(node3.has_param("b", False))
-        self.assertTrue(node4.has_param("b", False))
+        self.assertFalse(node1.has("foobar"))
+        self.assertTrue(node2.has(1))
+        self.assertTrue(node2.has("abc"))
+        self.assertFalse(node2.has("def"))
+        self.assertTrue(node3.has("1"))
+        self.assertTrue(node3.has(" b "))
+        self.assertFalse(node4.has("b"))
+        self.assertTrue(node3.has("b", False))
+        self.assertTrue(node4.has("b", False))
 
     def test_get(self):
         """test Template.get()"""
diff --git a/tests/test_tokens.py b/tests/test_tokens.py
index 4620982..2048bb9 100644
--- a/tests/test_tokens.py
+++ b/tests/test_tokens.py
@@ -44,8 +44,8 @@ class TestTokens(unittest.TestCase):
 
         self.assertEqual("bar", token2.foo)
         self.assertEqual(123, token2.baz)
-        self.assertRaises(KeyError, lambda: token1.foo)
-        self.assertRaises(KeyError, lambda: token2.bar)
+        self.assertFalse(token1.foo)
+        self.assertFalse(token2.bar)
 
         token1.spam = "eggs"
         token2.foo = "ham"
@@ -53,7 +53,7 @@ class TestTokens(unittest.TestCase):
 
         self.assertEqual("eggs", token1.spam)
         self.assertEqual("ham", token2.foo)
-        self.assertRaises(KeyError, lambda: token2.baz)
+        self.assertFalse(token2.baz)
         self.assertRaises(KeyError, delattr, token2, "baz")
 
     def test_repr(self):
diff --git a/tests/test_wikicode.py b/tests/test_wikicode.py
index 8dfa655..14d801c 100644
--- a/tests/test_wikicode.py
+++ b/tests/test_wikicode.py
@@ -21,6 +21,7 @@
 # SOFTWARE.
 
 from __future__ import unicode_literals
+from functools import partial
 import re
 from types import GeneratorType
 import unittest
@@ -122,66 +123,99 @@ class TestWikicode(TreeEqualityTestCase):
         code3.insert(-1000, "derp")
         self.assertEqual("derp{{foo}}bar[[baz]]", code3)
 
+    def _test_search(self, meth, expected):
+        """Base test for insert_before(), insert_after(), and replace()."""
+        code = parse("{{a}}{{b}}{{c}}{{d}}{{e}}")
+        func = partial(meth, code)
+        func("{{b}}", "x", recursive=True)
+        func("{{d}}", "[[y]]", recursive=False)
+        func(code.get(2), "z")
+        self.assertEqual(expected[0], code)
+        self.assertRaises(ValueError, func, "{{r}}", "n", recursive=True)
+        self.assertRaises(ValueError, func, "{{r}}", "n", recursive=False)
+        fake = parse("{{a}}").get(0)
+        self.assertRaises(ValueError, func, fake, "n", recursive=True)
+        self.assertRaises(ValueError, func, fake, "n", recursive=False)
+
+        code2 = parse("{{a}}{{a}}{{a}}{{b}}{{b}}{{b}}")
+        func = partial(meth, code2)
+        func(code2.get(1), "c", recursive=False)
+        func("{{a}}", "d", recursive=False)
+        func(code2.get(-1), "e", recursive=True)
+        func("{{b}}", "f", recursive=True)
+        self.assertEqual(expected[1], code2)
+
+        code3 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
+        func = partial(meth, code3)
+        obj = code3.get(0).params[0].value.get(0)
+        self.assertRaises(ValueError, func, obj, "x", recursive=False)
+        func(obj, "x", recursive=True)
+        self.assertRaises(ValueError, func, "{{f}}", "y", recursive=False)
+        func("{{f}}", "y", recursive=True)
+        self.assertEqual(expected[2], code3)
+
+        code4 = parse("{{a}}{{b}}{{c}}{{d}}{{e}}{{f}}{{g}}{{h}}{{i}}{{j}}")
+        func = partial(meth, code4)
+        fake = parse("{{b}}{{c}}")
+        self.assertRaises(ValueError, func, fake, "q", recursive=False)
+        self.assertRaises(ValueError, func, fake, "q", recursive=True)
+        func("{{b}}{{c}}", "w", recursive=False)
+        func("{{d}}{{e}}", "x", recursive=True)
+        func(wrap(code4.nodes[-2:]), "y", recursive=False)
+        func(wrap(code4.nodes[-2:]), "z", recursive=True)
+        self.assertEqual(expected[3], code4)
+        self.assertRaises(ValueError, func, "{{c}}{{d}}", "q", recursive=False)
+        self.assertRaises(ValueError, func, "{{c}}{{d}}", "q", recursive=True)
+
+        code5 = parse("{{a|{{b}}{{c}}|{{f|{{g}}={{h}}{{i}}}}}}")
+        func = partial(meth, code5)
+        self.assertRaises(ValueError, func, "{{b}}{{c}}", "x", recursive=False)
+        func("{{b}}{{c}}", "x", recursive=True)
+        obj = code5.get(0).params[1].value.get(0).params[0].value
+        self.assertRaises(ValueError, func, obj, "y", recursive=False)
+        func(obj, "y", recursive=True)
+        self.assertEqual(expected[4], code5)
+
+        code6 = parse("here is {{some text and a {{template}}}}")
+        func = partial(meth, code6)
+        self.assertRaises(ValueError, func, "text and", "ab", recursive=False)
+        func("text and", "ab", recursive=True)
+        self.assertRaises(ValueError, func, "is {{some", "cd", recursive=False)
+        func("is {{some", "cd", recursive=True)
+        self.assertEqual(expected[5], code6)
+
     def test_insert_before(self):
         """test Wikicode.insert_before()"""
-        code = parse("{{a}}{{b}}{{c}}{{d}}")
-        code.insert_before("{{b}}", "x", recursive=True)
-        code.insert_before("{{d}}", "[[y]]", recursive=False)
-        self.assertEqual("{{a}}x{{b}}{{c}}[[y]]{{d}}", code)
-        code.insert_before(code.get(2), "z")
-        self.assertEqual("{{a}}xz{{b}}{{c}}[[y]]{{d}}", code)
-        self.assertRaises(ValueError, code.insert_before, "{{r}}", "n",
-                          recursive=True)
-        self.assertRaises(ValueError, code.insert_before, "{{r}}", "n",
-                          recursive=False)
-
-        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
-        code2.insert_before(code2.get(0).params[0].value.get(0), "x",
-                            recursive=True)
-        code2.insert_before("{{f}}", "y", recursive=True)
-        self.assertEqual("{{a|x{{b}}|{{c|d=y{{f}}}}}}", code2)
-        self.assertRaises(ValueError, code2.insert_before, "{{f}}", "y",
-                          recursive=False)
+        meth = lambda code, *args, **kw: code.insert_before(*args, **kw)
+        expected = [
+            "{{a}}xz{{b}}{{c}}[[y]]{{d}}{{e}}",
+            "d{{a}}cd{{a}}d{{a}}f{{b}}f{{b}}ef{{b}}",
+            "{{a|x{{b}}|{{c|d=y{{f}}}}}}",
+            "{{a}}w{{b}}{{c}}x{{d}}{{e}}{{f}}{{g}}{{h}}yz{{i}}{{j}}",
+            "{{a|x{{b}}{{c}}|{{f|{{g}}=y{{h}}{{i}}}}}}",
+            "here cdis {{some abtext and a {{template}}}}"]
+        self._test_search(meth, expected)
 
     def test_insert_after(self):
         """test Wikicode.insert_after()"""
-        code = parse("{{a}}{{b}}{{c}}{{d}}")
-        code.insert_after("{{b}}", "x", recursive=True)
-        code.insert_after("{{d}}", "[[y]]", recursive=False)
-        self.assertEqual("{{a}}{{b}}x{{c}}{{d}}[[y]]", code)
-        code.insert_after(code.get(2), "z")
-        self.assertEqual("{{a}}{{b}}xz{{c}}{{d}}[[y]]", code)
-        self.assertRaises(ValueError, code.insert_after, "{{r}}", "n",
-                          recursive=True)
-        self.assertRaises(ValueError, code.insert_after, "{{r}}", "n",
-                          recursive=False)
-
-        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
-        code2.insert_after(code2.get(0).params[0].value.get(0), "x",
-                           recursive=True)
-        code2.insert_after("{{f}}", "y", recursive=True)
-        self.assertEqual("{{a|{{b}}x|{{c|d={{f}}y}}}}", code2)
-        self.assertRaises(ValueError, code2.insert_after, "{{f}}", "y",
-                          recursive=False)
+        meth = lambda code, *args, **kw: code.insert_after(*args, **kw)
+        expected = [
+            "{{a}}{{b}}xz{{c}}{{d}}[[y]]{{e}}",
+            "{{a}}d{{a}}dc{{a}}d{{b}}f{{b}}f{{b}}fe",
+            "{{a|{{b}}x|{{c|d={{f}}y}}}}",
+            "{{a}}{{b}}{{c}}w{{d}}{{e}}x{{f}}{{g}}{{h}}{{i}}{{j}}yz",
+            "{{a|{{b}}{{c}}x|{{f|{{g}}={{h}}{{i}}y}}}}",
+            "here is {{somecd text andab a {{template}}}}"]
+        self._test_search(meth, expected)
 
     def test_replace(self):
         """test Wikicode.replace()"""
-        code = parse("{{a}}{{b}}{{c}}{{d}}")
-        code.replace("{{b}}", "x", recursive=True)
-        code.replace("{{d}}", "[[y]]", recursive=False)
-        self.assertEqual("{{a}}x{{c}}[[y]]", code)
-        code.replace(code.get(1), "z")
-        self.assertEqual("{{a}}z{{c}}[[y]]", code)
-        self.assertRaises(ValueError, code.replace, "{{r}}", "n",
-                          recursive=True)
-        self.assertRaises(ValueError, code.replace, "{{r}}", "n",
-                          recursive=False)
-
-        code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}")
-        code2.replace(code2.get(0).params[0].value.get(0), "x", recursive=True)
-        code2.replace("{{f}}", "y", recursive=True)
-        self.assertEqual("{{a|x|{{c|d=y}}}}", code2)
-        self.assertRaises(ValueError, code2.replace, "y", "z", recursive=False)
+        meth = lambda code, *args, **kw: code.replace(*args, **kw)
+        expected = [
+            "{{a}}xz[[y]]{{e}}", "dcdffe", "{{a|x|{{c|d=y}}}}",
+            "{{a}}wx{{f}}{{g}}z", "{{a|x|{{f|{{g}}=y}}}}",
+            "here cd ab a {{template}}}}"]
+        self._test_search(meth, expected)
 
     def test_append(self):
         """test Wikicode.append()"""
@@ -197,18 +231,25 @@ class TestWikicode(TreeEqualityTestCase):
 
     def test_remove(self):
         """test Wikicode.remove()"""
-        code = parse("{{a}}{{b}}{{c}}{{d}}")
-        code.remove("{{b}}", recursive=True)
-        code.remove(code.get(1), recursive=True)
-        self.assertEqual("{{a}}{{d}}", code)
-        self.assertRaises(ValueError, code.remove, "{{r}}", recursive=True)
-        self.assertRaises(ValueError, code.remove, "{{r}}", recursive=False)
-
-        code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}")
-        code2.remove(code2.get(0).params[0].value.get(0), recursive=True)
-        code2.remove("{{f}}", recursive=True)
-        self.assertEqual("{{a||{{c|d={{h}}}}}}", code2)
-        self.assertRaises(ValueError, code2.remove, "{{h}}", recursive=False)
+        meth = lambda code, obj, value, **kw: code.remove(obj, **kw)
+        expected = [
+            "{{a}}{{c}}", "", "{{a||{{c|d=}}}}", "{{a}}{{f}}",
+            "{{a||{{f|{{g}}=}}}}", "here   a {{template}}}}"
+        ]
+        self._test_search(meth, expected)
+
+    def test_matches(self):
+        """test Wikicode.matches()"""
+        code1 = parse("Cleanup")
+        code2 = parse("\nstub<!-- TODO: make more specific -->")
+        self.assertTrue(code1.matches("Cleanup"))
+        self.assertTrue(code1.matches("cleanup"))
+        self.assertTrue(code1.matches("  cleanup\n"))
+        self.assertFalse(code1.matches("CLEANup"))
+        self.assertFalse(code1.matches("Blah"))
+        self.assertTrue(code2.matches("stub"))
+        self.assertTrue(code2.matches("Stub<!-- no, it's fine! -->"))
+        self.assertFalse(code2.matches("StuB"))
 
     def test_filter_family(self):
         """test the Wikicode.i?filter() family of functions"""
@@ -219,11 +260,11 @@ class TestWikicode(TreeEqualityTestCase):
 
         code = parse("a{{b}}c[[d]]{{{e}}}{{f}}[[g]]")
         for func in (code.filter, ifilter(code)):
-            self.assertEqual(["a", "{{b}}", "c", "[[d]]", "{{{e}}}", "{{f}}",
-                              "[[g]]"], func())
+            self.assertEqual(["a", "{{b}}", "b", "c", "[[d]]", "d", "{{{e}}}",
+                              "e", "{{f}}", "f", "[[g]]", "g"], func())
             self.assertEqual(["{{{e}}}"], func(forcetype=Argument))
             self.assertIs(code.get(4), func(forcetype=Argument)[0])
-            self.assertEqual(["a", "c"], func(forcetype=Text))
+            self.assertEqual(list("abcdefg"), func(forcetype=Text))
             self.assertEqual([], func(forcetype=Heading))
             self.assertRaises(TypeError, func, forcetype=True)
 
@@ -235,11 +276,12 @@ class TestWikicode(TreeEqualityTestCase):
             self.assertEqual(["{{{e}}}"], get_filter("arguments"))
             self.assertIs(code.get(4), get_filter("arguments")[0])
             self.assertEqual([], get_filter("comments"))
+            self.assertEqual([], get_filter("external_links"))
             self.assertEqual([], get_filter("headings"))
             self.assertEqual([], get_filter("html_entities"))
             self.assertEqual([], get_filter("tags"))
             self.assertEqual(["{{b}}", "{{f}}"], get_filter("templates"))
-            self.assertEqual(["a", "c"], get_filter("text"))
+            self.assertEqual(list("abcdefg"), get_filter("text"))
             self.assertEqual(["[[d]]", "[[g]]"], get_filter("wikilinks"))
 
         code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}")
@@ -252,13 +294,13 @@ class TestWikicode(TreeEqualityTestCase):
 
         code3 = parse("{{foobar}}{{FOO}}{{baz}}{{bz}}")
         for func in (code3.filter, ifilter(code3)):
-            self.assertEqual(["{{foobar}}", "{{FOO}}"], func(matches=r"foo"))
+            self.assertEqual(["{{foobar}}", "{{FOO}}"], func(recursive=False, matches=r"foo"))
             self.assertEqual(["{{foobar}}", "{{FOO}}"],
-                             func(matches=r"^{{foo.*?}}"))
+                             func(recursive=False, matches=r"^{{foo.*?}}"))
             self.assertEqual(["{{foobar}}"],
-                             func(matches=r"^{{foo.*?}}", flags=re.UNICODE))
-            self.assertEqual(["{{baz}}", "{{bz}}"], func(matches=r"^{{b.*?z"))
-            self.assertEqual(["{{baz}}"], func(matches=r"^{{b.+?z}}"))
+                             func(recursive=False, matches=r"^{{foo.*?}}", flags=re.UNICODE))
+            self.assertEqual(["{{baz}}", "{{bz}}"], func(recursive=False, matches=r"^{{b.*?z"))
+            self.assertEqual(["{{baz}}"], func(recursive=False, matches=r"^{{b.+?z}}"))
 
         self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}"],
                          code2.filter_templates(recursive=False))
diff --git a/tests/tokenizer/external_links.mwtest b/tests/tokenizer/external_links.mwtest
new file mode 100644
index 0000000..af7a570
--- /dev/null
+++ b/tests/tokenizer/external_links.mwtest
@@ -0,0 +1,473 @@
+name:   basic
+label:  basic external link
+input:  "http://example.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose()]
+
+---
+
+name:   basic_brackets
+label:  basic external link in brackets
+input:  "[http://example.com/]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose()]
+
+---
+
+name:   brackets_space
+label:  basic external link in brackets, with a space after
+input:  "[http://example.com/ ]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), ExternalLinkClose()]
+
+---
+
+name:   brackets_title
+label:  basic external link in brackets, with a title
+input:  "[http://example.com/ Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   brackets_multiword_title
+label:  basic external link in brackets, with a multi-word title
+input:  "[http://example.com/ Example Web Page]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkSeparator(), Text(text="Example Web Page"), ExternalLinkClose()]
+
+---
+
+name:   brackets_adjacent
+label:  three adjacent bracket-enclosed external links
+input:  "[http://foo.com/ Foo][http://bar.com/ Bar]\n[http://baz.com/ Baz]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://foo.com/"), ExternalLinkSeparator(), Text(text="Foo"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://bar.com/"), ExternalLinkSeparator(), Text(text="Bar"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=True), Text(text="http://baz.com/"), ExternalLinkSeparator(), Text(text="Baz"), ExternalLinkClose()]
+
+---
+
+name:   brackets_newline_before
+label:  bracket-enclosed link with a newline before the title
+input:  "[http://example.com/ \nExample]"
+output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" \nExample]")]
+
+---
+
+name:   brackets_newline_inside
+label:  bracket-enclosed link with a newline in the title
+input:  "[http://example.com/ Example \nWeb Page]"
+output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example \nWeb Page]")]
+
+---
+
+name:   brackets_newline_after
+label:  bracket-enclosed link with a newline after the title
+input:  "[http://example.com/ Example\n]"
+output: [Text(text="["), ExternalLinkOpen(brackets=False), Text(text="http://example.com/"), ExternalLinkClose(), Text(text=" Example\n]")]
+
+---
+
+name:   brackets_space_before
+label:  bracket-enclosed link with a space before the URL
+input:  "[ http://example.com Example]"
+output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" Example]")]
+
+---
+
+name:   brackets_title_like_url
+label:  bracket-enclosed link with a title that looks like a URL
+input:  "[http://example.com http://example.com]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="http://example.com"), ExternalLinkClose()]
+
+---
+
+name:   brackets_recursive
+label:  bracket-enclosed link with a bracket-enclosed link as the title
+input:  "[http://example.com [http://example.com]]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com"), ExternalLinkSeparator(), Text(text="[http://example.com"), ExternalLinkClose(), Text(text="]")]
+
+---
+
+name:   period_after
+label:  a period after a free link that is excluded
+input:  "http://example.com."
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=".")]
+
+---
+
+name:   colons_after
+label:  colons after a free link that are excluded
+input:  "http://example.com/foo:bar.:;baz!?,"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com/foo:bar.:;baz"), ExternalLinkClose(), Text(text="!?,")]
+
+---
+
+name:   close_paren_after_excluded
+label:  a closing parenthesis after a free link that is excluded
+input:  "http://example.)com)"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.)com"), ExternalLinkClose(), Text(text=")")]
+
+---
+
+name:   close_paren_after_included
+label:  a closing parenthesis after a free link that is included because of an opening parenthesis in the URL
+input:  "http://example.(com)"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.(com)"), ExternalLinkClose()]
+
+---
+
+name:   open_bracket_inside
+label:  an open bracket inside a free link that causes it to be ended abruptly
+input:  "http://foobar[baz.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://foobar"), ExternalLinkClose(), Text(text="[baz.com")]
+
+---
+
+name:   brackets_period_after
+label:  a period after a bracket-enclosed link that is included
+input:  "[http://example.com. Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com."), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   brackets_colons_after
+label:  colons after a bracket-enclosed link that are included
+input:  "[http://example.com/foo:bar.:;baz!?, Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.com/foo:bar.:;baz!?,"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   brackets_close_paren_after_included
+label:  a closing parenthesis after a bracket-enclosed link that is included
+input:  "[http://example.)com) Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.)com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   brackets_close_paren_after_included_2
+label:  a closing parenthesis after a bracket-enclosed link that is also included
+input:  "[http://example.(com) Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://example.(com)"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   brackets_open_bracket_inside
+label:  an open bracket inside a bracket-enclosed link that is also included
+input:  "[http://foobar[baz.com Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="http://foobar[baz.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   adjacent_space
+label:  two free links separated by a space
+input:  "http://example.com http://example.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text=" "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
+
+---
+
+name:   adjacent_newline
+label:  two free links separated by a newline
+input:  "http://example.com\nhttp://example.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="\n"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
+
+---
+
+name:   adjacent_close_bracket
+label:  two free links separated by a close bracket
+input:  "http://example.com]http://example.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="]"), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
+
+---
+
+name:   html_entity_in_url
+label:  a HTML entity parsed correctly inside a free link
+input:  "http://exa&nbsp;mple.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="mple.com/"), ExternalLinkClose()]
+
+---
+
+name:   template_in_url
+label:  a template parsed correctly inside a free link
+input:  "http://exa{{template}}mple.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), TemplateOpen(), Text(text="template"), TemplateClose(), Text(text="mple.com/"), ExternalLinkClose()]
+
+---
+
+name:   argument_in_url
+label:  an argument parsed correctly inside a free link
+input:  "http://exa{{{argument}}}mple.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ArgumentOpen(), Text(text="argument"), ArgumentClose(), Text(text="mple.com/"), ExternalLinkClose()]
+
+---
+
+name:   wikilink_in_url
+label:  a wikilink that destroys a free link
+input:  "http://exa[[wikilink]]mple.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), WikilinkOpen(), Text(text="wikilink"), WikilinkClose(), Text(text="mple.com/")]
+
+---
+
+name:   external_link_in_url
+label:  a bracketed link that destroys a free link
+input:  "http://exa[http://example.com/]mple.com/"
+output: [ExternalLinkOpen(brackets=False), Text(text="http://exa"), ExternalLinkClose(), ExternalLinkOpen(brackets=True), Text(text="http://example.com/"), ExternalLinkClose(), Text(text="mple.com/")]
+
+---
+
+name:   spaces_padding
+label:  spaces padding a free link
+input:  "   http://example.com   "
+output: [Text(text="   "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="   ")]
+
+---
+
+name:   text_and_spaces_padding
+label:  text and spaces padding a free link
+input:  "x   http://example.com   x"
+output: [Text(text="x   "), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose(), Text(text="   x")]
+
+---
+
+name:   template_before
+label:  a template before a free link
+input:  "{{foo}}http://example.com"
+output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="http://example.com"), ExternalLinkClose()]
+
+---
+
+name:   spaces_padding_no_slashes
+label:  spaces padding a free link with no slashes after the colon
+input:  "   mailto:example@example.com   "
+output: [Text(text="   "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text="   ")]
+
+---
+
+name:   text_and_spaces_padding_no_slashes
+label:  text and spaces padding a free link with no slashes after the colon
+input:  "x   mailto:example@example.com   x"
+output: [Text(text="x   "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text="   x")]
+
+---
+
+name:   template_before_no_slashes
+label:  a template before a free link with no slashes after the colon
+input:  "{{foo}}mailto:example@example.com"
+output: [TemplateOpen(), Text(text="foo"), TemplateClose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]
+
+---
+
+name:   no_slashes
+label:  a free link with no slashes after the colon
+input:  "mailto:example@example.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose()]
+
+---
+
+name:   slashes_optional
+label:  a free link using a scheme that doesn't need slashes, but has them anyway
+input:  "mailto://example@example.com"
+output: [ExternalLinkOpen(brackets=False), Text(text="mailto://example@example.com"), ExternalLinkClose()]
+
+---
+
+name:   short
+label:  a very short free link
+input:  "mailto://abc"
+output: [ExternalLinkOpen(brackets=False), Text(text="mailto://abc"), ExternalLinkClose()]
+
+---
+
+name:   slashes_missing
+label:  slashes missing from a free link with a scheme that requires them
+input:  "http:example@example.com"
+output: [Text(text="http:example@example.com")]
+
+---
+
+name:   no_scheme_but_slashes
+label:  no scheme in a free link, but slashes (protocol-relative free links are not supported)
+input:  "//example.com"
+output: [Text(text="//example.com")]
+
+---
+
+name:   no_scheme_but_colon
+label:  no scheme in a free link, but a colon
+input:  " :example.com"
+output: [Text(text=" :example.com")]
+
+---
+
+name:   no_scheme_but_colon_and_slashes
+label:  no scheme in a free link, but a colon and slashes
+input:  " ://example.com"
+output: [Text(text=" ://example.com")]
+
+---
+
+name:   fake_scheme_no_slashes
+label:  a nonexistent scheme in a free link, without slashes
+input:  "fake:example.com"
+output: [Text(text="fake:example.com")]
+
+---
+
+name:   fake_scheme_slashes
+label:  a nonexistent scheme in a free link, with slashes
+input:  "fake://example.com"
+output: [Text(text="fake://example.com")]
+
+---
+
+name:   fake_scheme_brackets_no_slashes
+label:  a nonexistent scheme in a bracketed link, without slashes
+input:  "[fake:example.com]"
+output: [Text(text="[fake:example.com]")]
+
+---
+
+name:   fake_scheme_brackets_slashes
+label:  #=a nonexistent scheme in a bracketed link, with slashes
+input:  "[fake://example.com]"
+output: [Text(text="[fake://example.com]")]
+
+---
+
+name:   interrupted_scheme
+label:  an otherwise valid scheme with something in the middle of it, in a free link
+input:  "ht?tp://example.com"
+output: [Text(text="ht?tp://example.com")]
+
+---
+
+name:   interrupted_scheme_brackets
+label:  an otherwise valid scheme with something in the middle of it, in a bracketed link
+input:  "[ht?tp://example.com]"
+output: [Text(text="[ht?tp://example.com]")]
+
+---
+
+name:   no_slashes_brackets
+label:  no slashes after the colon in a bracketed link
+input:  "[mailto:example@example.com Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="mailto:example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   space_before_no_slashes_brackets
+label:  a space before a bracketed link with no slashes after the colon
+input:  "[ mailto:example@example.com Example]"
+output: [Text(text="[ "), ExternalLinkOpen(brackets=False), Text(text="mailto:example@example.com"), ExternalLinkClose(), Text(text=" Example]")]
+
+---
+
+name:   slashes_optional_brackets
+label:  a bracketed link using a scheme that doesn't need slashes, but has them anyway
+input:  "[mailto://example@example.com Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="mailto://example@example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   short_brackets
+label:  a very short link in brackets
+input:  "[mailto://abc Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="mailto://abc"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   slashes_missing_brackets
+label:  slashes missing from a scheme that requires them in a bracketed link
+input:  "[http:example@example.com Example]"
+output: [Text(text="[http:example@example.com Example]")]
+
+---
+
+name:   protcol_relative
+label:  a protocol-relative link (in brackets)
+input:  "[//example.com Example]"
+output: [ExternalLinkOpen(brackets=True), Text(text="//example.com"), ExternalLinkSeparator(), Text(text="Example"), ExternalLinkClose()]
+
+---
+
+name:   scheme_missing_but_colon_brackets
+label:  scheme missing from a bracketed link, but with a colon
+input:  "[:example.com Example]"
+output: [Text(text="[:example.com Example]")]
+
+---
+
+name:   scheme_missing_but_colon_slashes_brackets
+label:  scheme missing from a bracketed link, but with a colon and slashes
+input:  "[://example.com Example]"
+output: [Text(text="[://example.com Example]")]
+
+---
+
+name:   unclosed_protocol_relative
+label:  an unclosed protocol-relative bracketed link
+input:  "[//example.com"
+output: [Text(text="[//example.com")]
+
+---
+
+name:   space_before_protcol_relative
+label:  a space before a protocol-relative bracketed link
+input:  "[ //example.com]"
+output: [Text(text="[ //example.com]")]
+
+---
+
+name:   unclosed_just_scheme
+label:  an unclosed bracketed link, ending after the scheme
+input:  "[http"
+output: [Text(text="[http")]
+
+---
+
+name:   unclosed_scheme_colon
+label:  an unclosed bracketed link, ending after the colon
+input:  "[http:"
+output: [Text(text="[http:")]
+
+---
+
+name:   unclosed_scheme_colon_slashes
+label:  an unclosed bracketed link, ending after the slashes
+input:  "[http://"
+output: [Text(text="[http://")]
+
+---
+
+name:   incomplete_bracket
+label:  just an open bracket
+input:  "["
+output: [Text(text="[")]
+
+---
+
+name:   incomplete_scheme_colon
+label:  a free link with just a scheme and a colon
+input:  "http:"
+output: [Text(text="http:")]
+
+---
+
+name:   incomplete_scheme_colon_slashes
+label:  a free link with just a scheme, colon, and slashes
+input:  "http://"
+output: [Text(text="http://")]
+
+---
+
+name:   brackets_scheme_but_no_url
+label:  brackets around a scheme and a colon
+input:  "[mailto:]"
+output: [Text(text="[mailto:]")]
+
+---
+
+name:   brackets_scheme_slashes_but_no_url
+label:  brackets around a scheme, colon, and slashes
+input:  "[http://]"
+output: [Text(text="[http://]")]
+
+---
+
+name:   brackets_scheme_title_but_no_url
+label:  brackets around a scheme, colon, and slashes, with a title
+input:  "[http:// Example]"
+output: [Text(text="[http:// Example]")]
diff --git a/tests/tokenizer/html_entities.mwtest b/tests/tokenizer/html_entities.mwtest
index 625dd60..53bedbd 100644
--- a/tests/tokenizer/html_entities.mwtest
+++ b/tests/tokenizer/html_entities.mwtest
@@ -117,6 +117,20 @@ output: [Text(text="&;")]
 
 ---
 
+name:   invalid_partial_amp_pound
+label:  invalid entities: just an ampersand, pound sign
+input:  "&#"
+output: [Text(text="&#")]
+
+---
+
+name:   invalid_partial_amp_pound_x
+label:  invalid entities: just an ampersand, pound sign, x
+input:  "&#x"
+output: [Text(text="&#x")]
+
+---
+
 name:   invalid_partial_amp_pound_semicolon
 label:  invalid entities: an ampersand, pound sign, and semicolon
 input:  "&#;"
diff --git a/tests/tokenizer/integration.mwtest b/tests/tokenizer/integration.mwtest
index d3cb419..083b12c 100644
--- a/tests/tokenizer/integration.mwtest
+++ b/tests/tokenizer/integration.mwtest
@@ -12,6 +12,13 @@ output: [TemplateOpen(), ArgumentOpen(), ArgumentOpen(), Text(text="foo"), Argum
 
 ---
 
+name:   link_in_template_name
+label:  a wikilink inside a template name, which breaks the template
+input:  "{{foo[[bar]]}}"
+output: [Text(text="{{foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="}}")]
+
+---
+
 name:   rich_heading
 label:  a heading with templates/wikilinks in it
 input:  "== Head{{ing}} [[with]] {{{funky|{{stuf}}}}} =="
@@ -33,6 +40,13 @@ output: [Text(text="&n"), CommentStart(), Text(text="foo"), CommentEnd(), Text(t
 
 ---
 
+name:   rich_tags
+label:  a HTML tag with tons of other things in it
+input:  "{{dubious claim}}<ref name={{abc}}   foo="bar {{baz}}" abc={{de}}f ghi=j{{k}}{{l}} \n mno =  "{{p}} [[q]] {{r}}">[[Source]]</ref>"
+output: [TemplateOpen(), Text(text="dubious claim"), TemplateClose(), TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TemplateOpen(), Text(text="abc"), TemplateClose(), TagAttrStart(pad_first="   ", pad_before_eq="", pad_after_eq=""), Text(text="foo"), TagAttrEquals(), TagAttrQuote(), Text(text="bar "), TemplateOpen(), Text(text="baz"), TemplateClose(), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="abc"), TagAttrEquals(), TemplateOpen(), Text(text="de"), TemplateClose(), Text(text="f"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="ghi"), TagAttrEquals(), Text(text="j"), TemplateOpen(), Text(text="k"), TemplateClose(), TemplateOpen(), Text(text="l"), TemplateClose(), TagAttrStart(pad_first=" \n ", pad_before_eq=" ", pad_after_eq="  "), Text(text="mno"), TagAttrEquals(), TagAttrQuote(), TemplateOpen(), Text(text="p"), TemplateClose(), Text(text=" "), WikilinkOpen(), Text(text="q"), WikilinkClose(), Text(text=" "), TemplateOpen(), Text(text="r"), TemplateClose(), TagCloseOpen(padding=""), WikilinkOpen(), Text(text="Source"), WikilinkClose(), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
 name:   wildcard
 label:  a wildcard assortment of various things
 input:  "{{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}"
@@ -44,3 +58,17 @@ name:   wildcard_redux
 label:  an even wilder assortment of various things
 input:  "{{a|b|{{c|[[d]]{{{e}}}}}}}[[f|{{{g}}}<!--h-->]]{{i|j=&nbsp;}}"
 output: [TemplateOpen(), Text(text="a"), TemplateParamSeparator(), Text(text="b"), TemplateParamSeparator(), TemplateOpen(), Text(text="c"), TemplateParamSeparator(), WikilinkOpen(), Text(text="d"), WikilinkClose(), ArgumentOpen(), Text(text="e"), ArgumentClose(), TemplateClose(), TemplateClose(), WikilinkOpen(), Text(text="f"), WikilinkSeparator(), ArgumentOpen(), Text(text="g"), ArgumentClose(), CommentStart(), Text(text="h"), CommentEnd(), WikilinkClose(), TemplateOpen(), Text(text="i"), TemplateParamSeparator(), Text(text="j"), TemplateParamEquals(), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), TemplateClose()]
+
+---
+
+name:   link_inside_dl
+label:  an external link inside a def list, such that the external link is parsed
+input:  ";;;mailto:example"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), ExternalLinkOpen(brackets=False), Text(text="mailto:example"), ExternalLinkClose()]
+
+---
+
+name:   link_inside_dl_2
+label:  an external link inside a def list, such that the external link is not parsed
+input:  ";;;malito:example"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="malito"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="example")]
diff --git a/tests/tokenizer/tags.mwtest b/tests/tokenizer/tags.mwtest
new file mode 100644
index 0000000..a0d7f18
--- /dev/null
+++ b/tests/tokenizer/tags.mwtest
@@ -0,0 +1,578 @@
+name:   basic
+label:  a basic tag with an open and close
+input:  "<ref></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   basic_selfclosing
+label:  a basic self-closing tag
+input:  "<ref/>"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding="")]
+
+---
+
+name:   content
+label:  a tag with some content in the middle
+input:  "<ref>this is a reference</ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), Text(text="this is a reference"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   padded_open
+label:  a tag with some padding in the open tag
+input:  "<ref ></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   padded_close
+label:  a tag with some padding in the close tag
+input:  "<ref></ref >"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref "), TagCloseClose()]
+
+---
+
+name:   padded_selfclosing
+label:  a self-closing tag with padding
+input:  "<ref />"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseSelfclose(padding=" ")]
+
+---
+
+name:   attribute
+label:  a tag with a single attribute
+input:  "<ref name></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   attribute_value
+label:  a tag with a single attribute with a value
+input:  "<ref name=foo></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   attribute_quoted
+label:  a tag with a single quoted attribute
+input:  "<ref name="foo bar"></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   attribute_hyphen
+label:  a tag with a single attribute, containing a hyphen
+input:  "<ref name=foo-bar></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   attribute_quoted_hyphen
+label:  a tag with a single quoted attribute, containing a hyphen
+input:  "<ref name="foo-bar"></ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo-bar"), TagCloseOpen(padding=""), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   attribute_selfclosing
+label:  a self-closing tag with a single attribute
+input:  "<ref name/>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagCloseSelfclose(padding="")]
+
+---
+
+name:   attribute_selfclosing_value
+label:  a self-closing tag with a single attribute with a value
+input:  "<ref name=foo/>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), Text(text="foo"), TagCloseSelfclose(padding="")]
+
+---
+
+name:   attribute_selfclosing_value_quoted
+label:  a self-closing tag with a single quoted attribute
+input:  "<ref name="foo"/>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="foo"), TagCloseSelfclose(padding="")]
+
+---
+
+name:   nested_tag
+label:  a tag nested within the attributes of another
+input:  "<ref name=<span style="color: red;">foo</span>>citation</ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   nested_tag_quoted
+label:  a tag nested within the attributes of another, quoted
+input:  "<ref name="<span style="color: red;">foo</span>">citation</ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), TagAttrQuote(), Text(text="color: red;"), TagCloseOpen(padding=""), Text(text="foo"), TagOpenClose(), Text(text="span"), TagCloseClose(), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   nested_troll_tag
+label:  a bogus tag that appears to be nested within the attributes of another
+input:  "<ref name=</ ><//>>citation</ref>"
+output: [Text(text="<ref name=</ ><//>>citation</ref>")]
+
+---
+
+name:   nested_troll_tag_quoted
+label:  a bogus tag that appears to be nested within the attributes of another, quoted
+input:  "<ref name="</ ><//>">citation</ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="name"), TagAttrEquals(), TagAttrQuote(), Text(text="</ ><//>"), TagCloseOpen(padding=""), Text(text="citation"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   invalid_space_begin_open
+label:  invalid tag: a space at the beginning of the open tag
+input:  "< ref>test</ref>"
+output: [Text(text="< ref>test</ref>")]
+
+---
+
+name:   invalid_space_begin_close
+label:  invalid tag: a space at the beginning of the close tag
+input:  "<ref>test</ ref>"
+output: [Text(text="<ref>test</ ref>")]
+
+---
+
+name:   valid_space_end
+label:  valid tag: spaces at the ends of both the open and close tags
+input:  "<ref >test</ref >"
+output: [TagOpenOpen(), Text(text="ref"), TagCloseOpen(padding=" "), Text(text="test"), TagOpenClose(), Text(text="ref "), TagCloseClose()]
+
+---
+
+name:   invalid_template_ends
+label:  invalid tag: a template at the ends of both the open and close tags
+input:  "<ref {{foo}}>test</ref {{foo}}>"
+output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]
+
+---
+
+name:   invalid_template_ends_nospace
+label:  invalid tag: a template at the ends of both the open and close tags, without spacing
+input:  "<ref {{foo}}>test</ref{{foo}}>"
+output: [Text(text="<ref "), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">")]
+
+---
+
+name:   valid_template_end_open
+label:  valid tag: a template at the end of the open tag
+input:  "<ref {{foo}}>test</ref>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref"), TagCloseClose()]
+
+---
+
+name:   valid_template_end_open_space_end_close
+label:  valid tag: a template at the end of the open tag; whitespace at the end of the close tag
+input:  "<ref {{foo}}>test</ref\n>"
+output: [TagOpenOpen(), Text(text="ref"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), TemplateOpen(), Text(text="foo"), TemplateClose(), TagCloseOpen(padding=""), Text(text="test"), TagOpenClose(), Text(text="ref\n"), TagCloseClose()]
+
+---
+
+name:   invalid_template_end_open_nospace
+label:  invalid tag: a template at the end of the open tag, without spacing
+input:  "<ref{{foo}}>test</ref>"
+output: [Text(text="<ref"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text=">test</ref>")]
+
+---
+
+name:   invalid_template_start_close
+label:  invalid tag: a template at the beginning of the close tag
+input:  "<ref>test</{{foo}}ref>"
+output: [Text(text="<ref>test</"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>")]
+
+---
+
+name:   invalid_template_start_open
+label:  invalid tag: a template at the beginning of the open tag
+input:  "<{{foo}}ref>test</ref>"
+output: [Text(text="<"), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="ref>test</ref>")]
+
+---
+
+name:   unclosed_quote
+label:  a quoted attribute that is never closed
+input:  "<span style="foobar>stuff</span>"
+output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foobar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
+
+---
+
+name:   fake_quote
+label:  a fake quoted attribute
+input:  "<span style="foo"bar>stuff</span>"
+output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo\"bar"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
+
+---
+
+name:   fake_quote_complex
+label:  a fake quoted attribute, with spaces and templates and links
+input:  "<span style="foo {{bar}}\n[[baz]]"buzz >stuff</span>"
+output: [TagOpenOpen(), Text(text="span"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="style"), TagAttrEquals(), Text(text="\"foo"), TagAttrStart(pad_first=" ", pad_before_eq="\n", pad_after_eq=""), TemplateOpen(), Text(text="bar"), TemplateClose(), TagAttrStart(pad_first="", pad_before_eq=" ", pad_after_eq=""), WikilinkOpen(), Text(text="baz"), WikilinkClose(), Text(text="\"buzz"), TagCloseOpen(padding=""), Text(text="stuff"), TagOpenClose(), Text(text="span"), TagCloseClose()]
+
+---
+
+name:   incomplete_lbracket
+label:  incomplete tags: just a left bracket
+input:  "<"
+output: [Text(text="<")]
+
+---
+
+name:   incomplete_lbracket_junk
+label:  incomplete tags: just a left bracket, surrounded by stuff
+input:  "foo<bar"
+output: [Text(text="foo<bar")]
+
+---
+
+name:   incomplete_unclosed_open
+label:  incomplete tags: an unclosed open tag
+input:  "junk <ref"
+output: [Text(text="junk <ref")]
+
+---
+
+name:   incomplete_unclosed_open_space
+label:  incomplete tags: an unclosed open tag, space
+input:  "junk <ref "
+output: [Text(text="junk <ref ")]
+
+---
+
+name:   incomplete_unclosed_open_unnamed_attr
+label:  incomplete tags: an unclosed open tag, unnamed attribute
+input:  "junk <ref name"
+output: [Text(text="junk <ref name")]
+
+---
+
+name:   incomplete_unclosed_open_attr_equals
+label:  incomplete tags: an unclosed open tag, attribute, equal sign
+input:  "junk <ref name="
+output: [Text(text="junk <ref name=")]
+
+---
+
+name:   incomplete_unclosed_open_attr_equals_quoted
+label:  incomplete tags: an unclosed open tag, attribute, equal sign, quote
+input:  "junk <ref name=""
+output: [Text(text="junk <ref name=\"")]
+
+---
+
+name:   incomplete_unclosed_open_attr
+label:  incomplete tags: an unclosed open tag, attribute with a key/value
+input:  "junk <ref name=foo"
+output: [Text(text="junk <ref name=foo")]
+
+---
+
+name:   incomplete_unclosed_open_attr_quoted
+label:  incomplete tags: an unclosed open tag, attribute with a key/value, quoted
+input:  "junk <ref name="foo""
+output: [Text(text="junk <ref name=\"foo\"")]
+
+---
+
+name:   incomplete_open
+label:  incomplete tags: an open tag
+input:  "junk <ref>"
+output: [Text(text="junk <ref>")]
+
+---
+
+name:   incomplete_open_unnamed_attr
+label:  incomplete tags: an open tag, unnamed attribute
+input:  "junk <ref name>"
+output: [Text(text="junk <ref name>")]
+
+---
+
+name:   incomplete_open_attr_equals
+label:  incomplete tags: an open tag, attribute, equal sign
+input:  "junk <ref name=>"
+output: [Text(text="junk <ref name=>")]
+
+---
+
+name:   incomplete_open_attr
+label:  incomplete tags: an open tag, attribute with a key/value
+input:  "junk <ref name=foo>"
+output: [Text(text="junk <ref name=foo>")]
+
+---
+
+name:   incomplete_open_attr_quoted
+label:  incomplete tags: an open tag, attribute with a key/value, quoted
+input:  "junk <ref name="foo">"
+output: [Text(text="junk <ref name=\"foo\">")]
+
+---
+
+name:   incomplete_open_text
+label:  incomplete tags: an open tag, text
+input:  "junk <ref>foo"
+output: [Text(text="junk <ref>foo")]
+
+---
+
+name:   incomplete_open_attr_text
+label:  incomplete tags: an open tag, attribute with a key/value, text
+input:  "junk <ref name=foo>bar"
+output: [Text(text="junk <ref name=foo>bar")]
+
+---
+
+name:   incomplete_open_text_lbracket
+label:  incomplete tags: an open tag, text, left open bracket
+input:  "junk <ref>bar<"
+output: [Text(text="junk <ref>bar<")]
+
+---
+
+name:   incomplete_open_text_lbracket_slash
+label:  incomplete tags: an open tag, text, left bracket, slash
+input:  "junk <ref>bar</"
+output: [Text(text="junk <ref>bar</")]
+
+---
+
+name:   incomplete_open_text_unclosed_close
+label:  incomplete tags: an open tag, text, unclosed close
+input:  "junk <ref>bar</ref"
+output: [Text(text="junk <ref>bar</ref")]
+
+---
+
+name:   incomplete_open_text_wrong_close
+label:  incomplete tags: an open tag, text, wrong close
+input:  "junk <ref>bar</span>"
+output: [Text(text="junk <ref>bar</span>")]
+
+---
+
+name:   incomplete_unclosed_close
+label:  incomplete tags: an unclosed close tag
+input:  "junk </"
+output: [Text(text="junk </")]
+
+---
+
+name:   incomplete_unclosed_close_text
+label:  incomplete tags: an unclosed close tag, with text
+input:  "junk </br"
+output: [Text(text="junk </br")]
+
+---
+
+name:   incomplete_close
+label:  incomplete tags: a close tag
+input:  "junk </ref>"
+output: [Text(text="junk </ref>")]
+
+---
+
+name:   incomplete_no_tag_name_open
+label:  incomplete tags: no tag name within brackets; just an open
+input:  "junk <>"
+output: [Text(text="junk <>")]
+
+---
+
+name:   incomplete_no_tag_name_selfclosing
+label:  incomplete tags: no tag name within brackets; self-closing
+input:  "junk < />"
+output: [Text(text="junk < />")]
+
+---
+
+name:   incomplete_no_tag_name_open_close
+label:  incomplete tags: no tag name within brackets; open and close
+input:  "junk <></>"
+output: [Text(text="junk <></>")]
+
+---
+
+name:   backslash_premature_before
+label:  a backslash before a quote before a space
+input:  "<foo attribute="this is\\" quoted">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\" quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_premature_after
+label:  a backslash before a quote after a space
+input:  "<foo attribute="this is \\"quoted">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is \\\"quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_premature_middle
+label:  a backslash before a quote in the middle of a word
+input:  "<foo attribute="this i\\"s quoted">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this i\\\"s quoted"), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_adjacent
+label:  escaped quotes next to unescaped quotes
+input:  "<foo attribute="\\"this is quoted\\"">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="\\\"this is quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_endquote
+label:  backslashes before the end quote, causing the attribute to become unquoted
+input:  "<foo attribute="this_is quoted\\">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), Text(text="\"this_is"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_double
+label:  two adjacent backslashes, which do *not* affect the quote
+input:  "<foo attribute="this is\\\\" quoted">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_triple
+label:  three adjacent backslashes, which do *not* affect the quote
+input:  "<foo attribute="this is\\\\\\" quoted">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="this is\\\\\\"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="quoted\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   backslash_unaffecting
+label:  backslashes near quotes, but not immediately adjacent, thus having no effect
+input:  "<foo attribute="\\quote\\d" also="quote\\d\\">blah</foo>"
+output: [TagOpenOpen(), Text(text="foo"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attribute"), TagAttrEquals(), TagAttrQuote(), Text(text="\\quote\\d"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="also"), TagAttrEquals(), Text(text="\"quote\\d\\\""), TagCloseOpen(padding=""), Text(text="blah"), TagOpenClose(), Text(text="foo"), TagCloseClose()]
+
+---
+
+name:   unparsable
+label:  a tag that should not be put through the normal parser
+input:  "{{t1}}<nowiki>{{t2}}</nowiki>{{t3}}"
+output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
+
+---
+
+name:   unparsable_complex
+label:  a tag that should not be put through the normal parser; lots of stuff inside
+input:  "{{t1}}<pre>{{t2}}\n==Heading==\nThis is some text with a [[page|link]].</pre>{{t3}}"
+output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="pre"), TagCloseOpen(padding=""), Text(text="{{t2}}\n==Heading==\nThis is some text with a [[page|link]]."), TagOpenClose(), Text(text="pre"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
+
+---
+
+name:   unparsable_attributed
+label:  a tag that should not be put through the normal parser; parsed attributes
+input:  "{{t1}}<nowiki attr=val attr2="{{val2}}">{{t2}}</nowiki>{{t3}}"
+output: [TemplateOpen(), Text(text=u't1'), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr"), TagAttrEquals(), Text(text="val"), TagAttrStart(pad_first=" ", pad_before_eq="", pad_after_eq=""), Text(text="attr2"), TagAttrEquals(), TagAttrQuote(), TemplateOpen(), Text(text="val2"), TemplateClose(), TagCloseOpen(padding=""), Text(text="{{t2}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
+
+---
+
+name:   unparsable_incomplete
+label:  a tag that should not be put through the normal parser; incomplete
+input:  "{{t1}}<nowiki>{{t2}}{{t3}}"
+output: [TemplateOpen(), Text(text="t1"), TemplateClose(), Text(text="<nowiki>"), TemplateOpen(), Text(text="t2"), TemplateClose(), TemplateOpen(), Text(text="t3"), TemplateClose()]
+
+---
+
+name:   unparsable_entity
+label:  a HTML entity inside unparsable text is still parsed
+input:  "{{t1}}<nowiki>{{t2}}&nbsp;{{t3}}</nowiki>{{t4}}"
+output: [TemplateOpen(), Text(text="t1"), TemplateClose(), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="{{t2}}"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="{{t3}}"), TagOpenClose(), Text(text="nowiki"), TagCloseClose(), TemplateOpen(), Text(text="t4"), TemplateClose()]
+
+---
+
+name:   unparsable_entity_incomplete
+label:  an incomplete HTML entity inside unparsable text
+input:  "<nowiki>&</nowiki>"
+output: [TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="&"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
+
+---
+
+name:   unparsable_entity_incomplete_2
+label:  an incomplete HTML entity inside unparsable text
+input:  "<nowiki>&"
+output: [Text(text="<nowiki>&")]
+
+---
+
+name:   single_open_close
+label:  a tag that supports being single; both an open and a close tag
+input:  "foo<li>bar{{baz}}</li>"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseOpen(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenClose(), Text(text="li"), TagCloseClose()]
+
+---
+
+name:   single_open
+label:  a tag that supports being single; just an open tag
+input:  "foo<li>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_selfclose
+label:  a tag that supports being single; a self-closing tag
+input:  "foo<li/>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="li"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_close
+label:  a tag that supports being single; just a close tag
+input:  "foo</li>bar{{baz}}"
+output: [Text(text="foo</li>bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_only_open_close
+label:  a tag that can only be single; both an open and a close tag
+input:  "foo<br>bar{{baz}}</br>"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose(), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True)]
+
+---
+
+name:   single_only_open
+label:  a tag that can only be single; just an open tag
+input:  "foo<br>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_only_selfclose
+label:  a tag that can only be single; a self-closing tag
+input:  "foo<br/>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_only_close
+label:  a tag that can only be single; just a close tag
+input:  "foo</br>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding="", implicit=True), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_only_double
+label:  a tag that can only be single; a tag with backslashes at the beginning and end
+input:  "foo</br/>bar{{baz}}"
+output: [Text(text="foo"), TagOpenOpen(invalid=True), Text(text="br"), TagCloseSelfclose(padding=""), Text(text="bar"), TemplateOpen(), Text(text="baz"), TemplateClose()]
+
+---
+
+name:   single_only_close_attribute
+label:  a tag that can only be single; presented as a close tag with an attribute
+input:  "</br id="break">"
+output: [TagOpenOpen(invalid=True), Text(text="br"), TagAttrStart(pad_first=" ", pad_after_eq="", pad_before_eq=""), Text(text="id"), TagAttrEquals(), TagAttrQuote(), Text(text="break"), TagCloseSelfclose(padding="", implicit=True)]
+
+---
+
+name:   capitalization
+label:  caps should be ignored within tag names
+input:  "<NoWiKi>{{test}}</nOwIkI>"
+output: [TagOpenOpen(), Text(text="NoWiKi"), TagCloseOpen(padding=""), Text(text="{{test}}"), TagOpenClose(), Text(text="nOwIkI"), TagCloseClose()]
diff --git a/tests/tokenizer/tags_wikimarkup.mwtest b/tests/tokenizer/tags_wikimarkup.mwtest
new file mode 100644
index 0000000..feff9c5
--- /dev/null
+++ b/tests/tokenizer/tags_wikimarkup.mwtest
@@ -0,0 +1,523 @@
+name:   basic_italics
+label:  basic italic text
+input:  "''text''"
+output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="text"), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   basic_bold
+label:  basic bold text
+input:  "'''text'''"
+output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="text"), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   basic_ul
+label:  basic unordered list
+input:  "*text"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="text")]
+
+---
+
+name:   basic_ol
+label:  basic ordered list
+input:  "#text"
+output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="text")]
+
+---
+
+name:   basic_dt
+label:  basic description term
+input:  ";text"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="text")]
+
+---
+
+name:   basic_dd
+label:  basic description item
+input:  ":text"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="text")]
+
+---
+
+name:   basic_hr
+label:  basic horizontal rule
+input:  "----"
+output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose()]
+
+---
+
+name:   complex_italics
+label:  italics with a lot in them
+input:  "''this is a&nbsp;test of [[Italic text|italics]] with {{plenty|of|stuff}}''"
+output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of "), WikilinkOpen(), Text(text="Italic text"), WikilinkSeparator(), Text(text="italics"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   multiline_italics
+label:  italics spanning mulitple lines
+input:  "foo\nbar''testing\ntext\nspanning\n\n\n\n\nmultiple\nlines''foo\n\nbar"
+output: [Text(text="foo\nbar"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="testing\ntext\nspanning\n\n\n\n\nmultiple\nlines"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="foo\n\nbar")]
+
+---
+
+name:   unending_italics
+label:  italics without an ending tag
+input:  "''unending formatting!"
+output: [Text(text="''unending formatting!")]
+
+---
+
+name:   misleading_italics_end
+label:  italics with something that looks like an end but isn't
+input:  "''this is 'not' the en'd'<nowiki>''</nowiki>"
+output: [Text(text="''this is 'not' the en'd'"), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="''"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
+]
+
+---
+
+name:   italics_start_outside_end_inside
+label:  italics that start outside a link and end inside it
+input:  "''foo[[bar|baz'']]spam"
+output: [Text(text="''foo"), WikilinkOpen(), Text(text="bar"), WikilinkSeparator(), Text(text="baz''"), WikilinkClose(), Text(text="spam")]
+
+---
+
+name:   italics_start_inside_end_outside
+label:  italics that start inside a link and end outside it
+input:  "[[foo|''bar]]baz''spam"
+output: [Text(text="[[foo|"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar]]baz"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="spam")]
+
+---
+
+name:   complex_bold
+label:  bold with a lot in it
+input:  "'''this is a&nbsp;test of [[Bold text|bold]] with {{plenty|of|stuff}}'''"
+output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of "), WikilinkOpen(), Text(text="Bold text"), WikilinkSeparator(), Text(text="bold"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose(), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   multiline_bold
+label:  bold spanning mulitple lines
+input:  "foo\nbar'''testing\ntext\nspanning\n\n\n\n\nmultiple\nlines'''foo\n\nbar"
+output: [Text(text="foo\nbar"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="testing\ntext\nspanning\n\n\n\n\nmultiple\nlines"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="foo\n\nbar")]
+
+---
+
+name:   unending_bold
+label:  bold without an ending tag
+input:  "'''unending formatting!"
+output: [Text(text="'''unending formatting!")]
+
+---
+
+name:   misleading_bold_end
+label:  bold with something that looks like an end but isn't
+input:  "'''this is 'not' the en''d'<nowiki>'''</nowiki>"
+output: [Text(text="'"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="this is 'not' the en"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="d'"), TagOpenOpen(), Text(text="nowiki"), TagCloseOpen(padding=""), Text(text="'''"), TagOpenClose(), Text(text="nowiki"), TagCloseClose()]
+
+---
+
+name:   bold_start_outside_end_inside
+label:  bold that start outside a link and end inside it
+input:  "'''foo[[bar|baz''']]spam"
+output: [Text(text="'''foo"), WikilinkOpen(), Text(text="bar"), WikilinkSeparator(), Text(text="baz'''"), WikilinkClose(), Text(text="spam")]
+
+---
+
+name:   bold_start_inside_end_outside
+label:  bold that start inside a link and end outside it
+input:  "[[foo|'''bar]]baz'''spam"
+output: [Text(text="[[foo|"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bar]]baz"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="spam")]
+
+---
+
+name:   bold_and_italics
+label:  bold and italics together
+input:  "this is '''''bold and italic text'''''!"
+output: [Text(text="this is "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold and italic text"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="!")]
+
+---
+
+name:   both_then_bold
+label:  text that starts bold/italic, then is just bold
+input:  "'''''both''bold'''"
+output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   both_then_italics
+label:  text that starts bold/italic, then is just italic
+input:  "'''''both'''italics''"
+output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   bold_then_both
+label:  text that starts just bold, then is bold/italic
+input:  "'''bold''both'''''"
+output: [TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="i"), TagCloseClose(), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   italics_then_both
+label:  text that starts just italic, then is bold/italic
+input:  "''italics'''both'''''"
+output: [TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="both"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   italics_then_bold
+label:  text that starts italic, then is bold
+input:  "none''italics'''''bold'''none"
+output: [Text(text="none"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text="none")]
+
+---
+
+name:   bold_then_italics
+label:  text that starts bold, then is italic
+input:  "none'''bold'''''italics''none"
+output: [Text(text="none"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bold"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="italics"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text="none")]
+
+---
+
+name:   five_three
+label:  five ticks to open, three to close (bold)
+input:  "'''''foobar'''"
+output: [Text(text="''"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="foobar"), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   five_two
+label:  five ticks to open, two to close (bold)
+input:  "'''''foobar''"
+output: [Text(text="'''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="foobar"), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   four
+label:  four ticks
+input:  "foo ''''bar'''' baz"
+output: [Text(text="foo '"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="bar'"), TagOpenClose(), Text(text="b"), TagCloseClose(), Text(text=" baz")]
+
+---
+
+name:   four_two
+label:  four ticks to open, two to close
+input:  "foo ''''bar'' baz"
+output: [Text(text="foo ''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]
+
+---
+
+name:   two_three
+label:  two ticks to open, three to close
+input:  "foo ''bar''' baz"
+output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar'"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]
+
+---
+
+name:   two_four
+label:  two ticks to open, four to close
+input:  "foo ''bar'''' baz"
+output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar''"), TagOpenClose(), Text(text="i"), TagCloseClose(), Text(text=" baz")]
+
+---
+
+name:   two_three_two
+label:  two ticks to open, three to close, two afterwards
+input:  "foo ''bar''' baz''"
+output: [Text(text="foo "), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), Text(text="bar''' baz"), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   two_four_four
+label:  two ticks to open, four to close, four afterwards
+input:  "foo ''bar'''' baz''''"
+output: [Text(text="foo ''bar'"), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text=" baz'"), TagOpenClose(), Text(text="b"), TagCloseClose()]
+
+---
+
+name:   seven
+label:  seven ticks
+input:  "'''''''seven'''''''"
+output: [Text(text="''"), TagOpenOpen(wiki_markup="''"), Text(text="i"), TagCloseOpen(), TagOpenOpen(wiki_markup="'''"), Text(text="b"), TagCloseOpen(), Text(text="seven''"), TagOpenClose(), Text(text="b"), TagCloseClose(), TagOpenClose(), Text(text="i"), TagCloseClose()]
+
+---
+
+name:   complex_ul
+label:  ul with a lot in it
+input:  "* this is a&nbsp;test of an [[Unordered list|ul]] with {{plenty|of|stuff}}"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="Unordered list"), WikilinkSeparator(), Text(text="ul"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]
+
+---
+
+name:   ul_multiline_template
+label:  ul with a template that spans multiple lines
+input:  "* this has a template with a {{line|\nbreak}}\nthis is not part of the list"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]
+
+---
+
+name:   ul_adjacent
+label:  multiple adjacent uls
+input:  "a\n*b\n*c\nd\n*e\nf"
+output: [Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf")]
+
+---
+
+name:   ul_depths
+label:  multiple adjacent uls, with differing depths
+input:  "*a\n**b\n***c\n********d\n**e\nf\n***g"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   ul_space_before
+label:  uls with space before them
+input:  "foo    *bar\n *baz\n*buzz"
+output: [Text(text="foo    *bar\n *baz\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="buzz")]
+
+---
+
+name:   ul_interruption
+label:  high-depth ul with something blocking it
+input:  "**f*oobar"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="f*oobar")]
+
+---
+
+name:   complex_ol
+label:  ol with a lot in it
+input:  "# this is a&nbsp;test of an [[Ordered list|ol]] with {{plenty|of|stuff}}"
+output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="Ordered list"), WikilinkSeparator(), Text(text="ol"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]
+
+---
+
+name:   ol_multiline_template
+label:  ol with a template that spans moltiple lines
+input:  "# this has a template with a {{line|\nbreak}}\nthis is not part of the list"
+output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]
+
+---
+
+name:   ol_adjacent
+label:  moltiple adjacent ols
+input:  "a\n#b\n#c\nd\n#e\nf"
+output: [Text(text="a\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf")]
+
+---
+
+name:   ol_depths
+label:  moltiple adjacent ols, with differing depths
+input:  "#a\n##b\n###c\n########d\n##e\nf\n###g"
+output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   ol_space_before
+label:  ols with space before them
+input:  "foo    #bar\n #baz\n#buzz"
+output: [Text(text="foo    #bar\n #baz\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="buzz")]
+
+---
+
+name:   ol_interruption
+label:  high-depth ol with something blocking it
+input:  "##f#oobar"
+output: [TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="f#oobar")]
+
+---
+
+name:   ul_ol_mix
+label:  a mix of adjacent uls and ols
+input:  "*a\n*#b\n*##c\n*##*#*#*d\n*#e\nf\n##*g"
+output: [TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   complex_dt
+label:  dt with a lot in it
+input:  "; this is a&nbsp;test of an [[description term|dt]] with {{plenty|of|stuff}}"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="description term"), WikilinkSeparator(), Text(text="dt"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]
+
+---
+
+name:   dt_multiline_template
+label:  dt with a template that spans mdttiple lines
+input:  "; this has a template with a {{line|\nbreak}}\nthis is not part of the list"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]
+
+---
+
+name:   dt_adjacent
+label:  mdttiple adjacent dts
+input:  "a\n;b\n;c\nd\n;e\nf"
+output: [Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="e\nf")]
+
+---
+
+name:   dt_depths
+label:  mdttiple adjacent dts, with differing depths
+input:  ";a\n;;b\n;;;c\n;;;;;;;;d\n;;e\nf\n;;;g"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   dt_space_before
+label:  dts with space before them
+input:  "foo    ;bar\n ;baz\n;buzz"
+output: [Text(text="foo    ;bar\n ;baz\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="buzz")]
+
+---
+
+name:   dt_interruption
+label:  high-depth dt with something blocking it
+input:  ";;f;oobar"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="f;oobar")]
+
+---
+
+name:   complex_dd
+label:  dd with a lot in it
+input:  ": this is a&nbsp;test of an [[description item|dd]] with {{plenty|of|stuff}}"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text=" this is a"), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), Text(text="test of an "), WikilinkOpen(), Text(text="description item"), WikilinkSeparator(), Text(text="dd"), WikilinkClose(), Text(text=" with "), TemplateOpen(), Text(text="plenty"), TemplateParamSeparator(), Text(text="of"), TemplateParamSeparator(), Text(text="stuff"), TemplateClose()]
+
+---
+
+name:   dd_multiline_template
+label:  dd with a template that spans mddtiple lines
+input:  ": this has a template with a {{line|\nbreak}}\nthis is not part of the list"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text=" this has a template with a "), TemplateOpen(), Text(text="line"), TemplateParamSeparator(), Text(text="\nbreak"), TemplateClose(), Text(text="\nthis is not part of the list")]
+
+---
+
+name:   dd_adjacent
+label:  mddtiple adjacent dds
+input:  "a\n:b\n:c\nd\n:e\nf"
+output: [Text(text="a\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\nd\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf")]
+
+---
+
+name:   dd_depths
+label:  mddtiple adjacent dds, with differing depths
+input:  ":a\n::b\n:::c\n::::::::d\n::e\nf\n:::g"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   dd_space_before
+label:  dds with space before them
+input:  "foo    :bar\n :baz\n:buzz"
+output: [Text(text="foo    :bar\n :baz\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="buzz")]
+
+---
+
+name:   dd_interruption
+label:  high-depth dd with something blocking it
+input:  "::f:oobar"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="f:oobar")]
+
+---
+
+name:   dt_dd_mix
+label:  a mix of adjacent dts and dds
+input:  ";a\n;:b\n;::c\n;::;:;:;d\n;:e\nf\n::;g"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="a\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="b\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="c\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="d\n"), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="e\nf\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="g")]
+
+---
+
+name:   dt_dd_mix2
+label:  the correct usage of a dt/dd unit, as in a dl
+input:  ";foo:bar:baz"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="foo"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="bar:baz")]
+
+---
+
+name:   dt_dd_mix3
+label:  another example of correct (but strange) dt/dd usage
+input:  ":;;::foo:bar:baz"
+output: [TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="foo"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="bar:baz")]
+
+---
+
+name:   ul_ol_dt_dd_mix
+label:  an assortment of uls, ols, dds, and dts
+input:  ";:#*foo\n:#*;foo\n#*;:foo\n*;:#foo"
+output: [TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), Text(text="foo\n"), TagOpenOpen(wiki_markup="*"), Text(text="li"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=";"), Text(text="dt"), TagCloseSelfclose(), TagOpenOpen(wiki_markup=":"), Text(text="dd"), TagCloseSelfclose(), TagOpenOpen(wiki_markup="#"), Text(text="li"), TagCloseSelfclose(), Text(text="foo")]
+
+---
+
+name:   hr_text_before
+label:  text before an otherwise-valid hr
+input:  "foo----"
+output: [Text(text="foo----")]
+
+---
+
+name:   hr_text_after
+label:  text after a valid hr
+input:  "----bar"
+output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="bar")]
+
+---
+
+name:   hr_text_before_after
+label:  text at both ends of an otherwise-valid hr
+input:  "foo----bar"
+output: [Text(text="foo----bar")]
+
+---
+
+name:   hr_newlines
+label:  newlines surrounding a valid hr
+input:  "foo\n----\nbar"
+output: [Text(text="foo\n"), TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\nbar")]
+
+---
+
+name:   hr_adjacent
+label:  two adjacent hrs
+input:  "----\n----"
+output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\n"), TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose()]
+
+---
+
+name:   hr_adjacent_space
+label:  two adjacent hrs, with a space before the second one, making it invalid
+input:  "----\n ----"
+output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="\n ----")]
+
+---
+
+name:   hr_short
+label:  an invalid three-hyphen-long hr
+input:  "---"
+output: [Text(text="---")]
+
+---
+
+name:   hr_long
+label:  a very long, valid hr
+input:  "------------------------------------------"
+output: [TagOpenOpen(wiki_markup="------------------------------------------"), Text(text="hr"), TagCloseSelfclose()]
+
+---
+
+name:   hr_interruption_short
+label:  a hr that is interrupted, making it invalid
+input:  "---x-"
+output: [Text(text="---x-")]
+
+---
+
+name:   hr_interruption_long
+label:  a hr that is interrupted, but the first part remains valid because it is long enough
+input:  "----x--"
+output: [TagOpenOpen(wiki_markup="----"), Text(text="hr"), TagCloseSelfclose(), Text(text="x--")]
+
+---
+
+name:   nowiki_cancel
+label:  a nowiki tag before a list causes it to not be parsed
+input:  "<nowiki />* Unordered list"
+output: [TagOpenOpen(), Text(text="nowiki"), TagCloseSelfclose(padding=" "), Text(text="* Unordered list")]
diff --git a/tests/tokenizer/text.mwtest b/tests/tokenizer/text.mwtest
index 77d5f50..040c677 100644
--- a/tests/tokenizer/text.mwtest
+++ b/tests/tokenizer/text.mwtest
@@ -23,3 +23,10 @@ name:   unicode2
 label:  additional unicode check for non-BMP codepoints
 input:  "𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰"
 output: [Text(text="𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰")]
+
+---
+
+name:   large
+label:  a lot of text, requiring multiple textbuffer blocks in the C tokenizer
+input:  "ZWfsZYcZyhGbkDYJiguJuuhsNyHGFkFhnjkbLJyXIygTHqcXdhsDkEOTSIKYlBiohLIkiXxvyebUyCGvvBcYqFdtcftGmaAanKXEIyYSEKlTfEEbdGhdePVwVImOyKiHSzAEuGyEVRIKPZaNjQsYqpqARIQfvAklFtQyTJVGlLwjJIxYkiqmHBmdOvTyNqJRbMvouoqXRyOhYDwowtkcZGSOcyzVxibQdnzhDYbrgbatUrlOMRvFSzmLWHRihtXnddwYadPgFWUOxAzAgddJVDXHerawdkrRuWaEXfuwQSkQUmLEJUmrgXDVlXCpciaisfuOUjBldElygamkkXbewzLucKRnAEBimIIotXeslRRhnqQjrypnLQvvdCsKFWPVTZaHvzJMFEahDHWcCbyXgxFvknWjhVfiLSDuFhGoFxqSvhjnnRZLmCMhmWeOgSoanDEInKTWHnbpKyUlabLppITDFFxyWKAnUYJQIcmYnrvMmzmtYvsbCYbebgAhMFVVFAKUSvlkLFYluDpbpBaNFWyfXTaOdSBrfiHDTWGBTUCXMqVvRCIMrEjWpQaGsABkioGnveQWqBTDdRQlxQiUipwfyqAocMddXqdvTHhEwjEzMkOSWVPjJvDtClhYwpvRztPmRKCSpGIpXQqrYtTLmShFdpKtOxGtGOZYIdyUGPjdmyvhJTQMtgYJWUUZnecRjBfQXsyWQWikyONySLzLEqRFqcJYdRNFcGwWZtfZasfFWcvdsHRXoqKlKYihRAOJdrPBDdxksXFwKceQVncmFXfUfBsNgjKzoObVExSnRnjegeEhqxXzPmFcuiasViAFeaXrAxXhSfSyCILkKYpjxNeKynUmdcGAbwRwRnlAFbOSCafmzXddiNpLCFTHBELvArdXFpKUGpSHRekhrMedMRNkQzmSyFKjVwiWwCvbNWjgxJRzYeRxHiCCRMXktmKBxbxGZvOpvZIJOwvGIxcBLzsMFlDqAMLtScdsJtrbIUAvKfcdChXGnBzIxGxXMgxJhayrziaCswdpjJJJhkaYnGhHXqZwOzHFdhhUIEtfjERdLaSPRTDDMHpQtonNaIgXUYhjdbnnKppfMBxgNSOOXJAPtFjfAKnrRDrumZBpNhxMstqjTGBViRkDqbTdXYUirsedifGYzZpQkvdNhtFTOPgsYXYCwZHLcSLSfwfpQKtWfZuRUUryHJsbVsAOQcIJdSKKlOvCeEjUQNRPHKXuBJUjPuaAJJxcDMqyaufqfVwUmHLdjeYZzSiiGLHOTCInpVAalbXXTMLugLiwFiyPSuSFiyJUKVrWjbZAHaJtZnQmnvorRrxdPKThqXzNgTjszQiCoMczRnwGYJMERUWGXFyrSbAqsHmLwLlnJOJoXNsjVehQjVOpQOQJAZWwFZBlgyVIplzLTlFwumPgBLYrUIAJAcmvHPGfHfWQguCjfTYzxYfbohaLFAPwxFRrNuCdCzLlEbuhyYjCmuDBTJDMCdLpNRVqEALjnPSaBPsKWRCKNGwEMFpiEWbYZRwaMopjoUuBUvMpvyLfsPKDrfQLiFOQIWPtLIMoijUEUYfhykHrSKbTtrvjwIzHdWZDVwLIpNkloCqpzIsErxxKAFuFEjikWNYChqYqVslXMtoSWzNhbMuxYbzLfJIcPGoUeGPkGyPQNhDyrjgdKekzftFrRPTuyLYqCArkDcWHTrjPQHfoThBNnTQyMwLEWxEnBXLtzJmFVLGEPrdbEwlXpgYfnVnWoNXgPQKKyiXifpvrmJATzQOzYwFhliiYxlbnsEPKbHYUfJLrwYPfSUwTIHiEvBFMrEtVmqJobfcwsiiEudTIiAnrtuywgKLOiMYbEIOAOJdOXqroPjWnQQcTNxFvkIEIsuHLyhSqSphuSmlvknzydQEnebOreeZwOouXYKlObAkaWHhOdTFLoMCHOWrVKeXjcniaxtgCziKEqWOZUWHJQpcDJzYnnduDZrmxgjZroBRwoPBUTJMYipsgJwbTSlvMyXXdAmiEWGMiQxhGvHGPLOKeTxNaLnFVbWpiYIVyqN"
+output: [Text(text="ZWfsZYcZyhGbkDYJiguJuuhsNyHGFkFhnjkbLJyXIygTHqcXdhsDkEOTSIKYlBiohLIkiXxvyebUyCGvvBcYqFdtcftGmaAanKXEIyYSEKlTfEEbdGhdePVwVImOyKiHSzAEuGyEVRIKPZaNjQsYqpqARIQfvAklFtQyTJVGlLwjJIxYkiqmHBmdOvTyNqJRbMvouoqXRyOhYDwowtkcZGSOcyzVxibQdnzhDYbrgbatUrlOMRvFSzmLWHRihtXnddwYadPgFWUOxAzAgddJVDXHerawdkrRuWaEXfuwQSkQUmLEJUmrgXDVlXCpciaisfuOUjBldElygamkkXbewzLucKRnAEBimIIotXeslRRhnqQjrypnLQvvdCsKFWPVTZaHvzJMFEahDHWcCbyXgxFvknWjhVfiLSDuFhGoFxqSvhjnnRZLmCMhmWeOgSoanDEInKTWHnbpKyUlabLppITDFFxyWKAnUYJQIcmYnrvMmzmtYvsbCYbebgAhMFVVFAKUSvlkLFYluDpbpBaNFWyfXTaOdSBrfiHDTWGBTUCXMqVvRCIMrEjWpQaGsABkioGnveQWqBTDdRQlxQiUipwfyqAocMddXqdvTHhEwjEzMkOSWVPjJvDtClhYwpvRztPmRKCSpGIpXQqrYtTLmShFdpKtOxGtGOZYIdyUGPjdmyvhJTQMtgYJWUUZnecRjBfQXsyWQWikyONySLzLEqRFqcJYdRNFcGwWZtfZasfFWcvdsHRXoqKlKYihRAOJdrPBDdxksXFwKceQVncmFXfUfBsNgjKzoObVExSnRnjegeEhqxXzPmFcuiasViAFeaXrAxXhSfSyCILkKYpjxNeKynUmdcGAbwRwRnlAFbOSCafmzXddiNpLCFTHBELvArdXFpKUGpSHRekhrMedMRNkQzmSyFKjVwiWwCvbNWjgxJRzYeRxHiCCRMXktmKBxbxGZvOpvZIJOwvGIxcBLzsMFlDqAMLtScdsJtrbIUAvKfcdChXGnBzIxGxXMgxJhayrziaCswdpjJJJhkaYnGhHXqZwOzHFdhhUIEtfjERdLaSPRTDDMHpQtonNaIgXUYhjdbnnKppfMBxgNSOOXJAPtFjfAKnrRDrumZBpNhxMstqjTGBViRkDqbTdXYUirsedifGYzZpQkvdNhtFTOPgsYXYCwZHLcSLSfwfpQKtWfZuRUUryHJsbVsAOQcIJdSKKlOvCeEjUQNRPHKXuBJUjPuaAJJxcDMqyaufqfVwUmHLdjeYZzSiiGLHOTCInpVAalbXXTMLugLiwFiyPSuSFiyJUKVrWjbZAHaJtZnQmnvorRrxdPKThqXzNgTjszQiCoMczRnwGYJMERUWGXFyrSbAqsHmLwLlnJOJoXNsjVehQjVOpQOQJAZWwFZBlgyVIplzLTlFwumPgBLYrUIAJAcmvHPGfHfWQguCjfTYzxYfbohaLFAPwxFRrNuCdCzLlEbuhyYjCmuDBTJDMCdLpNRVqEALjnPSaBPsKWRCKNGwEMFpiEWbYZRwaMopjoUuBUvMpvyLfsPKDrfQLiFOQIWPtLIMoijUEUYfhykHrSKbTtrvjwIzHdWZDVwLIpNkloCqpzIsErxxKAFuFEjikWNYChqYqVslXMtoSWzNhbMuxYbzLfJIcPGoUeGPkGyPQNhDyrjgdKekzftFrRPTuyLYqCArkDcWHTrjPQHfoThBNnTQyMwLEWxEnBXLtzJmFVLGEPrdbEwlXpgYfnVnWoNXgPQKKyiXifpvrmJATzQOzYwFhliiYxlbnsEPKbHYUfJLrwYPfSUwTIHiEvBFMrEtVmqJobfcwsiiEudTIiAnrtuywgKLOiMYbEIOAOJdOXqroPjWnQQcTNxFvkIEIsuHLyhSqSphuSmlvknzydQEnebOreeZwOouXYKlObAkaWHhOdTFLoMCHOWrVKeXjcniaxtgCziKEqWOZUWHJQpcDJzYnnduDZrmxgjZroBRwoPBUTJMYipsgJwbTSlvMyXXdAmiEWGMiQxhGvHGPLOKeTxNaLnFVbWpiYIVyqN")]
diff --git a/tests/tokenizer/wikilinks.mwtest b/tests/tokenizer/wikilinks.mwtest
index 0682ef1..8eb381a 100644
--- a/tests/tokenizer/wikilinks.mwtest
+++ b/tests/tokenizer/wikilinks.mwtest
@@ -40,17 +40,17 @@ output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar|b
 
 ---
 
-name:   nested
-label:  a wikilink nested within the value of another
-input:  "[[foo|[[bar]]]]"
-output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), WikilinkOpen(), Text(text="bar"), WikilinkClose(), WikilinkClose()]
+name:   newline_text
+label:  a newline in the middle of the text
+input:  "[[foo|foo\nbar]]"
+output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="foo\nbar"), WikilinkClose()]
 
 ---
 
-name:   nested_with_text
-label:  a wikilink nested within the value of another, separated by other data
-input:  "[[foo|a[[b]]c]]"
-output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c"), WikilinkClose()]
+name:   bracket_text
+label:  a left bracket in the middle of the text
+input:  "[[foo|bar[baz]]"
+output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar[baz"), WikilinkClose()]
 
 ---
 
@@ -96,13 +96,34 @@ output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(),
 
 ---
 
-name:   invalid_nested_text
+name:   invalid_nested_padding
 label:  invalid wikilink: trying to nest in the wrong context, with a text param
 input:  "[[foo[[bar]]|baz]]"
 output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="|baz]]")]
 
 ---
 
+name:   invalid_nested_text
+label:  invalid wikilink: a wikilink nested within the value of another
+input:  "[[foo|[[bar]]"
+output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose()]
+
+---
+
+name:   invalid_nested_text_2
+label:  invalid wikilink: a wikilink nested within the value of another, two pairs of closing brackets
+input:  "[[foo|[[bar]]]]"
+output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="]]")]
+
+---
+
+name:   invalid_nested_text_padding
+label:  invalid wikilink: a wikilink nested within the value of another, separated by other data
+input:  "[[foo|a[[b]]c]]"
+output: [Text(text="[[foo|a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c]]")]
+
+---
+
 name:   incomplete_open_only
 label:  incomplete wikilinks: just an open
 input:  "[["