@@ -1,4 +1,6 @@ | |||
*.pyc | |||
*.so | |||
*.dll | |||
*.egg | |||
*.egg-info | |||
.DS_Store | |||
@@ -0,0 +1,6 @@ | |||
language: python | |||
python: | |||
- "2.7" | |||
- "3.3" | |||
install: python setup.py build | |||
script: python setup.py test -q |
@@ -0,0 +1,33 @@ | |||
v0.1.1 (19da4d2144) to v0.2: | |||
- The parser now fully supports Python 3 in addition to Python 2.7. | |||
- Added a C tokenizer extension that is significantly faster than its Python | |||
equivalent. It is enabled by default (if available) and can be toggled by | |||
setting `mwparserfromhell.parser.use_c` to a boolean value. | |||
- Added a complete set of unit tests covering parsing and wikicode | |||
manipulation. | |||
- Renamed Wikicode.filter_links() to filter_wikilinks() (applies to ifilter as | |||
well). | |||
- Added filter methods for Arguments, Comments, Headings, and HTMLEntities. | |||
- Added 'before' param to Template.add(); renamed 'force_nonconformity' to | |||
'preserve_spacing'. | |||
- Added 'include_lead' param to Wikicode.get_sections(). | |||
- Removed 'flat' param from Wikicode.get_sections(). | |||
- Removed 'force_no_field' param from Template.remove(). | |||
- Added support for Travis CI. | |||
- Added note about Windows build issue in the README. | |||
- The tokenizer will limit itself to a realistic recursion depth to prevent | |||
errors and unreasonably long parse times. | |||
- Fixed how some nodes' attribute setters handle input. | |||
- Fixed multiple bugs in the tokenizer's handling of invalid markup. | |||
- Fixed bugs in the implementation of SmartList and StringMixIn. | |||
- Fixed some broken example code in the README; other copyedits. | |||
- Other bugfixes and code cleanup. | |||
v0.1 (ba94938fe8) to v0.1.1 (19da4d2144): | |||
- Added support for Comments (<!-- foo -->) and Wikilinks ([[foo]]). | |||
- Added corresponding ifilter_links() and filter_links() methods to Wikicode. | |||
- Fixed a bug when parsing incomplete templates. | |||
- Fixed strip_code() to affect the contents of headings. | |||
- Various copyedits in documentation and comments. |
@@ -1,4 +1,4 @@ | |||
Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
Permission is hereby granted, free of charge, to any person obtaining a copy | |||
of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,10 @@ | |||
mwparserfromhell | |||
================ | |||
.. image:: https://travis-ci.org/earwig/mwparserfromhell.png?branch=develop | |||
:alt: Build Status | |||
:target: http://travis-ci.org/earwig/mwparserfromhell | |||
**mwparserfromhell** (the *MediaWiki Parser from Hell*) is a Python package | |||
that provides an easy-to-use and outrageously powerful parser for MediaWiki_ | |||
wikicode. It supports Python 2 and Python 3. | |||
@@ -18,7 +22,13 @@ so you can install the latest release with ``pip install mwparserfromhell`` | |||
cd mwparserfromhell | |||
python setup.py install | |||
You can run the comprehensive unit testing suite with ``python setup.py test``. | |||
If you get ``error: Unable to find vcvarsall.bat`` while installing, this is | |||
because Windows can't find the compiler for C extensions. Consult this | |||
`StackOverflow question`_ for help. You can also set ``ext_modules`` in | |||
``setup.py`` to an empty list to prevent the extension from building. | |||
You can run the comprehensive unit testing suite with | |||
``python setup.py test -q``. | |||
Usage | |||
----- | |||
@@ -106,12 +116,12 @@ Integration | |||
``Page`` objects have a ``parse`` method that essentially calls | |||
``mwparserfromhell.parse()`` on ``page.get()``. | |||
If you're using PyWikipedia_, your code might look like this:: | |||
If you're using Pywikipedia_, your code might look like this:: | |||
import mwparserfromhell | |||
import wikipedia as pywikibot | |||
def parse(title): | |||
site = pywikibot.get_site() | |||
site = pywikibot.getSite() | |||
page = pywikibot.Page(site, title) | |||
text = page.get() | |||
return mwparserfromhell.parse(text) | |||
@@ -124,16 +134,19 @@ following code (via the API_):: | |||
import mwparserfromhell | |||
API_URL = "http://en.wikipedia.org/w/api.php" | |||
def parse(title): | |||
raw = urllib.urlopen(API_URL, data).read() | |||
data = {"action": "query", "prop": "revisions", "rvlimit": 1, | |||
"rvprop": "content", "format": "json", "titles": title} | |||
raw = urllib.urlopen(API_URL, urllib.urlencode(data)).read() | |||
res = json.loads(raw) | |||
text = res["query"]["pages"].values()[0]["revisions"][0]["*"] | |||
return mwparserfromhell.parse(text) | |||
.. _MediaWiki: http://mediawiki.org | |||
.. _Earwig: http://en.wikipedia.org/wiki/User:The_Earwig | |||
.. _Σ: http://en.wikipedia.org/wiki/User:%CE%A3 | |||
.. _Python Package Index: http://pypi.python.org | |||
.. _get pip: http://pypi.python.org/pypi/pip | |||
.. _EarwigBot: https://github.com/earwig/earwigbot | |||
.. _PyWikipedia: http://pywikipediabot.sourceforge.net/ | |||
.. _API: http://mediawiki.org/wiki/API | |||
.. _MediaWiki: http://mediawiki.org | |||
.. _Earwig: http://en.wikipedia.org/wiki/User:The_Earwig | |||
.. _Σ: http://en.wikipedia.org/wiki/User:%CE%A3 | |||
.. _Python Package Index: http://pypi.python.org | |||
.. _StackOverflow question: http://stackoverflow.com/questions/2817869/error-unable-to-find-vcvarsall-bat | |||
.. _get pip: http://pypi.python.org/pypi/pip | |||
.. _EarwigBot: https://github.com/earwig/earwigbot | |||
.. _Pywikipedia: https://www.mediawiki.org/wiki/Manual:Pywikipediabot | |||
.. _API: http://mediawiki.org/wiki/API |
@@ -0,0 +1,58 @@ | |||
Changelog | |||
========= | |||
v0.2 | |||
---- | |||
19da4d2144_ to master_ (released June 20, 2013) | |||
- The parser now fully supports Python 3 in addition to Python 2.7. | |||
- Added a C tokenizer extension that is significantly faster than its Python | |||
equivalent. It is enabled by default (if available) and can be toggled by | |||
setting :py:attr:`mwparserfromhell.parser.use_c` to a boolean value. | |||
- Added a complete set of unit tests covering parsing and wikicode | |||
manipulation. | |||
- Renamed :py:meth:`.filter_links` to :py:meth:`.filter_wikilinks` (applies to | |||
:py:meth:`.ifilter` as well). | |||
- Added filter methods for :py:class:`Arguments <.Argument>`, | |||
:py:class:`Comments <.Comment>`, :py:class:`Headings <.Heading>`, and | |||
:py:class:`HTMLEntities <.HTMLEntity>`. | |||
- Added *before* param to :py:meth:`Template.add() <.Template.add>`; renamed | |||
*force_nonconformity* to *preserve_spacing*. | |||
- Added *include_lead* param to :py:meth:`Wikicode.get_sections() | |||
<.get_sections>`. | |||
- Removed *flat* param from :py:meth:`.get_sections`. | |||
- Removed *force_no_field* param from :py:meth:`Template.remove() | |||
<.Template.remove>`. | |||
- Added support for Travis CI. | |||
- Added note about Windows build issue in the README. | |||
- The tokenizer will limit itself to a realistic recursion depth to prevent | |||
errors and unreasonably long parse times. | |||
- Fixed how some nodes' attribute setters handle input. | |||
- Fixed multiple bugs in the tokenizer's handling of invalid markup. | |||
- Fixed bugs in the implementation of :py:class:`.SmartList` and | |||
:py:class:`.StringMixIn`. | |||
- Fixed some broken example code in the README; other copyedits. | |||
- Other bugfixes and code cleanup. | |||
v0.1.1 | |||
------ | |||
ba94938fe8_ to 19da4d2144_ (released September 21, 2012) | |||
- Added support for :py:class:`Comments <.Comment>` (``<!-- foo -->``) and | |||
:py:class:`Wikilinks <.Wikilink>` (``[[foo]]``). | |||
- Added corresponding :py:meth:`.ifilter_links` and :py:meth:`.filter_links` | |||
methods to :py:class:`.Wikicode`. | |||
- Fixed a bug when parsing incomplete templates. | |||
- Fixed :py:meth:`.strip_code` to affect the contents of headings. | |||
- Various copyedits in documentation and comments. | |||
v0.1 | |||
---- | |||
ba94938fe8_ (released August 23, 2012) | |||
.. _master: https://github.com/earwig/mwparserfromhell/tree/v0.2 | |||
.. _19da4d2144: https://github.com/earwig/mwparserfromhell/tree/v0.1.1 | |||
.. _ba94938fe8: https://github.com/earwig/mwparserfromhell/tree/v0.1 |
@@ -17,6 +17,7 @@ import sys, os | |||
# add these directories to sys.path here. If the directory is relative to the | |||
# documentation root, use os.path.abspath to make it absolute, like shown here. | |||
sys.path.insert(0, os.path.abspath('..')) | |||
import mwparserfromhell | |||
# -- General configuration ----------------------------------------------------- | |||
@@ -41,16 +42,16 @@ master_doc = 'index' | |||
# General information about the project. | |||
project = u'mwparserfromhell' | |||
copyright = u'2012 Ben Kurtovic' | |||
copyright = u'2012, 2013 Ben Kurtovic' | |||
# The version info for the project you're documenting, acts as replacement for | |||
# |version| and |release|, also used in various other places throughout the | |||
# built documents. | |||
# | |||
# The short X.Y version. | |||
version = '0.1' | |||
version = ".".join(mwparserfromhell.__version__.split(".", 2)[:2]) | |||
# The full version, including alpha/beta/rc tags. | |||
release = '0.1.1' | |||
release = mwparserfromhell.__version__ | |||
# The language for content autogenerated by Sphinx. Refer to documentation | |||
# for a list of supported languages. | |||
@@ -1,4 +1,4 @@ | |||
MWParserFromHell v0.1 Documentation | |||
MWParserFromHell v0.2 Documentation | |||
=================================== | |||
:py:mod:`mwparserfromhell` (the *MediaWiki Parser from Hell*) is a Python | |||
@@ -22,10 +22,16 @@ so you can install the latest release with ``pip install mwparserfromhell`` | |||
cd mwparserfromhell | |||
python setup.py install | |||
If you get ``error: Unable to find vcvarsall.bat`` while installing, this is | |||
because Windows can't find the compiler for C extensions. Consult this | |||
`StackOverflow question`_ for help. You can also set ``ext_modules`` in | |||
``setup.py`` to an empty list to prevent the extension from building. | |||
You can run the comprehensive unit testing suite with ``python setup.py test``. | |||
.. _Python Package Index: http://pypi.python.org | |||
.. _get pip: http://pypi.python.org/pypi/pip | |||
.. _Python Package Index: http://pypi.python.org | |||
.. _get pip: http://pypi.python.org/pypi/pip | |||
.. _StackOverflow question: http://stackoverflow.com/questions/2817869/error-unable-to-find-vcvarsall-bat | |||
Contents | |||
-------- | |||
@@ -35,6 +41,7 @@ Contents | |||
usage | |||
integration | |||
changelog | |||
API Reference <api/modules> | |||
@@ -7,12 +7,12 @@ Integration | |||
:py:func:`mwparserfromhell.parse() <mwparserfromhell.__init__.parse>` on | |||
:py:meth:`~earwigbot.wiki.page.Page.get`. | |||
If you're using PyWikipedia_, your code might look like this:: | |||
If you're using Pywikipedia_, your code might look like this:: | |||
import mwparserfromhell | |||
import wikipedia as pywikibot | |||
def parse(title): | |||
site = pywikibot.get_site() | |||
site = pywikibot.getSite() | |||
page = pywikibot.Page(site, title) | |||
text = page.get() | |||
return mwparserfromhell.parse(text) | |||
@@ -31,5 +31,5 @@ following code (via the API_):: | |||
return mwparserfromhell.parse(text) | |||
.. _EarwigBot: https://github.com/earwig/earwigbot | |||
.. _PyWikipedia: http://pywikipediabot.sourceforge.net/ | |||
.. _Pywikipedia: https://www.mediawiki.org/wiki/Manual:Pywikipediabot | |||
.. _API: http://mediawiki.org/wiki/API |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -29,12 +29,11 @@ outrageously powerful parser for `MediaWiki <http://mediawiki.org>`_ wikicode. | |||
from __future__ import unicode_literals | |||
__author__ = "Ben Kurtovic" | |||
__copyright__ = "Copyright (C) 2012 Ben Kurtovic" | |||
__copyright__ = "Copyright (C) 2012, 2013 Ben Kurtovic" | |||
__license__ = "MIT License" | |||
__version__ = "0.1.1" | |||
__version__ = "0.2" | |||
__email__ = "ben.kurtovic@verizon.net" | |||
from . import nodes, parser, smart_list, string_mixin, wikicode | |||
from . import compat, nodes, parser, smart_list, string_mixin, utils, wikicode | |||
parse = lambda text: parser.Parser(text).parse() | |||
parse.__doc__ = "Short for :py:meth:`.Parser.parse`." | |||
parse = utils.parse_anything |
@@ -1,29 +1,29 @@ | |||
# -*- coding: utf-8 -*- | |||
""" | |||
Implements support for both Python 2 and Python 3 by defining common types in | |||
terms of their Python 2/3 variants. For example, :py:class:`str` is set to | |||
:py:class:`unicode` on Python 2 but :py:class:`str` on Python 3; likewise, | |||
:py:class:`bytes` is :py:class:`str` on 2 but :py:class:`bytes` on 3. These | |||
types are meant to be imported directly from within the parser's modules. | |||
""" | |||
import sys | |||
py3k = sys.version_info.major == 3 | |||
if py3k: | |||
bytes = bytes | |||
str = str | |||
basestring = str | |||
maxsize = sys.maxsize | |||
import html.entities as htmlentities | |||
else: | |||
bytes = str | |||
str = unicode | |||
basestring = basestring | |||
maxsize = sys.maxint | |||
import htmlentitydefs as htmlentities | |||
del sys | |||
# -*- coding: utf-8 -*- | |||
""" | |||
Implements support for both Python 2 and Python 3 by defining common types in | |||
terms of their Python 2/3 variants. For example, :py:class:`str` is set to | |||
:py:class:`unicode` on Python 2 but :py:class:`str` on Python 3; likewise, | |||
:py:class:`bytes` is :py:class:`str` on 2 but :py:class:`bytes` on 3. These | |||
types are meant to be imported directly from within the parser's modules. | |||
""" | |||
import sys | |||
py3k = sys.version_info[0] == 3 | |||
if py3k: | |||
bytes = bytes | |||
str = str | |||
basestring = str | |||
maxsize = sys.maxsize | |||
import html.entities as htmlentities | |||
else: | |||
bytes = str | |||
str = unicode | |||
basestring = basestring | |||
maxsize = sys.maxint | |||
import htmlentitydefs as htmlentities | |||
del sys |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -30,6 +30,7 @@ __all__ = ["Argument"] | |||
class Argument(Node): | |||
"""Represents a template argument substitution, like ``{{{foo}}}``.""" | |||
def __init__(self, name, default=None): | |||
super(Argument, self).__init__() | |||
self._name = name | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -29,6 +29,7 @@ __all__ = ["Comment"] | |||
class Comment(Node): | |||
"""Represents a hidden HTML comment, like ``<!-- foobar -->``.""" | |||
def __init__(self, contents): | |||
super(Comment, self).__init__() | |||
self._contents = contents | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -23,7 +23,7 @@ | |||
from __future__ import unicode_literals | |||
from . import Node | |||
from ..compat import htmlentities, str | |||
from ..compat import htmlentities, py3k, str | |||
__all__ = ["HTMLEntity"] | |||
@@ -63,28 +63,31 @@ class HTMLEntity(Node): | |||
return self.normalize() | |||
return self | |||
def _unichr(self, value): | |||
"""Implement the builtin unichr() with support for non-BMP code points. | |||
if not py3k: | |||
@staticmethod | |||
def _unichr(value): | |||
"""Implement builtin unichr() with support for non-BMP code points. | |||
On wide Python builds, this functions like the normal unichr(). On | |||
narrow builds, this returns the value's corresponding surrogate pair. | |||
""" | |||
try: | |||
return unichr(value) | |||
except ValueError: | |||
# Test whether we're on the wide or narrow Python build. Check the | |||
# length of a non-BMP code point (U+1F64A, SPEAK-NO-EVIL MONKEY): | |||
if len("\U0001F64A") == 2: | |||
# Ensure this is within the range we can encode: | |||
if value > 0x10FFFF: | |||
raise ValueError("unichr() arg not in range(0x110000)") | |||
code = value - 0x10000 | |||
if value < 0: # Invalid code point | |||
raise | |||
lead = 0xD800 + (code >> 10) | |||
trail = 0xDC00 + (code % (1 << 10)) | |||
return unichr(lead) + unichr(trail) | |||
raise | |||
On wide Python builds, this functions like the normal unichr(). On | |||
narrow builds, this returns the value's encoded surrogate pair. | |||
""" | |||
try: | |||
return unichr(value) | |||
except ValueError: | |||
# Test whether we're on the wide or narrow Python build. Check | |||
# the length of a non-BMP code point | |||
# (U+1F64A, SPEAK-NO-EVIL MONKEY): | |||
if len("\U0001F64A") == 2: | |||
# Ensure this is within the range we can encode: | |||
if value > 0x10FFFF: | |||
raise ValueError("unichr() arg not in range(0x110000)") | |||
code = value - 0x10000 | |||
if value < 0: # Invalid code point | |||
raise | |||
lead = 0xD800 + (code >> 10) | |||
trail = 0xDC00 + (code % (1 << 10)) | |||
return unichr(lead) + unichr(trail) | |||
raise | |||
@property | |||
def value(self): | |||
@@ -119,28 +122,60 @@ class HTMLEntity(Node): | |||
@value.setter | |||
def value(self, newval): | |||
newval = str(newval) | |||
if newval not in htmlentities.entitydefs: | |||
test = int(self.value, 16) | |||
if test < 0 or (test > 0x10FFFF and int(self.value) > 0x10FFFF): | |||
raise ValueError(newval) | |||
try: | |||
int(newval) | |||
except ValueError: | |||
try: | |||
int(newval, 16) | |||
except ValueError: | |||
if newval not in htmlentities.entitydefs: | |||
raise ValueError("entity value is not a valid name") | |||
self._named = True | |||
self._hexadecimal = False | |||
else: | |||
if int(newval, 16) < 0 or int(newval, 16) > 0x10FFFF: | |||
raise ValueError("entity value is not in range(0x110000)") | |||
self._named = False | |||
self._hexadecimal = True | |||
else: | |||
test = int(newval, 16 if self.hexadecimal else 10) | |||
if test < 0 or test > 0x10FFFF: | |||
raise ValueError("entity value is not in range(0x110000)") | |||
self._named = False | |||
self._value = newval | |||
@named.setter | |||
def named(self, newval): | |||
self._named = bool(newval) | |||
newval = bool(newval) | |||
if newval and self.value not in htmlentities.entitydefs: | |||
raise ValueError("entity value is not a valid name") | |||
if not newval: | |||
try: | |||
int(self.value, 16) | |||
except ValueError: | |||
err = "current entity value is not a valid Unicode codepoint" | |||
raise ValueError(err) | |||
self._named = newval | |||
@hexadecimal.setter | |||
def hexadecimal(self, newval): | |||
self._hexadecimal = bool(newval) | |||
newval = bool(newval) | |||
if newval and self.named: | |||
raise ValueError("a named entity cannot be hexadecimal") | |||
self._hexadecimal = newval | |||
@hex_char.setter | |||
def hex_char(self, newval): | |||
self._hex_char = bool(newval) | |||
newval = str(newval) | |||
if newval not in ("x", "X"): | |||
raise ValueError(newval) | |||
self._hex_char = newval | |||
def normalize(self): | |||
"""Return the unicode character represented by the HTML entity.""" | |||
chrfunc = chr if py3k else HTMLEntity._unichr | |||
if self.named: | |||
return unichr(htmlentities.name2codepoint[self.value]) | |||
return chrfunc(htmlentities.name2codepoint[self.value]) | |||
if self.hexadecimal: | |||
return self._unichr(int(self.value, 16)) | |||
return self._unichr(int(self.value)) | |||
return chrfunc(int(self.value, 16)) | |||
return chrfunc(int(self.value)) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -81,7 +81,7 @@ class Template(Node): | |||
in parameter names or values so they are not mistaken for new | |||
parameters. | |||
""" | |||
replacement = HTMLEntity(value=ord(char)) | |||
replacement = str(HTMLEntity(value=ord(char))) | |||
for node in code.filter_text(recursive=False): | |||
if char in node: | |||
code.replace(node, node.replace(char, replacement)) | |||
@@ -107,7 +107,7 @@ class Template(Node): | |||
values = tuple(theories.values()) | |||
best = max(values) | |||
confidence = float(best) / sum(values) | |||
if confidence > 0.75: | |||
if confidence >= 0.75: | |||
return tuple(theories.keys())[values.index(best)] | |||
def _get_spacing_conventions(self, use_names): | |||
@@ -142,9 +142,9 @@ class Template(Node): | |||
return False | |||
return True | |||
def _remove_without_field(self, param, i, force_no_field): | |||
def _remove_without_field(self, param, i): | |||
"""Return False if a parameter name should be kept, otherwise True.""" | |||
if not param.showkey and not force_no_field: | |||
if not param.showkey: | |||
dependents = [not after.showkey for after in self.params[i+1:]] | |||
if any(dependents): | |||
return False | |||
@@ -183,11 +183,10 @@ class Template(Node): | |||
def get(self, name): | |||
"""Get the parameter whose name is *name*. | |||
The returned object is a | |||
:py:class:`~.Parameter` instance. Raises :py:exc:`ValueError` if no | |||
parameter has this name. Since multiple parameters can have the same | |||
name, we'll return the last match, since the last parameter is the only | |||
one read by the MediaWiki parser. | |||
The returned object is a :py:class:`~.Parameter` instance. Raises | |||
:py:exc:`ValueError` if no parameter has this name. Since multiple | |||
parameters can have the same name, we'll return the last match, since | |||
the last parameter is the only one read by the MediaWiki parser. | |||
""" | |||
name = name.strip() if isinstance(name, basestring) else str(name) | |||
for param in reversed(self.params): | |||
@@ -195,20 +194,34 @@ class Template(Node): | |||
return param | |||
raise ValueError(name) | |||
def add(self, name, value, showkey=None, force_nonconformity=False): | |||
def add(self, name, value, showkey=None, before=None, | |||
preserve_spacing=True): | |||
"""Add a parameter to the template with a given *name* and *value*. | |||
*name* and *value* can be anything parasable by | |||
:py:func:`.utils.parse_anything`; pipes (and equal signs, if | |||
appropriate) are automatically escaped from *value* where applicable. | |||
:py:func:`.utils.parse_anything`; pipes and equal signs are | |||
automatically escaped from *value* when appropriate. | |||
If *showkey* is given, this will determine whether or not to show the | |||
parameter's name (e.g., ``{{foo|bar}}``'s parameter has a name of | |||
``"1"`` but it is hidden); otherwise, we'll make a safe and intelligent | |||
guess. If *name* is already a parameter, we'll replace its value while | |||
keeping the same spacing rules unless *force_nonconformity* is | |||
``True``. We will also try to guess the dominant spacing convention | |||
when adding a new parameter using :py:meth:`_get_spacing_conventions` | |||
unless *force_nonconformity* is ``True``. | |||
guess. | |||
If *name* is already a parameter in the template, we'll replace its | |||
value while keeping the same whitespace around it. We will also try to | |||
guess the dominant spacing convention when adding a new parameter using | |||
:py:meth:`_get_spacing_conventions`. | |||
If *before* is given (either a :py:class:`~.Parameter` object or a | |||
name), then we will place the parameter immediately before this one. | |||
Otherwise, it will be added at the end. If *before* is a name and | |||
exists multiple times in the template, we will place it before the last | |||
occurance. If *before* is not in the template, :py:exc:`ValueError` is | |||
raised. The argument is ignored if the new parameter already exists. | |||
If *preserve_spacing* is ``False``, we will avoid preserving spacing | |||
conventions when changing the value of an existing parameter or when | |||
adding a new one. | |||
""" | |||
name, value = parse_anything(name), parse_anything(value) | |||
self._surface_escape(value, "|") | |||
@@ -217,14 +230,17 @@ class Template(Node): | |||
self.remove(name, keep_field=True) | |||
existing = self.get(name) | |||
if showkey is not None: | |||
if not showkey: | |||
self._surface_escape(value, "=") | |||
existing.showkey = showkey | |||
if not existing.showkey: | |||
self._surface_escape(value, "=") | |||
nodes = existing.value.nodes | |||
if force_nonconformity: | |||
existing.value = value | |||
else: | |||
if preserve_spacing: | |||
for i in range(2): # Ignore empty text nodes | |||
if not nodes[i]: | |||
nodes[i] = None | |||
existing.value = parse_anything([nodes[0], value, nodes[1]]) | |||
else: | |||
existing.value = value | |||
return existing | |||
if showkey is None: | |||
@@ -246,43 +262,38 @@ class Template(Node): | |||
if not showkey: | |||
self._surface_escape(value, "=") | |||
if not force_nonconformity: | |||
if preserve_spacing: | |||
before_n, after_n = self._get_spacing_conventions(use_names=True) | |||
if before_n and after_n: | |||
name = parse_anything([before_n, value, after_n]) | |||
elif before_n: | |||
name = parse_anything([before_n, value]) | |||
elif after_n: | |||
name = parse_anything([value, after_n]) | |||
before_v, after_v = self._get_spacing_conventions(use_names=False) | |||
if before_v and after_v: | |||
value = parse_anything([before_v, value, after_v]) | |||
elif before_v: | |||
value = parse_anything([before_v, value]) | |||
elif after_v: | |||
value = parse_anything([value, after_v]) | |||
name = parse_anything([before_n, name, after_n]) | |||
value = parse_anything([before_v, value, after_v]) | |||
param = Parameter(name, value, showkey) | |||
self.params.append(param) | |||
if before: | |||
if not isinstance(before, Parameter): | |||
before = self.get(before) | |||
self.params.insert(self.params.index(before), param) | |||
else: | |||
self.params.append(param) | |||
return param | |||
def remove(self, name, keep_field=False, force_no_field=False): | |||
def remove(self, name, keep_field=False): | |||
"""Remove a parameter from the template whose name is *name*. | |||
If *keep_field* is ``True``, we will keep the parameter's name, but | |||
blank its value. Otherwise, we will remove the parameter completely | |||
*unless* other parameters are dependent on it (e.g. removing ``bar`` | |||
from ``{{foo|bar|baz}}`` is unsafe because ``{{foo|baz}}`` is not what | |||
we expected, so ``{{foo||baz}}`` will be produced instead), unless | |||
*force_no_field* is also ``True``. If the parameter shows up multiple | |||
times in the template, we will remove all instances of it (and keep | |||
one if *keep_field* is ``True`` - that being the first instance if | |||
none of the instances have dependents, otherwise that instance will be | |||
kept). | |||
we expected, so ``{{foo||baz}}`` will be produced instead). | |||
If the parameter shows up multiple times in the template, we will | |||
remove all instances of it (and keep one if *keep_field* is ``True`` - | |||
the first instance if none have dependents, otherwise the one with | |||
dependents will be kept). | |||
""" | |||
name = name.strip() if isinstance(name, basestring) else str(name) | |||
removed = False | |||
to_remove = [] | |||
for i, param in enumerate(self.params): | |||
if param.name.strip() == name: | |||
if keep_field: | |||
@@ -290,13 +301,15 @@ class Template(Node): | |||
self._blank_param_value(param.value) | |||
keep_field = False | |||
else: | |||
self.params.remove(param) | |||
to_remove.append(param) | |||
else: | |||
if self._remove_without_field(param, i, force_no_field): | |||
self.params.remove(param) | |||
if self._remove_without_field(param, i): | |||
to_remove.append(param) | |||
else: | |||
self._blank_param_value(param.value) | |||
if not removed: | |||
removed = True | |||
if not removed: | |||
raise ValueError(name) | |||
for param in to_remove: | |||
self.params.remove(param) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -29,6 +29,7 @@ __all__ = ["Text"] | |||
class Text(Node): | |||
"""Represents ordinary, unformatted text with no special properties.""" | |||
def __init__(self, value): | |||
super(Text, self).__init__() | |||
self._value = value | |||
@@ -39,6 +40,9 @@ class Text(Node): | |||
def __strip__(self, normalize, collapse): | |||
return self | |||
def __showtree__(self, write, get, mark): | |||
write(str(self).encode("unicode_escape").decode("utf8")) | |||
@property | |||
def value(self): | |||
"""The actual text itself.""" | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -30,6 +30,7 @@ __all__ = ["Wikilink"] | |||
class Wikilink(Node): | |||
"""Represents an internal wikilink, like ``[[Foo|Bar]]``.""" | |||
def __init__(self, title, text=None): | |||
super(Wikilink, self).__init__() | |||
self._title = title | |||
@@ -78,4 +79,7 @@ class Wikilink(Node): | |||
@text.setter | |||
def text(self, value): | |||
self._text = parse_anything(value) | |||
if value is None: | |||
self._text = None | |||
else: | |||
self._text = parse_anything(value) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -26,14 +26,16 @@ modules: the :py:mod:`~.tokenizer` and the :py:mod:`~.builder`. This module | |||
joins them together under one interface. | |||
""" | |||
from .builder import Builder | |||
from .tokenizer import Tokenizer | |||
try: | |||
from ._builder import CBuilder as Builder | |||
from ._tokenizer import CTokenizer as Tokenizer | |||
from ._tokenizer import CTokenizer | |||
use_c = True | |||
except ImportError: | |||
from .builder import Builder | |||
from .tokenizer import Tokenizer | |||
CTokenizer = None | |||
use_c = False | |||
__all__ = ["Parser"] | |||
__all__ = ["use_c", "Parser"] | |||
class Parser(object): | |||
"""Represents a parser for wikicode. | |||
@@ -46,7 +48,10 @@ class Parser(object): | |||
def __init__(self, text): | |||
self.text = text | |||
self._tokenizer = Tokenizer() | |||
if use_c and CTokenizer: | |||
self._tokenizer = CTokenizer() | |||
else: | |||
self._tokenizer = Tokenizer() | |||
self._builder = Builder() | |||
def parse(self): | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -62,6 +62,15 @@ Local (stack-specific) contexts: | |||
* :py:const:`COMMENT` | |||
* :py:const:`SAFETY_CHECK` | |||
* :py:const:`HAS_TEXT` | |||
* :py:const:`FAIL_ON_TEXT` | |||
* :py:const:`FAIL_NEXT` | |||
* :py:const:`FAIL_ON_LBRACE` | |||
* :py:const:`FAIL_ON_RBRACE` | |||
* :py:const:`FAIL_ON_EQUALS` | |||
Global contexts: | |||
* :py:const:`GL_HEADING` | |||
@@ -69,29 +78,36 @@ Global contexts: | |||
# Local contexts: | |||
TEMPLATE = 0b00000000000111 | |||
TEMPLATE_NAME = 0b00000000000001 | |||
TEMPLATE_PARAM_KEY = 0b00000000000010 | |||
TEMPLATE_PARAM_VALUE = 0b00000000000100 | |||
ARGUMENT = 0b00000000011000 | |||
ARGUMENT_NAME = 0b00000000001000 | |||
ARGUMENT_DEFAULT = 0b00000000010000 | |||
WIKILINK = 0b00000001100000 | |||
WIKILINK_TITLE = 0b00000000100000 | |||
WIKILINK_TEXT = 0b00000001000000 | |||
HEADING = 0b01111110000000 | |||
HEADING_LEVEL_1 = 0b00000010000000 | |||
HEADING_LEVEL_2 = 0b00000100000000 | |||
HEADING_LEVEL_3 = 0b00001000000000 | |||
HEADING_LEVEL_4 = 0b00010000000000 | |||
HEADING_LEVEL_5 = 0b00100000000000 | |||
HEADING_LEVEL_6 = 0b01000000000000 | |||
COMMENT = 0b10000000000000 | |||
TEMPLATE = 0b00000000000000000111 | |||
TEMPLATE_NAME = 0b00000000000000000001 | |||
TEMPLATE_PARAM_KEY = 0b00000000000000000010 | |||
TEMPLATE_PARAM_VALUE = 0b00000000000000000100 | |||
ARGUMENT = 0b00000000000000011000 | |||
ARGUMENT_NAME = 0b00000000000000001000 | |||
ARGUMENT_DEFAULT = 0b00000000000000010000 | |||
WIKILINK = 0b00000000000001100000 | |||
WIKILINK_TITLE = 0b00000000000000100000 | |||
WIKILINK_TEXT = 0b00000000000001000000 | |||
HEADING = 0b00000001111110000000 | |||
HEADING_LEVEL_1 = 0b00000000000010000000 | |||
HEADING_LEVEL_2 = 0b00000000000100000000 | |||
HEADING_LEVEL_3 = 0b00000000001000000000 | |||
HEADING_LEVEL_4 = 0b00000000010000000000 | |||
HEADING_LEVEL_5 = 0b00000000100000000000 | |||
HEADING_LEVEL_6 = 0b00000001000000000000 | |||
COMMENT = 0b00000010000000000000 | |||
SAFETY_CHECK = 0b11111100000000000000 | |||
HAS_TEXT = 0b00000100000000000000 | |||
FAIL_ON_TEXT = 0b00001000000000000000 | |||
FAIL_NEXT = 0b00010000000000000000 | |||
FAIL_ON_LBRACE = 0b00100000000000000000 | |||
FAIL_ON_RBRACE = 0b01000000000000000000 | |||
FAIL_ON_EQUALS = 0b10000000000000000000 | |||
# Global contexts: | |||
@@ -0,0 +1,285 @@ | |||
/* | |||
Tokenizer Header File for MWParserFromHell | |||
Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
Permission is hereby granted, free of charge, to any person obtaining a copy of | |||
this software and associated documentation files (the "Software"), to deal in | |||
the Software without restriction, including without limitation the rights to | |||
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies | |||
of the Software, and to permit persons to whom the Software is furnished to do | |||
so, subject to the following conditions: | |||
The above copyright notice and this permission notice shall be included in all | |||
copies or substantial portions of the Software. | |||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
SOFTWARE. | |||
*/ | |||
#ifndef PY_SSIZE_T_CLEAN | |||
#define PY_SSIZE_T_CLEAN | |||
#endif | |||
#include <Python.h> | |||
#include <math.h> | |||
#include <structmember.h> | |||
#if PY_MAJOR_VERSION >= 3 | |||
#define IS_PY3K | |||
#endif | |||
#define malloc PyObject_Malloc | |||
#define free PyObject_Free | |||
#define DIGITS "0123456789" | |||
#define HEXDIGITS "0123456789abcdefABCDEF" | |||
#define ALPHANUM "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" | |||
static const char* MARKERS[] = { | |||
"{", "}", "[", "]", "<", ">", "|", "=", "&", "#", "*", ";", ":", "/", "-", | |||
"!", "\n", ""}; | |||
#define NUM_MARKERS 18 | |||
#define TEXTBUFFER_BLOCKSIZE 1024 | |||
#define MAX_DEPTH 40 | |||
#define MAX_CYCLES 100000 | |||
#define MAX_BRACES 255 | |||
#define MAX_ENTITY_SIZE 8 | |||
static int route_state = 0; | |||
#define BAD_ROUTE (route_state) | |||
#define FAIL_ROUTE() (route_state = 1) | |||
#define RESET_ROUTE() (route_state = 0) | |||
static char** entitydefs; | |||
static PyObject* EMPTY; | |||
static PyObject* NOARGS; | |||
static PyObject* tokens; | |||
/* Tokens */ | |||
static PyObject* Text; | |||
static PyObject* TemplateOpen; | |||
static PyObject* TemplateParamSeparator; | |||
static PyObject* TemplateParamEquals; | |||
static PyObject* TemplateClose; | |||
static PyObject* ArgumentOpen; | |||
static PyObject* ArgumentSeparator; | |||
static PyObject* ArgumentClose; | |||
static PyObject* WikilinkOpen; | |||
static PyObject* WikilinkSeparator; | |||
static PyObject* WikilinkClose; | |||
static PyObject* HTMLEntityStart; | |||
static PyObject* HTMLEntityNumeric; | |||
static PyObject* HTMLEntityHex; | |||
static PyObject* HTMLEntityEnd; | |||
static PyObject* HeadingStart; | |||
static PyObject* HeadingEnd; | |||
static PyObject* CommentStart; | |||
static PyObject* CommentEnd; | |||
static PyObject* TagOpenOpen; | |||
static PyObject* TagAttrStart; | |||
static PyObject* TagAttrEquals; | |||
static PyObject* TagAttrQuote; | |||
static PyObject* TagCloseOpen; | |||
static PyObject* TagCloseSelfclose; | |||
static PyObject* TagOpenClose; | |||
static PyObject* TagCloseClose; | |||
/* Local contexts: */ | |||
#define LC_TEMPLATE 0x00007 | |||
#define LC_TEMPLATE_NAME 0x00001 | |||
#define LC_TEMPLATE_PARAM_KEY 0x00002 | |||
#define LC_TEMPLATE_PARAM_VALUE 0x00004 | |||
#define LC_ARGUMENT 0x00018 | |||
#define LC_ARGUMENT_NAME 0x00008 | |||
#define LC_ARGUMENT_DEFAULT 0x00010 | |||
#define LC_WIKILINK 0x00060 | |||
#define LC_WIKILINK_TITLE 0x00020 | |||
#define LC_WIKILINK_TEXT 0x00040 | |||
#define LC_HEADING 0x01F80 | |||
#define LC_HEADING_LEVEL_1 0x00080 | |||
#define LC_HEADING_LEVEL_2 0x00100 | |||
#define LC_HEADING_LEVEL_3 0x00200 | |||
#define LC_HEADING_LEVEL_4 0x00400 | |||
#define LC_HEADING_LEVEL_5 0x00800 | |||
#define LC_HEADING_LEVEL_6 0x01000 | |||
#define LC_COMMENT 0x02000 | |||
#define LC_SAFETY_CHECK 0xFC000 | |||
#define LC_HAS_TEXT 0x04000 | |||
#define LC_FAIL_ON_TEXT 0x08000 | |||
#define LC_FAIL_NEXT 0x10000 | |||
#define LC_FAIL_ON_LBRACE 0x20000 | |||
#define LC_FAIL_ON_RBRACE 0x40000 | |||
#define LC_FAIL_ON_EQUALS 0x80000 | |||
/* Global contexts: */ | |||
#define GL_HEADING 0x1 | |||
/* Miscellaneous structs: */ | |||
struct Textbuffer { | |||
Py_ssize_t size; | |||
Py_UNICODE* data; | |||
struct Textbuffer* next; | |||
}; | |||
struct Stack { | |||
PyObject* stack; | |||
int context; | |||
struct Textbuffer* textbuffer; | |||
struct Stack* next; | |||
}; | |||
typedef struct { | |||
PyObject* title; | |||
int level; | |||
} HeadingData; | |||
/* Tokenizer object definition: */ | |||
typedef struct { | |||
PyObject_HEAD | |||
PyObject* text; /* text to tokenize */ | |||
struct Stack* topstack; /* topmost stack */ | |||
Py_ssize_t head; /* current position in text */ | |||
Py_ssize_t length; /* length of text */ | |||
int global; /* global context */ | |||
int depth; /* stack recursion depth */ | |||
int cycles; /* total number of stack recursions */ | |||
} Tokenizer; | |||
/* Macros for accessing Tokenizer data: */ | |||
#define Tokenizer_READ(self, delta) (*PyUnicode_AS_UNICODE(Tokenizer_read(self, delta))) | |||
#define Tokenizer_CAN_RECURSE(self) (self->depth < MAX_DEPTH && self->cycles < MAX_CYCLES) | |||
/* Function prototypes: */ | |||
static int heading_level_from_context(int); | |||
static PyObject* Tokenizer_new(PyTypeObject*, PyObject*, PyObject*); | |||
static struct Textbuffer* Textbuffer_new(void); | |||
static void Tokenizer_dealloc(Tokenizer*); | |||
static void Textbuffer_dealloc(struct Textbuffer*); | |||
static int Tokenizer_init(Tokenizer*, PyObject*, PyObject*); | |||
static int Tokenizer_push(Tokenizer*, int); | |||
static PyObject* Textbuffer_render(struct Textbuffer*); | |||
static int Tokenizer_push_textbuffer(Tokenizer*); | |||
static void Tokenizer_delete_top_of_stack(Tokenizer*); | |||
static PyObject* Tokenizer_pop(Tokenizer*); | |||
static PyObject* Tokenizer_pop_keeping_context(Tokenizer*); | |||
static void* Tokenizer_fail_route(Tokenizer*); | |||
static int Tokenizer_write(Tokenizer*, PyObject*); | |||
static int Tokenizer_write_first(Tokenizer*, PyObject*); | |||
static int Tokenizer_write_text(Tokenizer*, Py_UNICODE); | |||
static int Tokenizer_write_all(Tokenizer*, PyObject*); | |||
static int Tokenizer_write_text_then_stack(Tokenizer*, const char*); | |||
static PyObject* Tokenizer_read(Tokenizer*, Py_ssize_t); | |||
static PyObject* Tokenizer_read_backwards(Tokenizer*, Py_ssize_t); | |||
static int Tokenizer_parse_template_or_argument(Tokenizer*); | |||
static int Tokenizer_parse_template(Tokenizer*); | |||
static int Tokenizer_parse_argument(Tokenizer*); | |||
static int Tokenizer_handle_template_param(Tokenizer*); | |||
static int Tokenizer_handle_template_param_value(Tokenizer*); | |||
static PyObject* Tokenizer_handle_template_end(Tokenizer*); | |||
static int Tokenizer_handle_argument_separator(Tokenizer*); | |||
static PyObject* Tokenizer_handle_argument_end(Tokenizer*); | |||
static int Tokenizer_parse_wikilink(Tokenizer*); | |||
static int Tokenizer_handle_wikilink_separator(Tokenizer*); | |||
static PyObject* Tokenizer_handle_wikilink_end(Tokenizer*); | |||
static int Tokenizer_parse_heading(Tokenizer*); | |||
static HeadingData* Tokenizer_handle_heading_end(Tokenizer*); | |||
static int Tokenizer_really_parse_entity(Tokenizer*); | |||
static int Tokenizer_parse_entity(Tokenizer*); | |||
static int Tokenizer_parse_comment(Tokenizer*); | |||
static int Tokenizer_verify_safe(Tokenizer*, int, Py_UNICODE); | |||
static PyObject* Tokenizer_parse(Tokenizer*, int); | |||
static PyObject* Tokenizer_tokenize(Tokenizer*, PyObject*); | |||
/* More structs for creating the Tokenizer type: */ | |||
static PyMethodDef | |||
Tokenizer_methods[] = { | |||
{"tokenize", (PyCFunction) Tokenizer_tokenize, METH_VARARGS, | |||
"Build a list of tokens from a string of wikicode and return it."}, | |||
{NULL} | |||
}; | |||
static PyMemberDef | |||
Tokenizer_members[] = { | |||
{NULL} | |||
}; | |||
static PyMethodDef | |||
module_methods[] = { | |||
{NULL} | |||
}; | |||
static PyTypeObject | |||
TokenizerType = { | |||
PyObject_HEAD_INIT(NULL) | |||
0, /* ob_size */ | |||
"_tokenizer.CTokenizer", /* tp_name */ | |||
sizeof(Tokenizer), /* tp_basicsize */ | |||
0, /* tp_itemsize */ | |||
(destructor) Tokenizer_dealloc, /* tp_dealloc */ | |||
0, /* tp_print */ | |||
0, /* tp_getattr */ | |||
0, /* tp_setattr */ | |||
0, /* tp_compare */ | |||
0, /* tp_repr */ | |||
0, /* tp_as_number */ | |||
0, /* tp_as_sequence */ | |||
0, /* tp_as_mapping */ | |||
0, /* tp_hash */ | |||
0, /* tp_call */ | |||
0, /* tp_str */ | |||
0, /* tp_getattro */ | |||
0, /* tp_setattro */ | |||
0, /* tp_as_buffer */ | |||
Py_TPFLAGS_DEFAULT, /* tp_flags */ | |||
"Creates a list of tokens from a string of wikicode.", /* tp_doc */ | |||
0, /* tp_traverse */ | |||
0, /* tp_clear */ | |||
0, /* tp_richcompare */ | |||
0, /* tp_weaklistoffset */ | |||
0, /* tp_iter */ | |||
0, /* tp_iternext */ | |||
Tokenizer_methods, /* tp_methods */ | |||
Tokenizer_members, /* tp_members */ | |||
0, /* tp_getset */ | |||
0, /* tp_base */ | |||
0, /* tp_dict */ | |||
0, /* tp_descr_get */ | |||
0, /* tp_descr_set */ | |||
0, /* tp_dictoffset */ | |||
(initproc) Tokenizer_init, /* tp_init */ | |||
0, /* tp_alloc */ | |||
Tokenizer_new, /* tp_new */ | |||
}; |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -23,7 +23,6 @@ | |||
from __future__ import unicode_literals | |||
from math import log | |||
import re | |||
import string | |||
from . import contexts | |||
from . import tokens | |||
@@ -38,10 +37,13 @@ class BadRoute(Exception): | |||
class Tokenizer(object): | |||
"""Creates a list of tokens from a string of wikicode.""" | |||
USES_C = False | |||
START = object() | |||
END = object() | |||
MARKERS = ["{", "}", "[", "]", "<", ">", "|", "=", "&", "#", "*", ";", ":", | |||
"/", "-", "!", "\n", END] | |||
MAX_DEPTH = 40 | |||
MAX_CYCLES = 100000 | |||
regex = re.compile(r"([{}\[\]<>|=&#*;:/\-!\n])", flags=re.IGNORECASE) | |||
def __init__(self): | |||
@@ -49,6 +51,8 @@ class Tokenizer(object): | |||
self._head = 0 | |||
self._stacks = [] | |||
self._global = 0 | |||
self._depth = 0 | |||
self._cycles = 0 | |||
@property | |||
def _stack(self): | |||
@@ -76,6 +80,8 @@ class Tokenizer(object): | |||
def _push(self, context=0): | |||
"""Add a new token stack, context, and textbuffer to the list.""" | |||
self._stacks.append([[], context, []]) | |||
self._depth += 1 | |||
self._cycles += 1 | |||
def _push_textbuffer(self): | |||
"""Push the textbuffer onto the stack as a Text node and clear it.""" | |||
@@ -86,10 +92,11 @@ class Tokenizer(object): | |||
def _pop(self, keep_context=False): | |||
"""Pop the current stack/context/textbuffer, returing the stack. | |||
If *keep_context is ``True``, then we will replace the underlying | |||
If *keep_context* is ``True``, then we will replace the underlying | |||
stack's context with the current stack's. | |||
""" | |||
self._push_textbuffer() | |||
self._depth -= 1 | |||
if keep_context: | |||
context = self._context | |||
stack = self._stacks.pop()[0] | |||
@@ -97,6 +104,10 @@ class Tokenizer(object): | |||
return stack | |||
return self._stacks.pop()[0] | |||
def _can_recurse(self): | |||
"""Return whether or not our max recursion depth has been exceeded.""" | |||
return self._depth < self.MAX_DEPTH and self._cycles < self.MAX_CYCLES | |||
def _fail_route(self): | |||
"""Fail the current tokenization route. | |||
@@ -162,8 +173,8 @@ class Tokenizer(object): | |||
self._head += 2 | |||
braces = 2 | |||
while self._read() == "{": | |||
braces += 1 | |||
self._head += 1 | |||
braces += 1 | |||
self._push() | |||
while braces: | |||
@@ -197,10 +208,9 @@ class Tokenizer(object): | |||
except BadRoute: | |||
self._head = reset | |||
raise | |||
else: | |||
self._write_first(tokens.TemplateOpen()) | |||
self._write_all(template) | |||
self._write(tokens.TemplateClose()) | |||
self._write_first(tokens.TemplateOpen()) | |||
self._write_all(template) | |||
self._write(tokens.TemplateClose()) | |||
def _parse_argument(self): | |||
"""Parse an argument at the head of the wikicode string.""" | |||
@@ -210,29 +220,13 @@ class Tokenizer(object): | |||
except BadRoute: | |||
self._head = reset | |||
raise | |||
else: | |||
self._write_first(tokens.ArgumentOpen()) | |||
self._write_all(argument) | |||
self._write(tokens.ArgumentClose()) | |||
def _verify_safe(self, unsafes): | |||
"""Verify that there are no unsafe characters in the current stack. | |||
The route will be failed if the name contains any element of *unsafes* | |||
in it (not merely at the beginning or end). This is used when parsing a | |||
template name or parameter key, which cannot contain newlines. | |||
""" | |||
self._push_textbuffer() | |||
if self._stack: | |||
text = [tok for tok in self._stack if isinstance(tok, tokens.Text)] | |||
text = "".join([token.text for token in text]).strip() | |||
if text and any([unsafe in text for unsafe in unsafes]): | |||
self._fail_route() | |||
self._write_first(tokens.ArgumentOpen()) | |||
self._write_all(argument) | |||
self._write(tokens.ArgumentClose()) | |||
def _handle_template_param(self): | |||
"""Handle a template parameter at the head of the string.""" | |||
if self._context & contexts.TEMPLATE_NAME: | |||
self._verify_safe(["\n", "{", "}", "[", "]"]) | |||
self._context ^= contexts.TEMPLATE_NAME | |||
elif self._context & contexts.TEMPLATE_PARAM_VALUE: | |||
self._context ^= contexts.TEMPLATE_PARAM_VALUE | |||
@@ -244,37 +238,26 @@ class Tokenizer(object): | |||
def _handle_template_param_value(self): | |||
"""Handle a template parameter's value at the head of the string.""" | |||
try: | |||
self._verify_safe(["\n", "{{", "}}"]) | |||
except BadRoute: | |||
self._pop() | |||
raise | |||
else: | |||
self._write_all(self._pop(keep_context=True)) | |||
self._write_all(self._pop(keep_context=True)) | |||
self._context ^= contexts.TEMPLATE_PARAM_KEY | |||
self._context |= contexts.TEMPLATE_PARAM_VALUE | |||
self._write(tokens.TemplateParamEquals()) | |||
def _handle_template_end(self): | |||
"""Handle the end of a template at the head of the string.""" | |||
if self._context & contexts.TEMPLATE_NAME: | |||
self._verify_safe(["\n", "{", "}", "[", "]"]) | |||
elif self._context & contexts.TEMPLATE_PARAM_KEY: | |||
if self._context & contexts.TEMPLATE_PARAM_KEY: | |||
self._write_all(self._pop(keep_context=True)) | |||
self._head += 1 | |||
return self._pop() | |||
def _handle_argument_separator(self): | |||
"""Handle the separator between an argument's name and default.""" | |||
self._verify_safe(["\n", "{{", "}}"]) | |||
self._context ^= contexts.ARGUMENT_NAME | |||
self._context |= contexts.ARGUMENT_DEFAULT | |||
self._write(tokens.ArgumentSeparator()) | |||
def _handle_argument_end(self): | |||
"""Handle the end of an argument at the head of the string.""" | |||
if self._context & contexts.ARGUMENT_NAME: | |||
self._verify_safe(["\n", "{{", "}}"]) | |||
self._head += 2 | |||
return self._pop() | |||
@@ -294,15 +277,12 @@ class Tokenizer(object): | |||
def _handle_wikilink_separator(self): | |||
"""Handle the separator between a wikilink's title and its text.""" | |||
self._verify_safe(["\n", "{", "}", "[", "]"]) | |||
self._context ^= contexts.WIKILINK_TITLE | |||
self._context |= contexts.WIKILINK_TEXT | |||
self._write(tokens.WikilinkSeparator()) | |||
def _handle_wikilink_end(self): | |||
"""Handle the end of a wikilink at the head of the string.""" | |||
if self._context & contexts.WIKILINK_TITLE: | |||
self._verify_safe(["\n", "{", "}", "[", "]"]) | |||
self._head += 1 | |||
return self._pop() | |||
@@ -342,14 +322,14 @@ class Tokenizer(object): | |||
current = int(log(self._context / contexts.HEADING_LEVEL_1, 2)) + 1 | |||
level = min(current, min(best, 6)) | |||
try: | |||
try: # Try to check for a heading closure after this one | |||
after, after_level = self._parse(self._context) | |||
except BadRoute: | |||
if level < best: | |||
self._write_text("=" * (best - level)) | |||
self._head = reset + best - 1 | |||
return self._pop(), level | |||
else: | |||
else: # Found another closure | |||
self._write_text("=" * best) | |||
self._write_all(after) | |||
return self._pop(), after_level | |||
@@ -376,9 +356,9 @@ class Tokenizer(object): | |||
else: | |||
numeric = hexadecimal = False | |||
valid = string.hexdigits if hexadecimal else string.digits | |||
valid = "0123456789abcdefABCDEF" if hexadecimal else "0123456789" | |||
if not numeric and not hexadecimal: | |||
valid += string.ascii_letters | |||
valid += "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" | |||
if not all([char in valid for char in this]): | |||
self._fail_route() | |||
@@ -423,18 +403,83 @@ class Tokenizer(object): | |||
self._write(tokens.CommentEnd()) | |||
self._head += 2 | |||
def _verify_safe(self, this): | |||
"""Make sure we are not trying to write an invalid character.""" | |||
context = self._context | |||
if context & contexts.FAIL_NEXT: | |||
return False | |||
if context & contexts.WIKILINK_TITLE: | |||
if this == "]" or this == "{": | |||
self._context |= contexts.FAIL_NEXT | |||
elif this == "\n" or this == "[" or this == "}": | |||
return False | |||
return True | |||
if context & contexts.TEMPLATE_NAME: | |||
if this == "{" or this == "}" or this == "[": | |||
self._context |= contexts.FAIL_NEXT | |||
return True | |||
if this == "]": | |||
return False | |||
if this == "|": | |||
return True | |||
if context & contexts.HAS_TEXT: | |||
if context & contexts.FAIL_ON_TEXT: | |||
if this is self.END or not this.isspace(): | |||
return False | |||
else: | |||
if this == "\n": | |||
self._context |= contexts.FAIL_ON_TEXT | |||
elif this is self.END or not this.isspace(): | |||
self._context |= contexts.HAS_TEXT | |||
return True | |||
else: | |||
if context & contexts.FAIL_ON_EQUALS: | |||
if this == "=": | |||
return False | |||
elif context & contexts.FAIL_ON_LBRACE: | |||
if this == "{" or (self._read(-1) == self._read(-2) == "{"): | |||
if context & contexts.TEMPLATE: | |||
self._context |= contexts.FAIL_ON_EQUALS | |||
else: | |||
self._context |= contexts.FAIL_NEXT | |||
return True | |||
self._context ^= contexts.FAIL_ON_LBRACE | |||
elif context & contexts.FAIL_ON_RBRACE: | |||
if this == "}": | |||
if context & contexts.TEMPLATE: | |||
self._context |= contexts.FAIL_ON_EQUALS | |||
else: | |||
self._context |= contexts.FAIL_NEXT | |||
return True | |||
self._context ^= contexts.FAIL_ON_RBRACE | |||
elif this == "{": | |||
self._context |= contexts.FAIL_ON_LBRACE | |||
elif this == "}": | |||
self._context |= contexts.FAIL_ON_RBRACE | |||
return True | |||
def _parse(self, context=0): | |||
"""Parse the wikicode string, using *context* for when to stop.""" | |||
self._push(context) | |||
while True: | |||
this = self._read() | |||
unsafe = (contexts.TEMPLATE_NAME | contexts.WIKILINK_TITLE | | |||
contexts.TEMPLATE_PARAM_KEY | contexts.ARGUMENT_NAME) | |||
if self._context & unsafe: | |||
if not self._verify_safe(this): | |||
if self._context & contexts.TEMPLATE_PARAM_KEY: | |||
self._pop() | |||
self._fail_route() | |||
if this not in self.MARKERS: | |||
self._write_text(this) | |||
self._head += 1 | |||
continue | |||
if this is self.END: | |||
fail = (contexts.TEMPLATE | contexts.ARGUMENT | | |||
contexts.HEADING | contexts.COMMENT) | |||
contexts.WIKILINK | contexts.HEADING | | |||
contexts.COMMENT) | |||
if self._context & contexts.TEMPLATE_PARAM_KEY: | |||
self._pop() | |||
if self._context & fail: | |||
self._fail_route() | |||
return self._pop() | |||
@@ -445,7 +490,12 @@ class Tokenizer(object): | |||
else: | |||
self._write_text(this) | |||
elif this == next == "{": | |||
self._parse_template_or_argument() | |||
if self._can_recurse(): | |||
self._parse_template_or_argument() | |||
if self._context & contexts.FAIL_NEXT: | |||
self._context ^= contexts.FAIL_NEXT | |||
else: | |||
self._write_text("{") | |||
elif this == "|" and self._context & contexts.TEMPLATE: | |||
self._handle_template_param() | |||
elif this == "=" and self._context & contexts.TEMPLATE_PARAM_KEY: | |||
@@ -460,8 +510,10 @@ class Tokenizer(object): | |||
else: | |||
self._write_text("}") | |||
elif this == next == "[": | |||
if not self._context & contexts.WIKILINK_TITLE: | |||
if not self._context & contexts.WIKILINK_TITLE and self._can_recurse(): | |||
self._parse_wikilink() | |||
if self._context & contexts.FAIL_NEXT: | |||
self._context ^= contexts.FAIL_NEXT | |||
else: | |||
self._write_text("[") | |||
elif this == "|" and self._context & contexts.WIKILINK_TITLE: | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -41,8 +41,23 @@ def inheritdoc(method): | |||
method.__doc__ = getattr(list, method.__name__).__doc__ | |||
return method | |||
class _SliceNormalizerMixIn(object): | |||
"""MixIn that provides a private method to normalize slices.""" | |||
class SmartList(list): | |||
def _normalize_slice(self, key): | |||
"""Return a slice equivalent to the input *key*, standardized.""" | |||
if key.start is not None: | |||
start = (len(self) + key.start) if key.start < 0 else key.start | |||
else: | |||
start = 0 | |||
if key.stop is not None: | |||
stop = (len(self) + key.stop) if key.stop < 0 else key.stop | |||
else: | |||
stop = maxsize | |||
return slice(start, stop, key.step or 1) | |||
class SmartList(_SliceNormalizerMixIn, list): | |||
"""Implements the ``list`` interface with special handling of sublists. | |||
When a sublist is created (by ``list[i:j]``), any changes made to this | |||
@@ -76,7 +91,8 @@ class SmartList(list): | |||
def __getitem__(self, key): | |||
if not isinstance(key, slice): | |||
return super(SmartList, self).__getitem__(key) | |||
sliceinfo = [key.start, key.stop, 1 if not key.step else key.step] | |||
key = self._normalize_slice(key) | |||
sliceinfo = [key.start, key.stop, key.step] | |||
child = _ListProxy(self, sliceinfo) | |||
self._children[id(child)] = (child, sliceinfo) | |||
return child | |||
@@ -86,25 +102,28 @@ class SmartList(list): | |||
return super(SmartList, self).__setitem__(key, item) | |||
item = list(item) | |||
super(SmartList, self).__setitem__(key, item) | |||
diff = len(item) - key.stop + key.start | |||
key = self._normalize_slice(key) | |||
diff = len(item) + (key.start - key.stop) // key.step | |||
values = self._children.values if py3k else self._children.itervalues | |||
if diff: | |||
for child, (start, stop, step) in values(): | |||
if start >= key.stop: | |||
if start > key.stop: | |||
self._children[id(child)][1][0] += diff | |||
if stop >= key.stop and stop != maxsize: | |||
self._children[id(child)][1][1] += diff | |||
def __delitem__(self, key): | |||
super(SmartList, self).__delitem__(key) | |||
if not isinstance(key, slice): | |||
key = slice(key, key + 1) | |||
diff = key.stop - key.start | |||
if isinstance(key, slice): | |||
key = self._normalize_slice(key) | |||
else: | |||
key = slice(key, key + 1, 1) | |||
diff = (key.stop - key.start) // key.step | |||
values = self._children.values if py3k else self._children.itervalues | |||
for child, (start, stop, step) in values(): | |||
if start > key.start: | |||
self._children[id(child)][1][0] -= diff | |||
if stop >= key.stop: | |||
if stop >= key.stop and stop != maxsize: | |||
self._children[id(child)][1][1] -= diff | |||
if not py3k: | |||
@@ -160,24 +179,35 @@ class SmartList(list): | |||
child._parent = copy | |||
super(SmartList, self).reverse() | |||
@inheritdoc | |||
def sort(self, cmp=None, key=None, reverse=None): | |||
copy = list(self) | |||
for child in self._children: | |||
child._parent = copy | |||
if cmp is not None: | |||
if py3k: | |||
@inheritdoc | |||
def sort(self, key=None, reverse=None): | |||
copy = list(self) | |||
for child in self._children: | |||
child._parent = copy | |||
kwargs = {} | |||
if key is not None: | |||
if reverse is not None: | |||
super(SmartList, self).sort(cmp, key, reverse) | |||
else: | |||
super(SmartList, self).sort(cmp, key) | |||
else: | |||
super(SmartList, self).sort(cmp) | |||
else: | |||
super(SmartList, self).sort() | |||
kwargs["key"] = key | |||
if reverse is not None: | |||
kwargs["reverse"] = reverse | |||
super(SmartList, self).sort(**kwargs) | |||
else: | |||
@inheritdoc | |||
def sort(self, cmp=None, key=None, reverse=None): | |||
copy = list(self) | |||
for child in self._children: | |||
child._parent = copy | |||
kwargs = {} | |||
if cmp is not None: | |||
kwargs["cmp"] = cmp | |||
if key is not None: | |||
kwargs["key"] = key | |||
if reverse is not None: | |||
kwargs["reverse"] = reverse | |||
super(SmartList, self).sort(**kwargs) | |||
class _ListProxy(list): | |||
class _ListProxy(_SliceNormalizerMixIn, list): | |||
"""Implement the ``list`` interface by getting elements from a parent. | |||
This is created by a :py:class:`~.SmartList` object when slicing. It does | |||
@@ -231,25 +261,52 @@ class _ListProxy(list): | |||
return bool(self._render()) | |||
def __len__(self): | |||
return (self._stop - self._start) / self._step | |||
return (self._stop - self._start) // self._step | |||
def __getitem__(self, key): | |||
return self._render()[key] | |||
if isinstance(key, slice): | |||
key = self._normalize_slice(key) | |||
if key.stop == maxsize: | |||
keystop = self._stop | |||
else: | |||
keystop = key.stop + self._start | |||
adjusted = slice(key.start + self._start, keystop, key.step) | |||
return self._parent[adjusted] | |||
else: | |||
return self._render()[key] | |||
def __setitem__(self, key, item): | |||
if isinstance(key, slice): | |||
adjusted = slice(key.start + self._start, key.stop + self._stop, | |||
key.step) | |||
key = self._normalize_slice(key) | |||
if key.stop == maxsize: | |||
keystop = self._stop | |||
else: | |||
keystop = key.stop + self._start | |||
adjusted = slice(key.start + self._start, keystop, key.step) | |||
self._parent[adjusted] = item | |||
else: | |||
length = len(self) | |||
if key < 0: | |||
key = length + key | |||
if key < 0 or key >= length: | |||
raise IndexError("list assignment index out of range") | |||
self._parent[self._start + key] = item | |||
def __delitem__(self, key): | |||
if isinstance(key, slice): | |||
adjusted = slice(key.start + self._start, key.stop + self._stop, | |||
key.step) | |||
key = self._normalize_slice(key) | |||
if key.stop == maxsize: | |||
keystop = self._stop | |||
else: | |||
keystop = key.stop + self._start | |||
adjusted = slice(key.start + self._start, keystop, key.step) | |||
del self._parent[adjusted] | |||
else: | |||
length = len(self) | |||
if key < 0: | |||
key = length + key | |||
if key < 0 or key >= length: | |||
raise IndexError("list assignment index out of range") | |||
del self._parent[self._start + key] | |||
def __iter__(self): | |||
@@ -287,6 +344,16 @@ class _ListProxy(list): | |||
self.extend(other) | |||
return self | |||
def __mul__(self, other): | |||
return SmartList(list(self) * other) | |||
def __rmul__(self, other): | |||
return SmartList(other * list(self)) | |||
def __imul__(self, other): | |||
self.extend(list(self) * (other - 1)) | |||
return self | |||
@property | |||
def _start(self): | |||
"""The starting index of this list, inclusive.""" | |||
@@ -295,6 +362,8 @@ class _ListProxy(list): | |||
@property | |||
def _stop(self): | |||
"""The ending index of this list, exclusive.""" | |||
if self._sliceinfo[1] == maxsize: | |||
return len(self._parent) | |||
return self._sliceinfo[1] | |||
@property | |||
@@ -328,18 +397,25 @@ class _ListProxy(list): | |||
@inheritdoc | |||
def insert(self, index, item): | |||
if index < 0: | |||
index = len(self) + index | |||
self._parent.insert(self._start + index, item) | |||
@inheritdoc | |||
def pop(self, index=None): | |||
length = len(self) | |||
if index is None: | |||
index = len(self) - 1 | |||
index = length - 1 | |||
elif index < 0: | |||
index = length + index | |||
if index < 0 or index >= length: | |||
raise IndexError("pop index out of range") | |||
return self._parent.pop(self._start + index) | |||
@inheritdoc | |||
def remove(self, item): | |||
index = self.index(item) | |||
del self._parent[index] | |||
del self._parent[self._start + index] | |||
@inheritdoc | |||
def reverse(self): | |||
@@ -347,17 +423,30 @@ class _ListProxy(list): | |||
item.reverse() | |||
self._parent[self._start:self._stop:self._step] = item | |||
@inheritdoc | |||
def sort(self, cmp=None, key=None, reverse=None): | |||
item = self._render() | |||
if cmp is not None: | |||
if py3k: | |||
@inheritdoc | |||
def sort(self, key=None, reverse=None): | |||
item = self._render() | |||
kwargs = {} | |||
if key is not None: | |||
if reverse is not None: | |||
item.sort(cmp, key, reverse) | |||
else: | |||
item.sort(cmp, key) | |||
else: | |||
item.sort(cmp) | |||
else: | |||
item.sort() | |||
self._parent[self._start:self._stop:self._step] = item | |||
kwargs["key"] = key | |||
if reverse is not None: | |||
kwargs["reverse"] = reverse | |||
item.sort(**kwargs) | |||
self._parent[self._start:self._stop:self._step] = item | |||
else: | |||
@inheritdoc | |||
def sort(self, cmp=None, key=None, reverse=None): | |||
item = self._render() | |||
kwargs = {} | |||
if cmp is not None: | |||
kwargs["cmp"] = cmp | |||
if key is not None: | |||
kwargs["key"] = key | |||
if reverse is not None: | |||
kwargs["reverse"] = reverse | |||
item.sort(**kwargs) | |||
self._parent[self._start:self._stop:self._step] = item | |||
del inheritdoc |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -40,7 +40,6 @@ def inheritdoc(method): | |||
method.__doc__ = getattr(str, method.__name__).__doc__ | |||
return method | |||
class StringMixIn(object): | |||
"""Implement the interface for ``unicode``/``str`` in a dynamic manner. | |||
@@ -114,6 +113,9 @@ class StringMixIn(object): | |||
def __getitem__(self, key): | |||
return self.__unicode__()[key] | |||
def __reversed__(self): | |||
return reversed(self.__unicode__()) | |||
def __contains__(self, item): | |||
if isinstance(item, StringMixIn): | |||
return str(item) in self.__unicode__() | |||
@@ -123,22 +125,39 @@ class StringMixIn(object): | |||
def capitalize(self): | |||
return self.__unicode__().capitalize() | |||
if py3k: | |||
@inheritdoc | |||
def casefold(self): | |||
return self.__unicode__().casefold() | |||
@inheritdoc | |||
def center(self, width, fillchar=None): | |||
if fillchar is None: | |||
return self.__unicode__().center(width) | |||
return self.__unicode__().center(width, fillchar) | |||
@inheritdoc | |||
def count(self, sub=None, start=None, end=None): | |||
def count(self, sub, start=None, end=None): | |||
return self.__unicode__().count(sub, start, end) | |||
if not py3k: | |||
@inheritdoc | |||
def decode(self, encoding=None, errors=None): | |||
return self.__unicode__().decode(encoding, errors) | |||
kwargs = {} | |||
if encoding is not None: | |||
kwargs["encoding"] = encoding | |||
if errors is not None: | |||
kwargs["errors"] = errors | |||
return self.__unicode__().decode(**kwargs) | |||
@inheritdoc | |||
def encode(self, encoding=None, errors=None): | |||
return self.__unicode__().encode(encoding, errors) | |||
kwargs = {} | |||
if encoding is not None: | |||
kwargs["encoding"] = encoding | |||
if errors is not None: | |||
kwargs["errors"] = errors | |||
return self.__unicode__().encode(**kwargs) | |||
@inheritdoc | |||
def endswith(self, prefix, start=None, end=None): | |||
@@ -146,18 +165,25 @@ class StringMixIn(object): | |||
@inheritdoc | |||
def expandtabs(self, tabsize=None): | |||
if tabsize is None: | |||
return self.__unicode__().expandtabs() | |||
return self.__unicode__().expandtabs(tabsize) | |||
@inheritdoc | |||
def find(self, sub=None, start=None, end=None): | |||
def find(self, sub, start=None, end=None): | |||
return self.__unicode__().find(sub, start, end) | |||
@inheritdoc | |||
def format(self, *args, **kwargs): | |||
return self.__unicode__().format(*args, **kwargs) | |||
if py3k: | |||
@inheritdoc | |||
def format_map(self, mapping): | |||
return self.__unicode__().format_map(mapping) | |||
@inheritdoc | |||
def index(self, sub=None, start=None, end=None): | |||
def index(self, sub, start=None, end=None): | |||
return self.__unicode__().index(sub, start, end) | |||
@inheritdoc | |||
@@ -176,6 +202,11 @@ class StringMixIn(object): | |||
def isdigit(self): | |||
return self.__unicode__().isdigit() | |||
if py3k: | |||
@inheritdoc | |||
def isidentifier(self): | |||
return self.__unicode__().isidentifier() | |||
@inheritdoc | |||
def islower(self): | |||
return self.__unicode__().islower() | |||
@@ -184,6 +215,11 @@ class StringMixIn(object): | |||
def isnumeric(self): | |||
return self.__unicode__().isnumeric() | |||
if py3k: | |||
@inheritdoc | |||
def isprintable(self): | |||
return self.__unicode__().isprintable() | |||
@inheritdoc | |||
def isspace(self): | |||
return self.__unicode__().isspace() | |||
@@ -202,6 +238,8 @@ class StringMixIn(object): | |||
@inheritdoc | |||
def ljust(self, width, fillchar=None): | |||
if fillchar is None: | |||
return self.__unicode__().ljust(width) | |||
return self.__unicode__().ljust(width, fillchar) | |||
@inheritdoc | |||
@@ -212,44 +250,88 @@ class StringMixIn(object): | |||
def lstrip(self, chars=None): | |||
return self.__unicode__().lstrip(chars) | |||
if py3k: | |||
@staticmethod | |||
@inheritdoc | |||
def maketrans(x, y=None, z=None): | |||
if z is None: | |||
if y is None: | |||
return str.maketrans(x) | |||
return str.maketrans(x, y) | |||
return str.maketrans(x, y, z) | |||
@inheritdoc | |||
def partition(self, sep): | |||
return self.__unicode__().partition(sep) | |||
@inheritdoc | |||
def replace(self, old, new, count): | |||
def replace(self, old, new, count=None): | |||
if count is None: | |||
return self.__unicode__().replace(old, new) | |||
return self.__unicode__().replace(old, new, count) | |||
@inheritdoc | |||
def rfind(self, sub=None, start=None, end=None): | |||
def rfind(self, sub, start=None, end=None): | |||
return self.__unicode__().rfind(sub, start, end) | |||
@inheritdoc | |||
def rindex(self, sub=None, start=None, end=None): | |||
def rindex(self, sub, start=None, end=None): | |||
return self.__unicode__().rindex(sub, start, end) | |||
@inheritdoc | |||
def rjust(self, width, fillchar=None): | |||
if fillchar is None: | |||
return self.__unicode__().rjust(width) | |||
return self.__unicode__().rjust(width, fillchar) | |||
@inheritdoc | |||
def rpartition(self, sep): | |||
return self.__unicode__().rpartition(sep) | |||
@inheritdoc | |||
def rsplit(self, sep=None, maxsplit=None): | |||
return self.__unicode__().rsplit(sep, maxsplit) | |||
if py3k: | |||
@inheritdoc | |||
def rsplit(self, sep=None, maxsplit=None): | |||
kwargs = {} | |||
if sep is not None: | |||
kwargs["sep"] = sep | |||
if maxsplit is not None: | |||
kwargs["maxsplit"] = maxsplit | |||
return self.__unicode__().rsplit(**kwargs) | |||
else: | |||
@inheritdoc | |||
def rsplit(self, sep=None, maxsplit=None): | |||
if maxsplit is None: | |||
if sep is None: | |||
return self.__unicode__().rsplit() | |||
return self.__unicode__().rsplit(sep) | |||
return self.__unicode__().rsplit(sep, maxsplit) | |||
@inheritdoc | |||
def rstrip(self, chars=None): | |||
return self.__unicode__().rstrip(chars) | |||
@inheritdoc | |||
def split(self, sep=None, maxsplit=None): | |||
return self.__unicode__().split(sep, maxsplit) | |||
if py3k: | |||
@inheritdoc | |||
def split(self, sep=None, maxsplit=None): | |||
kwargs = {} | |||
if sep is not None: | |||
kwargs["sep"] = sep | |||
if maxsplit is not None: | |||
kwargs["maxsplit"] = maxsplit | |||
return self.__unicode__().split(**kwargs) | |||
else: | |||
@inheritdoc | |||
def split(self, sep=None, maxsplit=None): | |||
if maxsplit is None: | |||
if sep is None: | |||
return self.__unicode__().split() | |||
return self.__unicode__().split(sep) | |||
return self.__unicode__().split(sep, maxsplit) | |||
@inheritdoc | |||
def splitlines(self, keepends=None): | |||
if keepends is None: | |||
return self.__unicode__().splitlines() | |||
return self.__unicode__().splitlines(keepends) | |||
@inheritdoc | |||
@@ -269,8 +351,8 @@ class StringMixIn(object): | |||
return self.__unicode__().title() | |||
@inheritdoc | |||
def translate(self, table, deletechars=None): | |||
return self.__unicode__().translate(table, deletechars) | |||
def translate(self, table): | |||
return self.__unicode__().translate(table) | |||
@inheritdoc | |||
def upper(self): | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -34,16 +34,16 @@ from .smart_list import SmartList | |||
def parse_anything(value): | |||
"""Return a :py:class:`~.Wikicode` for *value*, allowing multiple types. | |||
This differs from :py:func:`mwparserfromhell.parse` in that we accept more | |||
than just a string to be parsed. Unicode objects (strings in py3k), strings | |||
(bytes in py3k), integers (converted to strings), ``None``, existing | |||
This differs from :py:meth:`.Parser.parse` in that we accept more than just | |||
a string to be parsed. Unicode objects (strings in py3k), strings (bytes in | |||
py3k), integers (converted to strings), ``None``, existing | |||
:py:class:`~.Node` or :py:class:`~.Wikicode` objects, as well as an | |||
iterable of these types, are supported. This is used to parse input | |||
on-the-fly by various methods of :py:class:`~.Wikicode` and others like | |||
:py:class:`~.Template`, such as :py:meth:`wikicode.insert() | |||
<.Wikicode.insert>` or setting :py:meth:`template.name <.Template.name>`. | |||
""" | |||
from . import parse | |||
from .parser import Parser | |||
from .wikicode import Wikicode | |||
if isinstance(value, Wikicode): | |||
@@ -51,11 +51,11 @@ def parse_anything(value): | |||
elif isinstance(value, Node): | |||
return Wikicode(SmartList([value])) | |||
elif isinstance(value, str): | |||
return parse(value) | |||
return Parser(value).parse() | |||
elif isinstance(value, bytes): | |||
return parse(value.decode("utf8")) | |||
return Parser(value.decode("utf8")).parse() | |||
elif isinstance(value, int): | |||
return parse(str(value)) | |||
return Parser(str(value)).parse() | |||
elif value is None: | |||
return Wikicode(SmartList()) | |||
try: | |||
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -23,8 +23,9 @@ | |||
from __future__ import unicode_literals | |||
import re | |||
from .compat import maxsize, str | |||
from .nodes import Heading, Node, Tag, Template, Text, Wikilink | |||
from .compat import maxsize, py3k, str | |||
from .nodes import (Argument, Comment, Heading, HTMLEntity, Node, Tag, | |||
Template, Text, Wikilink) | |||
from .string_mixin import StringMixIn | |||
from .utils import parse_anything | |||
@@ -68,7 +69,7 @@ class Wikicode(StringMixIn): | |||
Raises ``ValueError`` if *obj* is not within *node*. | |||
""" | |||
for context, child in node.__iternodes__(self._get_all_nodes): | |||
if child is obj: | |||
if self._is_equivalent(obj, child): | |||
return context | |||
raise ValueError(obj) | |||
@@ -88,13 +89,7 @@ class Wikicode(StringMixIn): | |||
If *obj* is a ``Node``, the function will test whether they are the | |||
same object, otherwise it will compare them with ``==``. | |||
""" | |||
if isinstance(obj, Node): | |||
if node is obj: | |||
return True | |||
else: | |||
if node == obj: | |||
return True | |||
return False | |||
return (node is obj) if isinstance(obj, Node) else (node == obj) | |||
def _contains(self, nodes, obj): | |||
"""Return ``True`` if *obj* is inside of *nodes*, else ``False``. | |||
@@ -157,6 +152,36 @@ class Wikicode(StringMixIn): | |||
node.__showtree__(write, get, mark) | |||
return lines | |||
@classmethod | |||
def _build_filter_methods(cls, **meths): | |||
"""Given Node types, build the corresponding i?filter shortcuts. | |||
The should be given as keys storing the method's base name paired | |||
with values storing the corresponding :py:class:`~.Node` type. For | |||
example, the dict may contain the pair ``("templates", Template)``, | |||
which will produce the methods :py:meth:`ifilter_templates` and | |||
:py:meth:`filter_templates`, which are shortcuts for | |||
:py:meth:`ifilter(forcetype=Template) <ifilter>` and | |||
:py:meth:`filter(forcetype=Template) <filter>`, respectively. These | |||
shortcuts are added to the class itself, with an appropriate docstring. | |||
""" | |||
doc = """Iterate over {0}. | |||
This is equivalent to :py:meth:`{1}` with *forcetype* set to | |||
:py:class:`~{2.__module__}.{2.__name__}`. | |||
""" | |||
make_ifilter = lambda ftype: (lambda self, **kw: | |||
self.ifilter(forcetype=ftype, **kw)) | |||
make_filter = lambda ftype: (lambda self, **kw: | |||
self.filter(forcetype=ftype, **kw)) | |||
for name, ftype in (meths.items() if py3k else meths.iteritems()): | |||
ifilter = make_ifilter(ftype) | |||
filter = make_filter(ftype) | |||
ifilter.__doc__ = doc.format(name, "ifilter", ftype) | |||
filter.__doc__ = doc.format(name, "filter", ftype) | |||
setattr(cls, "ifilter_" + name, ifilter) | |||
setattr(cls, "filter_" + name, filter) | |||
@property | |||
def nodes(self): | |||
"""A list of :py:class:`~.Node` objects. | |||
@@ -168,6 +193,8 @@ class Wikicode(StringMixIn): | |||
@nodes.setter | |||
def nodes(self, value): | |||
if not isinstance(value, list): | |||
value = parse_anything(value).nodes | |||
self._nodes = value | |||
def get(self, index): | |||
@@ -188,9 +215,10 @@ class Wikicode(StringMixIn): | |||
raise ValueError("Cannot coerce multiple nodes into one index") | |||
if index >= len(self.nodes) or -1 * index > len(self.nodes): | |||
raise IndexError("List assignment index out of range") | |||
self.nodes.pop(index) | |||
if nodes: | |||
self.nodes[index] = nodes[0] | |||
else: | |||
self.nodes.pop(index) | |||
def index(self, obj, recursive=False): | |||
"""Return the index of *obj* in the list of nodes. | |||
@@ -294,47 +322,11 @@ class Wikicode(StringMixIn): | |||
*flags*. If *forcetype* is given, only nodes that are instances of this | |||
type are yielded. | |||
""" | |||
if recursive: | |||
nodes = self._get_all_nodes(self) | |||
else: | |||
nodes = self.nodes | |||
for node in nodes: | |||
for node in (self._get_all_nodes(self) if recursive else self.nodes): | |||
if not forcetype or isinstance(node, forcetype): | |||
if not matches or re.search(matches, str(node), flags): | |||
yield node | |||
def ifilter_links(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Iterate over wikilink nodes. | |||
This is equivalent to :py:meth:`ifilter` with *forcetype* set to | |||
:py:class:`~.Wikilink`. | |||
""" | |||
return self.ifilter(recursive, matches, flags, forcetype=Wikilink) | |||
def ifilter_templates(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Iterate over template nodes. | |||
This is equivalent to :py:meth:`ifilter` with *forcetype* set to | |||
:py:class:`~.Template`. | |||
""" | |||
return self.filter(recursive, matches, flags, forcetype=Template) | |||
def ifilter_text(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Iterate over text nodes. | |||
This is equivalent to :py:meth:`ifilter` with *forcetype* set to | |||
:py:class:`~.nodes.Text`. | |||
""" | |||
return self.filter(recursive, matches, flags, forcetype=Text) | |||
def ifilter_tags(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Iterate over tag nodes. | |||
This is equivalent to :py:meth:`ifilter` with *forcetype* set to | |||
:py:class:`~.Tag`. | |||
""" | |||
return self.ifilter(recursive, matches, flags, forcetype=Tag) | |||
def filter(self, recursive=False, matches=None, flags=FLAGS, | |||
forcetype=None): | |||
"""Return a list of nodes within our list matching certain conditions. | |||
@@ -343,77 +335,56 @@ class Wikicode(StringMixIn): | |||
""" | |||
return list(self.ifilter(recursive, matches, flags, forcetype)) | |||
def filter_links(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Return a list of wikilink nodes. | |||
This is equivalent to calling :py:func:`list` on | |||
:py:meth:`ifilter_links`. | |||
""" | |||
return list(self.ifilter_links(recursive, matches, flags)) | |||
def filter_templates(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Return a list of template nodes. | |||
This is equivalent to calling :py:func:`list` on | |||
:py:meth:`ifilter_templates`. | |||
""" | |||
return list(self.ifilter_templates(recursive, matches, flags)) | |||
def filter_text(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Return a list of text nodes. | |||
This is equivalent to calling :py:func:`list` on | |||
:py:meth:`ifilter_text`. | |||
""" | |||
return list(self.ifilter_text(recursive, matches, flags)) | |||
def filter_tags(self, recursive=False, matches=None, flags=FLAGS): | |||
"""Return a list of tag nodes. | |||
This is equivalent to calling :py:func:`list` on | |||
:py:meth:`ifilter_tags`. | |||
""" | |||
return list(self.ifilter_tags(recursive, matches, flags)) | |||
def get_sections(self, flat=True, matches=None, levels=None, flags=FLAGS, | |||
include_headings=True): | |||
def get_sections(self, levels=None, matches=None, flags=FLAGS, | |||
include_lead=None, include_headings=True): | |||
"""Return a list of sections within the page. | |||
Sections are returned as :py:class:`~.Wikicode` objects with a shared | |||
node list (implemented using :py:class:`~.SmartList`) so that changes | |||
to sections are reflected in the parent Wikicode object. | |||
With *flat* as ``True``, each returned section contains all of its | |||
subsections within the :py:class:`~.Wikicode`; otherwise, the returned | |||
sections contain only the section up to the next heading, regardless of | |||
its size. If *matches* is given, it should be a regex to matched | |||
against the titles of section headings; only sections whose headings | |||
match the regex will be included. If *levels* is given, it should be a = | |||
list of integers; only sections whose heading levels are within the | |||
list will be returned. If *include_headings* is ``True``, the section's | |||
literal :py:class:`~.Heading` object will be included in returned | |||
:py:class:`~.Wikicode` objects; otherwise, this is skipped. | |||
Each section contains all of its subsections. If *levels* is given, it | |||
should be a iterable of integers; only sections whose heading levels | |||
are within it will be returned.If *matches* is given, it should be a | |||
regex to be matched against the titles of section headings; only | |||
sections whose headings match the regex will be included. *flags* can | |||
be used to override the default regex flags (see :py:meth:`ifilter`) if | |||
*matches* is used. | |||
If *include_lead* is ``True``, the first, lead section (without a | |||
heading) will be included in the list; ``False`` will not include it; | |||
the default will include it only if no specific *levels* were given. If | |||
*include_headings* is ``True``, the section's beginning | |||
:py:class:`~.Heading` object will be included; otherwise, this is | |||
skipped. | |||
""" | |||
if matches: | |||
matches = r"^(=+?)\s*" + matches + r"\s*\1$" | |||
headings = self.filter(recursive=True, matches=matches, flags=flags, | |||
forcetype=Heading) | |||
headings = self.filter_headings(recursive=True) | |||
filtered = self.filter_headings(recursive=True, matches=matches, | |||
flags=flags) | |||
if levels: | |||
headings = [head for head in headings if head.level in levels] | |||
filtered = [head for head in filtered if head.level in levels] | |||
if matches or include_lead is False or (not include_lead and levels): | |||
buffers = [] | |||
else: | |||
buffers = [(maxsize, 0)] | |||
sections = [] | |||
buffers = [[maxsize, 0]] | |||
i = 0 | |||
while i < len(self.nodes): | |||
if self.nodes[i] in headings: | |||
this = self.nodes[i].level | |||
for (level, start) in buffers: | |||
if not flat or this <= level: | |||
buffers.remove([level, start]) | |||
if this <= level: | |||
sections.append(Wikicode(self.nodes[start:i])) | |||
buffers.append([this, i]) | |||
if not include_headings: | |||
i += 1 | |||
buffers = [buf for buf in buffers if buf[0] < this] | |||
if self.nodes[i] in filtered: | |||
if not include_headings: | |||
i += 1 | |||
if i >= len(self.nodes): | |||
break | |||
buffers.append((this, i)) | |||
i += 1 | |||
for (level, start) in buffers: | |||
if start != i: | |||
@@ -473,3 +444,8 @@ class Wikicode(StringMixIn): | |||
""" | |||
marker = object() # Random object we can find with certainty in a list | |||
return "\n".join(self._get_tree(self, [], marker, 0)) | |||
Wikicode._build_filter_methods( | |||
arguments=Argument, comments=Comment, headings=Heading, | |||
html_entities=HTMLEntity, tags=Tag, templates=Template, text=Text, | |||
wikilinks=Wikilink) |
@@ -1,7 +1,7 @@ | |||
#! /usr/bin/env python | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -21,16 +21,24 @@ | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from setuptools import setup, find_packages | |||
from setuptools import setup, find_packages, Extension | |||
from mwparserfromhell import __version__ | |||
from mwparserfromhell.compat import py3k | |||
with open("README.rst") as fp: | |||
long_docs = fp.read() | |||
# builder = Extension("mwparserfromhell.parser._builder", | |||
# sources = ["mwparserfromhell/parser/builder.c"]) | |||
tokenizer = Extension("mwparserfromhell.parser._tokenizer", | |||
sources = ["mwparserfromhell/parser/tokenizer.c"]) | |||
setup( | |||
name = "mwparserfromhell", | |||
packages = find_packages(exclude=("tests",)), | |||
ext_modules = [] if py3k else [tokenizer], | |||
test_suite = "tests", | |||
version = __version__, | |||
author = "Ben Kurtovic", | |||
@@ -0,0 +1,130 @@ | |||
<?xml version="1.0" encoding="UTF-8"?> | |||
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> | |||
<plist version="1.0"> | |||
<dict> | |||
<key>fileTypes</key> | |||
<array> | |||
<string>mwtest</string> | |||
</array> | |||
<key>name</key> | |||
<string>MWParserFromHell Test Case</string> | |||
<key>patterns</key> | |||
<array> | |||
<dict> | |||
<key>match</key> | |||
<string>---</string> | |||
<key>name</key> | |||
<string>markup.heading.divider.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>keyword.other.name.mwpfh</string> | |||
</dict> | |||
<key>2</key> | |||
<dict> | |||
<key>name</key> | |||
<string>variable.other.name.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(name:)\s*(\w*)</string> | |||
<key>name</key> | |||
<string>meta.name.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>keyword.other.label.mwpfh</string> | |||
</dict> | |||
<key>2</key> | |||
<dict> | |||
<key>name</key> | |||
<string>comment.line.other.label.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(label:)\s*(.*)</string> | |||
<key>name</key> | |||
<string>meta.label.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>keyword.other.input.mwpfh</string> | |||
</dict> | |||
<key>2</key> | |||
<dict> | |||
<key>name</key> | |||
<string>string.quoted.double.input.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(input:)\s*(.*)</string> | |||
<key>name</key> | |||
<string>meta.input.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>keyword.other.output.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(output:)</string> | |||
<key>name</key> | |||
<string>meta.output.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>support.language.token.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(\w+)\s*\(</string> | |||
<key>name</key> | |||
<string>meta.name.token.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>captures</key> | |||
<dict> | |||
<key>1</key> | |||
<dict> | |||
<key>name</key> | |||
<string>variable.parameter.token.mwpfh</string> | |||
</dict> | |||
</dict> | |||
<key>match</key> | |||
<string>(\w+)\s*(=)</string> | |||
<key>name</key> | |||
<string>meta.name.parameter.token.mwpfh</string> | |||
</dict> | |||
<dict> | |||
<key>match</key> | |||
<string>".*?"</string> | |||
<key>name</key> | |||
<string>string.quoted.double.mwpfh</string> | |||
</dict> | |||
</array> | |||
<key>scopeName</key> | |||
<string>text.mwpfh</string> | |||
<key>uuid</key> | |||
<string>cd3e2ffa-a57d-4c40-954f-1a2e87ffd638</string> | |||
</dict> | |||
</plist> |
@@ -0,0 +1,133 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import print_function, unicode_literals | |||
from os import listdir, path | |||
import sys | |||
from mwparserfromhell.compat import py3k | |||
from mwparserfromhell.parser import tokens | |||
class _TestParseError(Exception): | |||
"""Raised internally when a test could not be parsed.""" | |||
pass | |||
class TokenizerTestCase(object): | |||
"""A base test case for tokenizers, whose tests are loaded dynamically. | |||
Subclassed along with unittest.TestCase to form TestPyTokenizer and | |||
TestCTokenizer. Tests are loaded dynamically from files in the 'tokenizer' | |||
directory. | |||
""" | |||
@classmethod | |||
def _build_test_method(cls, funcname, data): | |||
"""Create and return a method to be treated as a test case method. | |||
*data* is a dict containing multiple keys: the *input* text to be | |||
tokenized, the expected list of tokens as *output*, and an optional | |||
*label* for the method's docstring. | |||
""" | |||
def inner(self): | |||
expected = data["output"] | |||
actual = self.tokenizer().tokenize(data["input"]) | |||
self.assertEqual(expected, actual) | |||
if not py3k: | |||
inner.__name__ = funcname.encode("utf8") | |||
inner.__doc__ = data["label"] | |||
return inner | |||
@classmethod | |||
def _load_tests(cls, filename, name, text): | |||
"""Load all tests in *text* from the file *filename*.""" | |||
tests = text.split("\n---\n") | |||
counter = 1 | |||
digits = len(str(len(tests))) | |||
for test in tests: | |||
data = {"name": None, "label": None, "input": None, "output": None} | |||
try: | |||
for line in test.strip().splitlines(): | |||
if line.startswith("name:"): | |||
data["name"] = line[len("name:"):].strip() | |||
elif line.startswith("label:"): | |||
data["label"] = line[len("label:"):].strip() | |||
elif line.startswith("input:"): | |||
raw = line[len("input:"):].strip() | |||
if raw[0] == '"' and raw[-1] == '"': | |||
raw = raw[1:-1] | |||
raw = raw.encode("raw_unicode_escape") | |||
data["input"] = raw.decode("unicode_escape") | |||
elif line.startswith("output:"): | |||
raw = line[len("output:"):].strip() | |||
try: | |||
data["output"] = eval(raw, vars(tokens)) | |||
except Exception as err: | |||
raise _TestParseError(err) | |||
except _TestParseError as err: | |||
if data["name"]: | |||
error = "Could not parse test '{0}' in '{1}':\n\t{2}" | |||
print(error.format(data["name"], filename, err)) | |||
else: | |||
error = "Could not parse a test in '{0}':\n\t{1}" | |||
print(error.format(filename, err)) | |||
continue | |||
if not data["name"]: | |||
error = "A test in '{0}' was ignored because it lacked a name" | |||
print(error.format(filename)) | |||
continue | |||
if data["input"] is None or data["output"] is None: | |||
error = "Test '{0}' in '{1}' was ignored because it lacked an input or an output" | |||
print(error.format(data["name"], filename)) | |||
continue | |||
number = str(counter).zfill(digits) | |||
fname = "test_{0}{1}_{2}".format(name, number, data["name"]) | |||
meth = cls._build_test_method(fname, data) | |||
setattr(cls, fname, meth) | |||
counter += 1 | |||
@classmethod | |||
def build(cls): | |||
"""Load and install all tests from the 'tokenizer' directory.""" | |||
def load_file(filename): | |||
with open(filename, "rU") as fp: | |||
text = fp.read() | |||
if not py3k: | |||
text = text.decode("utf8") | |||
name = path.split(filename)[1][:0-len(extension)] | |||
cls._load_tests(filename, name, text) | |||
directory = path.join(path.dirname(__file__), "tokenizer") | |||
extension = ".mwtest" | |||
if len(sys.argv) > 2 and sys.argv[1] == "--use": | |||
for name in sys.argv[2:]: | |||
load_file(path.join(directory, name + extension)) | |||
sys.argv = [sys.argv[0]] # So unittest doesn't try to load these | |||
cls.skip_others = True | |||
else: | |||
for filename in listdir(directory): | |||
if not filename.endswith(extension): | |||
continue | |||
load_file(path.join(directory, filename)) | |||
cls.skip_others = False | |||
TokenizerTestCase.build() |
@@ -0,0 +1,126 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
from unittest import TestCase | |||
from mwparserfromhell.nodes import (Argument, Comment, Heading, HTMLEntity, | |||
Tag, Template, Text, Wikilink) | |||
from mwparserfromhell.nodes.extras import Attribute, Parameter | |||
from mwparserfromhell.smart_list import SmartList | |||
from mwparserfromhell.wikicode import Wikicode | |||
wrap = lambda L: Wikicode(SmartList(L)) | |||
wraptext = lambda *args: wrap([Text(t) for t in args]) | |||
def getnodes(code): | |||
"""Iterate over all child nodes of a given parent node. | |||
Imitates Wikicode._get_all_nodes(). | |||
""" | |||
for node in code.nodes: | |||
for context, child in node.__iternodes__(getnodes): | |||
yield child | |||
class TreeEqualityTestCase(TestCase): | |||
"""A base test case with support for comparing the equality of node trees. | |||
This adds a number of type equality functions, for Wikicode, Text, | |||
Templates, and Wikilinks. | |||
""" | |||
def assertNodeEqual(self, expected, actual): | |||
"""Assert that two Nodes have the same type and have the same data.""" | |||
registry = { | |||
Argument: self.assertArgumentNodeEqual, | |||
Comment: self.assertCommentNodeEqual, | |||
Heading: self.assertHeadingNodeEqual, | |||
HTMLEntity: self.assertHTMLEntityNodeEqual, | |||
Tag: self.assertTagNodeEqual, | |||
Template: self.assertTemplateNodeEqual, | |||
Text: self.assertTextNodeEqual, | |||
Wikilink: self.assertWikilinkNodeEqual | |||
} | |||
for nodetype in registry: | |||
if isinstance(expected, nodetype): | |||
self.assertIsInstance(actual, nodetype) | |||
registry[nodetype](expected, actual) | |||
def assertArgumentNodeEqual(self, expected, actual): | |||
"""Assert that two Argument nodes have the same data.""" | |||
self.assertWikicodeEqual(expected.name, actual.name) | |||
if expected.default is not None: | |||
self.assertWikicodeEqual(expected.default, actual.default) | |||
else: | |||
self.assertIs(None, actual.default) | |||
def assertCommentNodeEqual(self, expected, actual): | |||
"""Assert that two Comment nodes have the same data.""" | |||
self.assertWikicodeEqual(expected.contents, actual.contents) | |||
def assertHeadingNodeEqual(self, expected, actual): | |||
"""Assert that two Heading nodes have the same data.""" | |||
self.assertWikicodeEqual(expected.title, actual.title) | |||
self.assertEqual(expected.level, actual.level) | |||
def assertHTMLEntityNodeEqual(self, expected, actual): | |||
"""Assert that two HTMLEntity nodes have the same data.""" | |||
self.assertEqual(expected.value, actual.value) | |||
self.assertIs(expected.named, actual.named) | |||
self.assertIs(expected.hexadecimal, actual.hexadecimal) | |||
self.assertEqual(expected.hex_char, actual.hex_char) | |||
def assertTagNodeEqual(self, expected, actual): | |||
"""Assert that two Tag nodes have the same data.""" | |||
self.fail("Holding this until feature/html_tags is ready.") | |||
def assertTemplateNodeEqual(self, expected, actual): | |||
"""Assert that two Template nodes have the same data.""" | |||
self.assertWikicodeEqual(expected.name, actual.name) | |||
length = len(expected.params) | |||
self.assertEqual(length, len(actual.params)) | |||
for i in range(length): | |||
exp_param = expected.params[i] | |||
act_param = actual.params[i] | |||
self.assertWikicodeEqual(exp_param.name, act_param.name) | |||
self.assertWikicodeEqual(exp_param.value, act_param.value) | |||
self.assertIs(exp_param.showkey, act_param.showkey) | |||
def assertTextNodeEqual(self, expected, actual): | |||
"""Assert that two Text nodes have the same data.""" | |||
self.assertEqual(expected.value, actual.value) | |||
def assertWikilinkNodeEqual(self, expected, actual): | |||
"""Assert that two Wikilink nodes have the same data.""" | |||
self.assertWikicodeEqual(expected.title, actual.title) | |||
if expected.text is not None: | |||
self.assertWikicodeEqual(expected.text, actual.text) | |||
else: | |||
self.assertIs(None, actual.text) | |||
def assertWikicodeEqual(self, expected, actual): | |||
"""Assert that two Wikicode objects have the same data.""" | |||
self.assertIsInstance(actual, Wikicode) | |||
length = len(expected.nodes) | |||
self.assertEqual(length, len(actual.nodes)) | |||
for i in range(length): | |||
self.assertNodeEqual(expected.get(i), actual.get(i)) |
@@ -0,0 +1,20 @@ | |||
# -*- coding: utf-8 -*- | |||
""" | |||
Serves the same purpose as mwparserfromhell.compat, but only for objects | |||
required by unit tests. This avoids unnecessary imports (like urllib) within | |||
the main library. | |||
""" | |||
from mwparserfromhell.compat import py3k | |||
if py3k: | |||
range = range | |||
from io import StringIO | |||
from urllib.parse import urlencode | |||
from urllib.request import urlopen | |||
else: | |||
range = xrange | |||
from StringIO import StringIO | |||
from urllib import urlencode, urlopen |
@@ -0,0 +1,107 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Argument, Text | |||
from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext | |||
class TestArgument(TreeEqualityTestCase): | |||
"""Test cases for the Argument node.""" | |||
def test_unicode(self): | |||
"""test Argument.__unicode__()""" | |||
node = Argument(wraptext("foobar")) | |||
self.assertEqual("{{{foobar}}}", str(node)) | |||
node2 = Argument(wraptext("foo"), wraptext("bar")) | |||
self.assertEqual("{{{foo|bar}}}", str(node2)) | |||
def test_iternodes(self): | |||
"""test Argument.__iternodes__()""" | |||
node1n1 = Text("foobar") | |||
node2n1, node2n2, node2n3 = Text("foo"), Text("bar"), Text("baz") | |||
node1 = Argument(wrap([node1n1])) | |||
node2 = Argument(wrap([node2n1]), wrap([node2n2, node2n3])) | |||
gen1 = node1.__iternodes__(getnodes) | |||
gen2 = node2.__iternodes__(getnodes) | |||
self.assertEqual((None, node1), next(gen1)) | |||
self.assertEqual((None, node2), next(gen2)) | |||
self.assertEqual((node1.name, node1n1), next(gen1)) | |||
self.assertEqual((node2.name, node2n1), next(gen2)) | |||
self.assertEqual((node2.default, node2n2), next(gen2)) | |||
self.assertEqual((node2.default, node2n3), next(gen2)) | |||
self.assertRaises(StopIteration, next, gen1) | |||
self.assertRaises(StopIteration, next, gen2) | |||
def test_strip(self): | |||
"""test Argument.__strip__()""" | |||
node = Argument(wraptext("foobar")) | |||
node2 = Argument(wraptext("foo"), wraptext("bar")) | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertIs(None, node.__strip__(a, b)) | |||
self.assertEqual("bar", node2.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Argument.__showtree__()""" | |||
output = [] | |||
getter, marker = object(), object() | |||
get = lambda code: output.append((getter, code)) | |||
mark = lambda: output.append(marker) | |||
node1 = Argument(wraptext("foobar")) | |||
node2 = Argument(wraptext("foo"), wraptext("bar")) | |||
node1.__showtree__(output.append, get, mark) | |||
node2.__showtree__(output.append, get, mark) | |||
valid = [ | |||
"{{{", (getter, node1.name), "}}}", "{{{", (getter, node2.name), | |||
" | ", marker, (getter, node2.default), "}}}"] | |||
self.assertEqual(valid, output) | |||
def test_name(self): | |||
"""test getter/setter for the name attribute""" | |||
name = wraptext("foobar") | |||
node1 = Argument(name) | |||
node2 = Argument(name, wraptext("baz")) | |||
self.assertIs(name, node1.name) | |||
self.assertIs(name, node2.name) | |||
node1.name = "héhehé" | |||
node2.name = "héhehé" | |||
self.assertWikicodeEqual(wraptext("héhehé"), node1.name) | |||
self.assertWikicodeEqual(wraptext("héhehé"), node2.name) | |||
def test_default(self): | |||
"""test getter/setter for the default attribute""" | |||
default = wraptext("baz") | |||
node1 = Argument(wraptext("foobar")) | |||
node2 = Argument(wraptext("foobar"), default) | |||
self.assertIs(None, node1.default) | |||
self.assertIs(default, node2.default) | |||
node1.default = "buzz" | |||
node2.default = None | |||
self.assertWikicodeEqual(wraptext("buzz"), node1.default) | |||
self.assertIs(None, node2.default) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,247 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.nodes import (Argument, Comment, Heading, HTMLEntity, | |||
Tag, Template, Text, Wikilink) | |||
from mwparserfromhell.nodes.extras import Attribute, Parameter | |||
from mwparserfromhell.parser import tokens | |||
from mwparserfromhell.parser.builder import Builder | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext | |||
class TestBuilder(TreeEqualityTestCase): | |||
"""Tests for the builder, which turns tokens into Wikicode objects.""" | |||
def setUp(self): | |||
self.builder = Builder() | |||
def test_text(self): | |||
"""tests for building Text nodes""" | |||
tests = [ | |||
([tokens.Text(text="foobar")], wraptext("foobar")), | |||
([tokens.Text(text="fóóbar")], wraptext("fóóbar")), | |||
([tokens.Text(text="spam"), tokens.Text(text="eggs")], | |||
wraptext("spam", "eggs")), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_template(self): | |||
"""tests for building Template nodes""" | |||
tests = [ | |||
([tokens.TemplateOpen(), tokens.Text(text="foobar"), | |||
tokens.TemplateClose()], | |||
wrap([Template(wraptext("foobar"))])), | |||
([tokens.TemplateOpen(), tokens.Text(text="spam"), | |||
tokens.Text(text="eggs"), tokens.TemplateClose()], | |||
wrap([Template(wraptext("spam", "eggs"))])), | |||
([tokens.TemplateOpen(), tokens.Text(text="foo"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="bar"), | |||
tokens.TemplateClose()], | |||
wrap([Template(wraptext("foo"), params=[ | |||
Parameter(wraptext("1"), wraptext("bar"), showkey=False)])])), | |||
([tokens.TemplateOpen(), tokens.Text(text="foo"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="bar"), | |||
tokens.TemplateParamEquals(), tokens.Text(text="baz"), | |||
tokens.TemplateClose()], | |||
wrap([Template(wraptext("foo"), params=[ | |||
Parameter(wraptext("bar"), wraptext("baz"))])])), | |||
([tokens.TemplateOpen(), tokens.Text(text="foo"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="bar"), | |||
tokens.TemplateParamEquals(), tokens.Text(text="baz"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="biz"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="buzz"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="3"), | |||
tokens.TemplateParamEquals(), tokens.Text(text="buff"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="baff"), | |||
tokens.TemplateClose()], | |||
wrap([Template(wraptext("foo"), params=[ | |||
Parameter(wraptext("bar"), wraptext("baz")), | |||
Parameter(wraptext("1"), wraptext("biz"), showkey=False), | |||
Parameter(wraptext("2"), wraptext("buzz"), showkey=False), | |||
Parameter(wraptext("3"), wraptext("buff")), | |||
Parameter(wraptext("3"), wraptext("baff"), | |||
showkey=False)])])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_argument(self): | |||
"""tests for building Argument nodes""" | |||
tests = [ | |||
([tokens.ArgumentOpen(), tokens.Text(text="foobar"), | |||
tokens.ArgumentClose()], | |||
wrap([Argument(wraptext("foobar"))])), | |||
([tokens.ArgumentOpen(), tokens.Text(text="spam"), | |||
tokens.Text(text="eggs"), tokens.ArgumentClose()], | |||
wrap([Argument(wraptext("spam", "eggs"))])), | |||
([tokens.ArgumentOpen(), tokens.Text(text="foo"), | |||
tokens.ArgumentSeparator(), tokens.Text(text="bar"), | |||
tokens.ArgumentClose()], | |||
wrap([Argument(wraptext("foo"), wraptext("bar"))])), | |||
([tokens.ArgumentOpen(), tokens.Text(text="foo"), | |||
tokens.Text(text="bar"), tokens.ArgumentSeparator(), | |||
tokens.Text(text="baz"), tokens.Text(text="biz"), | |||
tokens.ArgumentClose()], | |||
wrap([Argument(wraptext("foo", "bar"), wraptext("baz", "biz"))])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_wikilink(self): | |||
"""tests for building Wikilink nodes""" | |||
tests = [ | |||
([tokens.WikilinkOpen(), tokens.Text(text="foobar"), | |||
tokens.WikilinkClose()], | |||
wrap([Wikilink(wraptext("foobar"))])), | |||
([tokens.WikilinkOpen(), tokens.Text(text="spam"), | |||
tokens.Text(text="eggs"), tokens.WikilinkClose()], | |||
wrap([Wikilink(wraptext("spam", "eggs"))])), | |||
([tokens.WikilinkOpen(), tokens.Text(text="foo"), | |||
tokens.WikilinkSeparator(), tokens.Text(text="bar"), | |||
tokens.WikilinkClose()], | |||
wrap([Wikilink(wraptext("foo"), wraptext("bar"))])), | |||
([tokens.WikilinkOpen(), tokens.Text(text="foo"), | |||
tokens.Text(text="bar"), tokens.WikilinkSeparator(), | |||
tokens.Text(text="baz"), tokens.Text(text="biz"), | |||
tokens.WikilinkClose()], | |||
wrap([Wikilink(wraptext("foo", "bar"), wraptext("baz", "biz"))])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_html_entity(self): | |||
"""tests for building HTMLEntity nodes""" | |||
tests = [ | |||
([tokens.HTMLEntityStart(), tokens.Text(text="nbsp"), | |||
tokens.HTMLEntityEnd()], | |||
wrap([HTMLEntity("nbsp", named=True, hexadecimal=False)])), | |||
([tokens.HTMLEntityStart(), tokens.HTMLEntityNumeric(), | |||
tokens.Text(text="107"), tokens.HTMLEntityEnd()], | |||
wrap([HTMLEntity("107", named=False, hexadecimal=False)])), | |||
([tokens.HTMLEntityStart(), tokens.HTMLEntityNumeric(), | |||
tokens.HTMLEntityHex(char="X"), tokens.Text(text="6B"), | |||
tokens.HTMLEntityEnd()], | |||
wrap([HTMLEntity("6B", named=False, hexadecimal=True, | |||
hex_char="X")])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_heading(self): | |||
"""tests for building Heading nodes""" | |||
tests = [ | |||
([tokens.HeadingStart(level=2), tokens.Text(text="foobar"), | |||
tokens.HeadingEnd()], | |||
wrap([Heading(wraptext("foobar"), 2)])), | |||
([tokens.HeadingStart(level=4), tokens.Text(text="spam"), | |||
tokens.Text(text="eggs"), tokens.HeadingEnd()], | |||
wrap([Heading(wraptext("spam", "eggs"), 4)])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_comment(self): | |||
"""tests for building Comment nodes""" | |||
tests = [ | |||
([tokens.CommentStart(), tokens.Text(text="foobar"), | |||
tokens.CommentEnd()], | |||
wrap([Comment(wraptext("foobar"))])), | |||
([tokens.CommentStart(), tokens.Text(text="spam"), | |||
tokens.Text(text="eggs"), tokens.CommentEnd()], | |||
wrap([Comment(wraptext("spam", "eggs"))])), | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_integration(self): | |||
"""a test for building a combination of templates together""" | |||
# {{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}} | |||
test = [tokens.TemplateOpen(), tokens.TemplateOpen(), | |||
tokens.TemplateOpen(), tokens.TemplateOpen(), | |||
tokens.Text(text="foo"), tokens.TemplateClose(), | |||
tokens.Text(text="bar"), tokens.TemplateParamSeparator(), | |||
tokens.Text(text="baz"), tokens.TemplateParamEquals(), | |||
tokens.Text(text="biz"), tokens.TemplateClose(), | |||
tokens.Text(text="buzz"), tokens.TemplateClose(), | |||
tokens.Text(text="usr"), tokens.TemplateParamSeparator(), | |||
tokens.TemplateOpen(), tokens.Text(text="bin"), | |||
tokens.TemplateClose(), tokens.TemplateClose()] | |||
valid = wrap( | |||
[Template(wrap([Template(wrap([Template(wrap([Template(wraptext( | |||
"foo")), Text("bar")]), params=[Parameter(wraptext("baz"), | |||
wraptext("biz"))]), Text("buzz")])), Text("usr")]), params=[ | |||
Parameter(wraptext("1"), wrap([Template(wraptext("bin"))]), | |||
showkey=False)])]) | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
def test_integration2(self): | |||
"""an even more audacious test for building a horrible wikicode mess""" | |||
# {{a|b|{{c|[[d]]{{{e}}}}}}}[[f|{{{g}}}<!--h-->]]{{i|j= }} | |||
test = [tokens.TemplateOpen(), tokens.Text(text="a"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="b"), | |||
tokens.TemplateParamSeparator(), tokens.TemplateOpen(), | |||
tokens.Text(text="c"), tokens.TemplateParamSeparator(), | |||
tokens.WikilinkOpen(), tokens.Text(text="d"), | |||
tokens.WikilinkClose(), tokens.ArgumentOpen(), | |||
tokens.Text(text="e"), tokens.ArgumentClose(), | |||
tokens.TemplateClose(), tokens.TemplateClose(), | |||
tokens.WikilinkOpen(), tokens.Text(text="f"), | |||
tokens.WikilinkSeparator(), tokens.ArgumentOpen(), | |||
tokens.Text(text="g"), tokens.ArgumentClose(), | |||
tokens.CommentStart(), tokens.Text(text="h"), | |||
tokens.CommentEnd(), tokens.WikilinkClose(), | |||
tokens.TemplateOpen(), tokens.Text(text="i"), | |||
tokens.TemplateParamSeparator(), tokens.Text(text="j"), | |||
tokens.TemplateParamEquals(), tokens.HTMLEntityStart(), | |||
tokens.Text(text="nbsp"), tokens.HTMLEntityEnd(), | |||
tokens.TemplateClose()] | |||
valid = wrap( | |||
[Template(wraptext("a"), params=[Parameter(wraptext("1"), wraptext( | |||
"b"), showkey=False), Parameter(wraptext("2"), wrap([Template( | |||
wraptext("c"), params=[Parameter(wraptext("1"), wrap([Wikilink( | |||
wraptext("d")), Argument(wraptext("e"))]), showkey=False)])]), | |||
showkey=False)]), Wikilink(wraptext("f"), wrap([Argument(wraptext( | |||
"g")), Comment(wraptext("h"))])), Template(wraptext("i"), params=[ | |||
Parameter(wraptext("j"), wrap([HTMLEntity("nbsp", | |||
named=True)]))])]) | |||
self.assertWikicodeEqual(valid, self.builder.build(test)) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,68 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Comment | |||
from ._test_tree_equality import TreeEqualityTestCase | |||
class TestComment(TreeEqualityTestCase): | |||
"""Test cases for the Comment node.""" | |||
def test_unicode(self): | |||
"""test Comment.__unicode__()""" | |||
node = Comment("foobar") | |||
self.assertEqual("<!--foobar-->", str(node)) | |||
def test_iternodes(self): | |||
"""test Comment.__iternodes__()""" | |||
node = Comment("foobar") | |||
gen = node.__iternodes__(None) | |||
self.assertEqual((None, node), next(gen)) | |||
self.assertRaises(StopIteration, next, gen) | |||
def test_strip(self): | |||
"""test Comment.__strip__()""" | |||
node = Comment("foobar") | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertIs(None, node.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Comment.__showtree__()""" | |||
output = [] | |||
node = Comment("foobar") | |||
node.__showtree__(output.append, None, None) | |||
self.assertEqual(["<!--foobar-->"], output) | |||
def test_contents(self): | |||
"""test getter/setter for the contents attribute""" | |||
node = Comment("foobar") | |||
self.assertEqual("foobar", node.contents) | |||
node.contents = "barfoo" | |||
self.assertEqual("barfoo", node.contents) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,48 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
try: | |||
from mwparserfromhell.parser._tokenizer import CTokenizer | |||
except ImportError: | |||
CTokenizer = None | |||
from ._test_tokenizer import TokenizerTestCase | |||
@unittest.skipUnless(CTokenizer, "C tokenizer not available") | |||
class TestCTokenizer(TokenizerTestCase, unittest.TestCase): | |||
"""Test cases for the C tokenizer.""" | |||
@classmethod | |||
def setUpClass(cls): | |||
cls.tokenizer = CTokenizer | |||
if not TokenizerTestCase.skip_others: | |||
def test_uses_c(self): | |||
"""make sure the C tokenizer identifies as using a C extension""" | |||
self.assertTrue(CTokenizer.USES_C) | |||
self.assertTrue(CTokenizer().USES_C) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,131 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import print_function, unicode_literals | |||
import json | |||
import unittest | |||
import mwparserfromhell | |||
from mwparserfromhell.compat import py3k, str | |||
from .compat import StringIO, urlencode, urlopen | |||
class TestDocs(unittest.TestCase): | |||
"""Integration test cases for mwparserfromhell's documentation.""" | |||
def assertPrint(self, input, output): | |||
"""Assertion check that *input*, when printed, produces *output*.""" | |||
buff = StringIO() | |||
print(input, end="", file=buff) | |||
buff.seek(0) | |||
self.assertEqual(output, buff.read()) | |||
def test_readme_1(self): | |||
"""test a block of example code in the README""" | |||
text = "I has a template! {{foo|bar|baz|eggs=spam}} See it?" | |||
wikicode = mwparserfromhell.parse(text) | |||
self.assertPrint(wikicode, | |||
"I has a template! {{foo|bar|baz|eggs=spam}} See it?") | |||
templates = wikicode.filter_templates() | |||
if py3k: | |||
self.assertPrint(templates, "['{{foo|bar|baz|eggs=spam}}']") | |||
else: | |||
self.assertPrint(templates, "[u'{{foo|bar|baz|eggs=spam}}']") | |||
template = templates[0] | |||
self.assertPrint(template.name, "foo") | |||
if py3k: | |||
self.assertPrint(template.params, "['bar', 'baz', 'eggs=spam']") | |||
else: | |||
self.assertPrint(template.params, "[u'bar', u'baz', u'eggs=spam']") | |||
self.assertPrint(template.get(1).value, "bar") | |||
self.assertPrint(template.get("eggs").value, "spam") | |||
def test_readme_2(self): | |||
"""test a block of example code in the README""" | |||
code = mwparserfromhell.parse("{{foo|this {{includes a|template}}}}") | |||
if py3k: | |||
self.assertPrint(code.filter_templates(), | |||
"['{{foo|this {{includes a|template}}}}']") | |||
else: | |||
self.assertPrint(code.filter_templates(), | |||
"[u'{{foo|this {{includes a|template}}}}']") | |||
foo = code.filter_templates()[0] | |||
self.assertPrint(foo.get(1).value, "this {{includes a|template}}") | |||
self.assertPrint(foo.get(1).value.filter_templates()[0], | |||
"{{includes a|template}}") | |||
self.assertPrint(foo.get(1).value.filter_templates()[0].get(1).value, | |||
"template") | |||
def test_readme_3(self): | |||
"""test a block of example code in the README""" | |||
text = "{{foo|{{bar}}={{baz|{{spam}}}}}}" | |||
temps = mwparserfromhell.parse(text).filter_templates(recursive=True) | |||
if py3k: | |||
res = "['{{foo|{{bar}}={{baz|{{spam}}}}}}', '{{bar}}', '{{baz|{{spam}}}}', '{{spam}}']" | |||
else: | |||
res = "[u'{{foo|{{bar}}={{baz|{{spam}}}}}}', u'{{bar}}', u'{{baz|{{spam}}}}', u'{{spam}}']" | |||
self.assertPrint(temps, res) | |||
def test_readme_4(self): | |||
"""test a block of example code in the README""" | |||
text = "{{cleanup}} '''Foo''' is a [[bar]]. {{uncategorized}}" | |||
code = mwparserfromhell.parse(text) | |||
for template in code.filter_templates(): | |||
if template.name == "cleanup" and not template.has_param("date"): | |||
template.add("date", "July 2012") | |||
res = "{{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{uncategorized}}" | |||
self.assertPrint(code, res) | |||
code.replace("{{uncategorized}}", "{{bar-stub}}") | |||
res = "{{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{bar-stub}}" | |||
self.assertPrint(code, res) | |||
if py3k: | |||
res = "['{{cleanup|date=July 2012}}', '{{bar-stub}}']" | |||
else: | |||
res = "[u'{{cleanup|date=July 2012}}', u'{{bar-stub}}']" | |||
self.assertPrint(code.filter_templates(), res) | |||
text = str(code) | |||
res = "{{cleanup|date=July 2012}} '''Foo''' is a [[bar]]. {{bar-stub}}" | |||
self.assertPrint(text, res) | |||
self.assertEqual(text, code) | |||
def test_readme_5(self): | |||
"""test a block of example code in the README; includes a web call""" | |||
url1 = "http://en.wikipedia.org/w/api.php" | |||
url2 = "http://en.wikipedia.org/w/index.php?title={0}&action=raw" | |||
title = "Test" | |||
data = {"action": "query", "prop": "revisions", "rvlimit": 1, | |||
"rvprop": "content", "format": "json", "titles": title} | |||
try: | |||
raw = urlopen(url1, urlencode(data).encode("utf8")).read() | |||
except IOError: | |||
self.skipTest("cannot continue because of unsuccessful web call") | |||
res = json.loads(raw.decode("utf8")) | |||
text = list(res["query"]["pages"].values())[0]["revisions"][0]["*"] | |||
try: | |||
expected = urlopen(url2.format(title)).read().decode("utf8") | |||
except IOError: | |||
self.skipTest("cannot continue because of unsuccessful web call") | |||
actual = mwparserfromhell.parse(text) | |||
self.assertEqual(expected, actual) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,91 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Heading, Text | |||
from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext | |||
class TestHeading(TreeEqualityTestCase): | |||
"""Test cases for the Heading node.""" | |||
def test_unicode(self): | |||
"""test Heading.__unicode__()""" | |||
node = Heading(wraptext("foobar"), 2) | |||
self.assertEqual("==foobar==", str(node)) | |||
node2 = Heading(wraptext(" zzz "), 5) | |||
self.assertEqual("===== zzz =====", str(node2)) | |||
def test_iternodes(self): | |||
"""test Heading.__iternodes__()""" | |||
text1, text2 = Text("foo"), Text("bar") | |||
node = Heading(wrap([text1, text2]), 3) | |||
gen = node.__iternodes__(getnodes) | |||
self.assertEqual((None, node), next(gen)) | |||
self.assertEqual((node.title, text1), next(gen)) | |||
self.assertEqual((node.title, text2), next(gen)) | |||
self.assertRaises(StopIteration, next, gen) | |||
def test_strip(self): | |||
"""test Heading.__strip__()""" | |||
node = Heading(wraptext("foobar"), 3) | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertEqual("foobar", node.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Heading.__showtree__()""" | |||
output = [] | |||
getter = object() | |||
get = lambda code: output.append((getter, code)) | |||
node1 = Heading(wraptext("foobar"), 3) | |||
node2 = Heading(wraptext(" baz "), 4) | |||
node1.__showtree__(output.append, get, None) | |||
node2.__showtree__(output.append, get, None) | |||
valid = ["===", (getter, node1.title), "===", | |||
"====", (getter, node2.title), "===="] | |||
self.assertEqual(valid, output) | |||
def test_title(self): | |||
"""test getter/setter for the title attribute""" | |||
title = wraptext("foobar") | |||
node = Heading(title, 3) | |||
self.assertIs(title, node.title) | |||
node.title = "héhehé" | |||
self.assertWikicodeEqual(wraptext("héhehé"), node.title) | |||
def test_level(self): | |||
"""test getter/setter for the level attribute""" | |||
node = Heading(wraptext("foobar"), 3) | |||
self.assertEqual(3, node.level) | |||
node.level = 5 | |||
self.assertEqual(5, node.level) | |||
self.assertRaises(ValueError, setattr, node, "level", 0) | |||
self.assertRaises(ValueError, setattr, node, "level", 7) | |||
self.assertRaises(ValueError, setattr, node, "level", "abc") | |||
self.assertRaises(ValueError, setattr, node, "level", False) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,169 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import HTMLEntity | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap | |||
class TestHTMLEntity(TreeEqualityTestCase): | |||
"""Test cases for the HTMLEntity node.""" | |||
def test_unicode(self): | |||
"""test HTMLEntity.__unicode__()""" | |||
node1 = HTMLEntity("nbsp", named=True, hexadecimal=False) | |||
node2 = HTMLEntity("107", named=False, hexadecimal=False) | |||
node3 = HTMLEntity("6b", named=False, hexadecimal=True) | |||
node4 = HTMLEntity("6C", named=False, hexadecimal=True, hex_char="X") | |||
self.assertEqual(" ", str(node1)) | |||
self.assertEqual("k", str(node2)) | |||
self.assertEqual("k", str(node3)) | |||
self.assertEqual("l", str(node4)) | |||
def test_iternodes(self): | |||
"""test HTMLEntity.__iternodes__()""" | |||
node = HTMLEntity("nbsp", named=True, hexadecimal=False) | |||
gen = node.__iternodes__(None) | |||
self.assertEqual((None, node), next(gen)) | |||
self.assertRaises(StopIteration, next, gen) | |||
def test_strip(self): | |||
"""test HTMLEntity.__strip__()""" | |||
node1 = HTMLEntity("nbsp", named=True, hexadecimal=False) | |||
node2 = HTMLEntity("107", named=False, hexadecimal=False) | |||
node3 = HTMLEntity("e9", named=False, hexadecimal=True) | |||
for a in (True, False): | |||
self.assertEqual("\xa0", node1.__strip__(True, a)) | |||
self.assertEqual(" ", node1.__strip__(False, a)) | |||
self.assertEqual("k", node2.__strip__(True, a)) | |||
self.assertEqual("k", node2.__strip__(False, a)) | |||
self.assertEqual("é", node3.__strip__(True, a)) | |||
self.assertEqual("é", node3.__strip__(False, a)) | |||
def test_showtree(self): | |||
"""test HTMLEntity.__showtree__()""" | |||
output = [] | |||
node1 = HTMLEntity("nbsp", named=True, hexadecimal=False) | |||
node2 = HTMLEntity("107", named=False, hexadecimal=False) | |||
node3 = HTMLEntity("e9", named=False, hexadecimal=True) | |||
node1.__showtree__(output.append, None, None) | |||
node2.__showtree__(output.append, None, None) | |||
node3.__showtree__(output.append, None, None) | |||
res = [" ", "k", "é"] | |||
self.assertEqual(res, output) | |||
def test_value(self): | |||
"""test getter/setter for the value attribute""" | |||
node1 = HTMLEntity("nbsp") | |||
node2 = HTMLEntity("107") | |||
node3 = HTMLEntity("e9") | |||
self.assertEqual("nbsp", node1.value) | |||
self.assertEqual("107", node2.value) | |||
self.assertEqual("e9", node3.value) | |||
node1.value = "ffa4" | |||
node2.value = 72 | |||
node3.value = "Sigma" | |||
self.assertEqual("ffa4", node1.value) | |||
self.assertFalse(node1.named) | |||
self.assertTrue(node1.hexadecimal) | |||
self.assertEqual("72", node2.value) | |||
self.assertFalse(node2.named) | |||
self.assertFalse(node2.hexadecimal) | |||
self.assertEqual("Sigma", node3.value) | |||
self.assertTrue(node3.named) | |||
self.assertFalse(node3.hexadecimal) | |||
node1.value = "10FFFF" | |||
node2.value = 110000 | |||
node2.value = 1114111 | |||
self.assertRaises(ValueError, setattr, node3, "value", "") | |||
self.assertRaises(ValueError, setattr, node3, "value", "foobar") | |||
self.assertRaises(ValueError, setattr, node3, "value", True) | |||
self.assertRaises(ValueError, setattr, node3, "value", -1) | |||
self.assertRaises(ValueError, setattr, node1, "value", 110000) | |||
self.assertRaises(ValueError, setattr, node1, "value", "1114112") | |||
def test_named(self): | |||
"""test getter/setter for the named attribute""" | |||
node1 = HTMLEntity("nbsp") | |||
node2 = HTMLEntity("107") | |||
node3 = HTMLEntity("e9") | |||
self.assertTrue(node1.named) | |||
self.assertFalse(node2.named) | |||
self.assertFalse(node3.named) | |||
node1.named = 1 | |||
node2.named = 0 | |||
node3.named = 0 | |||
self.assertTrue(node1.named) | |||
self.assertFalse(node2.named) | |||
self.assertFalse(node3.named) | |||
self.assertRaises(ValueError, setattr, node1, "named", False) | |||
self.assertRaises(ValueError, setattr, node2, "named", True) | |||
self.assertRaises(ValueError, setattr, node3, "named", True) | |||
def test_hexadecimal(self): | |||
"""test getter/setter for the hexadecimal attribute""" | |||
node1 = HTMLEntity("nbsp") | |||
node2 = HTMLEntity("107") | |||
node3 = HTMLEntity("e9") | |||
self.assertFalse(node1.hexadecimal) | |||
self.assertFalse(node2.hexadecimal) | |||
self.assertTrue(node3.hexadecimal) | |||
node1.hexadecimal = False | |||
node2.hexadecimal = True | |||
node3.hexadecimal = False | |||
self.assertFalse(node1.hexadecimal) | |||
self.assertTrue(node2.hexadecimal) | |||
self.assertFalse(node3.hexadecimal) | |||
self.assertRaises(ValueError, setattr, node1, "hexadecimal", True) | |||
def test_hex_char(self): | |||
"""test getter/setter for the hex_char attribute""" | |||
node1 = HTMLEntity("e9") | |||
node2 = HTMLEntity("e9", hex_char="X") | |||
self.assertEqual("x", node1.hex_char) | |||
self.assertEqual("X", node2.hex_char) | |||
node1.hex_char = "X" | |||
node2.hex_char = "x" | |||
self.assertEqual("X", node1.hex_char) | |||
self.assertEqual("x", node2.hex_char) | |||
self.assertRaises(ValueError, setattr, node1, "hex_char", 123) | |||
self.assertRaises(ValueError, setattr, node1, "hex_char", "foobar") | |||
self.assertRaises(ValueError, setattr, node1, "hex_char", True) | |||
def test_normalize(self): | |||
"""test getter/setter for the normalize attribute""" | |||
node1 = HTMLEntity("nbsp") | |||
node2 = HTMLEntity("107") | |||
node3 = HTMLEntity("e9") | |||
node4 = HTMLEntity("1f648") | |||
self.assertEqual("\xa0", node1.normalize()) | |||
self.assertEqual("k", node2.normalize()) | |||
self.assertEqual("é", node3.normalize()) | |||
self.assertEqual("\U0001F648", node4.normalize()) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -20,100 +20,56 @@ | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.parameter import Parameter | |||
from mwparserfromhell.template import Template | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Text | |||
from mwparserfromhell.nodes.extras import Parameter | |||
class TestParameter(unittest.TestCase): | |||
def setUp(self): | |||
self.name = "foo" | |||
self.value1 = "bar" | |||
self.value2 = "{{spam}}" | |||
self.value3 = "bar{{spam}}" | |||
self.value4 = "embedded {{eggs|spam|baz=buz}} {{goes}} here" | |||
self.templates2 = [Template("spam")] | |||
self.templates3 = [Template("spam")] | |||
self.templates4 = [Template("eggs", [Parameter("1", "spam"), | |||
Parameter("baz", "buz")]), | |||
Template("goes")] | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext | |||
def test_construct(self): | |||
Parameter(self.name, self.value1) | |||
Parameter(self.name, self.value2, self.templates2) | |||
Parameter(name=self.name, value=self.value3) | |||
Parameter(name=self.name, value=self.value4, templates=self.templates4) | |||
class TestParameter(TreeEqualityTestCase): | |||
"""Test cases for the Parameter node extra.""" | |||
def test_unicode(self): | |||
"""test Parameter.__unicode__()""" | |||
node = Parameter(wraptext("1"), wraptext("foo"), showkey=False) | |||
self.assertEqual("foo", str(node)) | |||
node2 = Parameter(wraptext("foo"), wraptext("bar")) | |||
self.assertEqual("foo=bar", str(node2)) | |||
def test_name(self): | |||
params = [ | |||
Parameter(self.name, self.value1), | |||
Parameter(self.name, self.value2, self.templates2), | |||
Parameter(name=self.name, value=self.value3), | |||
Parameter(name=self.name, value=self.value4, | |||
templates=self.templates4) | |||
] | |||
for param in params: | |||
self.assertEqual(param.name, self.name) | |||
"""test getter/setter for the name attribute""" | |||
name1 = wraptext("1") | |||
name2 = wraptext("foobar") | |||
node1 = Parameter(name1, wraptext("foobar"), showkey=False) | |||
node2 = Parameter(name2, wraptext("baz")) | |||
self.assertIs(name1, node1.name) | |||
self.assertIs(name2, node2.name) | |||
node1.name = "héhehé" | |||
node2.name = "héhehé" | |||
self.assertWikicodeEqual(wraptext("héhehé"), node1.name) | |||
self.assertWikicodeEqual(wraptext("héhehé"), node2.name) | |||
def test_value(self): | |||
tests = [ | |||
(Parameter(self.name, self.value1), self.value1), | |||
(Parameter(self.name, self.value2, self.templates2), self.value2), | |||
(Parameter(name=self.name, value=self.value3), self.value3), | |||
(Parameter(name=self.name, value=self.value4, | |||
templates=self.templates4), self.value4) | |||
] | |||
for param, correct in tests: | |||
self.assertEqual(param.value, correct) | |||
def test_templates(self): | |||
tests = [ | |||
(Parameter(self.name, self.value3, self.templates3), | |||
self.templates3), | |||
(Parameter(name=self.name, value=self.value4, | |||
templates=self.templates4), self.templates4) | |||
] | |||
for param, correct in tests: | |||
self.assertEqual(param.templates, correct) | |||
def test_magic(self): | |||
params = [Parameter(self.name, self.value1), | |||
Parameter(self.name, self.value2, self.templates2), | |||
Parameter(self.name, self.value3, self.templates3), | |||
Parameter(self.name, self.value4, self.templates4)] | |||
for param in params: | |||
self.assertEqual(repr(param), repr(param.value)) | |||
self.assertEqual(str(param), str(param.value)) | |||
self.assertIs(param < "eggs", param.value < "eggs") | |||
self.assertIs(param <= "bar{{spam}}", param.value <= "bar{{spam}}") | |||
self.assertIs(param == "bar", param.value == "bar") | |||
self.assertIs(param != "bar", param.value != "bar") | |||
self.assertIs(param > "eggs", param.value > "eggs") | |||
self.assertIs(param >= "bar{{spam}}", param.value >= "bar{{spam}}") | |||
self.assertEquals(bool(param), bool(param.value)) | |||
self.assertEquals(len(param), len(param.value)) | |||
self.assertEquals(list(param), list(param.value)) | |||
self.assertEquals(param[2], param.value[2]) | |||
self.assertEquals(list(reversed(param)), | |||
list(reversed(param.value))) | |||
self.assertIs("bar" in param, "bar" in param.value) | |||
self.assertEquals(param + "test", param.value + "test") | |||
self.assertEquals("test" + param, "test" + param.value) | |||
# add param | |||
# add template left | |||
# add template right | |||
self.assertEquals(param * 3, Parameter(param.name, param.value * 3, | |||
param.templates * 3)) | |||
self.assertEquals(3 * param, Parameter(param.name, 3 * param.value, | |||
3 * param.templates)) | |||
"""test getter/setter for the value attribute""" | |||
value = wraptext("bar") | |||
node = Parameter(wraptext("foo"), value) | |||
self.assertIs(value, node.value) | |||
node.value = "héhehé" | |||
self.assertWikicodeEqual(wraptext("héhehé"), node.value) | |||
# add param inplace | |||
# add template implace | |||
# add str inplace | |||
# multiply int inplace | |||
self.assertIsInstance(param, Parameter) | |||
self.assertIsInstance(param.value, str) | |||
def test_showkey(self): | |||
"""test getter/setter for the showkey attribute""" | |||
node1 = Parameter(wraptext("1"), wraptext("foo"), showkey=False) | |||
node2 = Parameter(wraptext("foo"), wraptext("bar")) | |||
self.assertFalse(node1.showkey) | |||
self.assertTrue(node2.showkey) | |||
node1.showkey = True | |||
node2.showkey = "" | |||
self.assertTrue(node1.showkey) | |||
self.assertFalse(node2.showkey) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -20,44 +20,47 @@ | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.parameter import Parameter | |||
from mwparserfromhell.parser import Parser | |||
from mwparserfromhell.template import Template | |||
from mwparserfromhell import parser | |||
from mwparserfromhell.nodes import Template, Text, Wikilink | |||
from mwparserfromhell.nodes.extras import Parameter | |||
TESTS = [ | |||
("", []), | |||
("abcdef ghijhk", []), | |||
("abc{this is not a template}def", []), | |||
("neither is {{this one}nor} {this one {despite}} containing braces", []), | |||
("this is an acceptable {{template}}", [Template("template")]), | |||
("{{multiple}}{{templates}}", [Template("multiple"), | |||
Template("templates")]), | |||
("multiple {{-}} templates {{+}}!", [Template("-"), Template("+")]), | |||
("{{{no templates here}}}", []), | |||
("{ {{templates here}}}", [Template("templates here")]), | |||
("{{{{I do not exist}}}}", []), | |||
("{{foo|bar|baz|eggs=spam}}", | |||
[Template("foo", [Parameter("1", "bar"), Parameter("2", "baz"), | |||
Parameter("eggs", "spam")])]), | |||
("{{abc def|ghi|jk=lmno|pqr|st=uv|wx|yz}}", | |||
[Template("abc def", [Parameter("1", "ghi"), Parameter("jk", "lmno"), | |||
Parameter("2", "pqr"), Parameter("st", "uv"), | |||
Parameter("3", "wx"), Parameter("4", "yz")])]), | |||
("{{this has a|{{template}}|inside of it}}", | |||
[Template("this has a", [Parameter("1", "{{template}}", | |||
[Template("template")]), | |||
Parameter("2", "inside of it")])]), | |||
("{{{{I exist}} }}", [Template("I exist", [] )]), | |||
("{{}}") | |||
] | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext | |||
from .compat import range | |||
class TestParser(unittest.TestCase): | |||
def test_parse(self): | |||
parser = Parser() | |||
for unparsed, parsed in TESTS: | |||
self.assertEqual(parser.parse(unparsed), parsed) | |||
class TestParser(TreeEqualityTestCase): | |||
"""Tests for the Parser class itself, which tokenizes and builds nodes.""" | |||
def test_use_c(self): | |||
"""make sure the correct tokenizer is used""" | |||
if parser.use_c: | |||
self.assertTrue(parser.Parser(None)._tokenizer.USES_C) | |||
parser.use_c = False | |||
self.assertFalse(parser.Parser(None)._tokenizer.USES_C) | |||
def test_parsing(self): | |||
"""integration test for parsing overall""" | |||
text = "this is text; {{this|is=a|template={{with|[[links]]|in}}it}}" | |||
expected = wrap([ | |||
Text("this is text; "), | |||
Template(wraptext("this"), [ | |||
Parameter(wraptext("is"), wraptext("a")), | |||
Parameter(wraptext("template"), wrap([ | |||
Template(wraptext("with"), [ | |||
Parameter(wraptext("1"), | |||
wrap([Wikilink(wraptext("links"))]), | |||
showkey=False), | |||
Parameter(wraptext("2"), | |||
wraptext("in"), showkey=False) | |||
]), | |||
Text("it") | |||
])) | |||
]) | |||
]) | |||
actual = parser.Parser(text).parse() | |||
self.assertWikicodeEqual(expected, actual) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,44 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.parser.tokenizer import Tokenizer | |||
from ._test_tokenizer import TokenizerTestCase | |||
class TestPyTokenizer(TokenizerTestCase, unittest.TestCase): | |||
"""Test cases for the Python tokenizer.""" | |||
@classmethod | |||
def setUpClass(cls): | |||
cls.tokenizer = Tokenizer | |||
if not TokenizerTestCase.skip_others: | |||
def test_uses_c(self): | |||
"""make sure the Python tokenizer identifies as not using C""" | |||
self.assertFalse(Tokenizer.USES_C) | |||
self.assertFalse(Tokenizer().USES_C) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,392 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import py3k | |||
from mwparserfromhell.smart_list import SmartList, _ListProxy | |||
from .compat import range | |||
class TestSmartList(unittest.TestCase): | |||
"""Test cases for the SmartList class and its child, _ListProxy.""" | |||
def _test_get_set_del_item(self, builder): | |||
"""Run tests on __get/set/delitem__ of a list built with *builder*.""" | |||
def assign(L, s1, s2, s3, val): | |||
L[s1:s2:s3] = val | |||
def delete(L, s1): | |||
del L[s1] | |||
list1 = builder([0, 1, 2, 3, "one", "two"]) | |||
list2 = builder(list(range(10))) | |||
self.assertEqual(1, list1[1]) | |||
self.assertEqual("one", list1[-2]) | |||
self.assertEqual([2, 3], list1[2:4]) | |||
self.assertRaises(IndexError, lambda: list1[6]) | |||
self.assertRaises(IndexError, lambda: list1[-7]) | |||
self.assertEqual([0, 1, 2], list1[:3]) | |||
self.assertEqual([0, 1, 2, 3, "one", "two"], list1[:]) | |||
self.assertEqual([3, "one", "two"], list1[3:]) | |||
self.assertEqual(["one", "two"], list1[-2:]) | |||
self.assertEqual([0, 1], list1[:-4]) | |||
self.assertEqual([], list1[6:]) | |||
self.assertEqual([], list1[4:2]) | |||
self.assertEqual([0, 2, "one"], list1[0:5:2]) | |||
self.assertEqual([0, 2], list1[0:-3:2]) | |||
self.assertEqual([0, 1, 2, 3, "one", "two"], list1[::]) | |||
self.assertEqual([2, 3, "one", "two"], list1[2::]) | |||
self.assertEqual([0, 1, 2, 3], list1[:4:]) | |||
self.assertEqual([2, 3], list1[2:4:]) | |||
self.assertEqual([0, 2, 4, 6, 8], list2[::2]) | |||
self.assertEqual([2, 5, 8], list2[2::3]) | |||
self.assertEqual([0, 3], list2[:6:3]) | |||
self.assertEqual([2, 5, 8], list2[-8:9:3]) | |||
self.assertEqual([], list2[100000:1000:-100]) | |||
list1[3] = 100 | |||
self.assertEqual(100, list1[3]) | |||
list1[-3] = 101 | |||
self.assertEqual([0, 1, 2, 101, "one", "two"], list1) | |||
list1[5:] = [6, 7, 8] | |||
self.assertEqual([6, 7, 8], list1[5:]) | |||
self.assertEqual([0, 1, 2, 101, "one", 6, 7, 8], list1) | |||
list1[2:4] = [-1, -2, -3, -4, -5] | |||
self.assertEqual([0, 1, -1, -2, -3, -4, -5, "one", 6, 7, 8], list1) | |||
list1[0:-3] = [99] | |||
self.assertEqual([99, 6, 7, 8], list1) | |||
list2[0:6:2] = [100, 102, 104] | |||
self.assertEqual([100, 1, 102, 3, 104, 5, 6, 7, 8, 9], list2) | |||
list2[::3] = [200, 203, 206, 209] | |||
self.assertEqual([200, 1, 102, 203, 104, 5, 206, 7, 8, 209], list2) | |||
list2[::] = range(7) | |||
self.assertEqual([0, 1, 2, 3, 4, 5, 6], list2) | |||
self.assertRaises(ValueError, assign, list2, 0, 5, 2, | |||
[100, 102, 104, 106]) | |||
del list2[2] | |||
self.assertEqual([0, 1, 3, 4, 5, 6], list2) | |||
del list2[-3] | |||
self.assertEqual([0, 1, 3, 5, 6], list2) | |||
self.assertRaises(IndexError, delete, list2, 100) | |||
self.assertRaises(IndexError, delete, list2, -6) | |||
list2[:] = range(10) | |||
del list2[3:6] | |||
self.assertEqual([0, 1, 2, 6, 7, 8, 9], list2) | |||
del list2[-2:] | |||
self.assertEqual([0, 1, 2, 6, 7], list2) | |||
del list2[:2] | |||
self.assertEqual([2, 6, 7], list2) | |||
list2[:] = range(10) | |||
del list2[2:8:2] | |||
self.assertEqual([0, 1, 3, 5, 7, 8, 9], list2) | |||
def _test_add_radd_iadd(self, builder): | |||
"""Run tests on __r/i/add__ of a list built with *builder*.""" | |||
list1 = builder(range(5)) | |||
list2 = builder(range(5, 10)) | |||
self.assertEqual([0, 1, 2, 3, 4, 5, 6], list1 + [5, 6]) | |||
self.assertEqual([0, 1, 2, 3, 4], list1) | |||
self.assertEqual(list(range(10)), list1 + list2) | |||
self.assertEqual([-2, -1, 0, 1, 2, 3, 4], [-2, -1] + list1) | |||
self.assertEqual([0, 1, 2, 3, 4], list1) | |||
list1 += ["foo", "bar", "baz"] | |||
self.assertEqual([0, 1, 2, 3, 4, "foo", "bar", "baz"], list1) | |||
def _test_other_magic_methods(self, builder): | |||
"""Run tests on other magic methods of a list built with *builder*.""" | |||
list1 = builder([0, 1, 2, 3, "one", "two"]) | |||
list2 = builder([]) | |||
list3 = builder([0, 2, 3, 4]) | |||
list4 = builder([0, 1, 2]) | |||
if py3k: | |||
self.assertEqual("[0, 1, 2, 3, 'one', 'two']", str(list1)) | |||
self.assertEqual(b"\x00\x01\x02", bytes(list4)) | |||
self.assertEqual("[0, 1, 2, 3, 'one', 'two']", repr(list1)) | |||
else: | |||
self.assertEqual("[0, 1, 2, 3, u'one', u'two']", unicode(list1)) | |||
self.assertEqual(b"[0, 1, 2, 3, u'one', u'two']", str(list1)) | |||
self.assertEqual(b"[0, 1, 2, 3, u'one', u'two']", repr(list1)) | |||
self.assertTrue(list1 < list3) | |||
self.assertTrue(list1 <= list3) | |||
self.assertFalse(list1 == list3) | |||
self.assertTrue(list1 != list3) | |||
self.assertFalse(list1 > list3) | |||
self.assertFalse(list1 >= list3) | |||
other1 = [0, 2, 3, 4] | |||
self.assertTrue(list1 < other1) | |||
self.assertTrue(list1 <= other1) | |||
self.assertFalse(list1 == other1) | |||
self.assertTrue(list1 != other1) | |||
self.assertFalse(list1 > other1) | |||
self.assertFalse(list1 >= other1) | |||
other2 = [0, 0, 1, 2] | |||
self.assertFalse(list1 < other2) | |||
self.assertFalse(list1 <= other2) | |||
self.assertFalse(list1 == other2) | |||
self.assertTrue(list1 != other2) | |||
self.assertTrue(list1 > other2) | |||
self.assertTrue(list1 >= other2) | |||
other3 = [0, 1, 2, 3, "one", "two"] | |||
self.assertFalse(list1 < other3) | |||
self.assertTrue(list1 <= other3) | |||
self.assertTrue(list1 == other3) | |||
self.assertFalse(list1 != other3) | |||
self.assertFalse(list1 > other3) | |||
self.assertTrue(list1 >= other3) | |||
self.assertTrue(bool(list1)) | |||
self.assertFalse(bool(list2)) | |||
self.assertEqual(6, len(list1)) | |||
self.assertEqual(0, len(list2)) | |||
out = [] | |||
for obj in list1: | |||
out.append(obj) | |||
self.assertEqual([0, 1, 2, 3, "one", "two"], out) | |||
out = [] | |||
for ch in list2: | |||
out.append(ch) | |||
self.assertEqual([], out) | |||
gen1 = iter(list1) | |||
out = [] | |||
for i in range(len(list1)): | |||
out.append(next(gen1)) | |||
self.assertRaises(StopIteration, next, gen1) | |||
self.assertEqual([0, 1, 2, 3, "one", "two"], out) | |||
gen2 = iter(list2) | |||
self.assertRaises(StopIteration, next, gen2) | |||
self.assertEqual(["two", "one", 3, 2, 1, 0], list(reversed(list1))) | |||
self.assertEqual([], list(reversed(list2))) | |||
self.assertTrue("one" in list1) | |||
self.assertTrue(3 in list1) | |||
self.assertFalse(10 in list1) | |||
self.assertFalse(0 in list2) | |||
self.assertEqual([], list2 * 5) | |||
self.assertEqual([], 5 * list2) | |||
self.assertEqual([0, 1, 2, 0, 1, 2, 0, 1, 2], list4 * 3) | |||
self.assertEqual([0, 1, 2, 0, 1, 2, 0, 1, 2], 3 * list4) | |||
list4 *= 2 | |||
self.assertEqual([0, 1, 2, 0, 1, 2], list4) | |||
def _test_list_methods(self, builder): | |||
"""Run tests on the public methods of a list built with *builder*.""" | |||
list1 = builder(range(5)) | |||
list2 = builder(["foo"]) | |||
list3 = builder([("a", 5), ("d", 2), ("b", 8), ("c", 3)]) | |||
list1.append(5) | |||
list1.append(1) | |||
list1.append(2) | |||
self.assertEqual([0, 1, 2, 3, 4, 5, 1, 2], list1) | |||
self.assertEqual(0, list1.count(6)) | |||
self.assertEqual(2, list1.count(1)) | |||
list1.extend(range(5, 8)) | |||
self.assertEqual([0, 1, 2, 3, 4, 5, 1, 2, 5, 6, 7], list1) | |||
self.assertEqual(1, list1.index(1)) | |||
self.assertEqual(6, list1.index(1, 3)) | |||
self.assertEqual(6, list1.index(1, 3, 7)) | |||
self.assertRaises(ValueError, list1.index, 1, 3, 5) | |||
list1.insert(0, -1) | |||
self.assertEqual([-1, 0, 1, 2, 3, 4, 5, 1, 2, 5, 6, 7], list1) | |||
list1.insert(-1, 6.5) | |||
self.assertEqual([-1, 0, 1, 2, 3, 4, 5, 1, 2, 5, 6, 6.5, 7], list1) | |||
list1.insert(13, 8) | |||
self.assertEqual([-1, 0, 1, 2, 3, 4, 5, 1, 2, 5, 6, 6.5, 7, 8], list1) | |||
self.assertEqual(8, list1.pop()) | |||
self.assertEqual(7, list1.pop()) | |||
self.assertEqual([-1, 0, 1, 2, 3, 4, 5, 1, 2, 5, 6, 6.5], list1) | |||
self.assertEqual(-1, list1.pop(0)) | |||
self.assertEqual(5, list1.pop(5)) | |||
self.assertEqual(6.5, list1.pop(-1)) | |||
self.assertEqual([0, 1, 2, 3, 4, 1, 2, 5, 6], list1) | |||
self.assertEqual("foo", list2.pop()) | |||
self.assertRaises(IndexError, list2.pop) | |||
self.assertEqual([], list2) | |||
list1.remove(6) | |||
self.assertEqual([0, 1, 2, 3, 4, 1, 2, 5], list1) | |||
list1.remove(1) | |||
self.assertEqual([0, 2, 3, 4, 1, 2, 5], list1) | |||
list1.remove(1) | |||
self.assertEqual([0, 2, 3, 4, 2, 5], list1) | |||
self.assertRaises(ValueError, list1.remove, 1) | |||
list1.reverse() | |||
self.assertEqual([5, 2, 4, 3, 2, 0], list1) | |||
list1.sort() | |||
self.assertEqual([0, 2, 2, 3, 4, 5], list1) | |||
list1.sort(reverse=True) | |||
self.assertEqual([5, 4, 3, 2, 2, 0], list1) | |||
if not py3k: | |||
func = lambda x, y: abs(3 - x) - abs(3 - y) # Distance from 3 | |||
list1.sort(cmp=func) | |||
self.assertEqual([3, 4, 2, 2, 5, 0], list1) | |||
list1.sort(cmp=func, reverse=True) | |||
self.assertEqual([0, 5, 4, 2, 2, 3], list1) | |||
list3.sort(key=lambda i: i[1]) | |||
self.assertEqual([("d", 2), ("c", 3), ("a", 5), ("b", 8)], list3) | |||
list3.sort(key=lambda i: i[1], reverse=True) | |||
self.assertEqual([("b", 8), ("a", 5), ("c", 3), ("d", 2)], list3) | |||
def test_docs(self): | |||
"""make sure the methods of SmartList/_ListProxy have docstrings""" | |||
methods = ["append", "count", "extend", "index", "insert", "pop", | |||
"remove", "reverse", "sort"] | |||
for meth in methods: | |||
expected = getattr(list, meth).__doc__ | |||
smartlist_doc = getattr(SmartList, meth).__doc__ | |||
listproxy_doc = getattr(_ListProxy, meth).__doc__ | |||
self.assertEqual(expected, smartlist_doc) | |||
self.assertEqual(expected, listproxy_doc) | |||
def test_doctest(self): | |||
"""make sure the test embedded in SmartList's docstring passes""" | |||
parent = SmartList([0, 1, 2, 3]) | |||
self.assertEqual([0, 1, 2, 3], parent) | |||
child = parent[2:] | |||
self.assertEqual([2, 3], child) | |||
child.append(4) | |||
self.assertEqual([2, 3, 4], child) | |||
self.assertEqual([0, 1, 2, 3, 4], parent) | |||
def test_parent_get_set_del(self): | |||
"""make sure SmartList's getitem/setitem/delitem work""" | |||
self._test_get_set_del_item(SmartList) | |||
def test_parent_add(self): | |||
"""make sure SmartList's add/radd/iadd work""" | |||
self._test_add_radd_iadd(SmartList) | |||
def test_parent_unaffected_magics(self): | |||
"""sanity checks against SmartList features that were not modified""" | |||
self._test_other_magic_methods(SmartList) | |||
def test_parent_methods(self): | |||
"""make sure SmartList's non-magic methods work, like append()""" | |||
self._test_list_methods(SmartList) | |||
def test_child_get_set_del(self): | |||
"""make sure _ListProxy's getitem/setitem/delitem work""" | |||
self._test_get_set_del_item(lambda L: SmartList(list(L))[:]) | |||
self._test_get_set_del_item(lambda L: SmartList([999] + list(L))[1:]) | |||
self._test_get_set_del_item(lambda L: SmartList(list(L) + [999])[:-1]) | |||
builder = lambda L: SmartList([101, 102] + list(L) + [201, 202])[2:-2] | |||
self._test_get_set_del_item(builder) | |||
def test_child_add(self): | |||
"""make sure _ListProxy's add/radd/iadd work""" | |||
self._test_add_radd_iadd(lambda L: SmartList(list(L))[:]) | |||
self._test_add_radd_iadd(lambda L: SmartList([999] + list(L))[1:]) | |||
self._test_add_radd_iadd(lambda L: SmartList(list(L) + [999])[:-1]) | |||
builder = lambda L: SmartList([101, 102] + list(L) + [201, 202])[2:-2] | |||
self._test_add_radd_iadd(builder) | |||
def test_child_other_magics(self): | |||
"""make sure _ListProxy's other magically implemented features work""" | |||
self._test_other_magic_methods(lambda L: SmartList(list(L))[:]) | |||
self._test_other_magic_methods(lambda L: SmartList([999] + list(L))[1:]) | |||
self._test_other_magic_methods(lambda L: SmartList(list(L) + [999])[:-1]) | |||
builder = lambda L: SmartList([101, 102] + list(L) + [201, 202])[2:-2] | |||
self._test_other_magic_methods(builder) | |||
def test_child_methods(self): | |||
"""make sure _ListProxy's non-magic methods work, like append()""" | |||
self._test_list_methods(lambda L: SmartList(list(L))[:]) | |||
self._test_list_methods(lambda L: SmartList([999] + list(L))[1:]) | |||
self._test_list_methods(lambda L: SmartList(list(L) + [999])[:-1]) | |||
builder = lambda L: SmartList([101, 102] + list(L) + [201, 202])[2:-2] | |||
self._test_list_methods(builder) | |||
def test_influence(self): | |||
"""make sure changes are propagated from parents to children""" | |||
parent = SmartList([0, 1, 2, 3, 4, 5]) | |||
child1 = parent[2:] | |||
child2 = parent[2:5] | |||
parent.append(6) | |||
child1.append(7) | |||
child2.append(4.5) | |||
self.assertEqual([0, 1, 2, 3, 4, 4.5, 5, 6, 7], parent) | |||
self.assertEqual([2, 3, 4, 4.5, 5, 6, 7], child1) | |||
self.assertEqual([2, 3, 4, 4.5], child2) | |||
parent.insert(0, -1) | |||
parent.insert(4, 2.5) | |||
parent.insert(10, 6.5) | |||
self.assertEqual([-1, 0, 1, 2, 2.5, 3, 4, 4.5, 5, 6, 6.5, 7], parent) | |||
self.assertEqual([2, 2.5, 3, 4, 4.5, 5, 6, 6.5, 7], child1) | |||
self.assertEqual([2, 2.5, 3, 4, 4.5], child2) | |||
self.assertEqual(7, parent.pop()) | |||
self.assertEqual(6.5, child1.pop()) | |||
self.assertEqual(4.5, child2.pop()) | |||
self.assertEqual([-1, 0, 1, 2, 2.5, 3, 4, 5, 6], parent) | |||
self.assertEqual([2, 2.5, 3, 4, 5, 6], child1) | |||
self.assertEqual([2, 2.5, 3, 4], child2) | |||
parent.remove(-1) | |||
child1.remove(2.5) | |||
self.assertEqual([0, 1, 2, 3, 4, 5, 6], parent) | |||
self.assertEqual([2, 3, 4, 5, 6], child1) | |||
self.assertEqual([2, 3, 4], child2) | |||
self.assertEqual(0, parent.pop(0)) | |||
self.assertEqual([1, 2, 3, 4, 5, 6], parent) | |||
self.assertEqual([2, 3, 4, 5, 6], child1) | |||
self.assertEqual([2, 3, 4], child2) | |||
child2.reverse() | |||
self.assertEqual([1, 4, 3, 2, 5, 6], parent) | |||
self.assertEqual([4, 3, 2, 5, 6], child1) | |||
self.assertEqual([4, 3, 2], child2) | |||
parent.extend([7, 8]) | |||
child1.extend([8.1, 8.2]) | |||
child2.extend([1.9, 1.8]) | |||
self.assertEqual([1, 4, 3, 2, 1.9, 1.8, 5, 6, 7, 8, 8.1, 8.2], parent) | |||
self.assertEqual([4, 3, 2, 1.9, 1.8, 5, 6, 7, 8, 8.1, 8.2], child1) | |||
self.assertEqual([4, 3, 2, 1.9, 1.8], child2) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,435 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
from sys import getdefaultencoding | |||
from types import GeneratorType | |||
import unittest | |||
from mwparserfromhell.compat import bytes, py3k, str | |||
from mwparserfromhell.string_mixin import StringMixIn | |||
from .compat import range | |||
class _FakeString(StringMixIn): | |||
def __init__(self, data): | |||
self._data = data | |||
def __unicode__(self): | |||
return self._data | |||
class TestStringMixIn(unittest.TestCase): | |||
"""Test cases for the StringMixIn class.""" | |||
def test_docs(self): | |||
"""make sure the various methods of StringMixIn have docstrings""" | |||
methods = [ | |||
"capitalize", "center", "count", "encode", "endswith", | |||
"expandtabs", "find", "format", "index", "isalnum", "isalpha", | |||
"isdecimal", "isdigit", "islower", "isnumeric", "isspace", | |||
"istitle", "isupper", "join", "ljust", "lower", "lstrip", | |||
"partition", "replace", "rfind", "rindex", "rjust", "rpartition", | |||
"rsplit", "rstrip", "split", "splitlines", "startswith", "strip", | |||
"swapcase", "title", "translate", "upper", "zfill"] | |||
if py3k: | |||
methods.extend(["casefold", "format_map", "isidentifier", | |||
"isprintable", "maketrans"]) | |||
else: | |||
methods.append("decode") | |||
for meth in methods: | |||
expected = getattr(str, meth).__doc__ | |||
actual = getattr(StringMixIn, meth).__doc__ | |||
self.assertEqual(expected, actual) | |||
def test_types(self): | |||
"""make sure StringMixIns convert to different types correctly""" | |||
fstr = _FakeString("fake string") | |||
self.assertEqual(str(fstr), "fake string") | |||
self.assertEqual(bytes(fstr), b"fake string") | |||
if py3k: | |||
self.assertEqual(repr(fstr), "'fake string'") | |||
else: | |||
self.assertEqual(repr(fstr), b"u'fake string'") | |||
self.assertIsInstance(str(fstr), str) | |||
self.assertIsInstance(bytes(fstr), bytes) | |||
if py3k: | |||
self.assertIsInstance(repr(fstr), str) | |||
else: | |||
self.assertIsInstance(repr(fstr), bytes) | |||
def test_comparisons(self): | |||
"""make sure comparison operators work""" | |||
str1 = _FakeString("this is a fake string") | |||
str2 = _FakeString("this is a fake string") | |||
str3 = _FakeString("fake string, this is") | |||
str4 = "this is a fake string" | |||
str5 = "fake string, this is" | |||
self.assertFalse(str1 > str2) | |||
self.assertTrue(str1 >= str2) | |||
self.assertTrue(str1 == str2) | |||
self.assertFalse(str1 != str2) | |||
self.assertFalse(str1 < str2) | |||
self.assertTrue(str1 <= str2) | |||
self.assertTrue(str1 > str3) | |||
self.assertTrue(str1 >= str3) | |||
self.assertFalse(str1 == str3) | |||
self.assertTrue(str1 != str3) | |||
self.assertFalse(str1 < str3) | |||
self.assertFalse(str1 <= str3) | |||
self.assertFalse(str1 > str4) | |||
self.assertTrue(str1 >= str4) | |||
self.assertTrue(str1 == str4) | |||
self.assertFalse(str1 != str4) | |||
self.assertFalse(str1 < str4) | |||
self.assertTrue(str1 <= str4) | |||
self.assertTrue(str1 > str5) | |||
self.assertTrue(str1 >= str5) | |||
self.assertFalse(str1 == str5) | |||
self.assertTrue(str1 != str5) | |||
self.assertFalse(str1 < str5) | |||
self.assertFalse(str1 <= str5) | |||
def test_other_magics(self): | |||
"""test other magically implemented features, like len() and iter()""" | |||
str1 = _FakeString("fake string") | |||
str2 = _FakeString("") | |||
expected = ["f", "a", "k", "e", " ", "s", "t", "r", "i", "n", "g"] | |||
self.assertTrue(str1) | |||
self.assertFalse(str2) | |||
self.assertEqual(11, len(str1)) | |||
self.assertEqual(0, len(str2)) | |||
out = [] | |||
for ch in str1: | |||
out.append(ch) | |||
self.assertEqual(expected, out) | |||
out = [] | |||
for ch in str2: | |||
out.append(ch) | |||
self.assertEqual([], out) | |||
gen1 = iter(str1) | |||
gen2 = iter(str2) | |||
self.assertIsInstance(gen1, GeneratorType) | |||
self.assertIsInstance(gen2, GeneratorType) | |||
out = [] | |||
for i in range(len(str1)): | |||
out.append(next(gen1)) | |||
self.assertRaises(StopIteration, next, gen1) | |||
self.assertEqual(expected, out) | |||
self.assertRaises(StopIteration, next, gen2) | |||
self.assertEqual("gnirts ekaf", "".join(list(reversed(str1)))) | |||
self.assertEqual([], list(reversed(str2))) | |||
self.assertEqual("f", str1[0]) | |||
self.assertEqual(" ", str1[4]) | |||
self.assertEqual("g", str1[10]) | |||
self.assertEqual("n", str1[-2]) | |||
self.assertRaises(IndexError, lambda: str1[11]) | |||
self.assertRaises(IndexError, lambda: str2[0]) | |||
self.assertTrue("k" in str1) | |||
self.assertTrue("fake" in str1) | |||
self.assertTrue("str" in str1) | |||
self.assertTrue("" in str1) | |||
self.assertTrue("" in str2) | |||
self.assertFalse("real" in str1) | |||
self.assertFalse("s" in str2) | |||
def test_other_methods(self): | |||
"""test the remaining non-magic methods of StringMixIn""" | |||
str1 = _FakeString("fake string") | |||
self.assertEqual("Fake string", str1.capitalize()) | |||
self.assertEqual(" fake string ", str1.center(15)) | |||
self.assertEqual(" fake string ", str1.center(16)) | |||
self.assertEqual("qqfake stringqq", str1.center(15, "q")) | |||
self.assertEqual(1, str1.count("e")) | |||
self.assertEqual(0, str1.count("z")) | |||
self.assertEqual(1, str1.count("r", 7)) | |||
self.assertEqual(0, str1.count("r", 8)) | |||
self.assertEqual(1, str1.count("r", 5, 9)) | |||
self.assertEqual(0, str1.count("r", 5, 7)) | |||
if not py3k: | |||
str2 = _FakeString("fo") | |||
self.assertEqual(str1, str1.decode()) | |||
actual = _FakeString("\\U00010332\\U0001033f\\U00010344") | |||
self.assertEqual("𐌲𐌿𐍄", actual.decode("unicode_escape")) | |||
self.assertRaises(UnicodeError, str2.decode, "punycode") | |||
self.assertEqual("", str2.decode("punycode", "ignore")) | |||
str3 = _FakeString("𐌲𐌿𐍄") | |||
actual = b"\xF0\x90\x8C\xB2\xF0\x90\x8C\xBF\xF0\x90\x8D\x84" | |||
self.assertEqual(b"fake string", str1.encode()) | |||
self.assertEqual(actual, str3.encode("utf-8")) | |||
self.assertEqual(actual, str3.encode(encoding="utf-8")) | |||
if getdefaultencoding() == "ascii": | |||
self.assertRaises(UnicodeEncodeError, str3.encode) | |||
elif getdefaultencoding() == "utf-8": | |||
self.assertEqual(actual, str3.encode()) | |||
self.assertRaises(UnicodeEncodeError, str3.encode, "ascii") | |||
self.assertRaises(UnicodeEncodeError, str3.encode, "ascii", "strict") | |||
if getdefaultencoding() == "ascii": | |||
self.assertRaises(UnicodeEncodeError, str3.encode, errors="strict") | |||
elif getdefaultencoding() == "utf-8": | |||
self.assertEqual(actual, str3.encode(errors="strict")) | |||
self.assertEqual(b"", str3.encode("ascii", "ignore")) | |||
if getdefaultencoding() == "ascii": | |||
self.assertEqual(b"", str3.encode(errors="ignore")) | |||
elif getdefaultencoding() == "utf-8": | |||
self.assertEqual(actual, str3.encode(errors="ignore")) | |||
self.assertTrue(str1.endswith("ing")) | |||
self.assertFalse(str1.endswith("ingh")) | |||
str4 = _FakeString("\tfoobar") | |||
self.assertEqual("fake string", str1) | |||
self.assertEqual(" foobar", str4.expandtabs()) | |||
self.assertEqual(" foobar", str4.expandtabs(4)) | |||
self.assertEqual(3, str1.find("e")) | |||
self.assertEqual(-1, str1.find("z")) | |||
self.assertEqual(7, str1.find("r", 7)) | |||
self.assertEqual(-1, str1.find("r", 8)) | |||
self.assertEqual(7, str1.find("r", 5, 9)) | |||
self.assertEqual(-1, str1.find("r", 5, 7)) | |||
str5 = _FakeString("foo{0}baz") | |||
str6 = _FakeString("foo{abc}baz") | |||
str7 = _FakeString("foo{0}{abc}buzz") | |||
str8 = _FakeString("{0}{1}") | |||
self.assertEqual("fake string", str1.format()) | |||
self.assertEqual("foobarbaz", str5.format("bar")) | |||
self.assertEqual("foobarbaz", str6.format(abc="bar")) | |||
self.assertEqual("foobarbazbuzz", str7.format("bar", abc="baz")) | |||
self.assertRaises(IndexError, str8.format, "abc") | |||
if py3k: | |||
self.assertEqual("fake string", str1.format_map({})) | |||
self.assertEqual("foobarbaz", str6.format_map({"abc": "bar"})) | |||
self.assertRaises(ValueError, str5.format_map, {0: "abc"}) | |||
self.assertEqual(3, str1.index("e")) | |||
self.assertRaises(ValueError, str1.index, "z") | |||
self.assertEqual(7, str1.index("r", 7)) | |||
self.assertRaises(ValueError, str1.index, "r", 8) | |||
self.assertEqual(7, str1.index("r", 5, 9)) | |||
self.assertRaises(ValueError, str1.index, "r", 5, 7) | |||
str9 = _FakeString("foobar") | |||
str10 = _FakeString("foobar123") | |||
str11 = _FakeString("foo bar") | |||
self.assertTrue(str9.isalnum()) | |||
self.assertTrue(str10.isalnum()) | |||
self.assertFalse(str11.isalnum()) | |||
self.assertTrue(str9.isalpha()) | |||
self.assertFalse(str10.isalpha()) | |||
self.assertFalse(str11.isalpha()) | |||
str12 = _FakeString("123") | |||
str13 = _FakeString("\u2155") | |||
str14 = _FakeString("\u00B2") | |||
self.assertFalse(str9.isdecimal()) | |||
self.assertTrue(str12.isdecimal()) | |||
self.assertFalse(str13.isdecimal()) | |||
self.assertFalse(str14.isdecimal()) | |||
self.assertFalse(str9.isdigit()) | |||
self.assertTrue(str12.isdigit()) | |||
self.assertFalse(str13.isdigit()) | |||
self.assertTrue(str14.isdigit()) | |||
if py3k: | |||
self.assertTrue(str9.isidentifier()) | |||
self.assertTrue(str10.isidentifier()) | |||
self.assertFalse(str11.isidentifier()) | |||
self.assertFalse(str12.isidentifier()) | |||
str15 = _FakeString("") | |||
str16 = _FakeString("FooBar") | |||
self.assertTrue(str9.islower()) | |||
self.assertFalse(str15.islower()) | |||
self.assertFalse(str16.islower()) | |||
self.assertFalse(str9.isnumeric()) | |||
self.assertTrue(str12.isnumeric()) | |||
self.assertTrue(str13.isnumeric()) | |||
self.assertTrue(str14.isnumeric()) | |||
if py3k: | |||
str16B = _FakeString("\x01\x02") | |||
self.assertTrue(str9.isprintable()) | |||
self.assertTrue(str13.isprintable()) | |||
self.assertTrue(str14.isprintable()) | |||
self.assertTrue(str15.isprintable()) | |||
self.assertFalse(str16B.isprintable()) | |||
str17 = _FakeString(" ") | |||
str18 = _FakeString("\t \t \r\n") | |||
self.assertFalse(str1.isspace()) | |||
self.assertFalse(str9.isspace()) | |||
self.assertTrue(str17.isspace()) | |||
self.assertTrue(str18.isspace()) | |||
str19 = _FakeString("This Sentence Looks Like A Title") | |||
str20 = _FakeString("This sentence doesn't LookLikeATitle") | |||
self.assertFalse(str15.istitle()) | |||
self.assertTrue(str19.istitle()) | |||
self.assertFalse(str20.istitle()) | |||
str21 = _FakeString("FOOBAR") | |||
self.assertFalse(str9.isupper()) | |||
self.assertFalse(str15.isupper()) | |||
self.assertTrue(str21.isupper()) | |||
self.assertEqual("foobar", str15.join(["foo", "bar"])) | |||
self.assertEqual("foo123bar123baz", str12.join(("foo", "bar", "baz"))) | |||
self.assertEqual("fake string ", str1.ljust(15)) | |||
self.assertEqual("fake string ", str1.ljust(16)) | |||
self.assertEqual("fake stringqqqq", str1.ljust(15, "q")) | |||
str22 = _FakeString("ß") | |||
self.assertEqual("", str15.lower()) | |||
self.assertEqual("foobar", str16.lower()) | |||
self.assertEqual("ß", str22.lower()) | |||
if py3k: | |||
self.assertEqual("", str15.casefold()) | |||
self.assertEqual("foobar", str16.casefold()) | |||
self.assertEqual("ss", str22.casefold()) | |||
str23 = _FakeString(" fake string ") | |||
self.assertEqual("fake string", str1.lstrip()) | |||
self.assertEqual("fake string ", str23.lstrip()) | |||
self.assertEqual("ke string", str1.lstrip("abcdef")) | |||
self.assertEqual(("fa", "ke", " string"), str1.partition("ke")) | |||
self.assertEqual(("fake string", "", ""), str1.partition("asdf")) | |||
str24 = _FakeString("boo foo moo") | |||
self.assertEqual("real string", str1.replace("fake", "real")) | |||
self.assertEqual("bu fu moo", str24.replace("oo", "u", 2)) | |||
self.assertEqual(3, str1.rfind("e")) | |||
self.assertEqual(-1, str1.rfind("z")) | |||
self.assertEqual(7, str1.rfind("r", 7)) | |||
self.assertEqual(-1, str1.rfind("r", 8)) | |||
self.assertEqual(7, str1.rfind("r", 5, 9)) | |||
self.assertEqual(-1, str1.rfind("r", 5, 7)) | |||
self.assertEqual(3, str1.rindex("e")) | |||
self.assertRaises(ValueError, str1.rindex, "z") | |||
self.assertEqual(7, str1.rindex("r", 7)) | |||
self.assertRaises(ValueError, str1.rindex, "r", 8) | |||
self.assertEqual(7, str1.rindex("r", 5, 9)) | |||
self.assertRaises(ValueError, str1.rindex, "r", 5, 7) | |||
self.assertEqual(" fake string", str1.rjust(15)) | |||
self.assertEqual(" fake string", str1.rjust(16)) | |||
self.assertEqual("qqqqfake string", str1.rjust(15, "q")) | |||
self.assertEqual(("fa", "ke", " string"), str1.rpartition("ke")) | |||
self.assertEqual(("", "", "fake string"), str1.rpartition("asdf")) | |||
str25 = _FakeString(" this is a sentence with whitespace ") | |||
actual = ["this", "is", "a", "sentence", "with", "whitespace"] | |||
self.assertEqual(actual, str25.rsplit()) | |||
self.assertEqual(actual, str25.rsplit(None)) | |||
actual = ["", "", "", "this", "is", "a", "", "", "sentence", "with", | |||
"", "whitespace", ""] | |||
self.assertEqual(actual, str25.rsplit(" ")) | |||
actual = [" this is a", "sentence", "with", "whitespace"] | |||
self.assertEqual(actual, str25.rsplit(None, 3)) | |||
actual = [" this is a sentence with", "", "whitespace", ""] | |||
self.assertEqual(actual, str25.rsplit(" ", 3)) | |||
if py3k: | |||
actual = [" this is a", "sentence", "with", "whitespace"] | |||
self.assertEqual(actual, str25.rsplit(maxsplit=3)) | |||
self.assertEqual("fake string", str1.rstrip()) | |||
self.assertEqual(" fake string", str23.rstrip()) | |||
self.assertEqual("fake stri", str1.rstrip("ngr")) | |||
actual = ["this", "is", "a", "sentence", "with", "whitespace"] | |||
self.assertEqual(actual, str25.split()) | |||
self.assertEqual(actual, str25.split(None)) | |||
actual = ["", "", "", "this", "is", "a", "", "", "sentence", "with", | |||
"", "whitespace", ""] | |||
self.assertEqual(actual, str25.split(" ")) | |||
actual = ["this", "is", "a", "sentence with whitespace "] | |||
self.assertEqual(actual, str25.split(None, 3)) | |||
actual = ["", "", "", "this is a sentence with whitespace "] | |||
self.assertEqual(actual, str25.split(" ", 3)) | |||
if py3k: | |||
actual = ["this", "is", "a", "sentence with whitespace "] | |||
self.assertEqual(actual, str25.split(maxsplit=3)) | |||
str26 = _FakeString("lines\nof\ntext\r\nare\r\npresented\nhere") | |||
self.assertEqual(["lines", "of", "text", "are", "presented", "here"], | |||
str26.splitlines()) | |||
self.assertEqual(["lines\n", "of\n", "text\r\n", "are\r\n", | |||
"presented\n", "here"], str26.splitlines(True)) | |||
self.assertTrue(str1.startswith("fake")) | |||
self.assertFalse(str1.startswith("faker")) | |||
self.assertEqual("fake string", str1.strip()) | |||
self.assertEqual("fake string", str23.strip()) | |||
self.assertEqual("ke stri", str1.strip("abcdefngr")) | |||
self.assertEqual("fOObAR", str16.swapcase()) | |||
self.assertEqual("Fake String", str1.title()) | |||
if py3k: | |||
table1 = StringMixIn.maketrans({97: "1", 101: "2", 105: "3", | |||
111: "4", 117: "5"}) | |||
table2 = StringMixIn.maketrans("aeiou", "12345") | |||
table3 = StringMixIn.maketrans("aeiou", "12345", "rts") | |||
self.assertEqual("f1k2 str3ng", str1.translate(table1)) | |||
self.assertEqual("f1k2 str3ng", str1.translate(table2)) | |||
self.assertEqual("f1k2 3ng", str1.translate(table3)) | |||
else: | |||
table = {97: "1", 101: "2", 105: "3", 111: "4", 117: "5"} | |||
self.assertEqual("f1k2 str3ng", str1.translate(table)) | |||
self.assertEqual("", str15.upper()) | |||
self.assertEqual("FOOBAR", str16.upper()) | |||
self.assertEqual("123", str12.zfill(3)) | |||
self.assertEqual("000123", str12.zfill(6)) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -1,6 +1,6 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
@@ -20,87 +20,345 @@ | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from itertools import permutations | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.parameter import Parameter | |||
from mwparserfromhell.template import Template | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import HTMLEntity, Template, Text | |||
from mwparserfromhell.nodes.extras import Parameter | |||
from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext | |||
class TestTemplate(unittest.TestCase): | |||
def setUp(self): | |||
self.name = "foo" | |||
self.bar = Parameter("1", "bar") | |||
self.baz = Parameter("2", "baz") | |||
self.eggs = Parameter("eggs", "spam") | |||
self.params = [self.bar, self.baz, self.eggs] | |||
pgens = lambda k, v: Parameter(wraptext(k), wraptext(v), showkey=True) | |||
pgenh = lambda k, v: Parameter(wraptext(k), wraptext(v), showkey=False) | |||
def test_construct(self): | |||
Template(self.name) | |||
Template(self.name, self.params) | |||
Template(name=self.name) | |||
Template(name=self.name, params=self.params) | |||
class TestTemplate(TreeEqualityTestCase): | |||
"""Test cases for the Template node.""" | |||
def test_unicode(self): | |||
"""test Template.__unicode__()""" | |||
node = Template(wraptext("foobar")) | |||
self.assertEqual("{{foobar}}", str(node)) | |||
node2 = Template(wraptext("foo"), | |||
[pgenh("1", "bar"), pgens("abc", "def")]) | |||
self.assertEqual("{{foo|bar|abc=def}}", str(node2)) | |||
def test_iternodes(self): | |||
"""test Template.__iternodes__()""" | |||
node1n1 = Text("foobar") | |||
node2n1, node2n2, node2n3 = Text("foo"), Text("bar"), Text("abc") | |||
node2n4, node2n5 = Text("def"), Text("ghi") | |||
node2p1 = Parameter(wraptext("1"), wrap([node2n2]), showkey=False) | |||
node2p2 = Parameter(wrap([node2n3]), wrap([node2n4, node2n5]), | |||
showkey=True) | |||
node1 = Template(wrap([node1n1])) | |||
node2 = Template(wrap([node2n1]), [node2p1, node2p2]) | |||
gen1 = node1.__iternodes__(getnodes) | |||
gen2 = node2.__iternodes__(getnodes) | |||
self.assertEqual((None, node1), next(gen1)) | |||
self.assertEqual((None, node2), next(gen2)) | |||
self.assertEqual((node1.name, node1n1), next(gen1)) | |||
self.assertEqual((node2.name, node2n1), next(gen2)) | |||
self.assertEqual((node2.params[0].value, node2n2), next(gen2)) | |||
self.assertEqual((node2.params[1].name, node2n3), next(gen2)) | |||
self.assertEqual((node2.params[1].value, node2n4), next(gen2)) | |||
self.assertEqual((node2.params[1].value, node2n5), next(gen2)) | |||
self.assertRaises(StopIteration, next, gen1) | |||
self.assertRaises(StopIteration, next, gen2) | |||
def test_strip(self): | |||
"""test Template.__strip__()""" | |||
node1 = Template(wraptext("foobar")) | |||
node2 = Template(wraptext("foo"), | |||
[pgenh("1", "bar"), pgens("abc", "def")]) | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertEqual(None, node1.__strip__(a, b)) | |||
self.assertEqual(None, node2.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Template.__showtree__()""" | |||
output = [] | |||
getter, marker = object(), object() | |||
get = lambda code: output.append((getter, code)) | |||
mark = lambda: output.append(marker) | |||
node1 = Template(wraptext("foobar")) | |||
node2 = Template(wraptext("foo"), | |||
[pgenh("1", "bar"), pgens("abc", "def")]) | |||
node1.__showtree__(output.append, get, mark) | |||
node2.__showtree__(output.append, get, mark) | |||
valid = [ | |||
"{{", (getter, node1.name), "}}", "{{", (getter, node2.name), | |||
" | ", marker, (getter, node2.params[0].name), " = ", marker, | |||
(getter, node2.params[0].value), " | ", marker, | |||
(getter, node2.params[1].name), " = ", marker, | |||
(getter, node2.params[1].value), "}}"] | |||
self.assertEqual(valid, output) | |||
def test_name(self): | |||
templates = [ | |||
Template(self.name), | |||
Template(self.name, self.params), | |||
Template(name=self.name), | |||
Template(name=self.name, params=self.params) | |||
] | |||
for template in templates: | |||
self.assertEqual(template.name, self.name) | |||
"""test getter/setter for the name attribute""" | |||
name = wraptext("foobar") | |||
node1 = Template(name) | |||
node2 = Template(name, [pgenh("1", "bar")]) | |||
self.assertIs(name, node1.name) | |||
self.assertIs(name, node2.name) | |||
node1.name = "asdf" | |||
node2.name = "téstïng" | |||
self.assertWikicodeEqual(wraptext("asdf"), node1.name) | |||
self.assertWikicodeEqual(wraptext("téstïng"), node2.name) | |||
def test_params(self): | |||
for template in (Template(self.name), Template(name=self.name)): | |||
self.assertEqual(template.params, []) | |||
for template in (Template(self.name, self.params), | |||
Template(name=self.name, params=self.params)): | |||
self.assertEqual(template.params, self.params) | |||
def test_getitem(self): | |||
template = Template(name=self.name, params=self.params) | |||
self.assertIs(template[0], self.bar) | |||
self.assertIs(template[1], self.baz) | |||
self.assertIs(template[2], self.eggs) | |||
self.assertIs(template["1"], self.bar) | |||
self.assertIs(template["2"], self.baz) | |||
self.assertIs(template["eggs"], self.eggs) | |||
def test_render(self): | |||
tests = [ | |||
(Template(self.name), "{{foo}}"), | |||
(Template(self.name, self.params), "{{foo|bar|baz|eggs=spam}}") | |||
] | |||
for template, rendered in tests: | |||
self.assertEqual(template.render(), rendered) | |||
def test_repr(self): | |||
correct1= 'Template(name=foo, params={})' | |||
correct2 = 'Template(name=foo, params={"1": "bar", "2": "baz", "eggs": "spam"})' | |||
tests = [(Template(self.name), correct1), | |||
(Template(self.name, self.params), correct2)] | |||
for template, correct in tests: | |||
self.assertEqual(repr(template), correct) | |||
self.assertEqual(str(template), correct) | |||
def test_cmp(self): | |||
tmp1 = Template(self.name) | |||
tmp2 = Template(name=self.name) | |||
tmp3 = Template(self.name, []) | |||
tmp4 = Template(name=self.name, params=[]) | |||
tmp5 = Template(self.name, self.params) | |||
tmp6 = Template(name=self.name, params=self.params) | |||
for tmpA, tmpB in permutations((tmp1, tmp2, tmp3, tmp4), 2): | |||
self.assertEqual(tmpA, tmpB) | |||
for tmpA, tmpB in permutations((tmp5, tmp6), 2): | |||
self.assertEqual(tmpA, tmpB) | |||
for tmpA in (tmp5, tmp6): | |||
for tmpB in (tmp1, tmp2, tmp3, tmp4): | |||
self.assertNotEqual(tmpA, tmpB) | |||
self.assertNotEqual(tmpB, tmpA) | |||
"""test getter for the params attribute""" | |||
node1 = Template(wraptext("foobar")) | |||
plist = [pgenh("1", "bar"), pgens("abc", "def")] | |||
node2 = Template(wraptext("foo"), plist) | |||
self.assertEqual([], node1.params) | |||
self.assertIs(plist, node2.params) | |||
def test_has_param(self): | |||
"""test Template.has_param()""" | |||
node1 = Template(wraptext("foobar")) | |||
node2 = Template(wraptext("foo"), | |||
[pgenh("1", "bar"), pgens("\nabc ", "def")]) | |||
node3 = Template(wraptext("foo"), | |||
[pgenh("1", "a"), pgens("b", "c"), pgens("1", "d")]) | |||
node4 = Template(wraptext("foo"), [pgenh("1", "a"), pgens("b", " ")]) | |||
self.assertFalse(node1.has_param("foobar")) | |||
self.assertTrue(node2.has_param(1)) | |||
self.assertTrue(node2.has_param("abc")) | |||
self.assertFalse(node2.has_param("def")) | |||
self.assertTrue(node3.has_param("1")) | |||
self.assertTrue(node3.has_param(" b ")) | |||
self.assertFalse(node4.has_param("b")) | |||
self.assertTrue(node3.has_param("b", False)) | |||
self.assertTrue(node4.has_param("b", False)) | |||
def test_get(self): | |||
"""test Template.get()""" | |||
node1 = Template(wraptext("foobar")) | |||
node2p1 = pgenh("1", "bar") | |||
node2p2 = pgens("abc", "def") | |||
node2 = Template(wraptext("foo"), [node2p1, node2p2]) | |||
node3p1 = pgens("b", "c") | |||
node3p2 = pgens("1", "d") | |||
node3 = Template(wraptext("foo"), [pgenh("1", "a"), node3p1, node3p2]) | |||
node4p1 = pgens(" b", " ") | |||
node4 = Template(wraptext("foo"), [pgenh("1", "a"), node4p1]) | |||
self.assertRaises(ValueError, node1.get, "foobar") | |||
self.assertIs(node2p1, node2.get(1)) | |||
self.assertIs(node2p2, node2.get("abc")) | |||
self.assertRaises(ValueError, node2.get, "def") | |||
self.assertIs(node3p1, node3.get("b")) | |||
self.assertIs(node3p2, node3.get("1")) | |||
self.assertIs(node4p1, node4.get("b ")) | |||
def test_add(self): | |||
"""test Template.add()""" | |||
node1 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node2 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node3 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node4 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node5 = Template(wraptext("a"), [pgens("b", "c"), | |||
pgens(" d ", "e")]) | |||
node6 = Template(wraptext("a"), [pgens("b", "c"), pgens("b", "d"), | |||
pgens("b", "e")]) | |||
node7 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node8p = pgenh("1", "d") | |||
node8 = Template(wraptext("a"), [pgens("b", "c"), node8p]) | |||
node9 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "d")]) | |||
node10 = Template(wraptext("a"), [pgens("b", "c"), pgenh("1", "e")]) | |||
node11 = Template(wraptext("a"), [pgens("b", "c")]) | |||
node12 = Template(wraptext("a"), [pgens("b", "c")]) | |||
node13 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c"), pgens("\nd ", " e"), pgens("\nf ", " g")]) | |||
node14 = Template(wraptext("a\n"), [ | |||
pgens("b ", "c\n"), pgens("d ", " e"), pgens("f ", "g\n"), | |||
pgens("h ", " i\n")]) | |||
node15 = Template(wraptext("a"), [ | |||
pgens("b ", " c\n"), pgens("\nd ", " e"), pgens("\nf ", "g ")]) | |||
node16 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c"), pgens("\nd ", " e"), pgens("\nf ", " g")]) | |||
node17 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c"), pgens("\nd ", " e"), pgens("\nf ", " g")]) | |||
node18 = Template(wraptext("a\n"), [ | |||
pgens("b ", "c\n"), pgens("d ", " e"), pgens("f ", "g\n"), | |||
pgens("h ", " i\n")]) | |||
node19 = Template(wraptext("a"), [ | |||
pgens("b ", " c\n"), pgens("\nd ", " e"), pgens("\nf ", "g ")]) | |||
node20 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c"), pgens("\nd ", " e"), pgens("\nf ", " g")]) | |||
node21 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node22 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node23 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node24 = Template(wraptext("a"), [pgenh("1", "b"), pgenh("2", "c"), | |||
pgenh("3", "d"), pgenh("4", "e")]) | |||
node25 = Template(wraptext("a"), [pgenh("1", "b"), pgenh("2", "c"), | |||
pgens("4", "d"), pgens("5", "e")]) | |||
node26 = Template(wraptext("a"), [pgenh("1", "b"), pgenh("2", "c"), | |||
pgens("4", "d"), pgens("5", "e")]) | |||
node27 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node28 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node29 = Template(wraptext("a"), [pgens("b", "c")]) | |||
node30 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node31 = Template(wraptext("a"), [pgenh("1", "b")]) | |||
node32 = Template(wraptext("a"), [pgens("1", "b")]) | |||
node33 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c"), pgens("\nd ", " e"), pgens("\nf ", " g")]) | |||
node34 = Template(wraptext("a\n"), [ | |||
pgens("b ", "c\n"), pgens("d ", " e"), pgens("f ", "g\n"), | |||
pgens("h ", " i\n")]) | |||
node35 = Template(wraptext("a"), [ | |||
pgens("b ", " c\n"), pgens("\nd ", " e"), pgens("\nf ", "g ")]) | |||
node36 = Template(wraptext("a"), [ | |||
pgens("\nb ", " c "), pgens("\nd ", " e "), pgens("\nf ", " g ")]) | |||
node37 = Template(wraptext("a"), [pgens("b", "c"), pgens("d", "e"), | |||
pgens("b", "f"), pgens("b", "h"), | |||
pgens("i", "j")]) | |||
node37 = Template(wraptext("a"), [pgens("b", "c"), pgens("d", "e"), | |||
pgens("b", "f"), pgens("b", "h"), | |||
pgens("i", "j")]) | |||
node38 = Template(wraptext("a"), [pgens("1", "b"), pgens("x", "y"), | |||
pgens("1", "c"), pgens("2", "d")]) | |||
node39 = Template(wraptext("a"), [pgens("1", "b"), pgens("x", "y"), | |||
pgenh("1", "c"), pgenh("2", "d")]) | |||
node40 = Template(wraptext("a"), [pgens("b", "c"), pgens("d", "e"), | |||
pgens("f", "g")]) | |||
node1.add("e", "f", showkey=True) | |||
node2.add(2, "g", showkey=False) | |||
node3.add("e", "foo|bar", showkey=True) | |||
node4.add("e", "f", showkey=True, before="b") | |||
node5.add("f", "g", showkey=True, before=" d ") | |||
node6.add("f", "g", showkey=True, before="b") | |||
self.assertRaises(ValueError, node7.add, "e", "f", showkey=True, | |||
before="q") | |||
node8.add("e", "f", showkey=True, before=node8p) | |||
node9.add("e", "f", showkey=True, before=pgenh("1", "d")) | |||
self.assertRaises(ValueError, node10.add, "e", "f", showkey=True, | |||
before=pgenh("1", "d")) | |||
node11.add("d", "foo=bar", showkey=True) | |||
node12.add("1", "foo=bar", showkey=False) | |||
node13.add("h", "i", showkey=True) | |||
node14.add("j", "k", showkey=True) | |||
node15.add("h", "i", showkey=True) | |||
node16.add("h", "i", showkey=True, preserve_spacing=False) | |||
node17.add("h", "i", showkey=False) | |||
node18.add("j", "k", showkey=False) | |||
node19.add("h", "i", showkey=False) | |||
node20.add("h", "i", showkey=False, preserve_spacing=False) | |||
node21.add("2", "c") | |||
node22.add("3", "c") | |||
node23.add("c", "d") | |||
node24.add("5", "f") | |||
node25.add("3", "f") | |||
node26.add("6", "f") | |||
node27.add("c", "foo=bar") | |||
node28.add("2", "foo=bar") | |||
node29.add("b", "d") | |||
node30.add("1", "foo=bar") | |||
node31.add("1", "foo=bar", showkey=True) | |||
node32.add("1", "foo=bar", showkey=False) | |||
node33.add("d", "foo") | |||
node34.add("f", "foo") | |||
node35.add("f", "foo") | |||
node36.add("d", "foo", preserve_spacing=False) | |||
node37.add("b", "k") | |||
node38.add("1", "e") | |||
node39.add("1", "e") | |||
node40.add("d", "h", before="b") | |||
self.assertEqual("{{a|b=c|d|e=f}}", node1) | |||
self.assertEqual("{{a|b=c|d|g}}", node2) | |||
self.assertEqual("{{a|b=c|d|e=foo|bar}}", node3) | |||
self.assertIsInstance(node3.params[2].value.get(1), HTMLEntity) | |||
self.assertEqual("{{a|e=f|b=c|d}}", node4) | |||
self.assertEqual("{{a|b=c|f=g| d =e}}", node5) | |||
self.assertEqual("{{a|b=c|b=d|f=g|b=e}}", node6) | |||
self.assertEqual("{{a|b=c|d}}", node7) | |||
self.assertEqual("{{a|b=c|e=f|d}}", node8) | |||
self.assertEqual("{{a|b=c|e=f|d}}", node9) | |||
self.assertEqual("{{a|b=c|e}}", node10) | |||
self.assertEqual("{{a|b=c|d=foo=bar}}", node11) | |||
self.assertEqual("{{a|b=c|foo=bar}}", node12) | |||
self.assertIsInstance(node12.params[1].value.get(1), HTMLEntity) | |||
self.assertEqual("{{a|\nb = c|\nd = e|\nf = g|\nh = i}}", node13) | |||
self.assertEqual("{{a\n|b =c\n|d = e|f =g\n|h = i\n|j =k\n}}", node14) | |||
self.assertEqual("{{a|b = c\n|\nd = e|\nf =g |h =i}}", node15) | |||
self.assertEqual("{{a|\nb = c|\nd = e|\nf = g|h=i}}", node16) | |||
self.assertEqual("{{a|\nb = c|\nd = e|\nf = g| i}}", node17) | |||
self.assertEqual("{{a\n|b =c\n|d = e|f =g\n|h = i\n|k\n}}", node18) | |||
self.assertEqual("{{a|b = c\n|\nd = e|\nf =g |i}}", node19) | |||
self.assertEqual("{{a|\nb = c|\nd = e|\nf = g|i}}", node20) | |||
self.assertEqual("{{a|b|c}}", node21) | |||
self.assertEqual("{{a|b|3=c}}", node22) | |||
self.assertEqual("{{a|b|c=d}}", node23) | |||
self.assertEqual("{{a|b|c|d|e|f}}", node24) | |||
self.assertEqual("{{a|b|c|4=d|5=e|f}}", node25) | |||
self.assertEqual("{{a|b|c|4=d|5=e|6=f}}", node26) | |||
self.assertEqual("{{a|b|c=foo=bar}}", node27) | |||
self.assertEqual("{{a|b|foo=bar}}", node28) | |||
self.assertIsInstance(node28.params[1].value.get(1), HTMLEntity) | |||
self.assertEqual("{{a|b=d}}", node29) | |||
self.assertEqual("{{a|foo=bar}}", node30) | |||
self.assertIsInstance(node30.params[0].value.get(1), HTMLEntity) | |||
self.assertEqual("{{a|1=foo=bar}}", node31) | |||
self.assertEqual("{{a|foo=bar}}", node32) | |||
self.assertIsInstance(node32.params[0].value.get(1), HTMLEntity) | |||
self.assertEqual("{{a|\nb = c|\nd = foo|\nf = g}}", node33) | |||
self.assertEqual("{{a\n|b =c\n|d = e|f =foo\n|h = i\n}}", node34) | |||
self.assertEqual("{{a|b = c\n|\nd = e|\nf =foo }}", node35) | |||
self.assertEqual("{{a|\nb = c |\nd =foo|\nf = g }}", node36) | |||
self.assertEqual("{{a|b=k|d=e|i=j}}", node37) | |||
self.assertEqual("{{a|1=e|x=y|2=d}}", node38) | |||
self.assertEqual("{{a|x=y|e|d}}", node39) | |||
self.assertEqual("{{a|b=c|d=h|f=g}}", node40) | |||
def test_remove(self): | |||
"""test Template.remove()""" | |||
node1 = Template(wraptext("foobar")) | |||
node2 = Template(wraptext("foo"), [pgenh("1", "bar"), | |||
pgens("abc", "def")]) | |||
node3 = Template(wraptext("foo"), [pgenh("1", "bar"), | |||
pgens("abc", "def")]) | |||
node4 = Template(wraptext("foo"), [pgenh("1", "bar"), | |||
pgenh("2", "baz")]) | |||
node5 = Template(wraptext("foo"), [ | |||
pgens(" a", "b"), pgens("b", "c"), pgens("a ", "d")]) | |||
node6 = Template(wraptext("foo"), [ | |||
pgens(" a", "b"), pgens("b", "c"), pgens("a ", "d")]) | |||
node7 = Template(wraptext("foo"), [ | |||
pgens("1 ", "a"), pgens(" 1", "b"), pgens("2", "c")]) | |||
node8 = Template(wraptext("foo"), [ | |||
pgens("1 ", "a"), pgens(" 1", "b"), pgens("2", "c")]) | |||
node9 = Template(wraptext("foo"), [ | |||
pgens("1 ", "a"), pgenh("1", "b"), pgenh("2", "c")]) | |||
node10 = Template(wraptext("foo"), [ | |||
pgens("1 ", "a"), pgenh("1", "b"), pgenh("2", "c")]) | |||
node2.remove("1") | |||
node2.remove("abc") | |||
node3.remove(1, keep_field=True) | |||
node3.remove("abc", keep_field=True) | |||
node4.remove("1", keep_field=False) | |||
node5.remove("a", keep_field=False) | |||
node6.remove("a", keep_field=True) | |||
node7.remove(1, keep_field=True) | |||
node8.remove(1, keep_field=False) | |||
node9.remove(1, keep_field=True) | |||
node10.remove(1, keep_field=False) | |||
self.assertRaises(ValueError, node1.remove, 1) | |||
self.assertRaises(ValueError, node1.remove, "a") | |||
self.assertRaises(ValueError, node2.remove, "1") | |||
self.assertEqual("{{foo}}", node2) | |||
self.assertEqual("{{foo||abc=}}", node3) | |||
self.assertEqual("{{foo||baz}}", node4) | |||
self.assertEqual("{{foo|b=c}}", node5) | |||
self.assertEqual("{{foo| a=|b=c}}", node6) | |||
self.assertEqual("{{foo|1 =|2=c}}", node7) | |||
self.assertEqual("{{foo|2=c}}", node8) | |||
self.assertEqual("{{foo||c}}", node9) | |||
self.assertEqual("{{foo||c}}", node10) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,75 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Text | |||
class TestText(unittest.TestCase): | |||
"""Test cases for the Text node.""" | |||
def test_unicode(self): | |||
"""test Text.__unicode__()""" | |||
node = Text("foobar") | |||
self.assertEqual("foobar", str(node)) | |||
node2 = Text("fóóbar") | |||
self.assertEqual("fóóbar", str(node2)) | |||
def test_iternodes(self): | |||
"""test Text.__iternodes__()""" | |||
node = Text("foobar") | |||
gen = node.__iternodes__(None) | |||
self.assertEqual((None, node), next(gen)) | |||
self.assertRaises(StopIteration, next, gen) | |||
def test_strip(self): | |||
"""test Text.__strip__()""" | |||
node = Text("foobar") | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertIs(node, node.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Text.__showtree__()""" | |||
output = [] | |||
node1 = Text("foobar") | |||
node2 = Text("fóóbar") | |||
node3 = Text("𐌲𐌿𐍄") | |||
node1.__showtree__(output.append, None, None) | |||
node2.__showtree__(output.append, None, None) | |||
node3.__showtree__(output.append, None, None) | |||
res = ["foobar", r"f\xf3\xf3bar", "\\U00010332\\U0001033f\\U00010344"] | |||
self.assertEqual(res, output) | |||
def test_value(self): | |||
"""test getter/setter for the value attribute""" | |||
node = Text("foobar") | |||
self.assertEqual("foobar", node.value) | |||
self.assertIsInstance(node.value, str) | |||
node.value = "héhéhé" | |||
self.assertEqual("héhéhé", node.value) | |||
self.assertIsInstance(node.value, str) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,108 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import py3k | |||
from mwparserfromhell.parser import tokens | |||
class TestTokens(unittest.TestCase): | |||
"""Test cases for the Token class and its subclasses.""" | |||
def test_issubclass(self): | |||
"""check that all classes within the tokens module are really Tokens""" | |||
for name in tokens.__all__: | |||
klass = getattr(tokens, name) | |||
self.assertTrue(issubclass(klass, tokens.Token)) | |||
self.assertIsInstance(klass(), klass) | |||
self.assertIsInstance(klass(), tokens.Token) | |||
def test_attributes(self): | |||
"""check that Token attributes can be managed properly""" | |||
token1 = tokens.Token() | |||
token2 = tokens.Token(foo="bar", baz=123) | |||
self.assertEqual("bar", token2.foo) | |||
self.assertEqual(123, token2.baz) | |||
self.assertRaises(KeyError, lambda: token1.foo) | |||
self.assertRaises(KeyError, lambda: token2.bar) | |||
token1.spam = "eggs" | |||
token2.foo = "ham" | |||
del token2.baz | |||
self.assertEqual("eggs", token1.spam) | |||
self.assertEqual("ham", token2.foo) | |||
self.assertRaises(KeyError, lambda: token2.baz) | |||
self.assertRaises(KeyError, delattr, token2, "baz") | |||
def test_repr(self): | |||
"""check that repr() on a Token works as expected""" | |||
token1 = tokens.Token() | |||
token2 = tokens.Token(foo="bar", baz=123) | |||
token3 = tokens.Text(text="earwig" * 100) | |||
hundredchars = ("earwig" * 100)[:97] + "..." | |||
self.assertEqual("Token()", repr(token1)) | |||
if py3k: | |||
token2repr1 = "Token(foo='bar', baz=123)" | |||
token2repr2 = "Token(baz=123, foo='bar')" | |||
token3repr = "Text(text='" + hundredchars + "')" | |||
else: | |||
token2repr1 = "Token(foo=u'bar', baz=123)" | |||
token2repr2 = "Token(baz=123, foo=u'bar')" | |||
token3repr = "Text(text=u'" + hundredchars + "')" | |||
token2repr = repr(token2) | |||
self.assertTrue(token2repr == token2repr1 or token2repr == token2repr2) | |||
self.assertEqual(token3repr, repr(token3)) | |||
def test_equality(self): | |||
"""check that equivalent tokens are considered equal""" | |||
token1 = tokens.Token() | |||
token2 = tokens.Token() | |||
token3 = tokens.Token(foo="bar", baz=123) | |||
token4 = tokens.Text(text="asdf") | |||
token5 = tokens.Text(text="asdf") | |||
token6 = tokens.TemplateOpen(text="asdf") | |||
self.assertEqual(token1, token2) | |||
self.assertEqual(token2, token1) | |||
self.assertEqual(token4, token5) | |||
self.assertEqual(token5, token4) | |||
self.assertNotEqual(token1, token3) | |||
self.assertNotEqual(token2, token3) | |||
self.assertNotEqual(token4, token6) | |||
self.assertNotEqual(token5, token6) | |||
def test_repr_equality(self): | |||
"check that eval(repr(token)) == token" | |||
tests = [ | |||
tokens.Token(), | |||
tokens.Token(foo="bar", baz=123), | |||
tokens.Text(text="earwig") | |||
] | |||
for token in tests: | |||
self.assertEqual(token, eval(repr(token), vars(tokens))) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,62 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.nodes import Template, Text | |||
from mwparserfromhell.utils import parse_anything | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext | |||
class TestUtils(TreeEqualityTestCase): | |||
"""Tests for the utils module, which provides parse_anything().""" | |||
def test_parse_anything_valid(self): | |||
"""tests for valid input to utils.parse_anything()""" | |||
tests = [ | |||
(wraptext("foobar"), wraptext("foobar")), | |||
(Template(wraptext("spam")), wrap([Template(wraptext("spam"))])), | |||
("fóóbar", wraptext("fóóbar")), | |||
(b"foob\xc3\xa1r", wraptext("foobár")), | |||
(123, wraptext("123")), | |||
(True, wraptext("True")), | |||
(None, wrap([])), | |||
([Text("foo"), Text("bar"), Text("baz")], | |||
wraptext("foo", "bar", "baz")), | |||
([wraptext("foo"), Text("bar"), "baz", 123, 456], | |||
wraptext("foo", "bar", "baz", "123", "456")), | |||
([[[([[((("foo",),),)], "bar"],)]]], wraptext("foo", "bar")) | |||
] | |||
for test, valid in tests: | |||
self.assertWikicodeEqual(valid, parse_anything(test)) | |||
def test_parse_anything_invalid(self): | |||
"""tests for invalid input to utils.parse_anything()""" | |||
self.assertRaises(ValueError, parse_anything, Ellipsis) | |||
self.assertRaises(ValueError, parse_anything, object) | |||
self.assertRaises(ValueError, parse_anything, object()) | |||
self.assertRaises(ValueError, parse_anything, type) | |||
self.assertRaises(ValueError, parse_anything, ["foo", [object]]) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,364 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import re | |||
from types import GeneratorType | |||
import unittest | |||
from mwparserfromhell.nodes import (Argument, Comment, Heading, HTMLEntity, | |||
Node, Tag, Template, Text, Wikilink) | |||
from mwparserfromhell.smart_list import SmartList | |||
from mwparserfromhell.wikicode import Wikicode | |||
from mwparserfromhell import parse | |||
from mwparserfromhell.compat import py3k, str | |||
from ._test_tree_equality import TreeEqualityTestCase, wrap, wraptext | |||
class TestWikicode(TreeEqualityTestCase): | |||
"""Tests for the Wikicode class, which manages a list of nodes.""" | |||
def test_unicode(self): | |||
"""test Wikicode.__unicode__()""" | |||
code1 = parse("foobar") | |||
code2 = parse("Have a {{template}} and a [[page|link]]") | |||
self.assertEqual("foobar", str(code1)) | |||
self.assertEqual("Have a {{template}} and a [[page|link]]", str(code2)) | |||
def test_nodes(self): | |||
"""test getter/setter for the nodes attribute""" | |||
code = parse("Have a {{template}}") | |||
self.assertEqual(["Have a ", "{{template}}"], code.nodes) | |||
L1 = SmartList([Text("foobar"), Template(wraptext("abc"))]) | |||
L2 = [Text("barfoo"), Template(wraptext("cba"))] | |||
L3 = "abc{{def}}" | |||
code.nodes = L1 | |||
self.assertIs(L1, code.nodes) | |||
code.nodes = L2 | |||
self.assertIs(L2, code.nodes) | |||
code.nodes = L3 | |||
self.assertEqual(["abc", "{{def}}"], code.nodes) | |||
self.assertRaises(ValueError, setattr, code, "nodes", object) | |||
def test_get(self): | |||
"""test Wikicode.get()""" | |||
code = parse("Have a {{template}} and a [[page|link]]") | |||
self.assertIs(code.nodes[0], code.get(0)) | |||
self.assertIs(code.nodes[2], code.get(2)) | |||
self.assertRaises(IndexError, code.get, 4) | |||
def test_set(self): | |||
"""test Wikicode.set()""" | |||
code = parse("Have a {{template}} and a [[page|link]]") | |||
code.set(1, "{{{argument}}}") | |||
self.assertEqual("Have a {{{argument}}} and a [[page|link]]", code) | |||
self.assertIsInstance(code.get(1), Argument) | |||
code.set(2, None) | |||
self.assertEqual("Have a {{{argument}}}[[page|link]]", code) | |||
code.set(-3, "This is an ") | |||
self.assertEqual("This is an {{{argument}}}[[page|link]]", code) | |||
self.assertRaises(ValueError, code.set, 1, "foo {{bar}}") | |||
self.assertRaises(IndexError, code.set, 3, "{{baz}}") | |||
self.assertRaises(IndexError, code.set, -4, "{{baz}}") | |||
def test_index(self): | |||
"""test Wikicode.index()""" | |||
code = parse("Have a {{template}} and a [[page|link]]") | |||
self.assertEqual(0, code.index("Have a ")) | |||
self.assertEqual(3, code.index("[[page|link]]")) | |||
self.assertEqual(1, code.index(code.get(1))) | |||
self.assertRaises(ValueError, code.index, "foo") | |||
code = parse("{{foo}}{{bar|{{baz}}}}") | |||
self.assertEqual(1, code.index("{{bar|{{baz}}}}")) | |||
self.assertEqual(1, code.index("{{baz}}", recursive=True)) | |||
self.assertEqual(1, code.index(code.get(1).get(1).value, | |||
recursive=True)) | |||
self.assertRaises(ValueError, code.index, "{{baz}}", recursive=False) | |||
self.assertRaises(ValueError, code.index, | |||
code.get(1).get(1).value, recursive=False) | |||
def test_insert(self): | |||
"""test Wikicode.insert()""" | |||
code = parse("Have a {{template}} and a [[page|link]]") | |||
code.insert(1, "{{{argument}}}") | |||
self.assertEqual( | |||
"Have a {{{argument}}}{{template}} and a [[page|link]]", code) | |||
self.assertIsInstance(code.get(1), Argument) | |||
code.insert(2, None) | |||
self.assertEqual( | |||
"Have a {{{argument}}}{{template}} and a [[page|link]]", code) | |||
code.insert(-3, Text("foo")) | |||
self.assertEqual( | |||
"Have a {{{argument}}}foo{{template}} and a [[page|link]]", code) | |||
code2 = parse("{{foo}}{{bar}}{{baz}}") | |||
code2.insert(1, "abc{{def}}ghi[[jk]]") | |||
self.assertEqual("{{foo}}abc{{def}}ghi[[jk]]{{bar}}{{baz}}", code2) | |||
self.assertEqual(["{{foo}}", "abc", "{{def}}", "ghi", "[[jk]]", | |||
"{{bar}}", "{{baz}}"], code2.nodes) | |||
code3 = parse("{{foo}}bar") | |||
code3.insert(1000, "[[baz]]") | |||
code3.insert(-1000, "derp") | |||
self.assertEqual("derp{{foo}}bar[[baz]]", code3) | |||
def test_insert_before(self): | |||
"""test Wikicode.insert_before()""" | |||
code = parse("{{a}}{{b}}{{c}}{{d}}") | |||
code.insert_before("{{b}}", "x", recursive=True) | |||
code.insert_before("{{d}}", "[[y]]", recursive=False) | |||
self.assertEqual("{{a}}x{{b}}{{c}}[[y]]{{d}}", code) | |||
code.insert_before(code.get(2), "z") | |||
self.assertEqual("{{a}}xz{{b}}{{c}}[[y]]{{d}}", code) | |||
self.assertRaises(ValueError, code.insert_before, "{{r}}", "n", | |||
recursive=True) | |||
self.assertRaises(ValueError, code.insert_before, "{{r}}", "n", | |||
recursive=False) | |||
code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}") | |||
code2.insert_before(code2.get(0).params[0].value.get(0), "x", | |||
recursive=True) | |||
code2.insert_before("{{f}}", "y", recursive=True) | |||
self.assertEqual("{{a|x{{b}}|{{c|d=y{{f}}}}}}", code2) | |||
self.assertRaises(ValueError, code2.insert_before, "{{f}}", "y", | |||
recursive=False) | |||
def test_insert_after(self): | |||
"""test Wikicode.insert_after()""" | |||
code = parse("{{a}}{{b}}{{c}}{{d}}") | |||
code.insert_after("{{b}}", "x", recursive=True) | |||
code.insert_after("{{d}}", "[[y]]", recursive=False) | |||
self.assertEqual("{{a}}{{b}}x{{c}}{{d}}[[y]]", code) | |||
code.insert_after(code.get(2), "z") | |||
self.assertEqual("{{a}}{{b}}xz{{c}}{{d}}[[y]]", code) | |||
self.assertRaises(ValueError, code.insert_after, "{{r}}", "n", | |||
recursive=True) | |||
self.assertRaises(ValueError, code.insert_after, "{{r}}", "n", | |||
recursive=False) | |||
code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}") | |||
code2.insert_after(code2.get(0).params[0].value.get(0), "x", | |||
recursive=True) | |||
code2.insert_after("{{f}}", "y", recursive=True) | |||
self.assertEqual("{{a|{{b}}x|{{c|d={{f}}y}}}}", code2) | |||
self.assertRaises(ValueError, code2.insert_after, "{{f}}", "y", | |||
recursive=False) | |||
def test_replace(self): | |||
"""test Wikicode.replace()""" | |||
code = parse("{{a}}{{b}}{{c}}{{d}}") | |||
code.replace("{{b}}", "x", recursive=True) | |||
code.replace("{{d}}", "[[y]]", recursive=False) | |||
self.assertEqual("{{a}}x{{c}}[[y]]", code) | |||
code.replace(code.get(1), "z") | |||
self.assertEqual("{{a}}z{{c}}[[y]]", code) | |||
self.assertRaises(ValueError, code.replace, "{{r}}", "n", | |||
recursive=True) | |||
self.assertRaises(ValueError, code.replace, "{{r}}", "n", | |||
recursive=False) | |||
code2 = parse("{{a|{{b}}|{{c|d={{f}}}}}}") | |||
code2.replace(code2.get(0).params[0].value.get(0), "x", recursive=True) | |||
code2.replace("{{f}}", "y", recursive=True) | |||
self.assertEqual("{{a|x|{{c|d=y}}}}", code2) | |||
self.assertRaises(ValueError, code2.replace, "y", "z", recursive=False) | |||
def test_append(self): | |||
"""test Wikicode.append()""" | |||
code = parse("Have a {{template}}") | |||
code.append("{{{argument}}}") | |||
self.assertEqual("Have a {{template}}{{{argument}}}", code) | |||
self.assertIsInstance(code.get(2), Argument) | |||
code.append(None) | |||
self.assertEqual("Have a {{template}}{{{argument}}}", code) | |||
code.append(Text(" foo")) | |||
self.assertEqual("Have a {{template}}{{{argument}}} foo", code) | |||
self.assertRaises(ValueError, code.append, slice(0, 1)) | |||
def test_remove(self): | |||
"""test Wikicode.remove()""" | |||
code = parse("{{a}}{{b}}{{c}}{{d}}") | |||
code.remove("{{b}}", recursive=True) | |||
code.remove(code.get(1), recursive=True) | |||
self.assertEqual("{{a}}{{d}}", code) | |||
self.assertRaises(ValueError, code.remove, "{{r}}", recursive=True) | |||
self.assertRaises(ValueError, code.remove, "{{r}}", recursive=False) | |||
code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}") | |||
code2.remove(code2.get(0).params[0].value.get(0), recursive=True) | |||
code2.remove("{{f}}", recursive=True) | |||
self.assertEqual("{{a||{{c|d={{h}}}}}}", code2) | |||
self.assertRaises(ValueError, code2.remove, "{{h}}", recursive=False) | |||
def test_filter_family(self): | |||
"""test the Wikicode.i?filter() family of functions""" | |||
def genlist(gen): | |||
self.assertIsInstance(gen, GeneratorType) | |||
return list(gen) | |||
ifilter = lambda code: (lambda **kw: genlist(code.ifilter(**kw))) | |||
code = parse("a{{b}}c[[d]]{{{e}}}{{f}}[[g]]") | |||
for func in (code.filter, ifilter(code)): | |||
self.assertEqual(["a", "{{b}}", "c", "[[d]]", "{{{e}}}", "{{f}}", | |||
"[[g]]"], func()) | |||
self.assertEqual(["{{{e}}}"], func(forcetype=Argument)) | |||
self.assertIs(code.get(4), func(forcetype=Argument)[0]) | |||
self.assertEqual(["a", "c"], func(forcetype=Text)) | |||
self.assertEqual([], func(forcetype=Heading)) | |||
self.assertRaises(TypeError, func, forcetype=True) | |||
funcs = [ | |||
lambda name, **kw: getattr(code, "filter_" + name)(**kw), | |||
lambda name, **kw: genlist(getattr(code, "ifilter_" + name)(**kw)) | |||
] | |||
for get_filter in funcs: | |||
self.assertEqual(["{{{e}}}"], get_filter("arguments")) | |||
self.assertIs(code.get(4), get_filter("arguments")[0]) | |||
self.assertEqual([], get_filter("comments")) | |||
self.assertEqual([], get_filter("headings")) | |||
self.assertEqual([], get_filter("html_entities")) | |||
self.assertEqual([], get_filter("tags")) | |||
self.assertEqual(["{{b}}", "{{f}}"], get_filter("templates")) | |||
self.assertEqual(["a", "c"], get_filter("text")) | |||
self.assertEqual(["[[d]]", "[[g]]"], get_filter("wikilinks")) | |||
code2 = parse("{{a|{{b}}|{{c|d={{f}}{{h}}}}}}") | |||
for func in (code2.filter, ifilter(code2)): | |||
self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}"], | |||
func(recursive=False, forcetype=Template)) | |||
self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}", "{{b}}", | |||
"{{c|d={{f}}{{h}}}}", "{{f}}", "{{h}}"], | |||
func(recursive=True, forcetype=Template)) | |||
code3 = parse("{{foobar}}{{FOO}}{{baz}}{{bz}}") | |||
for func in (code3.filter, ifilter(code3)): | |||
self.assertEqual(["{{foobar}}", "{{FOO}}"], func(matches=r"foo")) | |||
self.assertEqual(["{{foobar}}", "{{FOO}}"], | |||
func(matches=r"^{{foo.*?}}")) | |||
self.assertEqual(["{{foobar}}"], | |||
func(matches=r"^{{foo.*?}}", flags=re.UNICODE)) | |||
self.assertEqual(["{{baz}}", "{{bz}}"], func(matches=r"^{{b.*?z")) | |||
self.assertEqual(["{{baz}}"], func(matches=r"^{{b.+?z}}")) | |||
self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}"], | |||
code2.filter_templates(recursive=False)) | |||
self.assertEqual(["{{a|{{b}}|{{c|d={{f}}{{h}}}}}}", "{{b}}", | |||
"{{c|d={{f}}{{h}}}}", "{{f}}", "{{h}}"], | |||
code2.filter_templates(recursive=True)) | |||
self.assertEqual(["{{baz}}", "{{bz}}"], | |||
code3.filter_templates(matches=r"^{{b.*?z")) | |||
self.assertEqual([], code3.filter_tags(matches=r"^{{b.*?z")) | |||
self.assertEqual([], code3.filter_tags(matches=r"^{{b.*?z", flags=0)) | |||
self.assertRaises(TypeError, code.filter_templates, 100) | |||
self.assertRaises(TypeError, code.filter_templates, a=42) | |||
self.assertRaises(TypeError, code.filter_templates, forcetype=Template) | |||
def test_get_sections(self): | |||
"""test Wikicode.get_sections()""" | |||
page1 = parse("") | |||
page2 = parse("==Heading==") | |||
page3 = parse("===Heading===\nFoo bar baz\n====Gnidaeh====\n") | |||
p4_lead = "This is a lead.\n" | |||
p4_IA = "=== Section I.A ===\nSection I.A [[body]].\n" | |||
p4_IB1 = "==== Section I.B.1 ====\nSection I.B.1 body.\n\n•Some content.\n\n" | |||
p4_IB = "=== Section I.B ===\n" + p4_IB1 | |||
p4_I = "== Section I ==\nSection I body. {{and a|template}}\n" + p4_IA + p4_IB | |||
p4_II = "== Section II ==\nSection II body.\n\n" | |||
p4_IIIA1a = "===== Section III.A.1.a =====\nMore text.\n" | |||
p4_IIIA2ai1 = "======= Section III.A.2.a.i.1 =======\nAn invalid section!" | |||
p4_IIIA2 = "==== Section III.A.2 ====\nEven more text.\n" + p4_IIIA2ai1 | |||
p4_IIIA = "=== Section III.A ===\nText.\n" + p4_IIIA1a + p4_IIIA2 | |||
p4_III = "== Section III ==\n" + p4_IIIA | |||
page4 = parse(p4_lead + p4_I + p4_II + p4_III) | |||
self.assertEqual([], page1.get_sections()) | |||
self.assertEqual(["", "==Heading=="], page2.get_sections()) | |||
self.assertEqual(["", "===Heading===\nFoo bar baz\n====Gnidaeh====\n", | |||
"====Gnidaeh====\n"], page3.get_sections()) | |||
self.assertEqual([p4_lead, p4_IA, p4_I, p4_IB, p4_IB1, p4_II, | |||
p4_IIIA1a, p4_III, p4_IIIA, p4_IIIA2, p4_IIIA2ai1], | |||
page4.get_sections()) | |||
self.assertEqual(["====Gnidaeh====\n"], page3.get_sections(levels=[4])) | |||
self.assertEqual(["===Heading===\nFoo bar baz\n====Gnidaeh====\n"], | |||
page3.get_sections(levels=(2, 3))) | |||
self.assertEqual([], page3.get_sections(levels=[0])) | |||
self.assertEqual(["", "====Gnidaeh====\n"], | |||
page3.get_sections(levels=[4], include_lead=True)) | |||
self.assertEqual(["===Heading===\nFoo bar baz\n====Gnidaeh====\n", | |||
"====Gnidaeh====\n"], | |||
page3.get_sections(include_lead=False)) | |||
self.assertEqual([p4_IB1, p4_IIIA2], page4.get_sections(levels=[4])) | |||
self.assertEqual([""], page2.get_sections(include_headings=False)) | |||
self.assertEqual(["\nSection I.B.1 body.\n\n•Some content.\n\n", | |||
"\nEven more text.\n" + p4_IIIA2ai1], | |||
page4.get_sections(levels=[4], | |||
include_headings=False)) | |||
self.assertEqual([], page4.get_sections(matches=r"body")) | |||
self.assertEqual([p4_IA, p4_I, p4_IB, p4_IB1], | |||
page4.get_sections(matches=r"Section\sI[.\s].*?")) | |||
self.assertEqual([p4_IA, p4_IIIA1a, p4_IIIA, p4_IIIA2, p4_IIIA2ai1], | |||
page4.get_sections(matches=r".*?a.*?")) | |||
self.assertEqual([p4_IIIA1a, p4_IIIA2ai1], | |||
page4.get_sections(matches=r".*?a.*?", flags=re.U)) | |||
self.assertEqual(["\nMore text.\n", "\nAn invalid section!"], | |||
page4.get_sections(matches=r".*?a.*?", flags=re.U, | |||
include_headings=False)) | |||
page5 = parse("X\n== Foo ==\nBar\n== Baz ==\nBuzz") | |||
section = page5.get_sections(matches="Foo")[0] | |||
section.replace("\nBar\n", "\nBarf ") | |||
section.append("{{Haha}}\n") | |||
self.assertEqual("== Foo ==\nBarf {{Haha}}\n", section) | |||
self.assertEqual("X\n== Foo ==\nBarf {{Haha}}\n== Baz ==\nBuzz", page5) | |||
def test_strip_code(self): | |||
"""test Wikicode.strip_code()""" | |||
# Since individual nodes have test cases for their __strip__ methods, | |||
# we're only going to do an integration test: | |||
code = parse("Foo [[bar]]\n\n{{baz}}\n\n[[a|b]] Σ") | |||
self.assertEqual("Foo bar\n\nb Σ", | |||
code.strip_code(normalize=True, collapse=True)) | |||
self.assertEqual("Foo bar\n\n\n\nb Σ", | |||
code.strip_code(normalize=True, collapse=False)) | |||
self.assertEqual("Foo bar\n\nb Σ", | |||
code.strip_code(normalize=False, collapse=True)) | |||
self.assertEqual("Foo bar\n\n\n\nb Σ", | |||
code.strip_code(normalize=False, collapse=False)) | |||
def test_get_tree(self): | |||
"""test Wikicode.get_tree()""" | |||
# Since individual nodes have test cases for their __showtree___ | |||
# methods, and the docstring covers all possibilities for the output of | |||
# __showtree__, we'll test it only: | |||
code = parse("Lorem ipsum {{foo|bar|{{baz}}|spam=eggs}}") | |||
expected = "Lorem ipsum \n{{\n\t foo\n\t| 1\n\t= bar\n\t| 2\n\t= " + \ | |||
"{{\n\t\t\tbaz\n\t }}\n\t| spam\n\t= eggs\n}}" | |||
self.assertEqual(expected.expandtabs(4), code.get_tree()) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,107 @@ | |||
# -*- coding: utf-8 -*- | |||
# | |||
# Copyright (C) 2012-2013 Ben Kurtovic <ben.kurtovic@verizon.net> | |||
# | |||
# Permission is hereby granted, free of charge, to any person obtaining a copy | |||
# of this software and associated documentation files (the "Software"), to deal | |||
# in the Software without restriction, including without limitation the rights | |||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |||
# copies of the Software, and to permit persons to whom the Software is | |||
# furnished to do so, subject to the following conditions: | |||
# | |||
# The above copyright notice and this permission notice shall be included in | |||
# all copies or substantial portions of the Software. | |||
# | |||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | |||
# SOFTWARE. | |||
from __future__ import unicode_literals | |||
import unittest | |||
from mwparserfromhell.compat import str | |||
from mwparserfromhell.nodes import Text, Wikilink | |||
from ._test_tree_equality import TreeEqualityTestCase, getnodes, wrap, wraptext | |||
class TestWikilink(TreeEqualityTestCase): | |||
"""Test cases for the Wikilink node.""" | |||
def test_unicode(self): | |||
"""test Wikilink.__unicode__()""" | |||
node = Wikilink(wraptext("foobar")) | |||
self.assertEqual("[[foobar]]", str(node)) | |||
node2 = Wikilink(wraptext("foo"), wraptext("bar")) | |||
self.assertEqual("[[foo|bar]]", str(node2)) | |||
def test_iternodes(self): | |||
"""test Wikilink.__iternodes__()""" | |||
node1n1 = Text("foobar") | |||
node2n1, node2n2, node2n3 = Text("foo"), Text("bar"), Text("baz") | |||
node1 = Wikilink(wrap([node1n1])) | |||
node2 = Wikilink(wrap([node2n1]), wrap([node2n2, node2n3])) | |||
gen1 = node1.__iternodes__(getnodes) | |||
gen2 = node2.__iternodes__(getnodes) | |||
self.assertEqual((None, node1), next(gen1)) | |||
self.assertEqual((None, node2), next(gen2)) | |||
self.assertEqual((node1.title, node1n1), next(gen1)) | |||
self.assertEqual((node2.title, node2n1), next(gen2)) | |||
self.assertEqual((node2.text, node2n2), next(gen2)) | |||
self.assertEqual((node2.text, node2n3), next(gen2)) | |||
self.assertRaises(StopIteration, next, gen1) | |||
self.assertRaises(StopIteration, next, gen2) | |||
def test_strip(self): | |||
"""test Wikilink.__strip__()""" | |||
node = Wikilink(wraptext("foobar")) | |||
node2 = Wikilink(wraptext("foo"), wraptext("bar")) | |||
for a in (True, False): | |||
for b in (True, False): | |||
self.assertEqual("foobar", node.__strip__(a, b)) | |||
self.assertEqual("bar", node2.__strip__(a, b)) | |||
def test_showtree(self): | |||
"""test Wikilink.__showtree__()""" | |||
output = [] | |||
getter, marker = object(), object() | |||
get = lambda code: output.append((getter, code)) | |||
mark = lambda: output.append(marker) | |||
node1 = Wikilink(wraptext("foobar")) | |||
node2 = Wikilink(wraptext("foo"), wraptext("bar")) | |||
node1.__showtree__(output.append, get, mark) | |||
node2.__showtree__(output.append, get, mark) | |||
valid = [ | |||
"[[", (getter, node1.title), "]]", "[[", (getter, node2.title), | |||
" | ", marker, (getter, node2.text), "]]"] | |||
self.assertEqual(valid, output) | |||
def test_title(self): | |||
"""test getter/setter for the title attribute""" | |||
title = wraptext("foobar") | |||
node1 = Wikilink(title) | |||
node2 = Wikilink(title, wraptext("baz")) | |||
self.assertIs(title, node1.title) | |||
self.assertIs(title, node2.title) | |||
node1.title = "héhehé" | |||
node2.title = "héhehé" | |||
self.assertWikicodeEqual(wraptext("héhehé"), node1.title) | |||
self.assertWikicodeEqual(wraptext("héhehé"), node2.title) | |||
def test_text(self): | |||
"""test getter/setter for the text attribute""" | |||
text = wraptext("baz") | |||
node1 = Wikilink(wraptext("foobar")) | |||
node2 = Wikilink(wraptext("foobar"), text) | |||
self.assertIs(None, node1.text) | |||
self.assertIs(text, node2.text) | |||
node1.text = "buzz" | |||
node2.text = None | |||
self.assertWikicodeEqual(wraptext("buzz"), node1.text) | |||
self.assertIs(None, node2.text) | |||
if __name__ == "__main__": | |||
unittest.main(verbosity=2) |
@@ -0,0 +1,130 @@ | |||
name: blank | |||
label: argument with no content | |||
input: "{{{}}}" | |||
output: [ArgumentOpen(), ArgumentClose()] | |||
--- | |||
name: blank_with_default | |||
label: argument with no content but a pipe | |||
input: "{{{|}}}" | |||
output: [ArgumentOpen(), ArgumentSeparator(), ArgumentClose()] | |||
--- | |||
name: basic | |||
label: simplest type of argument | |||
input: "{{{argument}}}" | |||
output: [ArgumentOpen(), Text(text="argument"), ArgumentClose()] | |||
--- | |||
name: default | |||
label: argument with a default value | |||
input: "{{{foo|bar}}}" | |||
output: [ArgumentOpen(), Text(text="foo"), ArgumentSeparator(), Text(text="bar"), ArgumentClose()] | |||
--- | |||
name: blank_with_multiple_defaults | |||
label: no content, multiple pipes | |||
input: "{{{|||}}}" | |||
output: [ArgumentOpen(), ArgumentSeparator(), Text(text="||"), ArgumentClose()] | |||
--- | |||
name: multiple_defaults | |||
label: multiple values separated by pipes | |||
input: "{{{foo|bar|baz}}}" | |||
output: [ArgumentOpen(), Text(text="foo"), ArgumentSeparator(), Text(text="bar|baz"), ArgumentClose()] | |||
--- | |||
name: newline | |||
label: newline as only content | |||
input: "{{{\n}}}" | |||
output: [ArgumentOpen(), Text(text="\n"), ArgumentClose()] | |||
--- | |||
name: right_braces | |||
label: multiple } scattered throughout text | |||
input: "{{{foo}b}a}r}}}" | |||
output: [ArgumentOpen(), Text(text="foo}b}a}r"), ArgumentClose()] | |||
--- | |||
name: right_braces_default | |||
label: multiple } scattered throughout text, with a default value | |||
input: "{{{foo}b}|}a}r}}}" | |||
output: [ArgumentOpen(), Text(text="foo}b}"), ArgumentSeparator(), Text(text="}a}r"), ArgumentClose()] | |||
--- | |||
name: nested | |||
label: an argument nested within another argument | |||
input: "{{{{{{foo}}}|{{{bar}}}}}}" | |||
output: [ArgumentOpen(), ArgumentOpen(), Text(text="foo"), ArgumentClose(), ArgumentSeparator(), ArgumentOpen(), Text(text="bar"), ArgumentClose(), ArgumentClose()] | |||
--- | |||
name: invalid_braces | |||
label: invalid argument: multiple braces that are not part of a template or argument | |||
input: "{{{foo{{[a}}}}}" | |||
output: [Text(text="{{{foo{{[a}}}}}")] | |||
--- | |||
name: incomplete_open_only | |||
label: incomplete arguments: just an open | |||
input: "{{{" | |||
output: [Text(text="{{{")] | |||
--- | |||
name: incomplete_open_text | |||
label: incomplete arguments: an open with some text | |||
input: "{{{foo" | |||
output: [Text(text="{{{foo")] | |||
--- | |||
name: incomplete_open_text_pipe | |||
label: incomplete arguments: an open, text, then a pipe | |||
input: "{{{foo|" | |||
output: [Text(text="{{{foo|")] | |||
--- | |||
name: incomplete_open_pipe | |||
label: incomplete arguments: an open, then a pipe | |||
input: "{{{|" | |||
output: [Text(text="{{{|")] | |||
--- | |||
name: incomplete_open_pipe_text | |||
label: incomplete arguments: an open, then a pipe, then text | |||
input: "{{{|foo" | |||
output: [Text(text="{{{|foo")] | |||
--- | |||
name: incomplete_open_pipes_text | |||
label: incomplete arguments: a pipe, then text then two pipes | |||
input: "{{{|f||" | |||
output: [Text(text="{{{|f||")] | |||
--- | |||
name: incomplete_open_partial_close | |||
label: incomplete arguments: an open, then one right brace | |||
input: "{{{{}" | |||
output: [Text(text="{{{{}")] | |||
--- | |||
name: incomplete_preserve_previous | |||
label: incomplete arguments: a valid argument followed by an invalid one | |||
input: "{{{foo}}} {{{bar" | |||
output: [ArgumentOpen(), Text(text="foo"), ArgumentClose(), Text(text=" {{{bar")] |
@@ -0,0 +1,39 @@ | |||
name: blank | |||
label: a blank comment | |||
input: "<!---->" | |||
output: [CommentStart(), CommentEnd()] | |||
--- | |||
name: basic | |||
label: a basic comment | |||
input: "<!-- comment -->" | |||
output: [CommentStart(), Text(text=" comment "), CommentEnd()] | |||
--- | |||
name: tons_of_nonsense | |||
label: a comment with tons of ignorable garbage in it | |||
input: "<!-- foo{{bar}}[[basé\n\n]{}{}{}{}]{{{{{{haha{{--a>aa<!--aa -->" | |||
output: [CommentStart(), Text(text=" foo{{bar}}[[basé\n\n]{}{}{}{}]{{{{{{haha{{--a>aa<!--aa "), CommentEnd()] | |||
--- | |||
name: incomplete_blank | |||
label: a comment that doesn't close | |||
input: "<!--" | |||
output: [Text(text="<!--")] | |||
--- | |||
name: incomplete_text | |||
label: a comment that doesn't close, with text | |||
input: "<!-- foo" | |||
output: [Text(text="<!-- foo")] | |||
--- | |||
name: incomplete_partial_close | |||
label: a comment that doesn't close, with a partial close | |||
input: "<!-- foo --\x01>" | |||
output: [Text(text="<!-- foo --\x01>")] |
@@ -0,0 +1,109 @@ | |||
name: level_1 | |||
label: a basic level-1 heading | |||
input: "= Heading =" | |||
output: [HeadingStart(level=1), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_2 | |||
label: a basic level-2 heading | |||
input: "== Heading ==" | |||
output: [HeadingStart(level=2), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_3 | |||
label: a basic level-3 heading | |||
input: "=== Heading ===" | |||
output: [HeadingStart(level=3), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_4 | |||
label: a basic level-4 heading | |||
input: "==== Heading ====" | |||
output: [HeadingStart(level=4), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_5 | |||
label: a basic level-5 heading | |||
input: "===== Heading =====" | |||
output: [HeadingStart(level=5), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_6 | |||
label: a basic level-6 heading | |||
input: "====== Heading ======" | |||
output: [HeadingStart(level=6), Text(text=" Heading "), HeadingEnd()] | |||
--- | |||
name: level_7 | |||
label: a level-6 heading that pretends to be a level-7 heading | |||
input: "======= Heading =======" | |||
output: [HeadingStart(level=6), Text(text="= Heading ="), HeadingEnd()] | |||
--- | |||
name: level_3_2 | |||
label: a level-2 heading that pretends to be a level-3 heading | |||
input: "=== Heading ==" | |||
output: [HeadingStart(level=2), Text(text="= Heading "), HeadingEnd()] | |||
--- | |||
name: level_4_6 | |||
label: a level-4 heading that pretends to be a level-6 heading | |||
input: "==== Heading ======" | |||
output: [HeadingStart(level=4), Text(text=" Heading =="), HeadingEnd()] | |||
--- | |||
name: newline_before | |||
label: a heading that starts after a newline | |||
input: "This is some text.\n== Foobar ==\nbaz" | |||
output: [Text(text="This is some text.\n"), HeadingStart(level=2), Text(text=" Foobar "), HeadingEnd(), Text(text="\nbaz")] | |||
--- | |||
name: text_after | |||
label: text on the same line after | |||
input: "This is some text.\n== Foobar == baz" | |||
output: [Text(text="This is some text.\n"), HeadingStart(level=2), Text(text=" Foobar "), HeadingEnd(), Text(text=" baz")] | |||
--- | |||
name: invalid_text_before | |||
label: invalid headings: text on the same line before | |||
input: "This is some text. == Foobar ==\nbaz" | |||
output: [Text(text="This is some text. == Foobar ==\nbaz")] | |||
--- | |||
name: invalid_newline_middle | |||
label: invalid headings: newline in the middle | |||
input: "This is some text.\n== Foo\nbar ==" | |||
output: [Text(text="This is some text.\n== Foo\nbar ==")] | |||
--- | |||
name: invalid_newline_end | |||
label: invalid headings: newline in the middle | |||
input: "This is some text.\n=== Foo\n===" | |||
output: [Text(text="This is some text.\n=== Foo\n===")] | |||
--- | |||
name: invalid_nesting | |||
label: invalid headings: attempts at nesting | |||
input: "== Foo === Bar === Baz ==" | |||
output: [HeadingStart(level=2), Text(text=" Foo === Bar === Baz "), HeadingEnd()] | |||
--- | |||
name: incomplete | |||
label: a heading that starts but doesn't finish | |||
input: "Foobar. \n== Heading " | |||
output: [Text(text="Foobar. \n== Heading ")] |
@@ -0,0 +1,144 @@ | |||
name: named | |||
label: a basic named HTML entity | |||
input: " " | |||
output: [HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_decimal | |||
label: a basic decimal HTML entity | |||
input: "k" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), Text(text="107"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_hexadecimal_x | |||
label: a basic hexadecimal HTML entity, using 'x' as a signal | |||
input: "k" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), HTMLEntityHex(char="x"), Text(text="6B"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_hexadecimal_X | |||
label: a basic hexadecimal HTML entity, using 'X' as a signal | |||
input: "k" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), HTMLEntityHex(char="X"), Text(text="6B"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_decimal_max | |||
label: the maximum acceptable decimal numeric entity | |||
input: "" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), Text(text="1114111"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_hex_max | |||
label: the maximum acceptable hexadecimal numeric entity | |||
input: "" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), HTMLEntityHex(char="x"), Text(text="10FFFF"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_zeros | |||
label: zeros accepted at the beginning of a numeric entity | |||
input: "k" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), Text(text="0000000107"), HTMLEntityEnd()] | |||
--- | |||
name: numeric_hex_zeros | |||
label: zeros accepted at the beginning of a hex numeric entity | |||
input: "ć" | |||
output: [HTMLEntityStart(), HTMLEntityNumeric(), HTMLEntityHex(char="x"), Text(text="0000000107"), HTMLEntityEnd()] | |||
--- | |||
name: invalid_named_too_long | |||
label: a named entity that is too long | |||
input: "&sigmaSigma;" | |||
output: [Text(text="&sigmaSigma;")] | |||
--- | |||
name: invalid_named_undefined | |||
label: a named entity that doesn't exist | |||
input: "&foobar;" | |||
output: [Text(text="&foobar;")] | |||
--- | |||
name: invalid_named_nonascii | |||
label: a named entity with non-ASCII characters | |||
input: "&sígma;" | |||
output: [Text(text="&sígma;")] | |||
--- | |||
name: invalid_numeric_out_of_range_1 | |||
label: a numeric entity that is out of range: < 1 | |||
input: "�" | |||
output: [Text(text="�")] | |||
--- | |||
name: invalid_numeric_out_of_range_2 | |||
label: a hex numeric entity that is out of range: < 1 | |||
input: "�" | |||
output: [Text(text="�")] | |||
--- | |||
name: invalid_numeric_out_of_range_3 | |||
label: a numeric entity that is out of range: > 0x10FFFF | |||
input: "�" | |||
output: [Text(text="�")] | |||
--- | |||
name: invalid_numeric_out_of_range_4 | |||
label: a hex numeric entity that is out of range: > 0x10FFFF | |||
input: "�" | |||
output: [Text(text="�")] | |||
--- | |||
name: invalid_partial_amp | |||
label: invalid entities: just an ampersand | |||
input: "&" | |||
output: [Text(text="&")] | |||
--- | |||
name: invalid_partial_amp_semicolon | |||
label: invalid entities: an ampersand and semicolon | |||
input: "&;" | |||
output: [Text(text="&;")] | |||
--- | |||
name: invalid_partial_amp_pound_semicolon | |||
label: invalid entities: an ampersand, pound sign, and semicolon | |||
input: "&#;" | |||
output: [Text(text="&#;")] | |||
--- | |||
name: invalid_partial_amp_pound_x_semicolon | |||
label: invalid entities: an ampersand, pound sign, x, and semicolon | |||
input: "&#x;" | |||
output: [Text(text="&#x;")] | |||
--- | |||
name: invalid_partial_amp_pound_numbers | |||
label: invalid entities: an ampersand, pound sign, numbers | |||
input: "{" | |||
output: [Text(text="{")] | |||
--- | |||
name: invalid_partial_amp_pound_x_semicolon | |||
label: invalid entities: an ampersand, pound sign, and x | |||
input: "&#x" | |||
output: [Text(text="&#x")] |
@@ -0,0 +1,46 @@ | |||
name: empty | |||
label: sanity check that parsing an empty string yields nothing | |||
input: "" | |||
output: [] | |||
--- | |||
name: template_argument_mix | |||
label: an ambiguous mix of templates and arguments | |||
input: "{{{{{{{{foo}}}}}}}}{{{{{{{bar}}baz}}}buz}}" | |||
output: [TemplateOpen(), ArgumentOpen(), ArgumentOpen(), Text(text="foo"), ArgumentClose(), ArgumentClose(), TemplateClose(), TemplateOpen(), ArgumentOpen(), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), ArgumentClose(), Text(text="buz"), TemplateClose()] | |||
--- | |||
name: rich_heading | |||
label: a heading with templates/wikilinks in it | |||
input: "== Head{{ing}} [[with]] {{{funky|{{stuf}}}}} ==" | |||
output: [HeadingStart(level=2), Text(text=" Head"), TemplateOpen(), Text(text="ing"), TemplateClose(), Text(text=" "), WikilinkOpen(), Text(text="with"), WikilinkClose(), Text(text=" "), ArgumentOpen(), Text(text="funky"), ArgumentSeparator(), TemplateOpen(), Text(text="stuf"), TemplateClose(), ArgumentClose(), Text(text=" "), HeadingEnd()] | |||
--- | |||
name: html_entity_with_template | |||
label: a HTML entity with a template embedded inside | |||
input: "&n{{bs}}p;" | |||
output: [Text(text="&n"), TemplateOpen(), Text(text="bs"), TemplateClose(), Text(text="p;")] | |||
--- | |||
name: html_entity_with_comment | |||
label: a HTML entity with a comment embedded inside | |||
input: "&n<!--foo-->bsp;" | |||
output: [Text(text="&n"), CommentStart(), Text(text="foo"), CommentEnd(), Text(text="bsp;")] | |||
--- | |||
name: wildcard | |||
label: a wildcard assortment of various things | |||
input: "{{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), Text(text="biz"), TemplateClose(), Text(text="buzz"), TemplateClose(), Text(text="usr"), TemplateParamSeparator(), TemplateOpen(), Text(text="bin"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: wildcard_redux | |||
label: an even wilder assortment of various things | |||
input: "{{a|b|{{c|[[d]]{{{e}}}}}}}[[f|{{{g}}}<!--h-->]]{{i|j= }}" | |||
output: [TemplateOpen(), Text(text="a"), TemplateParamSeparator(), Text(text="b"), TemplateParamSeparator(), TemplateOpen(), Text(text="c"), TemplateParamSeparator(), WikilinkOpen(), Text(text="d"), WikilinkClose(), ArgumentOpen(), Text(text="e"), ArgumentClose(), TemplateClose(), TemplateClose(), WikilinkOpen(), Text(text="f"), WikilinkSeparator(), ArgumentOpen(), Text(text="g"), ArgumentClose(), CommentStart(), Text(text="h"), CommentEnd(), WikilinkClose(), TemplateOpen(), Text(text="i"), TemplateParamSeparator(), Text(text="j"), TemplateParamEquals(), HTMLEntityStart(), Text(text="nbsp"), HTMLEntityEnd(), TemplateClose()] |
@@ -0,0 +1,641 @@ | |||
name: blank | |||
label: template with no content | |||
input: "{{}}" | |||
output: [TemplateOpen(), TemplateClose()] | |||
--- | |||
name: blank_with_params | |||
label: template with no content, but pipes and equal signs | |||
input: "{{||=|}}" | |||
output: [TemplateOpen(), TemplateParamSeparator(), TemplateParamSeparator(), TemplateParamEquals(), TemplateParamSeparator(), TemplateClose()] | |||
--- | |||
name: no_params | |||
label: simplest type of template | |||
input: "{{template}}" | |||
output: [TemplateOpen(), Text(text="template"), TemplateClose()] | |||
--- | |||
name: one_param_unnamed | |||
label: basic template with one unnamed parameter | |||
input: "{{foo|bar}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateClose()] | |||
--- | |||
name: one_param_named | |||
label: basic template with one named parameter | |||
input: "{{foo|bar=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: multiple_unnamed_params | |||
label: basic template with multiple unnamed parameters | |||
input: "{{foo|bar|baz|biz|buzz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamSeparator(), Text(text="biz"), TemplateParamSeparator(), Text(text="buzz"), TemplateClose()] | |||
--- | |||
name: multiple_named_params | |||
label: basic template with multiple named parameters | |||
input: "{{foo|bar=baz|biz=buzz|buff=baff|usr=bin}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamEquals(), Text(text="baz"), TemplateParamSeparator(), Text(text="biz"), TemplateParamEquals(), Text(text="buzz"), TemplateParamSeparator(), Text(text="buff"), TemplateParamEquals(), Text(text="baff"), TemplateParamSeparator(), Text(text="usr"), TemplateParamEquals(), Text(text="bin"), TemplateClose()] | |||
--- | |||
name: multiple_mixed_params | |||
label: basic template with multiple unnamed/named parameters | |||
input: "{{foo|bar=baz|biz|buzz=buff|usr|bin}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamEquals(), Text(text="baz"), TemplateParamSeparator(), Text(text="biz"), TemplateParamSeparator(), Text(text="buzz"), TemplateParamEquals(), Text(text="buff"), TemplateParamSeparator(), Text(text="usr"), TemplateParamSeparator(), Text(text="bin"), TemplateClose()] | |||
--- | |||
name: multiple_mixed_params2 | |||
label: basic template with multiple unnamed/named parameters in another order | |||
input: "{{foo|bar|baz|biz=buzz|buff=baff|usr=bin}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamSeparator(), Text(text="biz"), TemplateParamEquals(), Text(text="buzz"), TemplateParamSeparator(), Text(text="buff"), TemplateParamEquals(), Text(text="baff"), TemplateParamSeparator(), Text(text="usr"), TemplateParamEquals(), Text(text="bin"), TemplateClose()] | |||
--- | |||
name: nested_unnamed_param | |||
label: nested template as an unnamed parameter | |||
input: "{{foo|{{bar}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_named_param_value | |||
label: nested template as a parameter value with a named parameter | |||
input: "{{foo|bar={{baz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateParamEquals(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_named_param_name_and_value | |||
label: nested templates as a parameter name and value | |||
input: "{{foo|{{bar}}={{baz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start | |||
label: nested template at the beginning of a template name | |||
input: "{{{{foo}}bar}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateClose()] | |||
--- | |||
name: nested_name_start_unnamed_param | |||
label: nested template at the beginning of a template name and as an unnamed parameter | |||
input: "{{{{foo}}bar|{{baz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_named_param_value | |||
label: nested template at the beginning of a template name and as a parameter value with a named parameter | |||
input: "{{{{foo}}bar|baz={{biz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_named_param_name_and_value | |||
label: nested template at the beginning of a template name and as a parameter name and value | |||
input: "{{{{foo}}bar|{{baz}}={{biz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_end | |||
label: nested template at the end of a template name | |||
input: "{{foo{{bar}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_end_unnamed_param | |||
label: nested template at the end of a template name and as an unnamed parameter | |||
input: "{{foo{{bar}}|{{baz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_end_named_param_value | |||
label: nested template at the end of a template name and as a parameter value with a named parameter | |||
input: "{{foo{{bar}}|baz={{biz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_end_named_param_name_and_value | |||
label: nested template at the end of a template name and as a parameter name and value | |||
input: "{{foo{{bar}}|{{baz}}={{biz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_mid | |||
label: nested template in the middle of a template name | |||
input: "{{foo{{bar}}baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: nested_name_mid_unnamed_param | |||
label: nested template in the middle of a template name and as an unnamed parameter | |||
input: "{{foo{{bar}}baz|{{biz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateParamSeparator(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_mid_named_param_value | |||
label: nested template in the middle of a template name and as a parameter value with a named parameter | |||
input: "{{foo{{bar}}baz|biz={{buzz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateParamSeparator(), Text(text="biz"), TemplateParamEquals(), TemplateOpen(), Text(text="buzz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_mid_named_param_name_and_value | |||
label: nested template in the middle of a template name and as a parameter name and value | |||
input: "{{foo{{bar}}baz|{{biz}}={{buzz}}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateOpen(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateParamSeparator(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="buzz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_end | |||
label: nested template at the beginning and end of a template name | |||
input: "{{{{foo}}{{bar}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_end_unnamed_param | |||
label: nested template at the beginning and end of a template name and as an unnamed parameter | |||
input: "{{{{foo}}{{bar}}|{{baz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_end_named_param_value | |||
label: nested template at the beginning and end of a template name and as a parameter value with a named parameter | |||
input: "{{{{foo}}{{bar}}|baz={{biz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_name_start_end_named_param_name_and_value | |||
label: nested template at the beginning and end of a template name and as a parameter name and value | |||
input: "{{{{foo}}{{bar}}|{{baz}}={{biz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), TemplateOpen(), Text(text="bar"), TemplateClose(), TemplateParamSeparator(), TemplateOpen(), Text(text="baz"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="biz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_names_multiple | |||
label: multiple nested templates within nested templates | |||
input: "{{{{{{{{foo}}bar}}baz}}biz}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateClose(), Text(text="biz"), TemplateClose()] | |||
--- | |||
name: nested_names_multiple_unnamed_param | |||
label: multiple nested templates within nested templates with a nested unnamed parameter | |||
input: "{{{{{{{{foo}}bar}}baz}}biz|{{buzz}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateClose(), Text(text="biz"), TemplateParamSeparator(), TemplateOpen(), Text(text="buzz"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_names_multiple_named_param_value | |||
label: multiple nested templates within nested templates with a nested parameter value in a named parameter | |||
input: "{{{{{{{{foo}}bar}}baz}}biz|buzz={{bin}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateClose(), Text(text="biz"), TemplateParamSeparator(), Text(text="buzz"), TemplateParamEquals(), TemplateOpen(), Text(text="bin"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: nested_names_multiple_named_param_name_and_value | |||
label: multiple nested templates within nested templates with a nested parameter name and value | |||
input: "{{{{{{{{foo}}bar}}baz}}biz|{{buzz}}={{bin}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateClose(), Text(text="baz"), TemplateClose(), Text(text="biz"), TemplateParamSeparator(), TemplateOpen(), Text(text="buzz"), TemplateClose(), TemplateParamEquals(), TemplateOpen(), Text(text="bin"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: mixed_nested_templates | |||
label: mixed assortment of nested templates within template names, parameter names, and values | |||
input: "{{{{{{{{foo}}bar|baz=biz}}buzz}}usr|{{bin}}}}" | |||
output: [TemplateOpen(), TemplateOpen(), TemplateOpen(), TemplateOpen(), Text(text="foo"), TemplateClose(), Text(text="bar"), TemplateParamSeparator(), Text(text="baz"), TemplateParamEquals(), Text(text="biz"), TemplateClose(), Text(text="buzz"), TemplateClose(), Text(text="usr"), TemplateParamSeparator(), TemplateOpen(), Text(text="bin"), TemplateClose(), TemplateClose()] | |||
--- | |||
name: newlines_start | |||
label: a newline at the start of a template name | |||
input: "{{\nfoobar}}" | |||
output: [TemplateOpen(), Text(text="\nfoobar"), TemplateClose()] | |||
--- | |||
name: newlines_end | |||
label: a newline at the end of a template name | |||
input: "{{foobar\n}}" | |||
output: [TemplateOpen(), Text(text="foobar\n"), TemplateClose()] | |||
--- | |||
name: newlines_start_end | |||
label: a newline at the start and end of a template name | |||
input: "{{\nfoobar\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoobar\n"), TemplateClose()] | |||
--- | |||
name: newlines_mid | |||
label: a newline at the middle of a template name | |||
input: "{{foo\nbar}}" | |||
output: [Text(text="{{foo\nbar}}")] | |||
--- | |||
name: newlines_start_mid | |||
label: a newline at the start and middle of a template name | |||
input: "{{\nfoo\nbar}}" | |||
output: [Text(text="{{\nfoo\nbar}}")] | |||
--- | |||
name: newlines_mid_end | |||
label: a newline at the middle and end of a template name | |||
input: "{{foo\nbar\n}}" | |||
output: [Text(text="{{foo\nbar\n}}")] | |||
--- | |||
name: newlines_start_mid_end | |||
label: a newline at the start, middle, and end of a template name | |||
input: "{{\nfoo\nbar\n}}" | |||
output: [Text(text="{{\nfoo\nbar\n}}")] | |||
--- | |||
name: newlines_unnamed_param | |||
label: newlines within an unnamed template parameter | |||
input: "{{foo|\nb\nar\n}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateClose()] | |||
--- | |||
name: newlines_enclose_template_name_unnamed_param | |||
label: newlines enclosing a template name and within an unnamed template parameter | |||
input: "{{\nfoo\n|\nb\nar\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateClose()] | |||
--- | |||
name: newlines_within_template_name_unnamed_param | |||
label: newlines within a template name and within an unnamed template parameter | |||
input: "{{\nfo\no\n|\nb\nar\n}}" | |||
output: [Text(text="{{\nfo\no\n|\nb\nar\n}}")] | |||
--- | |||
name: newlines_enclose_template_name_named_param_value | |||
label: newlines enclosing a template name and within a named parameter value | |||
input: "{{\nfoo\n|1=\nb\nar\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="1"), TemplateParamEquals(), Text(text="\nb\nar\n"), TemplateClose()] | |||
--- | |||
name: newlines_within_template_name_named_param_value | |||
label: newlines within a template name and within a named parameter value | |||
input: "{{\nf\noo\n|1=\nb\nar\n}}" | |||
output: [Text(text="{{\nf\noo\n|1=\nb\nar\n}}")] | |||
--- | |||
name: newlines_named_param_name | |||
label: newlines within a parameter name | |||
input: "{{foo|\nb\nar\n=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: newlines_named_param_name_param_value | |||
label: newlines within a parameter name and within a parameter value | |||
input: "{{foo|\nb\nar\n=\nba\nz\n}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateParamEquals(), Text(text="\nba\nz\n"), TemplateClose()] | |||
--- | |||
name: newlines_enclose_template_name_named_param_name | |||
label: newlines enclosing a template name and within a parameter name | |||
input: "{{\nfoo\n|\nb\nar\n=baz}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: newlines_enclose_template_name_named_param_name_param_value | |||
label: newlines enclosing a template name and within a parameter name and within a parameter value | |||
input: "{{\nfoo\n|\nb\nar\n=\nba\nz\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateParamEquals(), Text(text="\nba\nz\n"), TemplateClose()] | |||
--- | |||
name: newlines_within_template_name_named_param_name | |||
label: newlines within a template name and within a parameter name | |||
input: "{{\nfo\no\n|\nb\nar\n=baz}}" | |||
output: [Text(text="{{\nfo\no\n|\nb\nar\n=baz}}")] | |||
--- | |||
name: newlines_within_template_name_named_param_name_param_value | |||
label: newlines within a template name and within a parameter name and within a parameter value | |||
input: "{{\nf\noo\n|\nb\nar\n=\nba\nz\n}}" | |||
output: [Text(text="{{\nf\noo\n|\nb\nar\n=\nba\nz\n}}")] | |||
--- | |||
name: newlines_wildcard | |||
label: a random, complex assortment of templates and newlines | |||
input: "{{\nfoo\n|\nb\nar\n=\nb\naz\n|\nb\nuz\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="\nb\nar\n"), TemplateParamEquals(), Text(text="\nb\naz\n"), TemplateParamSeparator(), Text(text="\nb\nuz\n"), TemplateClose()] | |||
--- | |||
name: newlines_wildcard_redux | |||
label: an even more random and complex assortment of templates and newlines | |||
input: "{{\nfoo\n|\n{{\nbar\n|\nb\naz\n=\nb\niz\n}}\n=\nb\nuzz\n}}" | |||
output: [TemplateOpen(), Text(text="\nfoo\n"), TemplateParamSeparator(), Text(text="\n"), TemplateOpen(), Text(text="\nbar\n"), TemplateParamSeparator(), Text(text="\nb\naz\n"), TemplateParamEquals(), Text(text="\nb\niz\n"), TemplateClose(), Text(text="\n"), TemplateParamEquals(), Text(text="\nb\nuzz\n"), TemplateClose()] | |||
--- | |||
name: newlines_wildcard_redux_invalid | |||
label: a variation of the newlines_wildcard_redux test that is invalid | |||
input: "{{\nfoo\n|\n{{\nb\nar\n|\nb\naz\n=\nb\niz\n}}\n=\nb\nuzz\n}}" | |||
output: [Text(text="{{\nfoo\n|\n{{\nb\nar\n|\nb\naz\n=\nb\niz\n}}\n=\nb\nuzz\n}}")] | |||
--- | |||
name: invalid_name_left_brace_middle | |||
label: invalid characters in template name: left brace in middle | |||
input: "{{foo{bar}}" | |||
output: [Text(text="{{foo{bar}}")] | |||
--- | |||
name: invalid_name_right_brace_middle | |||
label: invalid characters in template name: right brace in middle | |||
input: "{{foo}bar}}" | |||
output: [Text(text="{{foo}bar}}")] | |||
--- | |||
name: invalid_name_left_braces | |||
label: invalid characters in template name: two left braces in middle | |||
input: "{{foo{b{ar}}" | |||
output: [Text(text="{{foo{b{ar}}")] | |||
--- | |||
name: invalid_name_left_bracket_middle | |||
label: invalid characters in template name: left bracket in middle | |||
input: "{{foo[bar}}" | |||
output: [Text(text="{{foo[bar}}")] | |||
--- | |||
name: invalid_name_right_bracket_middle | |||
label: invalid characters in template name: right bracket in middle | |||
input: "{{foo]bar}}" | |||
output: [Text(text="{{foo]bar}}")] | |||
--- | |||
name: invalid_name_left_bracket_start | |||
label: invalid characters in template name: left bracket at start | |||
input: "{{[foobar}}" | |||
output: [Text(text="{{[foobar}}")] | |||
--- | |||
name: invalid_name_right_bracket_start | |||
label: invalid characters in template name: right bracket at end | |||
input: "{{foobar]}}" | |||
output: [Text(text="{{foobar]}}")] | |||
--- | |||
name: valid_name_left_brace_start | |||
label: valid characters in template name: left brace at start | |||
input: "{{{foobar}}" | |||
output: [Text(text="{"), TemplateOpen(), Text(text="foobar"), TemplateClose()] | |||
--- | |||
name: valid_unnamed_param_left_brace | |||
label: valid characters in unnamed template parameter: left brace | |||
input: "{{foo|ba{r}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="ba{r"), TemplateClose()] | |||
--- | |||
name: valid_unnamed_param_braces | |||
label: valid characters in unnamed template parameter: left and right braces | |||
input: "{{foo|ba{r}}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="ba{r"), TemplateClose(), Text(text="}")] | |||
--- | |||
name: valid_param_name_braces | |||
label: valid characters in template parameter name: left and right braces | |||
input: "{{foo|ba{r}=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="ba{r}"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: valid_param_name_brackets | |||
label: valid characters in unnamed template parameter: left and right brackets | |||
input: "{{foo|ba[r]=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="ba[r]"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: valid_param_name_double_left_brackets | |||
label: valid characters in unnamed template parameter: double left brackets | |||
input: "{{foo|bar[[in\nvalid=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar[[in\nvalid"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: valid_param_name_double_right_brackets | |||
label: valid characters in unnamed template parameter: double right brackets | |||
input: "{{foo|bar]]=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar]]"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: valid_param_name_double_brackets | |||
label: valid characters in unnamed template parameter: double left and right brackets | |||
input: "{{foo|bar[[in\nvalid]]=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar[[in\nvalid]]"), TemplateParamEquals(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: invalid_param_name_double_left_braces | |||
label: invalid characters in template parameter name: double left braces | |||
input: "{{foo|bar{{in\nvalid=baz}}" | |||
output: [Text(text="{{foo|bar{{in\nvalid=baz}}")] | |||
--- | |||
name: invalid_param_name_double_braces | |||
label: invalid characters in template parameter name: double left and right braces | |||
input: "{{foo|bar{{in\nvalid}}=baz}}" | |||
output: [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar{{in\nvalid"), TemplateClose(), Text(text="=baz}}")] | |||
--- | |||
name: incomplete_stub | |||
label: incomplete templates that should fail gracefully: just an opening | |||
input: "{{" | |||
output: [Text(text="{{")] | |||
--- | |||
name: incomplete_plain | |||
label: incomplete templates that should fail gracefully: no close whatsoever | |||
input: "{{stuff}} {{foobar" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foobar")] | |||
--- | |||
name: incomplete_right_brace | |||
label: incomplete templates that should fail gracefully: only one right brace | |||
input: "{{stuff}} {{foobar}" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foobar}")] | |||
--- | |||
name: incomplete_pipe | |||
label: incomplete templates that should fail gracefully: a pipe | |||
input: "{{stuff}} {{foobar|" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foobar|")] | |||
--- | |||
name: incomplete_unnamed_param | |||
label: incomplete templates that should fail gracefully: an unnamed parameter | |||
input: "{{stuff}} {{foo|bar" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar")] | |||
--- | |||
name: incomplete_unnamed_param_pipe | |||
label: incomplete templates that should fail gracefully: an unnamed parameter, then a pipe | |||
input: "{{stuff}} {{foo|bar|" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar|")] | |||
--- | |||
name: incomplete_valueless_param | |||
label: incomplete templates that should fail gracefully: an a named parameter with no value | |||
input: "{{stuff}} {{foo|bar=" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=")] | |||
--- | |||
name: incomplete_valueless_param_pipe | |||
label: incomplete templates that should fail gracefully: a named parameter with no value, then a pipe | |||
input: "{{stuff}} {{foo|bar=|" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=|")] | |||
--- | |||
name: incomplete_named_param | |||
label: incomplete templates that should fail gracefully: a named parameter with a value | |||
input: "{{stuff}} {{foo|bar=baz" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=baz")] | |||
--- | |||
name: incomplete_named_param_pipe | |||
label: incomplete templates that should fail gracefully: a named parameter with a value, then a paipe | |||
input: "{{stuff}} {{foo|bar=baz|" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=baz|")] | |||
--- | |||
name: incomplete_two_unnamed_params | |||
label: incomplete templates that should fail gracefully: two unnamed parameters | |||
input: "{{stuff}} {{foo|bar|baz" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar|baz")] | |||
--- | |||
name: incomplete_unnamed_param_valueless_param | |||
label: incomplete templates that should fail gracefully: an unnamed parameter, then a named parameter with no value | |||
input: "{{stuff}} {{foo|bar|baz=" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar|baz=")] | |||
--- | |||
name: incomplete_unnamed_param_named_param | |||
label: incomplete templates that should fail gracefully: an unnamed parameter, then a named parameter with a value | |||
input: "{{stuff}} {{foo|bar|baz=biz" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar|baz=biz")] | |||
--- | |||
name: incomplete_named_param_unnamed_param | |||
label: incomplete templates that should fail gracefully: a named parameter with a value, then an unnamed parameter | |||
input: "{{stuff}} {{foo|bar=baz|biz" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=baz|biz")] | |||
--- | |||
name: incomplete_named_param_valueless_param | |||
label: incomplete templates that should fail gracefully: a named parameter with a value, then a named parameter with no value | |||
input: "{{stuff}} {{foo|bar=baz|biz=" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=baz|biz=")] | |||
--- | |||
name: incomplete_two_named_params | |||
label: incomplete templates that should fail gracefully: two named parameters with values | |||
input: "{{stuff}} {{foo|bar=baz|biz=buzz" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar=baz|biz=buzz")] | |||
--- | |||
name: incomplete_nested_template_as_unnamed_param | |||
label: incomplete templates that should fail gracefully: a valid nested template as an unnamed parameter | |||
input: "{{stuff}} {{foo|{{bar}}" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|"), TemplateOpen(), Text(text="bar"), TemplateClose()] | |||
--- | |||
name: incomplete_nested_template_as_param_value | |||
label: incomplete templates that should fail gracefully: a valid nested template as a parameter value | |||
input: "{{stuff}} {{foo|bar={{baz}}" | |||
output: [TemplateOpen(), Text(text="stuff"), TemplateClose(), Text(text=" {{foo|bar="), TemplateOpen(), Text(text="baz"), TemplateClose()] | |||
--- | |||
name: recursion_five_hundred_opens | |||
label: test potentially dangerous recursion: five hundred template openings, without spaces | |||
input: "{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{" | |||
output: [Text(text="{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{")] | |||
--- | |||
name: recursion_one_hundred_opens | |||
label: test potentially dangerous recursion: one hundred template openings, with spaces | |||
input: "{{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{" | |||
output: [Text(text="{{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{ {{")] | |||
--- | |||
name: recursion_opens_and_closes | |||
label: test potentially dangerous recursion: template openings and closings | |||
input: "{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}" | |||
output: [Text(text="{{|"), TemplateOpen(), TemplateClose(), Text(text="{{|"), TemplateOpen(), TemplateClose(), TemplateOpen(), TemplateParamSeparator(), TemplateOpen(), TemplateClose(), Text(text="{{"), TemplateParamSeparator(), Text(text="{{"), TemplateClose(), Text(text="{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}{{|{{}}")] |
@@ -0,0 +1,25 @@ | |||
name: basic | |||
label: sanity check for basic text parsing, no gimmicks | |||
input: "foobar" | |||
output: [Text(text="foobar")] | |||
--- | |||
name: newlines | |||
label: slightly more complex text parsing, with newlines | |||
input: "This is a line of text.\nThis is another line of text.\nThis is another." | |||
output: [Text(text="This is a line of text.\nThis is another line of text.\nThis is another.")] | |||
--- | |||
name: unicode | |||
label: ensure unicode data is handled properly | |||
input: "Thís ís å sëñtënce with diœcritiçs." | |||
output: [Text(text="Thís ís å sëñtënce with diœcritiçs.")] | |||
--- | |||
name: unicode2 | |||
label: additional unicode check for non-BMP codepoints | |||
input: "𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰" | |||
output: [Text(text="𐌲𐌿𐍄𐌰𐍂𐌰𐌶𐌳𐌰")] |
@@ -0,0 +1,158 @@ | |||
name: blank | |||
label: wikilink with no content | |||
input: "[[]]" | |||
output: [WikilinkOpen(), WikilinkClose()] | |||
--- | |||
name: blank_with_text | |||
label: wikilink with no content but a pipe | |||
input: "[[|]]" | |||
output: [WikilinkOpen(), WikilinkSeparator(), WikilinkClose()] | |||
--- | |||
name: basic | |||
label: simplest type of wikilink | |||
input: "[[wikilink]]" | |||
output: [WikilinkOpen(), Text(text="wikilink"), WikilinkClose()] | |||
--- | |||
name: with_text | |||
label: wikilink with a text value | |||
input: "[[foo|bar]]" | |||
output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar"), WikilinkClose()] | |||
--- | |||
name: blank_with_multiple_texts | |||
label: no content, multiple pipes | |||
input: "[[|||]]" | |||
output: [WikilinkOpen(), WikilinkSeparator(), Text(text="||"), WikilinkClose()] | |||
--- | |||
name: multiple_texts | |||
label: multiple text values separated by pipes | |||
input: "[[foo|bar|baz]]" | |||
output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar|baz"), WikilinkClose()] | |||
--- | |||
name: nested | |||
label: a wikilink nested within the value of another | |||
input: "[[foo|[[bar]]]]" | |||
output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), WikilinkOpen(), Text(text="bar"), WikilinkClose(), WikilinkClose()] | |||
--- | |||
name: nested_with_text | |||
label: a wikilink nested within the value of another, separated by other data | |||
input: "[[foo|a[[b]]c]]" | |||
output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c"), WikilinkClose()] | |||
--- | |||
name: invalid_newline | |||
label: invalid wikilink: newline as only content | |||
input: "[[\n]]" | |||
output: [Text(text="[[\n]]")] | |||
--- | |||
name: invalid_right_brace | |||
label: invalid wikilink: right brace | |||
input: "[[foo}b}a}r]]" | |||
output: [Text(text="[[foo}b}a}r]]")] | |||
--- | |||
name: invalid_left_brace | |||
label: invalid wikilink: left brace | |||
input: "[[foo{{[a}}]]" | |||
output: [Text(text="[[foo{{[a}}]]")] | |||
--- | |||
name: invalid_right_bracket | |||
label: invalid wikilink: right bracket | |||
input: "[[foo]bar]]" | |||
output: [Text(text="[[foo]bar]]")] | |||
--- | |||
name: invalid_left_bracket | |||
label: invalid wikilink: left bracket | |||
input: "[[foo[bar]]" | |||
output: [Text(text="[[foo[bar]]")] | |||
--- | |||
name: invalid_nested | |||
label: invalid wikilink: trying to nest in the wrong context | |||
input: "[[foo[[bar]]]]" | |||
output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="]]")] | |||
--- | |||
name: invalid_nested_text | |||
label: invalid wikilink: trying to nest in the wrong context, with a text param | |||
input: "[[foo[[bar]]|baz]]" | |||
output: [Text(text="[[foo"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="|baz]]")] | |||
--- | |||
name: incomplete_open_only | |||
label: incomplete wikilinks: just an open | |||
input: "[[" | |||
output: [Text(text="[[")] | |||
--- | |||
name: incomplete_open_text | |||
label: incomplete wikilinks: an open with some text | |||
input: "[[foo" | |||
output: [Text(text="[[foo")] | |||
--- | |||
name: incomplete_open_text_pipe | |||
label: incomplete wikilinks: an open, text, then a pipe | |||
input: "[[foo|" | |||
output: [Text(text="[[foo|")] | |||
--- | |||
name: incomplete_open_pipe | |||
label: incomplete wikilinks: an open, then a pipe | |||
input: "[[|" | |||
output: [Text(text="[[|")] | |||
--- | |||
name: incomplete_open_pipe_text | |||
label: incomplete wikilinks: an open, then a pipe, then text | |||
input: "[[|foo" | |||
output: [Text(text="[[|foo")] | |||
--- | |||
name: incomplete_open_pipes_text | |||
label: incomplete wikilinks: a pipe, then text then two pipes | |||
input: "[[|f||" | |||
output: [Text(text="[[|f||")] | |||
--- | |||
name: incomplete_open_partial_close | |||
label: incomplete wikilinks: an open, then one right brace | |||
input: "[[{}" | |||
output: [Text(text="[[{}")] | |||
--- | |||
name: incomplete_preserve_previous | |||
label: incomplete wikilinks: a valid wikilink followed by an invalid one | |||
input: "[[foo]] [[bar" | |||
output: [WikilinkOpen(), Text(text="foo"), WikilinkClose(), Text(text=" [[bar")] |