This regression seems more severe than the bug the commit was
attempting to fix (incorrect parsing of nested wikilinks in normal
links), so that bug is reintroduced until localization-aware parsing
that allows us to detect file links is added.
This commit partially reverts fac60dee48.
* Proposed fix for https://github.com/earwig/mwparserfromhell/issues/197
* Port the fix for #197 to the C tokenizer
* Fix parsing of external links where the URL is terminated by some special character
- One existing test case has been found wrong -- current MediaWiki
version always terminates the URL when an opening bracket is
encountered.
- Other test cases added: double quote, two single quotes and angles
always terminate the URL (regardless if it is a free link or external
link inside brackets). One single quote does not terminate the URL.
* Fix case-insensitive parsing of URI schemes
It's EOL and AppVeyor builds are broken under it if we want to support Python 3.9.
Not fully deprecating 3.5 for now because it's still used on Toolforge.
* nodes: add a `default` param to Template.get
Similar to dict.get, Template.get with a default param supplied will
return that value instead of raising an exception. If default is unset,
Template.get will keep its previous behavior and raise an exception.
* nodes: Add __getitem__, __setitem__, and __delitem__ to Template
These are just aliases for existing methods, without the ability to
specifiy additional parameters. However, including them makes Template
more dict-like, so it's a good idea to have them.
* nodes: Use def instead of assignment of a lambda in Template
Per PEP8, there is no benefit to using a lambda here, and some
downsides. It's the same number of SLOC either way, so might as well
change it.
Just like the Windows wheels, these allow for Linux users to install
mwparserfromhell and use the faster CTokenizer without needing to
have build tools installed.
Under the hood, this uses pypa manylinux1 docker image to build and
tag the wheels, then publishes them to pypi if a new tag was pushed.
Fixes#170.
pytest is the preferred way to write and run unit tests these days and
it has a cleaner interface - so lets switch to it. The tokenizer tests
especially are much easier to read/understand.
This was mostly done with find/replace regexes and then cleaned up
manually.
In addition to replacing the manual version check, this will also
instruct pip to download an older version of mwparserfromhell for
users running earlier Python versions rather than just getting something
broken.