Jakub Klinkovský
bb51e8f282
Some fixes for the parsing of external links ( #232 )
* Proposed fix for https://github.com/earwig/mwparserfromhell/issues/197
* Port the fix for #197 to the C tokenizer
* Fix parsing of external links where the URL is terminated by some special character
- One existing test case has been found wrong -- current MediaWiki
version always terminates the URL when an opening bracket is
encountered.
- Other test cases added: double quote, two single quotes and angles
always terminate the URL (regardless if it is a free link or external
link inside brackets). One single quote does not terminate the URL.
* Fix case-insensitive parsing of URI schemes
pirms 3 gadiem
Jakub Klinkovský
90061b6844
Fix parsing of section headings inside templates ( #233 )
Fixes #198
Co-authored-by: Ben Kurtovic <ben.kurtovic@gmail.com>
pirms 3 gadiem
Ben Kurtovic
b7b3b2e33e
Update changelog; minor tweak to file headers
pirms 3 gadiem
Ben Kurtovic
1c983d3738
Assorted cleanup, linter fixes, and improvements for Python 3
pirms 3 gadiem
Kunal Mehta
03181bcd8b
Port tests to use pytest
pytest is the preferred way to write and run unit tests these days and
it has a cleaner interface - so lets switch to it. The tokenizer tests
especially are much easier to read/understand.
This was mostly done with find/replace regexes and then cleaned up
manually.
pirms 4 gadiem
Kunal Mehta
7e5297fbe6
Drop Python 2 support
Fixes #221 .
pirms 4 gadiem
Yuri Astrakhan
aa37425a9b
move smart_list into sub-package/multiple files
Step one of refactoring - making SmartList into its own
package, with each class having its own file. No code
changes were made.
Note that SmartList and ListProxy import each other,
so had to import SmartList as a full package name
rather than use from ... import ... construct.
pirms 4 gadiem
Ben Kurtovic
b09b619709
Switch to 'unittest discover' over 'setup.py test'
pirms 5 gadiem
Ben Kurtovic
b3c98efd22
Fix a parsing bug involving deeply nested style tags ( fixes #224 )
pirms 5 gadiem
Ben Kurtovic
2a4e1f4316
Add contexts.describe() for debugging
pirms 5 gadiem
Ben Kurtovic
b6e4c59004
Switch to requests for basic API example ( closes #219 ); update links
pirms 5 gadiem
Ben Kurtovic
6136b1b205
Make Wikicode.matches() treat _ and space as equivalent ( fixes #216 )
pirms 5 gadiem
Ben Kurtovic
4775131717
Fix not memoizing bad routes after failing inside a table ( fixes #206 )
pirms 5 gadiem
Ben Kurtovic
6e61c99c90
Update API query example; clarify docstring
pirms 5 gadiem
Ben Kurtovic
83bcb902b8
Support manual construction of Node objects ( fixes #214 )
pirms 5 gadiem
Ben Kurtovic
0ae5f6d641
Fix regression in previous commit on _ListProxy transformations ( fixes #213 )
pirms 5 gadiem
Ben Kurtovic
840a88bcd6
Fix Wikicode transformation methods on empty sections ( fixes #212 )
pirms 5 gadiem
Ben Kurtovic
de6e671c40
Update changelog, remove now-unneeded test discovery script, cleanup
pirms 6 gadiem
Hugo
f372d3d495
Upgrade unit test asserts
pirms 6 gadiem
Hugo
e457b39f32
Upgrade Python syntax with pyupgrade https://github.com/asottile/pyupgrade
pirms 6 gadiem
Hugo
59636609db
Drop support for EOL Python
pirms 6 gadiem
Ben Kurtovic
46000ee7c8
Fix test on old Python versions
pirms 7 gadiem
Ben Kurtovic
253102be35
Minor change to template test_formatting format.
pirms 7 gadiem
Ben Kurtovic
7a30e47f76
Some improvements to whitespace recognition; unit tests ( #185 ).
pirms 7 gadiem
Ben Kurtovic
cd4f90e663
Fix a rare parsing bug involving nested broken tags.
pirms 7 gadiem
Ben Kurtovic
8a9c9224be
Speed up parsing deeply nested syntax by caching bad routes ( fixes #42 )
Also removed the max cycles stop-gap, allowing much more complex pages
to be parsed quickly without losing nodes at the end
Also fixes #65 , fixes #102 , fixes #165 , fixes #183
Also fixes #81 (Rafael Nadal parsing bug)
Also fixes #53 , fixes #58 , fixes #88 , fixes #152 (duplicate issues)
pirms 7 gadiem
Ben Kurtovic
d7c755f526
Add Wikicode.contains(), Wikicode.get_ancestors(), Wikicode.get_parent() ( #177 )
pirms 7 gadiem
Ben Kurtovic
68ded2f890
Add keep_template_params to Wikicode.strip_code ( #175 )
pirms 7 gadiem
Ben Kurtovic
6159171e04
Make Template.remove(keep_field=True) slightly more reasonable.
pirms 7 gadiem
Ben Kurtovic
f34f662f35
Fix len() sometimes raising ValueError on empty node lists ( fixes #174 )
pirms 8 gadiem
Ben Kurtovic
aaffb7f66b
Update copyright for 2016.
pirms 8 gadiem
Ben Kurtovic
4707b455b5
Add a new test to check for parsing bug; fix an existing test ( #142 )
pirms 8 gadiem
Ben Kurtovic
61b6b98470
Fix two parser bugs involving wikitable error handling.
pirms 9 gadiem
Ben Kurtovic
50b401549b
Add failing test cases for #125 .
pirms 9 gadiem
Ben Kurtovic
4f3ab48375
Edge cases involving wikilink -> external link fallback ( fixes #120 )
pirms 9 gadiem
Ben Kurtovic
67214b7c05
Add some failing tests for SmartList features.
pirms 9 gadiem
Ben Kurtovic
ab9f6a97fb
Use weakrefs for SmartList children; remove _ListProxy.detach().
pirms 9 gadiem
Ben Kurtovic
2a3a978986
Incomplete code for C tokenizer textbuffer.
pirms 9 gadiem
Ben Kurtovic
f16c7e25ca
Fully fix parsing templates with blank names, I hope ( #111 )
pirms 9 gadiem
Ben Kurtovic
7993224926
Add a failing test for a missed component of #59
pirms 9 gadiem
Ben Kurtovic
56f1797cfe
Add failing tests for #111
pirms 9 gadiem
Ben Kurtovic
46cb714344
Fix unit tests for 699d063
( #109 )
pirms 9 gadiem
Ben Kurtovic
4c2540060b
Fix preserve_spacing behavior in Template.add() on hidden keys ( #109 )
pirms 9 gadiem
Ben Kurtovic
3a57756068
Fix HTTPS requirement for enwiki API.
pirms 9 gadiem
Ben Kurtovic
efc571c5c0
Refactor _test_tokenizer; add syntax for running just one test.
pirms 9 gadiem
Ben Kurtovic
07d4577c33
Add tests for < and > in wilink titles/template names ( #104 )
pirms 9 gadiem
Ben Kurtovic
e71e7b4ece
Update copyright years for 2015; fix whitespace in docs.
pirms 10 gadiem
Ben Kurtovic
a64bae35c9
Add support for a NOWEB env var, update docs.
pirms 10 gadiem
Ben Kurtovic
a00c645bd8
Fix handling of tag closes within <nowiki> ( fixes #89 ).
pirms 10 gadiem
Ben Kurtovic
47b44a9730
Add a failing test for #89 .
pirms 10 gadiem