Ben Kurtovic
074e368684
Clean up pytest-ported tests
3 лет назад
Jakub Klinkovský
bb51e8f282
Some fixes for the parsing of external links ( #232 )
* Proposed fix for https://github.com/earwig/mwparserfromhell/issues/197
* Port the fix for #197 to the C tokenizer
* Fix parsing of external links where the URL is terminated by some special character
- One existing test case has been found wrong -- current MediaWiki
version always terminates the URL when an opening bracket is
encountered.
- Other test cases added: double quote, two single quotes and angles
always terminate the URL (regardless if it is a free link or external
link inside brackets). One single quote does not terminate the URL.
* Fix case-insensitive parsing of URI schemes
3 лет назад
Jakub Klinkovský
90061b6844
Fix parsing of section headings inside templates ( #233 )
Fixes #198
Co-authored-by: Ben Kurtovic <ben.kurtovic@gmail.com>
3 лет назад
Ben Kurtovic
b7b3b2e33e
Update changelog; minor tweak to file headers
3 лет назад
Ben Kurtovic
1c983d3738
Assorted cleanup, linter fixes, and improvements for Python 3
3 лет назад
Ben Kurtovic
237798a17e
Update tag definitions
3 лет назад
Kunal Mehta
7e5297fbe6
Drop Python 2 support
Fixes #221 .
4 лет назад
Ben Kurtovic
b3c98efd22
Fix a parsing bug involving deeply nested style tags ( fixes #224 )
4 лет назад
Ben Kurtovic
2a4e1f4316
Add contexts.describe() for debugging
4 лет назад
Ben Kurtovic
8c5f554406
Add guard against a rare crash in the C tokenizer
5 лет назад
Ben Kurtovic
fa98aad408
Bump copyright [skip ci]
5 лет назад
Ben Kurtovic
4775131717
Fix not memoizing bad routes after failing inside a table ( fixes #206 )
5 лет назад
Ben Kurtovic
708bee59e1
Minor cleanup
5 лет назад
Ben Kurtovic
83bcb902b8
Support manual construction of Node objects ( fixes #214 )
5 лет назад
Ben Kurtovic
2c206bc16b
Fix crash due to PyList_GET_SIZE being applied to a dict ( fixes #208 )
5 лет назад
Ben Kurtovic
6de7d41733
Fix signals getting stuck inside the C tokenizer ( #206 )
5 лет назад
Hugo
e457b39f32
Upgrade Python syntax with pyupgrade https://github.com/asottile/pyupgrade
5 лет назад
Kunal Mehta
e506380318
Add <wbr> to definitions.py
Added to MediaWiki in ff74113bea (T54468).
5 лет назад
Ben Kurtovic
86c805d59b
Don't get stuck in tags with unclosed quoted attributes ( fixes #190 ).
6 лет назад
Ben Kurtovic
cd4f90e663
Fix a rare parsing bug involving nested broken tags.
7 лет назад
Ben Kurtovic
5a99597eb3
Another C89 fix for MSVC.
7 лет назад
Ben Kurtovic
0ef6a2ffbe
Fix declarations for C89 compatibility (forgot MSVC needed that...)
7 лет назад
Ben Kurtovic
dc0b3ae446
Enable Windows builds on Python 3.6; try to fix again.
7 лет назад
Ben Kurtovic
6ad3b9fb2a
inttypes.h doesn't exist on Windows, so try using stdint.h
7 лет назад
Ben Kurtovic
2593675651
Remove stdbool.h from avl_tree since MSVC doesn't like it.
7 лет назад
Ben Kurtovic
6ee61789da
Fix compilation issue on Travis since GCC uses C90 by default there.
7 лет назад
Ben Kurtovic
8a9c9224be
Speed up parsing deeply nested syntax by caching bad routes ( fixes #42 )
Also removed the max cycles stop-gap, allowing much more complex pages
to be parsed quickly without losing nodes at the end
Also fixes #65 , fixes #102 , fixes #165 , fixes #183
Also fixes #81 (Rafael Nadal parsing bug)
Also fixes #53 , fixes #58 , fixes #88 , fixes #152 (duplicate issues)
7 лет назад
Ben Kurtovic
aaffb7f66b
Update copyright for 2016.
8 лет назад
Ben Kurtovic
8835ca313a
Don't preserve context when popping template key stack ( fixes #142 , hopefully).
8 лет назад
Ben Kurtovic
61b6b98470
Fix two parser bugs involving wikitable error handling.
8 лет назад
Ben Kurtovic
651b63b7f6
Windows fix ( #126 )
8 лет назад
Ben Kurtovic
23d97583bf
Fix regression in C tokenizer ( #125 )
8 лет назад
Ben Kurtovic
460199488f
Fix a couple sign compare issues.
8 лет назад
Ben Kurtovic
90bd12dd47
Fix a C tokenizer crash when parsing is interrupted ( fixes #97 )
8 лет назад
Ben Kurtovic
4f3ab48375
Edge cases involving wikilink -> external link fallback ( fixes #120 )
8 лет назад
Ben Kurtovic
8e7a600b51
Fix use-after-free bug.
9 лет назад
Ben Kurtovic
8963c1f683
Fix Textbuffer_reverse()
9 лет назад
Ben Kurtovic
1357da119d
Finish improved Unicode support for PEP 393.
9 лет назад
Ben Kurtovic
c1d4feea66
Py_UNICODE -> Unicode everywhere; bugfix for PEP 393.
9 лет назад
Ben Kurtovic
5eac0ab16f
More PEP 393 work; update Textbuffer interface and usage.
9 лет назад
Ben Kurtovic
2072a10b67
More reworking of CTokenizer Unicode support (incomplete)
9 лет назад
Ben Kurtovic
2a3a978986
Incomplete code for C tokenizer textbuffer.
9 лет назад
Ben Kurtovic
f16c7e25ca
Fully fix parsing templates with blank names, I hope ( #111 )
9 лет назад
John Vandenberg
ab0a58121a
Delay loading of pure Python tokenizer
9 лет назад
Ben Kurtovic
40fed91806
Fix C tokenizer leaking memory.
9 лет назад
Ben Kurtovic
7345a3742e
Fix a thread safety issue involving route state.
9 лет назад
Ben Kurtovic
2005efd309
Split up C tokenizer into tag_data, tok_parse, tok_support, tokens.
9 лет назад
Ben Kurtovic
0e547aa416
Begin splitting up C tokenizer.
9 лет назад
Ben Kurtovic
a8c0ff3f29
Remove stdint.h include for MSVC 2008.
9 лет назад
Ben Kurtovic
dad042bc2c
Fix C warnings in MSVC.
9 лет назад