Jakub Klinkovský
90061b6844
Fix parsing of section headings inside templates ( #233 )
Fixes #198
Co-authored-by: Ben Kurtovic <ben.kurtovic@gmail.com>
3 years ago
Kunal Mehta
7e5297fbe6
Drop Python 2 support
Fixes #221 .
4 years ago
Ben Kurtovic
b3c98efd22
Fix a parsing bug involving deeply nested style tags ( fixes #224 )
4 years ago
Ben Kurtovic
8c5f554406
Add guard against a rare crash in the C tokenizer
5 years ago
Ben Kurtovic
fa98aad408
Bump copyright [skip ci]
5 years ago
Ben Kurtovic
4775131717
Fix not memoizing bad routes after failing inside a table ( fixes #206 )
5 years ago
Ben Kurtovic
6de7d41733
Fix signals getting stuck inside the C tokenizer ( #206 )
6 years ago
Ben Kurtovic
86c805d59b
Don't get stuck in tags with unclosed quoted attributes ( fixes #190 ).
6 years ago
Ben Kurtovic
cd4f90e663
Fix a rare parsing bug involving nested broken tags.
7 years ago
Ben Kurtovic
0ef6a2ffbe
Fix declarations for C89 compatibility (forgot MSVC needed that...)
7 years ago
Ben Kurtovic
8a9c9224be
Speed up parsing deeply nested syntax by caching bad routes ( fixes #42 )
Also removed the max cycles stop-gap, allowing much more complex pages
to be parsed quickly without losing nodes at the end
Also fixes #65 , fixes #102 , fixes #165 , fixes #183
Also fixes #81 (Rafael Nadal parsing bug)
Also fixes #53 , fixes #58 , fixes #88 , fixes #152 (duplicate issues)
7 years ago
Ben Kurtovic
aaffb7f66b
Update copyright for 2016.
8 years ago
Ben Kurtovic
8835ca313a
Don't preserve context when popping template key stack ( fixes #142 , hopefully).
8 years ago
Ben Kurtovic
61b6b98470
Fix two parser bugs involving wikitable error handling.
8 years ago
Ben Kurtovic
460199488f
Fix a couple sign compare issues.
9 years ago
Ben Kurtovic
90bd12dd47
Fix a C tokenizer crash when parsing is interrupted ( fixes #97 )
9 years ago
Ben Kurtovic
4f3ab48375
Edge cases involving wikilink -> external link fallback ( fixes #120 )
9 years ago
Ben Kurtovic
c1d4feea66
Py_UNICODE -> Unicode everywhere; bugfix for PEP 393.
9 years ago
Ben Kurtovic
5eac0ab16f
More PEP 393 work; update Textbuffer interface and usage.
9 years ago
Ben Kurtovic
2072a10b67
More reworking of CTokenizer Unicode support (incomplete)
9 years ago
Ben Kurtovic
2a3a978986
Incomplete code for C tokenizer textbuffer.
9 years ago
Ben Kurtovic
40fed91806
Fix C tokenizer leaking memory.
9 years ago
Ben Kurtovic
2005efd309
Split up C tokenizer into tag_data, tok_parse, tok_support, tokens.
9 years ago
Ben Kurtovic
0e547aa416
Begin splitting up C tokenizer.
9 years ago
Ben Kurtovic
dad042bc2c
Fix C warnings in MSVC.
9 years ago
Ben Kurtovic
1d5bbbe25b
Disallow < and > in wikilink titles/template names ( fixes #104 )
9 years ago
Ben Kurtovic
e71e7b4ece
Update copyright years for 2015; fix whitespace in docs.
9 years ago
Ben Kurtovic
a00c645bd8
Fix handling of tag closes within <nowiki> ( fixes #89 ).
9 years ago
Ben Kurtovic
a15f6172c0
Minor bugfix.
10 years ago
Ben Kurtovic
9fc4b909e1
Refactor a lot of table error recovery code.
10 years ago
Ben Kurtovic
fb261450d8
Port tokenizer updates to C.
10 years ago
Ben Kurtovic
640005dbb2
Tokenizer cleanup; make inline table syntax invalid as it should be.
10 years ago
Ben Kurtovic
913ff590c8
Cleanup; add a missing test.
10 years ago
Ben Kurtovic
5d29bff918
Remove an incorrect usage of Py_XDECREF().
10 years ago
Ben Kurtovic
7489253e32
Break at 80 cols for most lines.
10 years ago
David Winegar
1a4c88e11f
Correctly handle no table endings
Tests were not correctly testing the situations without a table close.
Fixed tests and then fixed tokenizers for failing tests. Also refactored
pytokenizer to more closely match the ctokenizer by only holding the
`_parse` methods in the try blocks and no other code.
10 years ago
David Winegar
c63108039b
Fix C code to make declarations before statements
Python 3.4 compiles C extensions with the
`-Werror=declaration-after-statement` flag that enforces C90 more
strictly than previous versions. Move all statements after declarations
to make sure this extension builds on 3.4.
10 years ago
David Winegar
213c105666
Table tags are no longer self-closing
Table tags no longer self-closing. Rows and cells now contain their
contents. Also refactored out an `emit_table_tag` method.
Note: this will require changes to the Tag node and possibly the builder,
those changes will be in the next commit.
10 years ago
David Winegar
0128b1f78a
Implement CTokenizer for tables
CTokenizer is completely implemented in this commit - it didn't
make much sense to me to split it up. All tests passing, memory test
shows no leaks on Linux.
10 years ago
David Winegar
2d945b30e5
Use uint64_t for context
For the C tokenizer, include `<stdint.h>` and use `uint64_t` instead
of `int` for context. Changes to tables mean that context can be
larger than 32 bits, and it is possible for `int` to only have 16
bits anyways (though this is very unlikely).
10 years ago
Ben Kurtovic
6954480263
Fix template parsing when comments are inside the name ( fixes #59 ).
10 years ago
Ben Kurtovic
ded89fb14e
Add a few unit tests for untested code; remove a useless conditional.
10 years ago
Ben Kurtovic
b997e4cd71
Support attributes quoted with '; add required quotes in value setter.
10 years ago
Ben Kurtovic
a4c2fd023a
Remove some useless code in the tokenizers.
10 years ago
Ben Kurtovic
08cafc0576
Raise ParserError for internal problems. Improve coverage. Cleanup.
10 years ago
Ben Kurtovic
02eff0fc49
Fully fix #74 . Add another tokenizer test.
10 years ago
Ben Kurtovic
0497b54f03
Fix _handle_single_tag_end()'s token search order ( fixes #74 )
10 years ago
Ben Kurtovic
5c5fd6b3cb
Fix a bug involving nested links ( closes #61 and #62 ).
10 years ago
Ben Kurtovic
1312a1fb8a
Some clean up for Python 2.6 support.
* Removed unittest2 dependency on Python >2.6.
* Moved discover_tests.py into tests/.
* tokenizer.c: Fixed errors noted by -Wshorten-64-to-32.
10 years ago
Ben Kurtovic
e5f17eea00
Update copyright notices for 2014.
10 years ago