David Winegar
0128b1f78a
Implement CTokenizer for tables
CTokenizer is completely implemented in this commit - it didn't
make much sense to me to split it up. All tests passing, memory test
shows no leaks on Linux.
10 jaren geleden
David Winegar
2d945b30e5
Use uint64_t for context
For the C tokenizer, include `<stdint.h>` and use `uint64_t` instead
of `int` for context. Changes to tables mean that context can be
larger than 32 bits, and it is possible for `int` to only have 16
bits anyways (though this is very unlikely).
10 jaren geleden
Ben Kurtovic
6954480263
Fix template parsing when comments are inside the name ( fixes #59 ).
10 jaren geleden
Ben Kurtovic
ded89fb14e
Add a few unit tests for untested code; remove a useless conditional.
10 jaren geleden
Ben Kurtovic
b997e4cd71
Support attributes quoted with '; add required quotes in value setter.
10 jaren geleden
Ben Kurtovic
a4c2fd023a
Remove some useless code in the tokenizers.
10 jaren geleden
Ben Kurtovic
08cafc0576
Raise ParserError for internal problems. Improve coverage. Cleanup.
10 jaren geleden
Ben Kurtovic
02eff0fc49
Fully fix #74 . Add another tokenizer test.
10 jaren geleden
Ben Kurtovic
0497b54f03
Fix _handle_single_tag_end()'s token search order ( fixes #74 )
10 jaren geleden
Ben Kurtovic
5c5fd6b3cb
Fix a bug involving nested links ( closes #61 and #62 ).
10 jaren geleden
Ben Kurtovic
1312a1fb8a
Some clean up for Python 2.6 support.
* Removed unittest2 dependency on Python >2.6.
* Moved discover_tests.py into tests/.
* tokenizer.c: Fixed errors noted by -Wshorten-64-to-32.
10 jaren geleden
Ben Kurtovic
e5f17eea00
Update copyright notices for 2014.
10 jaren geleden
Ben Kurtovic
1946cf621d
Add a temporary skip_style_tags until we resolve some issues.
10 jaren geleden
Ben Kurtovic
066049b46a
Update email address.
10 jaren geleden
Ben Kurtovic
38050f6878
C code cleanup and speed improvements.
10 jaren geleden
Ben Kurtovic
951a8737a5
Don't pass underlying context if this is a bracketed link.
11 jaren geleden
Ben Kurtovic
287bf71158
Condense code.
11 jaren geleden
Ben Kurtovic
1bf9868753
Proper sentinel handling with free links in the C tokenizer.
11 jaren geleden
Ben Kurtovic
77092e066c
Fix C tokenizer behavior re: some single_only tag edge cases.
11 jaren geleden
Ben Kurtovic
6784ff73bf
Fix an edge case when we recurse too deeply.
11 jaren geleden
Ben Kurtovic
4d04cae780
Fix a segfault with GCC.
11 jaren geleden
Ben Kurtovic
67f1762aa4
Doc updates, and allow passing a starting context to tokenize().
11 jaren geleden
Ben Kurtovic
f1b95758d6
Squash a memory leak.
11 jaren geleden
Ben Kurtovic
2561cf5b5e
Fix all bugs in C implementation of external links.
11 jaren geleden
Ben Kurtovic
c1b502bbe6
Finish external links implementation.
11 jaren geleden
Ben Kurtovic
7dcfa3fe92
Implement Tokenizer_really_parse_external_link(), some other fixes
11 jaren geleden
Ben Kurtovic
6ecf15cad4
Tokenizer_parse_external_link()
11 jaren geleden
Ben Kurtovic
a1948b06aa
Tokenizer_parse_bracketed/free_uri_scheme(), other adjustments
11 jaren geleden
Ben Kurtovic
7b84b3f0df
Refactor out C's is_marker(); hooks for ext links.
11 jaren geleden
Ben Kurtovic
d42e05a554
Implement improved wikilink handling.
11 jaren geleden
Ben Kurtovic
5e6e5b6301
tag_defs.py -> definitions.py; more outline stuff
11 jaren geleden
Ben Kurtovic
cbf67c7842
Add hooks for some ext link stuff; add a INVALID_LINK aggregate context.
11 jaren geleden
Ben Kurtovic
0d934f8ad1
Squash a couple memory leaks.
11 jaren geleden
Ben Kurtovic
8923d96a57
More unification.
11 jaren geleden
Ben Kurtovic
5e8e050ca3
A few tweaks; py3k support now complete.
11 jaren geleden
Ben Kurtovic
db86176c08
wiki_markup attr should be unicode, not bytes
11 jaren geleden
Ben Kurtovic
b5ec7f3beb
Fix py3k module importing; stick a bunch of macros in one place.
11 jaren geleden
Ben Kurtovic
e02ad8239f
Make load_entitydefs() work on Python 3.
11 jaren geleden
Ben Kurtovic
25d53cacf8
Begin porting C tokenizer to Python 3.
11 jaren geleden
Ben Kurtovic
be5d2cbb07
Support HTML entities inside parser-blacklisted tags ( closes #36 )
11 jaren geleden
Ben Kurtovic
ebf99d722c
Combine emit()/emit_first() internally.
11 jaren geleden
Ben Kurtovic
95efa7dde9
emit_FAST() -> emit(); emit() -> emit_kwargs()
11 jaren geleden
Ben Kurtovic
a07a96d4ba
Finish emit()'s kwargs version.
11 jaren geleden
Ben Kurtovic
6036dc9d62
Finish new emit_first() and emit_first_kwargs()
11 jaren geleden
Ben Kurtovic
51ac97de04
Make macros out of the failing/unsafe contexts.
11 jaren geleden
Ben Kurtovic
df9f7388b6
emit_FAST(), emit_first_FAST(); update comment parsing
11 jaren geleden
Ben Kurtovic
36180a9e47
To clarify usage, emit_text() -> emit_char() and emit_string() -> emit_text()
11 jaren geleden
Ben Kurtovic
c1379d5f21
Add a emit_string() as a shortcut; a bunch of minor cleanup.
11 jaren geleden
Ben Kurtovic
bbcb906f37
handle_dl_term()
11 jaren geleden
Ben Kurtovic
9993ffe8bf
handle_hr()
11 jaren geleden