Ben Kurtovic
8cd0bdb322
Autoformat: black + clang-format + clang-tidy
il y a 2 ans
Ben Kurtovic
297bcb0cee
Move mwparserfromhell to src/ dir
il y a 3 ans
Kunal Mehta
7e5297fbe6
Drop Python 2 support
Fixes #221 .
il y a 4 ans
Ben Kurtovic
aaffb7f66b
Update copyright for 2016.
il y a 8 ans
Ben Kurtovic
2005efd309
Split up C tokenizer into tag_data, tok_parse, tok_support, tokens.
il y a 9 ans
Ben Kurtovic
0e547aa416
Begin splitting up C tokenizer.
il y a 9 ans
Ben Kurtovic
a8c0ff3f29
Remove stdint.h include for MSVC 2008.
il y a 9 ans
Ben Kurtovic
e71e7b4ece
Update copyright years for 2015; fix whitespace in docs.
il y a 9 ans
Ben Kurtovic
9fc4b909e1
Refactor a lot of table error recovery code.
il y a 9 ans
David Winegar
c802b1f814
Change context to uint64_t
One-line fix
il y a 9 ans
David Winegar
213c105666
Table tags are no longer self-closing
Table tags no longer self-closing. Rows and cells now contain their
contents. Also refactored out an `emit_table_tag` method.
Note: this will require changes to the Tag node and possibly the builder,
those changes will be in the next commit.
il y a 10 ans
David Winegar
0128b1f78a
Implement CTokenizer for tables
CTokenizer is completely implemented in this commit - it didn't
make much sense to me to split it up. All tests passing, memory test
shows no leaks on Linux.
il y a 10 ans
David Winegar
2d945b30e5
Use uint64_t for context
For the C tokenizer, include `<stdint.h>` and use `uint64_t` instead
of `int` for context. Changes to tables mean that context can be
larger than 32 bits, and it is possible for `int` to only have 16
bits anyways (though this is very unlikely).
il y a 10 ans
Ben Kurtovic
b997e4cd71
Support attributes quoted with '; add required quotes in value setter.
il y a 10 ans
Ben Kurtovic
08cafc0576
Raise ParserError for internal problems. Improve coverage. Cleanup.
il y a 10 ans
Ben Kurtovic
5c5fd6b3cb
Fix a bug involving nested links ( closes #61 and #62 ).
il y a 10 ans
Ben Kurtovic
e5f17eea00
Update copyright notices for 2014.
il y a 10 ans
Ben Kurtovic
1946cf621d
Add a temporary skip_style_tags until we resolve some issues.
il y a 10 ans
Ben Kurtovic
066049b46a
Update email address.
il y a 10 ans
Ben Kurtovic
38050f6878
C code cleanup and speed improvements.
il y a 10 ans
Ben Kurtovic
1bf9868753
Proper sentinel handling with free links in the C tokenizer.
il y a 10 ans
Ben Kurtovic
fcdc0abd22
Fix autofail contexts.
il y a 10 ans
Ben Kurtovic
2561cf5b5e
Fix all bugs in C implementation of external links.
il y a 10 ans
Ben Kurtovic
7dcfa3fe92
Implement Tokenizer_really_parse_external_link(), some other fixes
il y a 10 ans
Ben Kurtovic
6ecf15cad4
Tokenizer_parse_external_link()
il y a 10 ans
Ben Kurtovic
a1948b06aa
Tokenizer_parse_bracketed/free_uri_scheme(), other adjustments
il y a 10 ans
Ben Kurtovic
7b84b3f0df
Refactor out C's is_marker(); hooks for ext links.
il y a 10 ans
Ben Kurtovic
d42e05a554
Implement improved wikilink handling.
il y a 10 ans
Ben Kurtovic
5e6e5b6301
tag_defs.py -> definitions.py; more outline stuff
il y a 10 ans
Ben Kurtovic
cbf67c7842
Add hooks for some ext link stuff; add a INVALID_LINK aggregate context.
il y a 10 ans
Ben Kurtovic
8923d96a57
More unification.
il y a 10 ans
Ben Kurtovic
b5ec7f3beb
Fix py3k module importing; stick a bunch of macros in one place.
il y a 10 ans
Ben Kurtovic
25d53cacf8
Begin porting C tokenizer to Python 3.
il y a 10 ans
Ben Kurtovic
ebf99d722c
Combine emit()/emit_first() internally.
il y a 10 ans
Ben Kurtovic
51ac97de04
Make macros out of the failing/unsafe contexts.
il y a 10 ans
Ben Kurtovic
df9f7388b6
emit_FAST(), emit_first_FAST(); update comment parsing
il y a 10 ans
Ben Kurtovic
c20d3f2a6a
handle_list_marker() and handle_list()
il y a 10 ans
Ben Kurtovic
9b98907751
Add C hooks and prototypes for wiki-markup tags.
il y a 10 ans
Ben Kurtovic
4663563ce4
Remove unnecessary markers.
il y a 11 ans
Ben Kurtovic
e83f321340
Rearrange functions; remove useless prototypes.
il y a 11 ans
Ben Kurtovic
e3fc27c9e3
Refactor TagData code into dedicated functions.
il y a 11 ans
Ben Kurtovic
d02a6da81e
Implement Tokenizer_handle_tag_space(); refactor textbuffer writing.
- Add a test for very long strings of text.
il y a 11 ans
Ben Kurtovic
9365fcf6e4
Implement Tokenizer_handle_tag_data(); add a read-backwards macro.
il y a 11 ans
Ben Kurtovic
e636bf77cf
Implement Tokenizer_push_tag_buffer()
il y a 11 ans
Ben Kurtovic
653071379b
Finish porting misc changes; add prototypes for remaining functions.
il y a 11 ans
Ben Kurtovic
aca0f78cd7
Port more Python tokenizer updates to C.
il y a 11 ans
Ben Kurtovic
f67cf46900
Start C port of tag tokenization; refactor the init func.
il y a 11 ans
Ben
a689467577
Replace broken log2 function; add a missing comment.
il y a 11 ans
Ben Kurtovic
9ede1121ba
Fix tokenizer.c on Windows; add another template test ( #25 )
Mostly by @gdooms , with tweaks.
il y a 11 ans
Ben Kurtovic
debcb6577e
Fix recursion issues by giving up at a certain point ( closes #16 ).
- Stop parsing new templates if the template depth gets above
MAX_DEPTH (40) or if we've already tried to parse over MAX_CYCLES
(100,000) templates.
- Add two tests to ensure recursion works somewhat correctly.
- Fix parsing the string "{{" with the Python tokenizer; add a test.
il y a 11 ans