Tests were not correctly testing the situations without a table close.
Fixed tests and then fixed tokenizers for failing tests. Also refactored
pytokenizer to more closely match the ctokenizer by only holding the
`_parse` methods in the try blocks and no other code.
Python 3.4 compiles C extensions with the
`-Werror=declaration-after-statement` flag that enforces C90 more
strictly than previous versions. Move all statements after declarations
to make sure this extension builds on 3.4.
For wiki syntax tables, add `wiki_style_separator` as an attribute
for the Tag node. Also reorder `closing_wiki_markup` property and tests
to match its place in the constructor.
Table tags no longer self-closing. Rows and cells now contain their
contents. Also refactored out an `emit_table_tag` method.
Note: this will require changes to the Tag node and possibly the builder,
those changes will be in the next commit.
CTokenizer is completely implemented in this commit - it didn't
make much sense to me to split it up. All tests passing, memory test
shows no leaks on Linux.
For the C tokenizer, include `<stdint.h>` and use `uint64_t` instead
of `int` for context. Changes to tables mean that context can be
larger than 32 bits, and it is possible for `int` to only have 16
bits anyways (though this is very unlikely).
Various changes to avoid returning tuples - working on the C tokenizer
made me realize this was a bad idea for compatability/similarity between
the two.
Make sure py tokenizer methods only call methods that have been declared
earlier. Not necessary but makes it much easier to maintain/write
the C tokenizer if methods are in the same order.
Removed the `StopIteration()` exception for handling table style
and instead call `_handle_table_cell_end()` with a new parameter.
Also added some random tests for table openings.
Changed row recursion handling to make sure the tag is emitted even
when hitting recursion limits. Need to test table recursion to make
sure that works. Also fixed a bug in which tables were eating the
trailing token. Added several tests for rows and trailing tokens with
tables.
Tables and rows use newlines as padding, partly because these characters
are pretty important to the integrity of the table. They might need
to be in the preceding whitespace of inner tags instead as padding after,
not sure.