diff --git a/README.rst b/README.rst index b7d324c..6fd3be5 100644 --- a/README.rst +++ b/README.rst @@ -113,23 +113,38 @@ saving the page!) by calling ``str()`` on it:: Likewise, use ``unicode(code)`` in Python 2. -Caveats -------- +Limitations +----------- +While the MediaWiki parser generates HTML, mwparserfromhell acts as an interface to +the source code. mwparserfromhell therefore is unaware of template definitions since +if it would substitute templates with their output you would no longer be working +with the source code. This has several implications: + +* Start and end tags generated by templates aren't recognized e.g. ``foobar{{bold-end}}``. + +* Templates adjacent to external links e.g. ``http://example.com{{foo}}`` are + considered part of the link. + +* Crossed constructs like ``{{echo|''Hello}}, world!''`` are not supported, + the first node is treated as plain text. + + The current workaround for cases where you are not interested in text + formatting is to pass ``skip_style_tags=True`` to ``mwparserfromhell.parse()``. + This treats ``''`` and ``'''`` like plain text. + + A future version of mwparserfromhell will include multiple parsing modes to get + around this restriction. + +Configuration unawareness +------------------------- -An inherent limitation in wikicode prevents us from generating complete parse -trees in certain cases. For example, the string ``{{echo|''Hello}}, world!''`` -produces the valid output ``Hello, world!`` in MediaWiki, assuming -``{{echo}}`` is a template that returns its first parameter. But since -representing this in mwparserfromhell's node tree would be impossible, we -compromise by treating the first node (i.e., the template) as plain text, -parsing only the italics. +* `word-ending links`_ are not supported since the linktrail rules are language-specific. -The current workaround for cases where you are not interested in text -formatting is to pass ``skip_style_tags=True`` to ``mwparserfromhell.parse()``. -This treats ``''`` and ``'''`` like plain text. +* Localized namespace names aren't recognized, e.g. ``[[File:...]]`` + links are treated as regular wikilinks. -A future version of mwparserfromhell will include multiple parsing modes to get -around this restriction. +* Anything that looks like an XML tag is parsed as a tag + since the available tags are extension-dependent. Integration ----------- @@ -174,6 +189,7 @@ Python 3 code (via the API_):: .. _GitHub: https://github.com/earwig/mwparserfromhell .. _Python Package Index: http://pypi.python.org .. _get pip: http://pypi.python.org/pypi/pip +.. _word-ending links: https://www.mediawiki.org/wiki/Help:Links#linktrail .. _EarwigBot: https://github.com/earwig/earwigbot .. _Pywikibot: https://www.mediawiki.org/wiki/Manual:Pywikibot .. _API: http://mediawiki.org/wiki/API