소스 검색

Fix regression in parsing nested wikilinks in file captions

This regression seems more severe than the bug the commit was
attempting to fix (incorrect parsing of nested wikilinks in normal
links), so that bug is reintroduced until localization-aware parsing
that allows us to detect file links is added.

This commit partially reverts fac60dee48.
tags/v0.6.4
Ben Kurtovic 2 년 전
부모
커밋
2155638b91
5개의 변경된 파일34개의 추가작업 그리고 20개의 파일을 삭제
  1. +4
    -1
      CHANGELOG
  2. +7
    -1
      docs/changelog.rst
  3. +6
    -4
      src/mwparserfromhell/parser/ctokenizer/tok_parse.c
  4. +3
    -2
      src/mwparserfromhell/parser/tokenizer.py
  5. +14
    -12
      tests/tokenizer/wikilinks.mwtest

+ 4
- 1
CHANGELOG 파일 보기

@@ -1,7 +1,10 @@
v0.7 (unreleased):
v0.6.4 (unreleased):

- Dropped support for end-of-life Python 3.5.
- Added support for Python 3.10. (#278)
- Fixed a regression in v0.6.2 that broke parsing of nested wikilinks in file
captions. For now, the parser will interpret nested wikilinks in normal links
as well, even though this differs from MediaWiki. (#270)

v0.6.3 (released September 2, 2021):



+ 7
- 1
docs/changelog.rst 파일 보기

@@ -1,14 +1,19 @@
Changelog
=========

v0.7
v0.6.4
------

Unreleased
(`changes <https://github.com/earwig/mwparserfromhell/compare/v0.6.3...develop>`__):

- Dropped support for end-of-life Python 3.5.
- Added support for Python 3.10.
(`#278 <https://github.com/earwig/mwparserfromhell/issues/278>`_)
- Fixed a regression in v0.6.2 that broke parsing of nested wikilinks in file
captions. For now, the parser will handle interpret wikilinks in normal links
as well, even though this differs from MediaWiki.
(`#270 <https://github.com/earwig/mwparserfromhell/issues/270>`_)

v0.6.3
------


+ 6
- 4
src/mwparserfromhell/parser/ctokenizer/tok_parse.c 파일 보기

@@ -51,7 +51,8 @@ static int Tokenizer_parse_tag(Tokenizer *);
/*
Determine whether the given code point is a marker.
*/
static int is_marker(Py_UCS4 this)
static int
is_marker(Py_UCS4 this)
{
int i;

@@ -2929,9 +2930,10 @@ Tokenizer_parse(Tokenizer *self, uint64_t context, int push)
return NULL;
}
} else if (this == next && next == '[' && Tokenizer_CAN_RECURSE(self)) {
if (this_context & LC_WIKILINK_TEXT) {
return Tokenizer_fail_route(self);
}
// TODO: Only do this if not in a file context:
// if (this_context & LC_WIKILINK_TEXT) {
// return Tokenizer_fail_route(self);
// }
if (!(this_context & AGG_NO_WIKILINKS)) {
if (Tokenizer_parse_wikilink(self)) {
return NULL;


+ 3
- 2
src/mwparserfromhell/parser/tokenizer.py 파일 보기

@@ -1406,8 +1406,9 @@ class Tokenizer:
return self._handle_argument_end()
self._emit_text("}")
elif this == nxt == "[" and self._can_recurse():
if self._context & contexts.WIKILINK_TEXT:
self._fail_route()
# TODO: Only do this if not in a file context:
# if self._context & contexts.WIKILINK_TEXT:
# self._fail_route()
if not self._context & contexts.NO_WIKILINKS:
self._parse_wikilink()
else:


+ 14
- 12
tests/tokenizer/wikilinks.mwtest 파일 보기

@@ -54,6 +54,20 @@ output: [WikilinkOpen(), Text(text="foo"), WikilinkSeparator(), Text(text="bar[b

---

name: nested
label: a wikilink nested within another
input: "[[file:foo|[[bar]]]]"
output: [WikilinkOpen(), Text(text="file:foo"), WikilinkSeparator(), WikilinkOpen(), Text(text="bar"), WikilinkClose(), WikilinkClose()]

---

name: nested_padding
label: a wikilink nested within another, separated by other data
input: "[[file:foo|a[[b]]c]]"
output: [WikilinkOpen(), Text(text="file:foo"), WikilinkSeparator(), Text(text="a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c"), WikilinkClose()]

---

name: invalid_newline
label: invalid wikilink: newline as only content
input: "[[\n]]"
@@ -89,20 +103,6 @@ output: [Text(text="[[foo[bar]]")]

---

name: invalid_nested_text
label: invalid wikilink: nested within the text of another
input: "[[foo|[[bar]]]]"
output: [Text(text="[[foo|"), WikilinkOpen(), Text(text="bar"), WikilinkClose(), Text(text="]]")]


name: invalid_nested_text_2
label: invalid wikilink: a wikilink nested within the text of another, with additional content
input: "[[foo|a[[b]]c]]"
output: [Text(text="[[foo|a"), WikilinkOpen(), Text(text="b"), WikilinkClose(), Text(text="c]]")]


name: invalid_nested_title
label: invalid wikilink: nested within the title of another
input: "[[foo[[bar]]]]"


불러오는 중...
취소
저장