瀏覽代碼

copyvios: Catch PDF parser exceptions more aggressively.

tags/v0.3
Ben Kurtovic 5 年之前
父節點
當前提交
42a224f365
共有 1 個檔案被更改,包括 1 行新增3 行删除
  1. +1
    -3
      earwigbot/wiki/copyvios/parsers.py

+ 1
- 3
earwigbot/wiki/copyvios/parsers.py 查看文件

@@ -34,8 +34,6 @@ nltk = importer.new("nltk")
converter = importer.new("pdfminer.converter")
pdfinterp = importer.new("pdfminer.pdfinterp")
pdfpage = importer.new("pdfminer.pdfpage")
pdftypes = importer.new("pdfminer.pdftypes")
psparser = importer.new("pdfminer.psparser")

__all__ = ["ArticleTextParser", "get_parser"]

@@ -294,7 +292,7 @@ class _PDFParser(_BaseTextParser):
pages = pdfpage.PDFPage.get_pages(StringIO(self.text))
for page in pages:
interp.process_page(page)
except (pdftypes.PDFException, psparser.PSException, AssertionError):
except Exception: # pylint: disable=broad-except
return output.getvalue().decode("utf8")
finally:
conv.close()


Loading…
取消
儲存