959 Révisions (77514ee925db8acdbfc95bbfe42b359c4231a237)
 

Auteur SHA1 Message Date
  Ben Kurtovic 77514ee925 Add another PDF string substitution. il y a 10 ans
  Ben Kurtovic 0bdcbca8b0 Rudimentary solution for PDF parsing (closes earwig/copyvios#18) il y a 10 ans
  Ben Kurtovic 30f72df470 Refactor parsers; fix empty document behavior. il y a 10 ans
  Ben Kurtovic 5349179088 Fix parsing of plain text documents (earwig/copyvios#3) il y a 10 ans
  Ben Kurtovic f10908e34e Handle struct.error from GzipFile.read() (Python bug?) il y a 10 ans
  Ben Kurtovic 693cdc302f Catch errors while searching. il y a 10 ans
  Ben Kurtovic 303c39c8c7 Add an option to disable short-circuiting. il y a 10 ans
  Ben Kurtovic f8f4669460 Remove unnecessary key attribute of sources. il y a 10 ans
  Ben Kurtovic 9fd145da5c Add some docs; better sorting function. il y a 10 ans
  Ben Kurtovic 7afb484cea Refactor a bunch of copyvio internals. Store all sources with a result object. il y a 10 ans
  Ben Kurtovic e88d1c2c70 Fix lazy module behavior after failure. il y a 10 ans
  Ben Kurtovic 54ddff049f Make CopyvioSource public; tweaks. il y a 10 ans
  Ben Kurtovic 0438766ee4 Handle empty URLs better. il y a 10 ans
  Ben Kurtovic 2147207388 Remove unnecessary variable assign. il y a 10 ans
  Ben Kurtovic f94a67e0e3 Define num_queries in the proper place. il y a 10 ans
  Ben Kurtovic 12247dd756 Add no_links and no_searches to copyvio_check(). il y a 10 ans
  Ben Kurtovic f37621e5ec Use a deque for a FIFO instead of the python list LIFO. il y a 10 ans
  Ben Kurtovic 8e439e1eea source.join() now blocks when in the middle of processing. il y a 10 ans
  Ben Kurtovic dbb1ae5483 Handle empty queues correctly. Remove some log messages. il y a 10 ans
  Ben Kurtovic 2fa8aeba5b Fix a blocking issue. il y a 10 ans
  Ben Kurtovic c56838e742 Only spawn one worker for comparisons in local mode. il y a 10 ans
  Ben Kurtovic 939d8be08f Fix variable. il y a 10 ans
  Ben Kurtovic 3ed8837a3e Fix stopping queues in local mode. il y a 10 ans
  Ben Kurtovic de7576728f Fix dequeueing logic a bit. il y a 10 ans
  Ben Kurtovic b939262b11 Bugfix. il y a 10 ans
  Ben Kurtovic 32ef0fbf1f Add a bunch of temporary debugging code. il y a 10 ans
  Ben Kurtovic c7b3b7bc7f CopyvioSource.workspace should be public. il y a 10 ans
  Ben Kurtovic e73e626994 Some locks needed to be tightened. il y a 10 ans
  Ben Kurtovic 486c4692ed Remove _workers attr of workspaces. il y a 10 ans
  Ben Kurtovic 7c0e98596c Some bugfixes. il y a 10 ans
  Ben Kurtovic 361f7709f8 Starting work on global workers. il y a 10 ans
  Ben Kurtovic bdcbfa5327 Catch errors around response.read(). il y a 10 ans
  Ben Kurtovic 9b87e2e5f7 Fix trying to remove a node that was already removed. il y a 10 ans
  Ben Kurtovic 24dd497fd9 Catch more general socket.error. il y a 10 ans
  Ben Kurtovic 5e72e74759 Employ new piecewise article-delta confidence function. il y a 10 ans
  Ben Kurtovic 193f96451e Also strip <ref>s in ArticleTextParser.strip(). il y a 10 ans
  Ben Kurtovic c4dede1459 Reorder length check to potentially fix an empty-query bug. il y a 10 ans
  Ben Kurtovic 203c65280c Float delta. il y a 10 ans
  Ben Kurtovic 6b0f8ad311 Fix reference. il y a 10 ans
  Ben Kurtovic e2d7c7aef6 Update with new confidence function; fix unicode. il y a 10 ans
  Ben Kurtovic 05010933c7 Reorder some URL opening code; zip protection. il y a 10 ans
  Ben Kurtovic 4f5a22a2e5 Apparently oauth2 converts the query to unicode. il y a 10 ans
  Ben Kurtovic 5003c21ff6 Quoting the entire query works now. il y a 10 ans
  Ben Kurtovic 5677664476 Properly encode URL for the search engine. il y a 10 ans
  Ben Kurtovic 5890ee6e6a Don't quote_plus() the query. il y a 10 ans
  Ben Kurtovic 2bddf79a3d Fix deadlock when calling queue.put() while holding the mutex. il y a 10 ans
  Ben Kurtovic 7a4fcd7807 Fix queue clear call. il y a 10 ans
  Ben Kurtovic efae85a1fe Move thread spawning code to worker class. il y a 10 ans
  Ben Kurtovic 6a90efc812 Improve !threads command output. il y a 10 ans
  Ben Kurtovic 7137dda920 Update copyvio checker to not make concurrent requests to a single domain. il y a 10 ans