74 Révisions (a73f618e0aec26efdc28b85ae824911e1b9536c1)

Auteur SHA1 Message Date
  Ben Kurtovic a73f618e0a Initial conversion to Python 3 il y a 3 ans
  Ben Kurtovic 92d43e566e copyvios: Missed a file il y a 3 ans
  Ben Kurtovic a49a82e263 Fix a few bugs il y a 3 ans
  Ben Kurtovic 2b5914b6ae Support parser-directed URL redirecting (for Wayback Machine PDFs) il y a 3 ans
  Ben Kurtovic 774628b34e OAuth support; switch to requests; update login flow il y a 5 ans
  Ben Kurtovic 7853bcc0f3 Fix dependency checking for search engines. il y a 8 ans
  Ben Kurtovic 04ed5257c7 Refactor search engines. il y a 8 ans
  Ben Kurtovic 977b587e5e Add support for Bing Search il y a 8 ans
  Ben Kurtovic 69cdb41d07 Adjust mirror hints to include direct links back to the article. il y a 8 ans
  Ben Kurtovic f92fb34d0e Improve sentence splitting, again. il y a 8 ans
  Ben Kurtovic 108eca13ac Finish mirror hinting algorithm. il y a 8 ans
  Ben Kurtovic 91846ce4fb Refactor out mirror hinting logic in source parsers. il y a 8 ans
  Ben Kurtovic 03910b6cb5 Add mirror detection logic to parsers; fixes. il y a 8 ans
  Ben Kurtovic 4e8be871b7 Update copyright year for 2015. il y a 9 ans
  Ben Kurtovic 5194525a32 Note when sources might have been missed. il y a 9 ans
  Ben Kurtovic 303c39c8c7 Add an option to disable short-circuiting. il y a 9 ans
  Ben Kurtovic 9fd145da5c Add some docs; better sorting function. il y a 9 ans
  Ben Kurtovic 7afb484cea Refactor a bunch of copyvio internals. Store all sources with a result object. il y a 9 ans
  Ben Kurtovic f94a67e0e3 Define num_queries in the proper place. il y a 9 ans
  Ben Kurtovic 12247dd756 Add no_links and no_searches to copyvio_check(). il y a 9 ans
  Ben Kurtovic c56838e742 Only spawn one worker for comparisons in local mode. il y a 9 ans
  Ben Kurtovic 7c0e98596c Some bugfixes. il y a 9 ans
  Ben Kurtovic 361f7709f8 Starting work on global workers. il y a 9 ans
  Ben Kurtovic bdcbfa5327 Catch errors around response.read(). il y a 9 ans
  Ben Kurtovic 24dd497fd9 Catch more general socket.error. il y a 9 ans
  Ben Kurtovic 5e72e74759 Employ new piecewise article-delta confidence function. il y a 9 ans
  Ben Kurtovic 203c65280c Float delta. il y a 9 ans
  Ben Kurtovic 6b0f8ad311 Fix reference. il y a 9 ans
  Ben Kurtovic e2d7c7aef6 Update with new confidence function; fix unicode. il y a 9 ans
  Ben Kurtovic 05010933c7 Reorder some URL opening code; zip protection. il y a 9 ans
  Ben Kurtovic 2bddf79a3d Fix deadlock when calling queue.put() while holding the mutex. il y a 9 ans
  Ben Kurtovic 7a4fcd7807 Fix queue clear call. il y a 9 ans
  Ben Kurtovic efae85a1fe Move thread spawning code to worker class. il y a 9 ans
  Ben Kurtovic 7137dda920 Update copyvio checker to not make concurrent requests to a single domain. il y a 9 ans
  Ben Kurtovic 5874467ec3 Bugfix, cleanup. il y a 9 ans
  Ben Kurtovic cc7ac52a05 Fix query counting. il y a 10 ans
  Ben Kurtovic d672e670fa Fix param name. il y a 10 ans
  Ben Kurtovic 0e28f89466 Update logging. il y a 10 ans
  Ben Kurtovic ae0c390ceb Redesign copyvio internals to parallelize URL loading/parsing. il y a 10 ans
  Ben Kurtovic 1501341000 Allow even more time for a URL to time out. il y a 10 ans
  Ben Kurtovic ccb3c022ca Some servers don't leave a space before the content type parameter list. il y a 10 ans
  Ben Kurtovic 5e9d4cfa78 copyvios: use a different timeout for direct URL comparisons. il y a 10 ans
  Ben Kurtovic ea14f39e73 Split content type correctly. il y a 10 ans
  Ben Kurtovic e0cd174310 Refactor out empty chain definitions. il y a 10 ans
  Ben Kurtovic 0eadf65a09 Only accept HTML and plain text for copyvio checks. il y a 10 ans
  Ben Kurtovic c3ddc3d35a Return the correct empty chain. il y a 10 ans
  Ben Kurtovic 39d5c7c149 Update copyright notices for 2014. il y a 10 ans
  Ben Kurtovic ed95c99f0e Update email address. il y a 10 ans
  Ben Kurtovic 5931f375de Put response.read() in the try:, since that's what throws the timeout. il y a 11 ans
  Ben Kurtovic 0b7a13eca5 Update copyright notices for 2013. il y a 11 ans