Ben Kurtovic
a73f618e0a
Initial conversion to Python 3
3 yıl önce
Ben Kurtovic
9d66ebc6b2
copyvios: Config-directed URL proxying
3 yıl önce
Ben Kurtovic
91846ce4fb
Refactor out mirror hinting logic in source parsers.
8 yıl önce
Ben Kurtovic
03910b6cb5
Add mirror detection logic to parsers; fixes.
8 yıl önce
Ben Kurtovic
bb819c9306
Explicitly include excluded URLs in the result set; mark as excluded.
8 yıl önce
Ben Kurtovic
4e8be871b7
Update copyright year for 2015.
9 yıl önce
Ben Kurtovic
5194525a32
Note when sources might have been missed.
9 yıl önce
Ben Kurtovic
30f72df470
Refactor parsers; fix empty document behavior.
9 yıl önce
Ben Kurtovic
f8f4669460
Remove unnecessary key attribute of sources.
9 yıl önce
Ben Kurtovic
9fd145da5c
Add some docs; better sorting function.
9 yıl önce
Ben Kurtovic
7afb484cea
Refactor a bunch of copyvio internals. Store all sources with a result object.
9 yıl önce
Ben Kurtovic
54ddff049f
Make CopyvioSource public; tweaks.
9 yıl önce
Ben Kurtovic
ae0c390ceb
Redesign copyvio internals to parallelize URL loading/parsing.
10 yıl önce
Ben Kurtovic
39d5c7c149
Update copyright notices for 2014.
10 yıl önce
Ben Kurtovic
ed95c99f0e
Update email address.
10 yıl önce
Ben Kurtovic
0b7a13eca5
Update copyright notices for 2013.
11 yıl önce
Ben Kurtovic
bcf9b70107
Keep track of how long generating results takes; support 'max_time'.
11 yıl önce
Ben Kurtovic
a4dda89a61
Various fixes for copyvios.
- Fix a bug in ExclusionsDB; improve URL regexes.
- NLTK's LookupError is actually an IOError.
- Fix bug in __repr__ for CopyvioCheckResult.
- Rewrite YahooBOSSSearchEngine to actually work with oauth2.
- Search engines now take a URL opener in addition to credentials.
11 yıl önce
Ben Kurtovic
a074da853b
More work on copyvios, including an exclusions database ( #5 )
* Added exclusions module with a fully implemented ExclusionsDB that can pull
from multiple sources for different sites.
* Moved CopyvioCheckResult to its own module, to be imported by __init__.
* Some other related changes.
12 yıl önce