Procházet zdrojové kódy

Try using pentagrams rather than trigrams for copyvio Markov chains.

tags/v0.2
Ben Kurtovic před 9 roky
rodič
revize
d741667c4c
2 změnil soubory, kde provedl 3 přidání a 2 odebrání
  1. +2
    -1
      CHANGELOG
  2. +1
    -1
      earwigbot/wiki/copyvios/markov.py

+ 2
- 1
CHANGELOG Zobrazit soubor

@@ -15,7 +15,8 @@ v0.2 (unreleased):
- Added copyvio detector functionality: specifying a max time for checks;
improved exclusion support. URL loading and parsing is parallelized to speed
up check times, with a multi-threaded worker model that avoids concurrent
requests to the same domain. Fixed assorted bugs.
requests to the same domain. Improvements to the comparison algorithm. Fixed
assorted bugs.
- Added support for Wikimedia Labs when creating a config file.
- Added and improved lazy importing for various dependencies.
- Fixed a bug in job scheduling.


+ 1
- 1
earwigbot/wiki/copyvios/markov.py Zobrazit soubor

@@ -30,7 +30,7 @@ class MarkovChain(object):
"""Implements a basic ngram Markov chain of words."""
START = -1
END = -2
degree = 3 # 2 for bigrams, 3 for trigrams, etc.
degree = 5 # 2 for bigrams, 3 for trigrams, etc.

def __init__(self, text):
self.text = text


Načítá se…
Zrušit
Uložit