@@ -1,12 +1,20 @@ | |||||
v0.4 (unreleased): | v0.4 (unreleased): | ||||
- Copyvio detector: improved parsing of excluded URL lists. | |||||
- Wiki: fixed not sending Content-Type header in POST requests. | |||||
- Migrated to Python 3. | |||||
- Copyvios: Configurable proxy support for specific domains. | |||||
- Copyvios: Parser-directed URL redirection. | |||||
- Copyvios: General parsing improvements. | |||||
- Copyvios: URL exclusion improvements. | |||||
- Copyvios: Removed long-deprecated Yahoo! BOSS search engine. | |||||
- Wiki: Fixed not sending Content-Type header in POST requests. | |||||
- IRC: Remember joined channels across restarts. | |||||
- IRC: Added !listchans. | |||||
- IRC > !stalk: Added modifiers to change message format or filter messages. | |||||
v0.3 (released March 24, 2019): | v0.3 (released March 24, 2019): | ||||
- Added various new features to the WikiProjectTagger task. | - Added various new features to the WikiProjectTagger task. | ||||
- Copyvio detector: improved sentence splitting algorithm; many performance | |||||
- Copyvio detector: Improved sentence splitting algorithm; many performance | |||||
improvements. | improvements. | ||||
- Improved config file command/task exclusion logic. | - Improved config file command/task exclusion logic. | ||||
- Wiki: Added logging for warnings. | - Wiki: Added logging for warnings. | ||||
@@ -9,16 +9,15 @@ online at PyPI_). | |||||
History | History | ||||
------- | ------- | ||||
Development began, based on the `Pywikipedia framework`_, in early 2009. | |||||
Approval for its first task, a `copyright violation detector`_, was carried out | |||||
in May, and the bot has been running consistently ever since (with the | |||||
exception of Jan/Feb 2011). It currently handles `several ongoing tasks`_ | |||||
ranging from statistics generation to category cleanup, and on-demand tasks | |||||
such as WikiProject template tagging. Since it started running, the bot has | |||||
made over 50,000 edits. | |||||
Development began, based on `Pywikibot`_, in early 2009. Approval for its | |||||
first task, a `copyright violation detector`_, was carried out in May, and the | |||||
bot has been running consistently ever since (with the exception of Jan/Feb 2011). | |||||
It currently handles `several ongoing tasks`_ ranging from statistics generation | |||||
to category cleanup, and on-demand tasks such as WikiProject template tagging. | |||||
Since it started running, the bot has made over 250,000 edits. | |||||
A project to rewrite it from scratch began in early April 2011, thus moving | A project to rewrite it from scratch began in early April 2011, thus moving | ||||
away from the Pywikipedia framework and allowing for less overall code, better | |||||
away from the Pywikibot framework and allowing for less overall code, better | |||||
integration between bot parts, and easier maintenance. | integration between bot parts, and easier maintenance. | ||||
Installation | Installation | ||||
@@ -181,21 +180,21 @@ Footnotes | |||||
.. [1] ``python setup.py install``/``develop`` may require root, or use the | .. [1] ``python setup.py install``/``develop`` may require root, or use the | ||||
``--user`` switch to install for the current user only. | ``--user`` switch to install for the current user only. | ||||
.. _EarwigBot: http://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _Python: http://python.org/ | |||||
.. _Wikipedia: http://en.wikipedia.org/ | |||||
.. _IRC: http://en.wikipedia.org/wiki/Internet_Relay_Chat | |||||
.. _PyPI: http://packages.python.org/earwigbot | |||||
.. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/ | |||||
.. _copyright violation detector: http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/EarwigBot_1 | |||||
.. _several ongoing tasks: http://en.wikipedia.org/wiki/User:EarwigBot#Tasks | |||||
.. _my instance of EarwigBot: http://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _EarwigBot: https://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _Python: https://python.org/ | |||||
.. _Wikipedia: https://en.wikipedia.org/ | |||||
.. _IRC: https://en.wikipedia.org/wiki/Internet_Relay_Chat | |||||
.. _PyPI: https://packages.python.org/earwigbot | |||||
.. _Pywikibot: https://www.mediawiki.org/wiki/Manual:Pywikibot | |||||
.. _copyright violation detector: https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/EarwigBot_1 | |||||
.. _several ongoing tasks: https://en.wikipedia.org/wiki/User:EarwigBot#Tasks | |||||
.. _my instance of EarwigBot: https://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _earwigbot-plugins: https://github.com/earwig/earwigbot-plugins | .. _earwigbot-plugins: https://github.com/earwig/earwigbot-plugins | ||||
.. _Python Package Index: https://pypi.python.org/pypi/earwigbot | .. _Python Package Index: https://pypi.python.org/pypi/earwigbot | ||||
.. _get pip: http://pypi.python.org/pypi/pip | |||||
.. _this StackOverflow post: http://stackoverflow.com/questions/6504810/how-to-install-lxml-on-ubuntu/6504860#6504860 | |||||
.. _git flow: http://nvie.com/posts/a-successful-git-branching-model/ | |||||
.. _explanation of YAML: http://en.wikipedia.org/wiki/YAML | |||||
.. _get pip: https://pypi.python.org/pypi/pip | |||||
.. _this StackOverflow post: https://stackoverflow.com/questions/6504810/how-to-install-lxml-on-ubuntu/6504860#6504860 | |||||
.. _git flow: https://nvie.com/posts/a-successful-git-branching-model/ | |||||
.. _explanation of YAML: https://en.wikipedia.org/wiki/YAML | |||||
.. _earwigbot.bot.Bot: https://github.com/earwig/earwigbot/blob/develop/earwigbot/bot.py | .. _earwigbot.bot.Bot: https://github.com/earwig/earwigbot/blob/develop/earwigbot/bot.py | ||||
.. _earwigbot.config.BotConfig: https://github.com/earwig/earwigbot/blob/develop/earwigbot/config.py | .. _earwigbot.config.BotConfig: https://github.com/earwig/earwigbot/blob/develop/earwigbot/config.py | ||||
.. _earwigbot.commands.Command: https://github.com/earwig/earwigbot/blob/develop/earwigbot/commands/__init__.py | .. _earwigbot.commands.Command: https://github.com/earwig/earwigbot/blob/develop/earwigbot/commands/__init__.py | ||||
@@ -7,25 +7,25 @@ over IRC_. | |||||
History | History | ||||
------- | ------- | ||||
Development began, based on the `Pywikipedia framework`_, in early 2009. | |||||
Approval for its fist task, a `copyright violation detector`_, was carried out | |||||
in May, and the bot has been running consistently ever since (with the | |||||
exception of Jan/Feb 2011). It currently handles `several ongoing tasks`_ | |||||
ranging from statistics generation to category cleanup, and on-demand tasks | |||||
such as WikiProject template tagging. Since it started running, the bot has | |||||
made over 50,000 edits. | |||||
Development began, based on `Pywikibot`_, in early 2009. Approval for its | |||||
first task, a `copyright violation detector`_, was carried out in May, and the | |||||
bot has been running consistently ever since (with the exception of Jan/Feb 2011). | |||||
It currently handles `several ongoing tasks`_ ranging from statistics generation | |||||
to category cleanup, and on-demand tasks such as WikiProject template tagging. | |||||
Since it started running, the bot has made over 250,000 edits. | |||||
A project to rewrite it from scratch began in early April 2011, thus moving | A project to rewrite it from scratch began in early April 2011, thus moving | ||||
away from the Pywikipedia framework and allowing for less overall code, better | |||||
away from the Pywikibot framework and allowing for less overall code, better | |||||
integration between bot parts, and easier maintenance. | integration between bot parts, and easier maintenance. | ||||
.. _EarwigBot: http://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _Python: http://python.org/ | |||||
.. _Wikipedia: http://en.wikipedia.org/ | |||||
.. _IRC: http://en.wikipedia.org/wiki/Internet_Relay_Chat | |||||
.. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/ | |||||
.. _copyright violation detector: http://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/EarwigBot_1 | |||||
.. _several ongoing tasks: http://en.wikipedia.org/wiki/User:EarwigBot#Tasks | |||||
.. _EarwigBot: https://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _Python: https://python.org/ | |||||
.. _Wikipedia: https://en.wikipedia.org/ | |||||
.. _IRC: https://en.wikipedia.org/wiki/Internet_Relay_Chat | |||||
.. _PyPI: https://packages.python.org/earwigbot | |||||
.. _Pywikibot: https://www.mediawiki.org/wiki/Manual:Pywikibot | |||||
.. _copyright violation detector: https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/EarwigBot_1 | |||||
.. _several ongoing tasks: https://en.wikipedia.org/wiki/User:EarwigBot#Tasks | |||||
Contents | Contents | ||||
-------- | -------- | ||||
@@ -50,9 +50,9 @@ features (``feature/*`` branches):: | |||||
or use the :command:`--user` switch to install for the current user | or use the :command:`--user` switch to install for the current user | ||||
only. | only. | ||||
.. _my instance of EarwigBot: http://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _my instance of EarwigBot: https://en.wikipedia.org/wiki/User:EarwigBot | |||||
.. _earwigbot-plugins: https://github.com/earwig/earwigbot-plugins | .. _earwigbot-plugins: https://github.com/earwig/earwigbot-plugins | ||||
.. _Python Package Index: http://pypi.python.org | |||||
.. _get pip: http://pypi.python.org/pypi/pip | |||||
.. _this StackOverflow post: http://stackoverflow.com/questions/6504810/how-to-install-lxml-on-ubuntu/6504860#6504860 | |||||
.. _git flow: http://nvie.com/posts/a-successful-git-branching-model/ | |||||
.. _Python Package Index: https://pypi.python.org/pypi/earwigbot | |||||
.. _get pip: https://pypi.python.org/pypi/pip | |||||
.. _this StackOverflow post: https://stackoverflow.com/questions/6504810/how-to-install-lxml-on-ubuntu/6504860#6504860 | |||||
.. _git flow: https://nvie.com/posts/a-successful-git-branching-model/ |
@@ -25,4 +25,4 @@ You can stop the bot at any time with :kbd:`Control-c`, same as you stop a | |||||
normal Python program, and it will try to exit safely. You can also use the | normal Python program, and it will try to exit safely. You can also use the | ||||
"``!quit``" command on IRC. | "``!quit``" command on IRC. | ||||
.. _explanation of YAML: http://en.wikipedia.org/wiki/YAML | |||||
.. _explanation of YAML: https://en.wikipedia.org/wiki/YAML |
@@ -40,7 +40,7 @@ Tips | |||||
<earwigbot.managers._ResourceManager.load>` and | <earwigbot.managers._ResourceManager.load>` and | ||||
:py:meth:`bot.tasks.load() <earwigbot.managers._ResourceManager.load>`! | :py:meth:`bot.tasks.load() <earwigbot.managers._ResourceManager.load>`! | ||||
.. _logging: http://docs.python.org/library/logging.html | |||||
.. _logging: https://docs.python.org/library/logging.html | |||||
.. _!git plugin: https://github.com/earwig/earwigbot-plugins/blob/develop/commands/git.py | .. _!git plugin: https://github.com/earwig/earwigbot-plugins/blob/develop/commands/git.py | ||||
.. _Let me know: ben.kurtovic@gmail.com | .. _Let me know: ben.kurtovic@gmail.com | ||||
.. _create an issue: https://github.com/earwig/earwigbot/issues | .. _create an issue: https://github.com/earwig/earwigbot/issues |
@@ -1,7 +1,7 @@ | |||||
The Wiki Toolset | The Wiki Toolset | ||||
================ | ================ | ||||
EarwigBot's answer to the `Pywikipedia framework`_ is the Wiki Toolset | |||||
EarwigBot's answer to `Pywikibot`_ is the Wiki Toolset | |||||
(:py:mod:`earwigbot.wiki`), which you will mainly access through | (:py:mod:`earwigbot.wiki`), which you will mainly access through | ||||
:py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>`. | :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>`. | ||||
@@ -43,7 +43,7 @@ wikis, you can usually use code like this:: | |||||
try: | try: | ||||
site = bot.wiki.get_site(project=project, lang=lang) | site = bot.wiki.get_site(project=project, lang=lang) | ||||
except earwigbot.SiteNotFoundError: | except earwigbot.SiteNotFoundError: | ||||
# Load site info from http://es.wikipedia.org/w/api.php: | |||||
# Load site info from https://es.wikipedia.org/w/api.php: | |||||
site = bot.wiki.add_site(project=project, lang=lang) | site = bot.wiki.add_site(project=project, lang=lang) | ||||
This works because EarwigBot assumes that the URL for the site is | This works because EarwigBot assumes that the URL for the site is | ||||
@@ -56,8 +56,8 @@ like:: | |||||
try: | try: | ||||
site = bot.wiki.get_site(project=project, lang=lang) | site = bot.wiki.get_site(project=project, lang=lang) | ||||
except earwigbot.SiteNotFoundError: | except earwigbot.SiteNotFoundError: | ||||
# Load site info from http://mysite.net/mywiki/it/s/api.php: | |||||
base_url = "http://mysite.net/" + project + "/" + lang | |||||
# Load site info from https://mysite.net/mywiki/it/s/api.php: | |||||
base_url = "https://mysite.net/" + project + "/" + lang | |||||
db_name = lang + project + "_p" | db_name = lang + project + "_p" | ||||
sql = {host: "sql.mysite.net", db: db_name} | sql = {host: "sql.mysite.net", db: db_name} | ||||
site = bot.wiki.add_site(base_url=base_url, script_path="/s", sql=sql) | site = bot.wiki.add_site(base_url=base_url, script_path="/s", sql=sql) | ||||
@@ -242,6 +242,6 @@ docstrings`_ to learn how to use it in a more hands-on fashion. For reference, | |||||
:py:class:`earwigbot.wiki.SitesDB <earwigbot.wiki.sitesdb.SitesDB>` tied to the | :py:class:`earwigbot.wiki.SitesDB <earwigbot.wiki.sitesdb.SitesDB>` tied to the | ||||
:file:`sites.db` file in the bot's working directory. | :file:`sites.db` file in the bot's working directory. | ||||
.. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/ | |||||
.. _CentralAuth: http://www.mediawiki.org/wiki/Extension:CentralAuth | |||||
.. _Pywikibot: https://www.mediawiki.org/wiki/Manual:Pywikibot | |||||
.. _CentralAuth: https://www.mediawiki.org/wiki/Extension:CentralAuth | |||||
.. _its code and docstrings: https://github.com/earwig/earwigbot/tree/develop/earwigbot/wiki | .. _its code and docstrings: https://github.com/earwig/earwigbot/tree/develop/earwigbot/wiki |
@@ -26,7 +26,7 @@ Wikipedia and interacts with people over IRC. | |||||
See :file:`README.rst` for an overview, or the :file:`docs/` directory for | See :file:`README.rst` for an overview, or the :file:`docs/` directory for | ||||
details. This documentation is also available `online | details. This documentation is also available `online | ||||
<http://packages.python.org/earwigbot>`_. | |||||
<https://packages.python.org/earwigbot>`_. | |||||
""" | """ | ||||
__author__ = "Ben Kurtovic" | __author__ = "Ben Kurtovic" | ||||
@@ -32,7 +32,7 @@ from earwigbot.wiki import SitesDB | |||||
__all__ = ["Bot"] | __all__ = ["Bot"] | ||||
class Bot(object): | |||||
class Bot: | |||||
""" | """ | ||||
**EarwigBot: Main Bot Class** | **EarwigBot: Main Bot Class** | ||||
@@ -147,7 +147,7 @@ class Bot(object): | |||||
advance warning of their forced shutdown. | advance warning of their forced shutdown. | ||||
""" | """ | ||||
tasks = [] | tasks = [] | ||||
component_names = self.config.components.keys() | |||||
component_names = list(self.config.components.keys()) | |||||
skips = component_names + ["MainThread", "reminder", "irc:quit"] | skips = component_names + ["MainThread", "reminder", "irc:quit"] | ||||
for thread in enumerate_threads(): | for thread in enumerate_threads(): | ||||
if thread.is_alive() and not any( | if thread.is_alive() and not any( | ||||
@@ -22,7 +22,7 @@ | |||||
__all__ = ["Command"] | __all__ = ["Command"] | ||||
class Command(object): | |||||
class Command: | |||||
""" | """ | ||||
**EarwigBot: Base IRC Command** | **EarwigBot: Base IRC Command** | ||||
@@ -54,9 +54,8 @@ class Command(object): | |||||
This is called once when the command is loaded (from | This is called once when the command is loaded (from | ||||
:py:meth:`commands.load() <earwigbot.managers._ResourceManager.load>`). | :py:meth:`commands.load() <earwigbot.managers._ResourceManager.load>`). | ||||
*bot* is out base :py:class:`~earwigbot.bot.Bot` object. Don't override | *bot* is out base :py:class:`~earwigbot.bot.Bot` object. Don't override | ||||
this directly; if you do, remember to place | |||||
``super(Command, self).__init()`` first. Use :py:meth:`setup` for | |||||
typical command-init/setup needs. | |||||
this directly; if you do, remember to place ``super().__init()`` first. | |||||
Use :py:meth:`setup` for typical command-init/setup needs. | |||||
""" | """ | ||||
self.bot = bot | self.bot = bot | ||||
self.config = bot.config | self.config = bot.config | ||||
@@ -21,12 +21,13 @@ | |||||
# SOFTWARE. | # SOFTWARE. | ||||
import re | import re | ||||
import urllib | |||||
import urllib.request | |||||
import urllib.parse | |||||
from earwigbot.commands import Command | from earwigbot.commands import Command | ||||
class Calc(Command): | class Calc(Command): | ||||
"""A somewhat advanced calculator: see http://futureboy.us/fsp/frink.fsp | |||||
"""A somewhat advanced calculator: see https://futureboy.us/fsp/frink.fsp | |||||
for details.""" | for details.""" | ||||
name = "calc" | name = "calc" | ||||
@@ -38,9 +39,9 @@ class Calc(Command): | |||||
query = ' '.join(data.args) | query = ' '.join(data.args) | ||||
query = self.cleanup(query) | query = self.cleanup(query) | ||||
url = "http://futureboy.us/fsp/frink.fsp?fromVal={0}" | |||||
url = url.format(urllib.quote(query)) | |||||
result = urllib.urlopen(url).read() | |||||
url = "https://futureboy.us/fsp/frink.fsp?fromVal={0}" | |||||
url = url.format(urllib.parse.quote(query)) | |||||
result = urllib.request.urlopen(url).read().decode() | |||||
r_result = re.compile(r'(?i)<A NAME=results>(.*?)</A>') | r_result = re.compile(r'(?i)<A NAME=results>(.*?)</A>') | ||||
r_tag = re.compile(r'<\S+.*?>') | r_tag = re.compile(r'<\S+.*?>') | ||||
@@ -64,13 +65,14 @@ class Calc(Command): | |||||
res = "%s = %s" % (query, result) | res = "%s = %s" % (query, result) | ||||
self.reply(data, res) | self.reply(data, res) | ||||
def cleanup(self, query): | |||||
@staticmethod | |||||
def cleanup(query): | |||||
fixes = [ | fixes = [ | ||||
(' in ', ' -> '), | (' in ', ' -> '), | ||||
(' over ', ' / '), | (' over ', ' / '), | ||||
(u'£', 'GBP '), | |||||
(u'€', 'EUR '), | |||||
('\$', 'USD '), | |||||
('£', 'GBP '), | |||||
('€', 'EUR '), | |||||
(r'\$', 'USD '), | |||||
(r'\bKB\b', 'kilobytes'), | (r'\bKB\b', 'kilobytes'), | ||||
(r'\bMB\b', 'megabytes'), | (r'\bMB\b', 'megabytes'), | ||||
(r'\bGB\b', 'gigabytes'), | (r'\bGB\b', 'gigabytes'), | ||||
@@ -134,7 +134,7 @@ class CIDR(Command): | |||||
bin_ips[i] = bin_ips[i][:ip.size] + suffix | bin_ips[i] = bin_ips[i][:ip.size] + suffix | ||||
size = len(bin_ips[0]) | size = len(bin_ips[0]) | ||||
for i in xrange(len(bin_ips[0])): | |||||
for i in range(len(bin_ips[0])): | |||||
if any(ip[i] == "X" for ip in bin_ips) or ( | if any(ip[i] == "X" for ip in bin_ips) or ( | ||||
any(ip[i] == "0" for ip in bin_ips) and | any(ip[i] == "0" for ip in bin_ips) and | ||||
any(ip[i] == "1" for ip in bin_ips)): | any(ip[i] == "1" for ip in bin_ips)): | ||||
@@ -154,7 +154,7 @@ class CIDR(Command): | |||||
def _format_bin(family, binary): | def _format_bin(family, binary): | ||||
"""Convert an IP's binary representation to presentation format.""" | """Convert an IP's binary representation to presentation format.""" | ||||
return socket.inet_ntop(family, "".join( | return socket.inet_ntop(family, "".join( | ||||
chr(int(binary[i:i + 8], 2)) for i in xrange(0, len(binary), 8))) | |||||
chr(int(binary[i:i + 8], 2)) for i in range(0, len(binary), 8))) | |||||
@staticmethod | @staticmethod | ||||
def _format_count(count): | def _format_count(count): | ||||
@@ -20,16 +20,20 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
import base64 | |||||
import hashlib | import hashlib | ||||
import os | |||||
from earwigbot import importer | from earwigbot import importer | ||||
from earwigbot.commands import Command | from earwigbot.commands import Command | ||||
Blowfish = importer.new("Crypto.Cipher.Blowfish") | |||||
fernet = importer.new("cryptography.fernet") | |||||
hashes = importer.new("cryptography.hazmat.primitives.hashes") | |||||
pbkdf2 = importer.new("cryptography.hazmat.primitives.kdf.pbkdf2") | |||||
class Crypt(Command): | class Crypt(Command): | ||||
"""Provides hash functions with !hash (!hash list for supported algorithms) | """Provides hash functions with !hash (!hash list for supported algorithms) | ||||
and Blowfish encryption with !encrypt and !decrypt.""" | |||||
and basic encryption with !encrypt and !decrypt.""" | |||||
name = "crypt" | name = "crypt" | ||||
commands = ["crypt", "hash", "encrypt", "decrypt"] | commands = ["crypt", "hash", "encrypt", "decrypt"] | ||||
@@ -47,12 +51,12 @@ class Crypt(Command): | |||||
if data.command == "hash": | if data.command == "hash": | ||||
algo = data.args[0] | algo = data.args[0] | ||||
if algo == "list": | if algo == "list": | ||||
algos = ', '.join(hashlib.algorithms) | |||||
algos = ', '.join(hashlib.algorithms_available) | |||||
msg = algos.join(("Supported algorithms: ", ".")) | msg = algos.join(("Supported algorithms: ", ".")) | ||||
self.reply(data, msg) | self.reply(data, msg) | ||||
elif algo in hashlib.algorithms: | |||||
elif algo in hashlib.algorithms_available: | |||||
string = ' '.join(data.args[1:]) | string = ' '.join(data.args[1:]) | ||||
result = getattr(hashlib, algo)(string).hexdigest() | |||||
result = getattr(hashlib, algo)(string.encode()).hexdigest() | |||||
self.reply(data, result) | self.reply(data, result) | ||||
else: | else: | ||||
msg = "Unknown algorithm: '{0}'.".format(algo) | msg = "Unknown algorithm: '{0}'.".format(algo) | ||||
@@ -61,6 +65,7 @@ class Crypt(Command): | |||||
else: | else: | ||||
key = data.args[0] | key = data.args[0] | ||||
text = " ".join(data.args[1:]) | text = " ".join(data.args[1:]) | ||||
saltlen = 16 | |||||
if not text: | if not text: | ||||
msg = "A key was provided, but text to {0} was not." | msg = "A key was provided, but text to {0} was not." | ||||
@@ -68,19 +73,31 @@ class Crypt(Command): | |||||
return | return | ||||
try: | try: | ||||
cipher = Blowfish.new(hashlib.sha256(key).digest()) | |||||
except ImportError: | |||||
msg = "This command requires the 'pycrypto' package: https://www.dlitz.net/software/pycrypto/" | |||||
self.reply(data, msg) | |||||
return | |||||
try: | |||||
if data.command == "encrypt": | if data.command == "encrypt": | ||||
if len(text) % 8: | |||||
pad = 8 - len(text) % 8 | |||||
text = text.ljust(len(text) + pad, "\x00") | |||||
self.reply(data, cipher.encrypt(text).encode("hex")) | |||||
salt = os.urandom(saltlen) | |||||
kdf = pbkdf2.PBKDF2HMAC( | |||||
algorithm=hashes.SHA256(), | |||||
length=32, | |||||
salt=salt, | |||||
iterations=100000, | |||||
) | |||||
f = fernet.Fernet(base64.urlsafe_b64encode(kdf.derive(key.encode()))) | |||||
ciphertext = f.encrypt(text.encode()) | |||||
self.reply(data, base64.b64encode(salt + ciphertext).decode()) | |||||
else: | else: | ||||
self.reply(data, cipher.decrypt(text.decode("hex"))) | |||||
except (ValueError, TypeError) as error: | |||||
self.reply(data, error.message) | |||||
if len(text) < saltlen: | |||||
raise ValueError("Ciphertext is too short") | |||||
raw = base64.b64decode(text) | |||||
salt, ciphertext = raw[:saltlen], raw[saltlen:] | |||||
kdf = pbkdf2.PBKDF2HMAC( | |||||
algorithm=hashes.SHA256(), | |||||
length=32, | |||||
salt=salt, | |||||
iterations=100000, | |||||
) | |||||
f = fernet.Fernet(base64.urlsafe_b64encode(kdf.derive(key.encode()))) | |||||
self.reply(data, f.decrypt(ciphertext).decode()) | |||||
except ImportError: | |||||
self.reply(data, "This command requires the 'cryptography' package: https://cryptography.io/") | |||||
except Exception as error: | |||||
self.reply(data, "{}: {}".format(type(error).__name__, str(error))) |
@@ -63,23 +63,23 @@ class Dictionary(Command): | |||||
level, languages = self.get_languages(entry) | level, languages = self.get_languages(entry) | ||||
if not languages: | if not languages: | ||||
return u"Couldn't parse {0}!".format(page.url) | |||||
return "Couldn't parse {0}!".format(page.url) | |||||
if "#" in term: # Requesting a specific language | if "#" in term: # Requesting a specific language | ||||
lcase_langs = {lang.lower(): lang for lang in languages} | lcase_langs = {lang.lower(): lang for lang in languages} | ||||
request = term.rsplit("#", 1)[1] | request = term.rsplit("#", 1)[1] | ||||
lang = lcase_langs.get(request.lower()) | lang = lcase_langs.get(request.lower()) | ||||
if not lang: | if not lang: | ||||
resp = u"Language {0} not found in definition." | |||||
resp = "Language {0} not found in definition." | |||||
return resp.format(request) | return resp.format(request) | ||||
definition = self.get_definition(languages[lang], level) | definition = self.get_definition(languages[lang], level) | ||||
return u"({0}) {1}".format(lang, definition) | |||||
return "({0}) {1}".format(lang, definition) | |||||
result = [] | result = [] | ||||
for lang, section in sorted(languages.items()): | for lang, section in sorted(languages.items()): | ||||
definition = self.get_definition(section, level) | definition = self.get_definition(section, level) | ||||
result.append(u"({0}) {1}".format(lang, definition)) | |||||
return u"; ".join(result) | |||||
result.append("({0}) {1}".format(lang, definition)) | |||||
return "; ".join(result) | |||||
def get_languages(self, entry, level=2): | def get_languages(self, entry, level=2): | ||||
regex = r"(?:\A|\n)==\s*([a-zA-Z0-9_ ]*?)\s*==(?:\Z|\n)" | regex = r"(?:\A|\n)==\s*([a-zA-Z0-9_ ]*?)\s*==(?:\Z|\n)" | ||||
@@ -93,7 +93,7 @@ class Dictionary(Command): | |||||
split.pop(0) | split.pop(0) | ||||
languages = {} | languages = {} | ||||
for i in xrange(0, len(split), 2): | |||||
for i in range(0, len(split), 2): | |||||
languages[split[i]] = split[i + 1] | languages[split[i]] = split[i + 1] | ||||
return level, languages | return level, languages | ||||
@@ -118,40 +118,40 @@ class Dictionary(Command): | |||||
} | } | ||||
blocks = "=" * (level + 1) | blocks = "=" * (level + 1) | ||||
defs = [] | defs = [] | ||||
for part, basename in parts_of_speech.iteritems(): | |||||
fullnames = [basename, "\{\{" + basename + "\}\}", | |||||
"\{\{" + basename.lower() + "\}\}"] | |||||
for part, basename in parts_of_speech.items(): | |||||
fullnames = [basename, r"\{\{" + basename + r"\}\}", | |||||
r"\{\{" + basename.lower() + r"\}\}"] | |||||
for fullname in fullnames: | for fullname in fullnames: | ||||
regex = blocks + "\s*" + fullname + "\s*" + blocks | |||||
regex = blocks + r"\s*" + fullname + r"\s*" + blocks | |||||
if re.search(regex, section): | if re.search(regex, section): | ||||
regex = blocks + "\s*" + fullname | |||||
regex += "\s*{0}(.*?)(?:(?:{0})|\Z)".format(blocks) | |||||
regex = blocks + r"\s*" + fullname | |||||
regex += r"\s*{0}(.*?)(?:(?:{0})|\Z)".format(blocks) | |||||
bodies = re.findall(regex, section, re.DOTALL) | bodies = re.findall(regex, section, re.DOTALL) | ||||
if bodies: | if bodies: | ||||
for body in bodies: | for body in bodies: | ||||
definition = self.parse_body(body) | definition = self.parse_body(body) | ||||
if definition: | if definition: | ||||
msg = u"\x02{0}\x0F {1}" | |||||
msg = "\x02{0}\x0F {1}" | |||||
defs.append(msg.format(part, definition)) | defs.append(msg.format(part, definition)) | ||||
return "; ".join(defs) | return "; ".join(defs) | ||||
def parse_body(self, body): | def parse_body(self, body): | ||||
substitutions = [ | substitutions = [ | ||||
("<!--(.*?)-->", ""), | |||||
("<ref>(.*?)</ref>", ""), | |||||
("\[\[[^\]|]*?\|([^\]|]*?)\]\]", r"\1"), | |||||
("\{\{unsupported\|(.*?)\}\}", r"\1"), | |||||
("\{\{(.*?) of\|([^}|]*?)(\|(.*?))?\}\}", r"\1 of \2."), | |||||
("\{\{w\|(.*?)\}\}", r"\1"), | |||||
("\{\{surname(.*?)\}\}", r"A surname."), | |||||
("\{\{given name\|([^}|]*?)(\|(.*?))?\}\}", r"A \1 given name."), | |||||
(r"<!--(.*?)-->", ""), | |||||
(r"<ref>(.*?)</ref>", ""), | |||||
(r"\[\[[^\]|]*?\|([^\]|]*?)\]\]", r"\1"), | |||||
(r"\{\{unsupported\|(.*?)\}\}", r"\1"), | |||||
(r"\{\{(.*?) of\|([^}|]*?)(\|(.*?))?\}\}", r"\1 of \2."), | |||||
(r"\{\{w\|(.*?)\}\}", r"\1"), | |||||
(r"\{\{surname(.*?)\}\}", r"A surname."), | |||||
(r"\{\{given name\|([^}|]*?)(\|(.*?))?\}\}", r"A \1 given name."), | |||||
] | ] | ||||
senses = [] | senses = [] | ||||
for line in body.splitlines(): | for line in body.splitlines(): | ||||
line = line.strip() | line = line.strip() | ||||
if re.match("#\s*[^:*#]", line): | |||||
if re.match(r"#\s*[^:*#]", line): | |||||
for regex, repl in substitutions: | for regex, repl in substitutions: | ||||
line = re.sub(regex, repl, line) | line = re.sub(regex, repl, line) | ||||
line = self.strip_templates(line) | line = self.strip_templates(line) | ||||
@@ -167,7 +167,7 @@ class Dictionary(Command): | |||||
result = [] # Number the senses incrementally | result = [] # Number the senses incrementally | ||||
for i, sense in enumerate(senses): | for i, sense in enumerate(senses): | ||||
result.append(u"{0}. {1}".format(i + 1, sense)) | |||||
result.append("{0}. {1}".format(i + 1, sense)) | |||||
return " ".join(result) | return " ".join(result) | ||||
def strip_templates(self, line): | def strip_templates(self, line): | ||||
@@ -20,7 +20,7 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
from urllib import quote_plus | |||||
from urllib.parse import quote_plus | |||||
from earwigbot import exceptions | from earwigbot import exceptions | ||||
from earwigbot.commands import Command | from earwigbot.commands import Command | ||||
@@ -47,7 +47,7 @@ class Editcount(Command): | |||||
return | return | ||||
safe = quote_plus(user.name.encode("utf8")) | safe = quote_plus(user.name.encode("utf8")) | ||||
url = "http://tools.wmflabs.org/xtools-ec/index.php?user={0}&lang={1}&wiki={2}" | |||||
fullurl = url.format(safe, site.lang, site.project) | |||||
url = "https://xtools.wmflabs.org/ec/{}/{}".format(site.domain, safe) | |||||
fullurl = url.format(safe, site.domain) | |||||
msg = "\x0302{0}\x0F has {1} edits ({2})." | msg = "\x0302{0}\x0F has {1} edits ({2})." | ||||
self.reply(data, msg.format(name, count, fullurl)) | self.reply(data, msg.format(name, count, fullurl)) |
@@ -64,7 +64,7 @@ class Help(Command): | |||||
if command.name == target or target in command.commands: | if command.name == target or target in command.commands: | ||||
if command.__doc__: | if command.__doc__: | ||||
doc = command.__doc__.replace("\n", "") | doc = command.__doc__.replace("\n", "") | ||||
doc = re.sub("\s\s+", " ", doc) | |||||
doc = re.sub(r"\s\s+", " ", doc) | |||||
msg = 'Help for command \x0303{0}\x0F: "{1}"' | msg = 'Help for command \x0303{0}\x0F: "{1}"' | ||||
self.reply(data, msg.format(target, doc)) | self.reply(data, msg.format(target, doc)) | ||||
return | return | ||||
@@ -39,7 +39,7 @@ class Langcode(Command): | |||||
del matrix["count"] | del matrix["count"] | ||||
del matrix["specials"] | del matrix["specials"] | ||||
for site in matrix.itervalues(): | |||||
for site in matrix.values(): | |||||
if not site["name"]: | if not site["name"]: | ||||
continue | continue | ||||
name = site["name"].encode("utf8") | name = site["name"].encode("utf8") | ||||
@@ -32,21 +32,21 @@ class Link(Command): | |||||
self.last = {} | self.last = {} | ||||
def check(self, data): | def check(self, data): | ||||
if re.search("(\[\[(.*?)\]\])|(\{\{(.*?)\}\})", data.msg): | |||||
if re.search(r"(\[\[(.*?)\]\])|(\{\{(.*?)\}\})", data.msg): | |||||
self.last[data.chan] = data.msg # Store most recent link | self.last[data.chan] = data.msg # Store most recent link | ||||
return data.is_command and data.command == self.name | return data.is_command and data.command == self.name | ||||
def process(self, data): | def process(self, data): | ||||
self.site = self.bot.wiki.get_site() | self.site = self.bot.wiki.get_site() | ||||
if re.search("(\[\[(.*?)\]\])|(\{\{(.*?)\}\})", data.msg): | |||||
links = u" , ".join(self.parse_line(data.msg)) | |||||
if re.search(r"(\[\[(.*?)\]\])|(\{\{(.*?)\}\})", data.msg): | |||||
links = " , ".join(self.parse_line(data.msg)) | |||||
self.reply(data, links.encode("utf8")) | self.reply(data, links.encode("utf8")) | ||||
elif data.command == "link": | elif data.command == "link": | ||||
if not data.args: | if not data.args: | ||||
if data.chan in self.last: | if data.chan in self.last: | ||||
links = u" , ".join(self.parse_line(self.last[data.chan])) | |||||
links = " , ".join(self.parse_line(self.last[data.chan])) | |||||
self.reply(data, links.encode("utf8")) | self.reply(data, links.encode("utf8")) | ||||
else: | else: | ||||
self.reply(data, "What do you want me to link to?") | self.reply(data, "What do you want me to link to?") | ||||
@@ -60,17 +60,17 @@ class Link(Command): | |||||
results = [] | results = [] | ||||
# Destroy {{{template parameters}}}: | # Destroy {{{template parameters}}}: | ||||
line = re.sub("\{\{\{(.*?)\}\}\}", "", line) | |||||
line = re.sub(r"\{\{\{(.*?)\}\}\}", "", line) | |||||
# Find all [[links]]: | # Find all [[links]]: | ||||
links = re.findall("(\[\[(.*?)(\||\]\]))", line) | |||||
links = re.findall(r"(\[\[(.*?)(\||\]\]))", line) | |||||
if links: | if links: | ||||
# re.findall() returns a list of tuples, but we only want the 2nd | # re.findall() returns a list of tuples, but we only want the 2nd | ||||
# item in each tuple: | # item in each tuple: | ||||
results = [self.site.get_page(name[1]).url for name in links] | results = [self.site.get_page(name[1]).url for name in links] | ||||
# Find all {{templates}} | # Find all {{templates}} | ||||
templates = re.findall("(\{\{(.*?)(\||\}\}))", line) | |||||
templates = re.findall(r"(\{\{(.*?)(\||\}\}))", line) | |||||
if templates: | if templates: | ||||
p_tmpl = lambda name: self.site.get_page("Template:" + name).url | p_tmpl = lambda name: self.site.get_page("Template:" + name).url | ||||
templates = [p_tmpl(i[1]) for i in templates] | templates = [p_tmpl(i[1]) for i in templates] | ||||
@@ -96,7 +96,7 @@ class Notes(Command): | |||||
except IndexError: | except IndexError: | ||||
msg = ("\x0302The Earwig Mini-Wiki\x0F: running v{0}. Subcommands " | msg = ("\x0302The Earwig Mini-Wiki\x0F: running v{0}. Subcommands " | ||||
"are: {1}. You can get help on any with '!{2} help subcommand'.") | "are: {1}. You can get help on any with '!{2} help subcommand'.") | ||||
cmnds = ", ".join((info.keys())) | |||||
cmnds = ", ".join(info.keys()) | |||||
self.reply(data, msg.format(self.version, cmnds, data.command)) | self.reply(data, msg.format(self.version, cmnds, data.command)) | ||||
return | return | ||||
if command in self.aliases: | if command in self.aliases: | ||||
@@ -80,7 +80,7 @@ class Remind(Command): | |||||
def _evaluate(node): | def _evaluate(node): | ||||
"""Convert an AST node into a real number or raise an exception.""" | """Convert an AST node into a real number or raise an exception.""" | ||||
if isinstance(node, ast.Num): | if isinstance(node, ast.Num): | ||||
if not isinstance(node.n, (int, long, float)): | |||||
if not isinstance(node.n, (int, float)): | |||||
raise ValueError(node.n) | raise ValueError(node.n) | ||||
return node.n | return node.n | ||||
elif isinstance(node, ast.BinOp): | elif isinstance(node, ast.BinOp): | ||||
@@ -89,7 +89,7 @@ class Remind(Command): | |||||
else: | else: | ||||
raise ValueError(node) | raise ValueError(node) | ||||
for unit, factor in time_units.iteritems(): | |||||
for unit, factor in time_units.items(): | |||||
arg = arg.replace(unit, "*" + str(factor)) | arg = arg.replace(unit, "*" + str(factor)) | ||||
try: | try: | ||||
@@ -112,7 +112,7 @@ class Remind(Command): | |||||
def _get_new_id(self): | def _get_new_id(self): | ||||
"""Get a free ID for a new reminder.""" | """Get a free ID for a new reminder.""" | ||||
taken = set(robj.id for robj in chain(*self.reminders.values())) | |||||
taken = set(robj.id for robj in chain(*list(self.reminders.values()))) | |||||
num = random.choice(list(set(range(4096)) - taken)) | num = random.choice(list(set(range(4096)) - taken)) | ||||
return "R{0:03X}".format(num) | return "R{0:03X}".format(num) | ||||
@@ -232,7 +232,7 @@ class Remind(Command): | |||||
fmt = lambda robj, user: '\x0303{0}\x0F (for {1} {2}, {3})'.format( | fmt = lambda robj, user: '\x0303{0}\x0F (for {1} {2}, {3})'.format( | ||||
robj.id, user, dest(robj.data), robj.end_time) | robj.id, user, dest(robj.data), robj.end_time) | ||||
rlist = (fmt(rem, user) for user, rems in self.reminders.iteritems() | |||||
rlist = (fmt(rem, user) for user, rems in self.reminders.items() | |||||
for rem in rems) | for rem in rems) | ||||
self.reply(data, "All reminders: {0}.".format(", ".join(rlist))) | self.reply(data, "All reminders: {0}.".format(", ".join(rlist))) | ||||
@@ -363,7 +363,7 @@ class Remind(Command): | |||||
permdb.set_attr("command:remind", "data", str(database)) | permdb.set_attr("command:remind", "data", str(database)) | ||||
class _ReminderThread(object): | |||||
class _ReminderThread: | |||||
"""A single thread that handles reminders.""" | """A single thread that handles reminders.""" | ||||
def __init__(self, lock): | def __init__(self, lock): | ||||
@@ -429,7 +429,7 @@ class _ReminderThread(object): | |||||
self._thread = None | self._thread = None | ||||
class _Reminder(object): | |||||
class _Reminder: | |||||
"""Represents a single reminder.""" | """Represents a single reminder.""" | ||||
def __init__(self, rid, user, wait, message, data, cmdobj, end=None): | def __init__(self, rid, user, wait, message, data, cmdobj, end=None): | ||||
self.id = rid | self.id = rid | ||||
@@ -146,7 +146,7 @@ class Stalk(Command): | |||||
return target.startswith("re:") and re.match(target[3:], tag) | return target.startswith("re:") and re.match(target[3:], tag) | ||||
def _process(table, tag, flags): | def _process(table, tag, flags): | ||||
for target, stalks in table.iteritems(): | |||||
for target, stalks in table.items(): | |||||
if target == tag or _regex_match(target, tag): | if target == tag or _regex_match(target, tag): | ||||
_update_chans(stalks, flags) | _update_chans(stalks, flags) | ||||
@@ -161,7 +161,7 @@ class Stalk(Command): | |||||
with self.bot.component_lock: | with self.bot.component_lock: | ||||
frontend = self.bot.frontend | frontend = self.bot.frontend | ||||
if frontend and not frontend.is_stopped(): | if frontend and not frontend.is_stopped(): | ||||
for chan, users in chans.iteritems(): | |||||
for chan, users in chans.items(): | |||||
if chan.startswith("#") and chan not in frontend.channels: | if chan.startswith("#") and chan not in frontend.channels: | ||||
continue | continue | ||||
pretty = rc.prettify(color=chan not in nocolor) | pretty = rc.prettify(color=chan not in nocolor) | ||||
@@ -178,7 +178,7 @@ class Stalk(Command): | |||||
def _get_stalks_by_nick(nick, table): | def _get_stalks_by_nick(nick, table): | ||||
"""Return a dictionary of stalklist entries by the given nick.""" | """Return a dictionary of stalklist entries by the given nick.""" | ||||
entries = {} | entries = {} | ||||
for target, stalks in table.iteritems(): | |||||
for target, stalks in table.items(): | |||||
for info in stalks: | for info in stalks: | ||||
if info[0] == nick: | if info[0] == nick: | ||||
if target in entries: | if target in entries: | ||||
@@ -293,7 +293,7 @@ class Stalk(Command): | |||||
def _format_stalks(stalks): | def _format_stalks(stalks): | ||||
return ", ".join( | return ", ".join( | ||||
"\x0302{0}\x0F ({1})".format(target, _format_chans(chans)) | "\x0302{0}\x0F ({1})".format(target, _format_chans(chans)) | ||||
for target, chans in stalks.iteritems()) | |||||
for target, chans in stalks.items()) | |||||
users = self._get_stalks_by_nick(nick, self._users) | users = self._get_stalks_by_nick(nick, self._users) | ||||
pages = self._get_stalks_by_nick(nick, self._pages) | pages = self._get_stalks_by_nick(nick, self._pages) | ||||
@@ -325,7 +325,7 @@ class Stalk(Command): | |||||
def _format_stalks(stalks): | def _format_stalks(stalks): | ||||
return ", ".join( | return ", ".join( | ||||
"\x0302{0}\x0F ({1})".format(target, _format_data(data)) | "\x0302{0}\x0F ({1})".format(target, _format_data(data)) | ||||
for target, data in stalks.iteritems()) | |||||
for target, data in stalks.items()) | |||||
users, pages = self._users, self._pages | users, pages = self._users, self._pages | ||||
if users: | if users: | ||||
@@ -80,7 +80,7 @@ class Threads(Command): | |||||
t = "\x0302copyvio worker\x0F (site {0})" | t = "\x0302copyvio worker\x0F (site {0})" | ||||
daemon_threads.append(t.format(tname[len("cvworker-"):])) | daemon_threads.append(t.format(tname[len("cvworker-"):])) | ||||
else: | else: | ||||
match = re.findall("^(.*?) \((.*?)\)$", tname) | |||||
match = re.findall(r"^(.*?) \((.*?)\)$", tname) | |||||
if match: | if match: | ||||
t = "\x0302{0}\x0F (id {1}, since {2})" | t = "\x0302{0}\x0F (id {1}, since {2})" | ||||
thread_info = t.format(match[0][0], ident, match[0][1]) | thread_info = t.format(match[0][0], ident, match[0][1]) | ||||
@@ -60,7 +60,7 @@ class Time(Command): | |||||
try: | try: | ||||
tzinfo = pytz.timezone(timezone) | tzinfo = pytz.timezone(timezone) | ||||
except ImportError: | except ImportError: | ||||
msg = "This command requires the 'pytz' package: http://pytz.sourceforge.net/" | |||||
msg = "This command requires the 'pytz' package: https://pypi.org/project/pytz/" | |||||
self.reply(data, msg) | self.reply(data, msg) | ||||
return | return | ||||
except pytz.exceptions.UnknownTimeZoneError: | except pytz.exceptions.UnknownTimeZoneError: | ||||
@@ -35,7 +35,7 @@ class Watchers(Command): | |||||
site = self.bot.wiki.get_site() | site = self.bot.wiki.get_site() | ||||
query = site.api_query(action="query", prop="info", inprop="watchers", | query = site.api_query(action="query", prop="info", inprop="watchers", | ||||
titles=" ".join(data.args)) | titles=" ".join(data.args)) | ||||
page = query["query"]["pages"].values()[0] | |||||
page = list(query["query"]["pages"].values())[0] | |||||
title = page["title"].encode("utf8") | title = page["title"].encode("utf8") | ||||
if "invalid" in page: | if "invalid" in page: | ||||
@@ -20,9 +20,9 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
import base64 | |||||
from collections import OrderedDict | from collections import OrderedDict | ||||
from getpass import getpass | from getpass import getpass | ||||
from hashlib import sha256 | |||||
import logging | import logging | ||||
import logging.handlers | import logging.handlers | ||||
from os import mkdir, path | from os import mkdir, path | ||||
@@ -38,12 +38,13 @@ from earwigbot.config.permissions import PermissionsDB | |||||
from earwigbot.config.script import ConfigScript | from earwigbot.config.script import ConfigScript | ||||
from earwigbot.exceptions import NoConfigError | from earwigbot.exceptions import NoConfigError | ||||
Blowfish = importer.new("Crypto.Cipher.Blowfish") | |||||
bcrypt = importer.new("bcrypt") | |||||
fernet = importer.new("cryptography.fernet") | |||||
hashes = importer.new("cryptography.hazmat.primitives.hashes") | |||||
pbkdf2 = importer.new("cryptography.hazmat.primitives.kdf.pbkdf2") | |||||
__all__ = ["BotConfig"] | __all__ = ["BotConfig"] | ||||
class BotConfig(object): | |||||
class BotConfig: | |||||
""" | """ | ||||
**EarwigBot: YAML Config File Manager** | **EarwigBot: YAML Config File Manager** | ||||
@@ -109,9 +110,9 @@ class BotConfig(object): | |||||
return "<BotConfig at {0}>".format(self.root_dir) | return "<BotConfig at {0}>".format(self.root_dir) | ||||
def _handle_missing_config(self): | def _handle_missing_config(self): | ||||
print "Config file missing or empty:", self._config_path | |||||
print("Config file missing or empty:", self._config_path) | |||||
msg = "Would you like to create a config file now? [Y/n] " | msg = "Would you like to create a config file now? [Y/n] " | ||||
choice = raw_input(msg) | |||||
choice = input(msg) | |||||
if choice.lower().startswith("n"): | if choice.lower().startswith("n"): | ||||
raise NoConfigError() | raise NoConfigError() | ||||
else: | else: | ||||
@@ -127,7 +128,7 @@ class BotConfig(object): | |||||
try: | try: | ||||
self._data = yaml.load(fp, OrderedLoader) | self._data = yaml.load(fp, OrderedLoader) | ||||
except yaml.YAMLError: | except yaml.YAMLError: | ||||
print "Error parsing config file {0}:".format(filename) | |||||
print("Error parsing config file {0}:".format(filename)) | |||||
raise | raise | ||||
def _setup_logging(self): | def _setup_logging(self): | ||||
@@ -148,7 +149,7 @@ class BotConfig(object): | |||||
mkdir(log_dir, stat.S_IWUSR|stat.S_IRUSR|stat.S_IXUSR) | mkdir(log_dir, stat.S_IWUSR|stat.S_IRUSR|stat.S_IXUSR) | ||||
else: | else: | ||||
msg = "log_dir ({0}) exists but is not a directory!" | msg = "log_dir ({0}) exists but is not a directory!" | ||||
print msg.format(log_dir) | |||||
print(msg.format(log_dir)) | |||||
return | return | ||||
main_handler = hand(logfile("bot.log"), "midnight", 1, 7) | main_handler = hand(logfile("bot.log"), "midnight", 1, 7) | ||||
@@ -173,7 +174,7 @@ class BotConfig(object): | |||||
try: | try: | ||||
node._decrypt(self._decryption_cipher, nodes[:-1], nodes[-1]) | node._decrypt(self._decryption_cipher, nodes[:-1], nodes[-1]) | ||||
except ValueError: | except ValueError: | ||||
print "Error decrypting passwords:" | |||||
print("Error decrypting passwords:") | |||||
raise | raise | ||||
@property | @property | ||||
@@ -257,7 +258,7 @@ class BotConfig(object): | |||||
exit. | exit. | ||||
Data from the config file is stored in six | Data from the config file is stored in six | ||||
:py:class:`~earwigbot.config.ConfigNode`\ s (:py:attr:`components`, | |||||
:py:class:`~earwigbot.config.ConfigNode`\\ s (:py:attr:`components`, | |||||
:py:attr:`wiki`, :py:attr:`irc`, :py:attr:`commands`, :py:attr:`tasks`, | :py:attr:`wiki`, :py:attr:`irc`, :py:attr:`commands`, :py:attr:`tasks`, | ||||
:py:attr:`metadata`) for easy access (as well as the lower-level | :py:attr:`metadata`) for easy access (as well as the lower-level | ||||
:py:attr:`data` attribute). If passwords are encrypted, we'll use | :py:attr:`data` attribute). If passwords are encrypted, we'll use | ||||
@@ -283,18 +284,19 @@ class BotConfig(object): | |||||
if self.is_encrypted(): | if self.is_encrypted(): | ||||
if not self._decryption_cipher: | if not self._decryption_cipher: | ||||
try: | try: | ||||
blowfish_new = Blowfish.new | |||||
hashpw = bcrypt.hashpw | |||||
salt = self.metadata["salt"] | |||||
kdf = pbkdf2.PBKDF2HMAC( | |||||
algorithm=hashes.SHA256(), | |||||
length=32, | |||||
salt=salt, | |||||
iterations=ConfigScript.PBKDF_ROUNDS, | |||||
) | |||||
except ImportError: | except ImportError: | ||||
url1 = "http://www.mindrot.org/projects/py-bcrypt" | |||||
url2 = "https://www.dlitz.net/software/pycrypto/" | |||||
e = "Encryption requires the 'py-bcrypt' and 'pycrypto' packages: {0}, {1}" | |||||
raise NoConfigError(e.format(url1, url2)) | |||||
e = "Encryption requires the 'cryptography' package: https://cryptography.io/" | |||||
raise NoConfigError(e) | |||||
key = getpass("Enter key to decrypt bot passwords: ") | key = getpass("Enter key to decrypt bot passwords: ") | ||||
self._decryption_cipher = blowfish_new(sha256(key).digest()) | |||||
signature = self.metadata["signature"] | |||||
if hashpw(key, signature) != signature: | |||||
raise RuntimeError("Incorrect password.") | |||||
self._decryption_cipher = fernet.Fernet( | |||||
base64.urlsafe_b64encode(kdf.derive(key.encode()))) | |||||
for node, nodes in self._decryptable_nodes: | for node, nodes in self._decryptable_nodes: | ||||
self._decrypt(node, nodes) | self._decrypt(node, nodes) | ||||
@@ -26,7 +26,7 @@ __all__ = ["BotFormatter"] | |||||
class BotFormatter(logging.Formatter): | class BotFormatter(logging.Formatter): | ||||
def __init__(self, color=False): | def __init__(self, color=False): | ||||
self._format = super(BotFormatter, self).format | |||||
self._format = super().format | |||||
if color: | if color: | ||||
fmt = "[%(asctime)s %(lvl)s] %(name)s: %(message)s" | fmt = "[%(asctime)s %(lvl)s] %(name)s: %(message)s" | ||||
self.format = lambda rec: self._format(self.format_color(rec)) | self.format = lambda rec: self._format(self.format_color(rec)) | ||||
@@ -34,7 +34,7 @@ class BotFormatter(logging.Formatter): | |||||
fmt = "[%(asctime)s %(levelname)-8s] %(name)s: %(message)s" | fmt = "[%(asctime)s %(levelname)-8s] %(name)s: %(message)s" | ||||
self.format = self._format | self.format = self._format | ||||
datefmt = "%Y-%m-%d %H:%M:%S" | datefmt = "%Y-%m-%d %H:%M:%S" | ||||
super(BotFormatter, self).__init__(fmt=fmt, datefmt=datefmt) | |||||
super().__init__(fmt=fmt, datefmt=datefmt) | |||||
def format_color(self, record): | def format_color(self, record): | ||||
l = record.levelname.ljust(8) | l = record.levelname.ljust(8) | ||||
@@ -20,18 +20,19 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
import base64 | |||||
from collections import OrderedDict | from collections import OrderedDict | ||||
__all__ = ["ConfigNode"] | __all__ = ["ConfigNode"] | ||||
class ConfigNode(object): | |||||
class ConfigNode: | |||||
def __init__(self): | def __init__(self): | ||||
self._data = OrderedDict() | self._data = OrderedDict() | ||||
def __repr__(self): | def __repr__(self): | ||||
return self._data | return self._data | ||||
def __nonzero__(self): | |||||
def __bool__(self): | |||||
return bool(self._data) | return bool(self._data) | ||||
def __len__(self): | def __len__(self): | ||||
@@ -45,12 +46,12 @@ class ConfigNode(object): | |||||
def __getattr__(self, key): | def __getattr__(self, key): | ||||
if key == "_data": | if key == "_data": | ||||
return super(ConfigNode, self).__getattr__(key) | |||||
return super().__getattribute__(key) | |||||
return self._data[key] | return self._data[key] | ||||
def __setattr__(self, key, item): | def __setattr__(self, key, item): | ||||
if key == "_data": | if key == "_data": | ||||
super(ConfigNode, self).__setattr__(key, item) | |||||
super().__setattr__(key, item) | |||||
else: | else: | ||||
self._data[key] = item | self._data[key] = item | ||||
@@ -63,7 +64,7 @@ class ConfigNode(object): | |||||
def _dump(self): | def _dump(self): | ||||
data = self._data.copy() | data = self._data.copy() | ||||
for key, val in data.iteritems(): | |||||
for key, val in data.items(): | |||||
if isinstance(val, ConfigNode): | if isinstance(val, ConfigNode): | ||||
data[key] = val._dump() | data[key] = val._dump() | ||||
return data | return data | ||||
@@ -79,26 +80,26 @@ class ConfigNode(object): | |||||
except KeyError: | except KeyError: | ||||
return | return | ||||
if item in base: | if item in base: | ||||
ciphertext = base[item].decode("hex") | |||||
base[item] = cipher.decrypt(ciphertext).rstrip("\x00") | |||||
ciphertext = base64.b64decode(base[item]) | |||||
base[item] = cipher.decrypt(ciphertext).decode() | |||||
def get(self, *args, **kwargs): | def get(self, *args, **kwargs): | ||||
return self._data.get(*args, **kwargs) | return self._data.get(*args, **kwargs) | ||||
def keys(self): | def keys(self): | ||||
return self._data.keys() | |||||
return list(self._data.keys()) | |||||
def values(self): | def values(self): | ||||
return self._data.values() | |||||
return list(self._data.values()) | |||||
def items(self): | def items(self): | ||||
return self._data.items() | |||||
return list(self._data.items()) | |||||
def iterkeys(self): | def iterkeys(self): | ||||
return self._data.iterkeys() | |||||
return iter(self._data.keys()) | |||||
def itervalues(self): | def itervalues(self): | ||||
return self._data.itervalues() | |||||
return iter(self._data.values()) | |||||
def iteritems(self): | def iteritems(self): | ||||
return self._data.iteritems() | |||||
return iter(self._data.items()) |
@@ -24,7 +24,7 @@ | |||||
Based on: | Based on: | ||||
* https://gist.github.com/844388 | * https://gist.github.com/844388 | ||||
* http://pyyaml.org/attachment/ticket/161/use_ordered_dict.py | |||||
* https://pyyaml.org/attachment/ticket/161/use_ordered_dict.py | |||||
with modifications. | with modifications. | ||||
""" | """ | ||||
@@ -39,10 +39,10 @@ class OrderedLoader(yaml.Loader): | |||||
"""A YAML loader that loads mappings into ordered dictionaries.""" | """A YAML loader that loads mappings into ordered dictionaries.""" | ||||
def __init__(self, *args, **kwargs): | def __init__(self, *args, **kwargs): | ||||
super(OrderedLoader, self).__init__(*args, **kwargs) | |||||
super().__init__(*args, **kwargs) | |||||
constructor = type(self).construct_yaml_map | constructor = type(self).construct_yaml_map | ||||
self.add_constructor(u"tag:yaml.org,2002:map", constructor) | |||||
self.add_constructor(u"tag:yaml.org,2002:omap", constructor) | |||||
self.add_constructor("tag:yaml.org,2002:map", constructor) | |||||
self.add_constructor("tag:yaml.org,2002:omap", constructor) | |||||
def construct_yaml_map(self, node): | def construct_yaml_map(self, node): | ||||
data = OrderedDict() | data = OrderedDict() | ||||
@@ -63,7 +63,7 @@ class OrderedLoader(yaml.Loader): | |||||
key = self.construct_object(key_node, deep=deep) | key = self.construct_object(key_node, deep=deep) | ||||
try: | try: | ||||
hash(key) | hash(key) | ||||
except TypeError, exc: | |||||
except TypeError as exc: | |||||
raise yaml.constructor.ConstructorError( | raise yaml.constructor.ConstructorError( | ||||
"while constructing a mapping", node.start_mark, | "while constructing a mapping", node.start_mark, | ||||
"found unacceptable key ({0})".format(exc), | "found unacceptable key ({0})".format(exc), | ||||
@@ -77,7 +77,7 @@ class OrderedDumper(yaml.SafeDumper): | |||||
"""A YAML dumper that dumps ordered dictionaries into mappings.""" | """A YAML dumper that dumps ordered dictionaries into mappings.""" | ||||
def __init__(self, *args, **kwargs): | def __init__(self, *args, **kwargs): | ||||
super(OrderedDumper, self).__init__(*args, **kwargs) | |||||
super().__init__(*args, **kwargs) | |||||
self.add_representer(OrderedDict, type(self).represent_dict) | self.add_representer(OrderedDict, type(self).represent_dict) | ||||
def represent_mapping(self, tag, mapping, flow_style=None): | def represent_mapping(self, tag, mapping, flow_style=None): | ||||
@@ -26,7 +26,7 @@ from threading import Lock | |||||
__all__ = ["PermissionsDB"] | __all__ = ["PermissionsDB"] | ||||
class PermissionsDB(object): | |||||
class PermissionsDB: | |||||
""" | """ | ||||
**EarwigBot: Permissions Database Manager** | **EarwigBot: Permissions Database Manager** | ||||
@@ -198,7 +198,7 @@ class PermissionsDB(object): | |||||
with self._db_access_lock, sqlite.connect(self._dbfile) as conn: | with self._db_access_lock, sqlite.connect(self._dbfile) as conn: | ||||
conn.execute(query, (user, key)) | conn.execute(query, (user, key)) | ||||
class User(object): | |||||
class User: | |||||
"""A class that represents an IRC user for the purpose of testing rules.""" | """A class that represents an IRC user for the purpose of testing rules.""" | ||||
def __init__(self, nick, ident, host): | def __init__(self, nick, ident, host): | ||||
self.nick = nick | self.nick = nick | ||||
@@ -20,9 +20,10 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
import base64 | |||||
from collections import OrderedDict | from collections import OrderedDict | ||||
from getpass import getpass | from getpass import getpass | ||||
from hashlib import sha256 | |||||
import os | |||||
from os import chmod, makedirs, mkdir, path | from os import chmod, makedirs, mkdir, path | ||||
import re | import re | ||||
import stat | import stat | ||||
@@ -34,8 +35,9 @@ import yaml | |||||
from earwigbot import exceptions, importer | from earwigbot import exceptions, importer | ||||
from earwigbot.config.ordered_yaml import OrderedDumper | from earwigbot.config.ordered_yaml import OrderedDumper | ||||
Blowfish = importer.new("Crypto.Cipher.Blowfish") | |||||
bcrypt = importer.new("bcrypt") | |||||
fernet = importer.new("cryptography.fernet") | |||||
hashes = importer.new("cryptography.hazmat.primitives.hashes") | |||||
pbkdf2 = importer.new("cryptography.hazmat.primitives.kdf.pbkdf2") | |||||
__all__ = ["ConfigScript"] | __all__ = ["ConfigScript"] | ||||
@@ -48,11 +50,11 @@ def process(bot, rc): | |||||
pass | pass | ||||
""" | """ | ||||
class ConfigScript(object): | |||||
class ConfigScript: | |||||
"""A script to guide a user through the creation of a new config file.""" | """A script to guide a user through the creation of a new config file.""" | ||||
WIDTH = 79 | WIDTH = 79 | ||||
PROMPT = "\x1b[32m> \x1b[0m" | PROMPT = "\x1b[32m> \x1b[0m" | ||||
BCRYPT_ROUNDS = 12 | |||||
PBKDF_ROUNDS = 100000 | |||||
def __init__(self, config): | def __init__(self, config): | ||||
self.config = config | self.config = config | ||||
@@ -72,24 +74,24 @@ class ConfigScript(object): | |||||
self._lang = None | self._lang = None | ||||
def _print(self, text): | def _print(self, text): | ||||
print fill(re.sub("\s\s+", " ", text), self.WIDTH) | |||||
print(fill(re.sub(r"\s\s+", " ", text), self.WIDTH)) | |||||
def _print_no_nl(self, text): | def _print_no_nl(self, text): | ||||
sys.stdout.write(fill(re.sub("\s\s+", " ", text), self.WIDTH)) | |||||
sys.stdout.write(fill(re.sub(r"\s\s+", " ", text), self.WIDTH)) | |||||
sys.stdout.flush() | sys.stdout.flush() | ||||
def _pause(self): | def _pause(self): | ||||
raw_input(self.PROMPT + "Press enter to continue: ") | |||||
input(self.PROMPT + "Press enter to continue: ") | |||||
def _ask(self, text, default=None, require=True): | def _ask(self, text, default=None, require=True): | ||||
text = self.PROMPT + text | text = self.PROMPT + text | ||||
if default: | if default: | ||||
text += " \x1b[33m[{0}]\x1b[0m".format(default) | text += " \x1b[33m[{0}]\x1b[0m".format(default) | ||||
lines = wrap(re.sub("\s\s+", " ", text), self.WIDTH) | |||||
lines = wrap(re.sub(r"\s\s+", " ", text), self.WIDTH) | |||||
if len(lines) > 1: | if len(lines) > 1: | ||||
print "\n".join(lines[:-1]) | |||||
print("\n".join(lines[:-1])) | |||||
while True: | while True: | ||||
answer = raw_input(lines[-1] + " ") or default | |||||
answer = input(lines[-1] + " ") or default | |||||
if answer or not require: | if answer or not require: | ||||
return answer | return answer | ||||
@@ -99,11 +101,11 @@ class ConfigScript(object): | |||||
text += " \x1b[33m[Y/n]\x1b[0m" | text += " \x1b[33m[Y/n]\x1b[0m" | ||||
else: | else: | ||||
text += " \x1b[33m[y/N]\x1b[0m" | text += " \x1b[33m[y/N]\x1b[0m" | ||||
lines = wrap(re.sub("\s\s+", " ", text), self.WIDTH) | |||||
lines = wrap(re.sub(r"\s\s+", " ", text), self.WIDTH) | |||||
if len(lines) > 1: | if len(lines) > 1: | ||||
print "\n".join(lines[:-1]) | |||||
print("\n".join(lines[:-1])) | |||||
while True: | while True: | ||||
answer = raw_input(lines[-1] + " ").lower() | |||||
answer = input(lines[-1] + " ").lower() | |||||
if not answer: | if not answer: | ||||
return default | return default | ||||
if answer.startswith("y"): | if answer.startswith("y"): | ||||
@@ -119,59 +121,57 @@ class ConfigScript(object): | |||||
def _encrypt(self, password): | def _encrypt(self, password): | ||||
if self._cipher: | if self._cipher: | ||||
mod = len(password) % 8 | |||||
if mod: | |||||
password = password.ljust(len(password) + (8 - mod), "\x00") | |||||
return self._cipher.encrypt(password).encode("hex") | |||||
return base64.b64encode(self._cipher.encrypt(password.encode())).decode() | |||||
else: | else: | ||||
return password | return password | ||||
def _ask_list(self, text): | def _ask_list(self, text): | ||||
print fill(re.sub("\s\s+", " ", self.PROMPT + text), self.WIDTH) | |||||
print "[one item per line; blank line to end]:" | |||||
print(fill(re.sub(r"\s\s+", " ", self.PROMPT + text), self.WIDTH)) | |||||
print("[one item per line; blank line to end]:") | |||||
result = [] | result = [] | ||||
while True: | while True: | ||||
line = raw_input(self.PROMPT) | |||||
line = input(self.PROMPT) | |||||
if line: | if line: | ||||
result.append(line) | result.append(line) | ||||
else: | else: | ||||
return result | return result | ||||
def _set_metadata(self): | def _set_metadata(self): | ||||
print() | |||||
self.data["metadata"] = OrderedDict([("version", 1)]) | self.data["metadata"] = OrderedDict([("version", 1)]) | ||||
self._print("""I can encrypt passwords stored in your config file in | self._print("""I can encrypt passwords stored in your config file in | ||||
addition to preventing other users on your system from | addition to preventing other users on your system from | ||||
reading the file. Encryption is recommended if the bot | reading the file. Encryption is recommended if the bot | ||||
is to run on a public server like Wikimedia Labs, but | |||||
otherwise the need to enter a key every time you start | |||||
the bot may be an inconvenience.""") | |||||
is to run on a public server like Toolforge, but the | |||||
need to enter a key every time you start the bot may be | |||||
an inconvenience.""") | |||||
self.data["metadata"]["encryptPasswords"] = False | self.data["metadata"]["encryptPasswords"] = False | ||||
if self._ask_bool("Encrypt stored passwords?"): | if self._ask_bool("Encrypt stored passwords?"): | ||||
key = getpass(self.PROMPT + "Enter an encryption key: ") | key = getpass(self.PROMPT + "Enter an encryption key: ") | ||||
msg = "Running {0} rounds of bcrypt...".format(self.BCRYPT_ROUNDS) | |||||
self._print_no_nl(msg) | |||||
self._print_no_nl("Generating key...") | |||||
try: | try: | ||||
salt = bcrypt.gensalt(self.BCRYPT_ROUNDS) | |||||
signature = bcrypt.hashpw(key, salt) | |||||
self._cipher = Blowfish.new(sha256(key).digest()) | |||||
salt = os.urandom(16) | |||||
kdf = pbkdf2.PBKDF2HMAC( | |||||
algorithm=hashes.SHA256(), | |||||
length=32, | |||||
salt=salt, | |||||
iterations=self.PBKDF_ROUNDS, | |||||
) | |||||
self._cipher = fernet.Fernet(base64.urlsafe_b64encode(kdf.derive(key.encode()))) | |||||
except ImportError: | except ImportError: | ||||
print " error!" | |||||
self._print("""Encryption requires the 'py-bcrypt' and | |||||
'pycrypto' packages:""") | |||||
strt, end = " * \x1b[36m", "\x1b[0m" | |||||
print strt + "http://www.mindrot.org/projects/py-bcrypt/" + end | |||||
print strt + "https://www.dlitz.net/software/pycrypto/" + end | |||||
print(" error!") | |||||
self._print("""Encryption requires the 'cryptography' package: | |||||
https://cryptography.io/""") | |||||
self._print("""I will disable encryption for now; restart | self._print("""I will disable encryption for now; restart | ||||
configuration after installing these packages if | configuration after installing these packages if | ||||
you want it.""") | you want it.""") | ||||
self._pause() | self._pause() | ||||
else: | else: | ||||
self.data["metadata"]["encryptPasswords"] = True | self.data["metadata"]["encryptPasswords"] = True | ||||
self.data["metadata"]["signature"] = signature | |||||
print " done." | |||||
self.data["metadata"]["salt"] = base64.b64encode(salt).decode() | |||||
print(" done.") | |||||
print() | |||||
self._print("""The bot can temporarily store its logs in the logs/ | self._print("""The bot can temporarily store its logs in the logs/ | ||||
subdirectory. Error logs are kept for a month whereas | subdirectory. Error logs are kept for a month whereas | ||||
normal logs are kept for a week. If you disable this, | normal logs are kept for a week. If you disable this, | ||||
@@ -180,7 +180,7 @@ class ConfigScript(object): | |||||
self.data["metadata"]["enableLogging"] = logging | self.data["metadata"]["enableLogging"] = logging | ||||
def _set_components(self): | def _set_components(self): | ||||
print() | |||||
self._print("""The bot contains three separate components that can run | self._print("""The bot contains three separate components that can run | ||||
independently of each other.""") | independently of each other.""") | ||||
self._print("""- The IRC front-end runs on a normal IRC server, like | self._print("""- The IRC front-end runs on a normal IRC server, like | ||||
@@ -209,8 +209,8 @@ class ConfigScript(object): | |||||
try: | try: | ||||
site = self.config.bot.wiki.add_site(**kwargs) | site = self.config.bot.wiki.add_site(**kwargs) | ||||
except exceptions.APIError as exc: | except exceptions.APIError as exc: | ||||
print " API error!" | |||||
print "\x1b[31m" + exc.message + "\x1b[0m" | |||||
print(" API error!") | |||||
print("\x1b[31m" + exc.message + "\x1b[0m") | |||||
question = "Would you like to re-enter the site information?" | question = "Would you like to re-enter the site information?" | ||||
if self._ask_bool(question): | if self._ask_bool(question): | ||||
return self._set_wiki() | return self._set_wiki() | ||||
@@ -219,8 +219,8 @@ class ConfigScript(object): | |||||
raise exceptions.NoConfigError() | raise exceptions.NoConfigError() | ||||
return self._set_wiki() | return self._set_wiki() | ||||
except exceptions.LoginError as exc: | except exceptions.LoginError as exc: | ||||
print " login error!" | |||||
print "\x1b[31m" + exc.message + "\x1b[0m" | |||||
print(" login error!") | |||||
print("\x1b[31m" + exc.message + "\x1b[0m") | |||||
question = "Would you like to re-enter your login information?" | question = "Would you like to re-enter your login information?" | ||||
if self._ask_bool(question): | if self._ask_bool(question): | ||||
self.data["wiki"]["username"] = self._ask("Bot username:") | self.data["wiki"]["username"] = self._ask("Bot username:") | ||||
@@ -232,17 +232,17 @@ class ConfigScript(object): | |||||
question = "Would you like to re-enter the site information?" | question = "Would you like to re-enter the site information?" | ||||
if self._ask_bool(question): | if self._ask_bool(question): | ||||
return self._set_wiki() | return self._set_wiki() | ||||
print() | |||||
self._print("""Moving on. You can modify the login information | self._print("""Moving on. You can modify the login information | ||||
stored in the bot's config in the future.""") | stored in the bot's config in the future.""") | ||||
self.data["wiki"]["password"] = None # Clear so we don't login | self.data["wiki"]["password"] = None # Clear so we don't login | ||||
self.config.wiki._load(self.data["wiki"]) | self.config.wiki._load(self.data["wiki"]) | ||||
self._print_no_nl("Trying to connect to the site...") | self._print_no_nl("Trying to connect to the site...") | ||||
site = self.config.bot.wiki.add_site(**kwargs) | site = self.config.bot.wiki.add_site(**kwargs) | ||||
print " success." | |||||
print(" success.") | |||||
self.data["wiki"]["password"] = password # Reset original value | self.data["wiki"]["password"] = password # Reset original value | ||||
else: | else: | ||||
print " success." | |||||
print(" success.") | |||||
# Remember to store the encrypted password: | # Remember to store the encrypted password: | ||||
password = self._encrypt(self.data["wiki"]["password"]) | password = self._encrypt(self.data["wiki"]["password"]) | ||||
@@ -250,7 +250,7 @@ class ConfigScript(object): | |||||
return site | return site | ||||
def _set_wiki(self): | def _set_wiki(self): | ||||
print() | |||||
self._wmf = self._ask_bool("""Will this bot run on Wikimedia Foundation | self._wmf = self._ask_bool("""Will this bot run on Wikimedia Foundation | ||||
wikis, like Wikipedia?""") | wikis, like Wikipedia?""") | ||||
if self._wmf: | if self._wmf: | ||||
@@ -296,7 +296,7 @@ class ConfigScript(object): | |||||
self.data["wiki"]["shutoff"] = {} | self.data["wiki"]["shutoff"] = {} | ||||
msg = "Would you like to enable an automatic shutoff page for the bot?" | msg = "Would you like to enable an automatic shutoff page for the bot?" | ||||
if self._ask_bool(msg): | if self._ask_bool(msg): | ||||
print() | |||||
self._print("""The page title can contain two wildcards: $1 will be | self._print("""The page title can contain two wildcards: $1 will be | ||||
substituted with the bot's username, and $2 with the | substituted with the bot's username, and $2 with the | ||||
current task number. This can be used to implement a | current task number. This can be used to implement a | ||||
@@ -311,7 +311,7 @@ class ConfigScript(object): | |||||
def _set_irc(self): | def _set_irc(self): | ||||
if self.data["components"]["irc_frontend"]: | if self.data["components"]["irc_frontend"]: | ||||
print() | |||||
frontend = self.data["irc"]["frontend"] = OrderedDict() | frontend = self.data["irc"]["frontend"] = OrderedDict() | ||||
msg = "Hostname of the frontend's IRC server, without 'irc://':" | msg = "Hostname of the frontend's IRC server, without 'irc://':" | ||||
frontend["host"] = self._ask(msg, "irc.freenode.net") | frontend["host"] = self._ask(msg, "irc.freenode.net") | ||||
@@ -328,7 +328,7 @@ class ConfigScript(object): | |||||
frontend["nickservPassword"] = ns_pass | frontend["nickservPassword"] = ns_pass | ||||
chan_question = "Frontend channels to join by default:" | chan_question = "Frontend channels to join by default:" | ||||
frontend["channels"] = self._ask_list(chan_question) | frontend["channels"] = self._ask_list(chan_question) | ||||
print() | |||||
self._print("""The bot keeps a database of its admins (users who | self._print("""The bot keeps a database of its admins (users who | ||||
can use certain sensitive commands) and owners | can use certain sensitive commands) and owners | ||||
(users who can quit the bot and modify its access | (users who can quit the bot and modify its access | ||||
@@ -347,7 +347,7 @@ class ConfigScript(object): | |||||
frontend = {} | frontend = {} | ||||
if self.data["components"]["irc_watcher"]: | if self.data["components"]["irc_watcher"]: | ||||
print() | |||||
watcher = self.data["irc"]["watcher"] = OrderedDict() | watcher = self.data["irc"]["watcher"] = OrderedDict() | ||||
if self._wmf: | if self._wmf: | ||||
watcher["host"] = "irc.wikimedia.org" | watcher["host"] = "irc.wikimedia.org" | ||||
@@ -375,7 +375,7 @@ class ConfigScript(object): | |||||
else: | else: | ||||
chan_question = "Watcher channels to join by default:" | chan_question = "Watcher channels to join by default:" | ||||
watcher["channels"] = self._ask_list(chan_question) | watcher["channels"] = self._ask_list(chan_question) | ||||
print() | |||||
self._print("""I am now creating a blank 'rules.py' file, which | self._print("""I am now creating a blank 'rules.py' file, which | ||||
will determine how the bot handles messages received | will determine how the bot handles messages received | ||||
from the IRC watcher. It contains a process() | from the IRC watcher. It contains a process() | ||||
@@ -390,13 +390,13 @@ class ConfigScript(object): | |||||
self.data["irc"]["version"] = "EarwigBot - $1 - Python/$2 https://github.com/earwig/earwigbot" | self.data["irc"]["version"] = "EarwigBot - $1 - Python/$2 https://github.com/earwig/earwigbot" | ||||
def _set_commands(self): | def _set_commands(self): | ||||
print() | |||||
msg = """Would you like to disable the default IRC commands? You can | msg = """Would you like to disable the default IRC commands? You can | ||||
fine-tune which commands are disabled later on.""" | fine-tune which commands are disabled later on.""" | ||||
if (not self.data["components"]["irc_frontend"] or | if (not self.data["components"]["irc_frontend"] or | ||||
self._ask_bool(msg, default=False)): | self._ask_bool(msg, default=False)): | ||||
self.data["commands"]["disable"] = True | self.data["commands"]["disable"] = True | ||||
print() | |||||
self._print("""I am now creating the 'commands/' directory, where you | self._print("""I am now creating the 'commands/' directory, where you | ||||
can place custom IRC commands and plugins. Creating your | can place custom IRC commands and plugins. Creating your | ||||
own commands is described in the documentation.""") | own commands is described in the documentation.""") | ||||
@@ -404,7 +404,7 @@ class ConfigScript(object): | |||||
self._pause() | self._pause() | ||||
def _set_tasks(self): | def _set_tasks(self): | ||||
print() | |||||
self._print("""I am now creating the 'tasks/' directory, where you can | self._print("""I am now creating the 'tasks/' directory, where you can | ||||
place custom bot tasks and plugins. Creating your own | place custom bot tasks and plugins. Creating your own | ||||
tasks is described in the documentation.""") | tasks is described in the documentation.""") | ||||
@@ -412,22 +412,22 @@ class ConfigScript(object): | |||||
self._pause() | self._pause() | ||||
def _set_schedule(self): | def _set_schedule(self): | ||||
print() | |||||
self._print("""The final section of your config file, 'schedule', is a | self._print("""The final section of your config file, 'schedule', is a | ||||
list of bot tasks to be started by the wiki scheduler. | list of bot tasks to be started by the wiki scheduler. | ||||
Each entry contains cron-like time quantifiers and a | Each entry contains cron-like time quantifiers and a | ||||
list of tasks. For example, the following starts the | list of tasks. For example, the following starts the | ||||
'foobot' task every hour on the half-hour:""") | 'foobot' task every hour on the half-hour:""") | ||||
print "\x1b[33mschedule:" | |||||
print " - minute: 30" | |||||
print " tasks:" | |||||
print " - foobot\x1b[0m" | |||||
print("\x1b[33mschedule:") | |||||
print(" - minute: 30") | |||||
print(" tasks:") | |||||
print(" - foobot\x1b[0m") | |||||
self._print("""The following starts the 'barbot' task with the keyword | self._print("""The following starts the 'barbot' task with the keyword | ||||
arguments 'action="baz"' every Monday at 05:00 UTC:""") | arguments 'action="baz"' every Monday at 05:00 UTC:""") | ||||
print "\x1b[33m - week_day: 1" | |||||
print " hour: 5" | |||||
print " tasks:" | |||||
print ' - ["barbot", {"action": "baz"}]\x1b[0m' | |||||
print("\x1b[33m - week_day: 1") | |||||
print(" hour: 5") | |||||
print(" tasks:") | |||||
print(' - ["barbot", {"action": "baz"}]\x1b[0m') | |||||
self._print("""The full list of quantifiers is minute, hour, month_day, | self._print("""The full list of quantifiers is minute, hour, month_day, | ||||
month, and week_day. See the documentation for more | month, and week_day. See the documentation for more | ||||
information.""") | information.""") | ||||
@@ -449,7 +449,7 @@ class ConfigScript(object): | |||||
open(self.config.path, "w").close() | open(self.config.path, "w").close() | ||||
chmod(self.config.path, stat.S_IRUSR|stat.S_IWUSR) | chmod(self.config.path, stat.S_IRUSR|stat.S_IWUSR) | ||||
except IOError: | except IOError: | ||||
print "I can't seem to write to the config file:" | |||||
print("I can't seem to write to the config file:") | |||||
raise | raise | ||||
self._set_metadata() | self._set_metadata() | ||||
self._set_components() | self._set_components() | ||||
@@ -461,7 +461,7 @@ class ConfigScript(object): | |||||
self._set_tasks() | self._set_tasks() | ||||
if components["wiki_scheduler"]: | if components["wiki_scheduler"]: | ||||
self._set_schedule() | self._set_schedule() | ||||
print() | |||||
self._print("""I am now saving config.yml with your settings. YAML is a | self._print("""I am now saving config.yml with your settings. YAML is a | ||||
relatively straightforward format and you should be able | relatively straightforward format and you should be able | ||||
to update these settings in the future when necessary. | to update these settings in the future when necessary. | ||||
@@ -268,5 +268,5 @@ class ParserRedirectError(CopyvioCheckError): | |||||
exposed in client code. | exposed in client code. | ||||
""" | """ | ||||
def __init__(self, url): | def __init__(self, url): | ||||
super(ParserRedirectError, self).__init__() | |||||
super().__init__() | |||||
self.url = url | self.url = url |
@@ -28,7 +28,7 @@ from earwigbot.exceptions import BrokenSocketError | |||||
__all__ = ["IRCConnection"] | __all__ = ["IRCConnection"] | ||||
class IRCConnection(object): | |||||
class IRCConnection: | |||||
"""Interface with an IRC server.""" | """Interface with an IRC server.""" | ||||
def __init__(self, host, port, nick, ident, realname, logger): | def __init__(self, host, port, nick, ident, realname, logger): | ||||
@@ -84,7 +84,7 @@ class IRCConnection(object): | |||||
if not data: | if not data: | ||||
# Socket isn't giving us any data, so it is dead or broken: | # Socket isn't giving us any data, so it is dead or broken: | ||||
raise BrokenSocketError() | raise BrokenSocketError() | ||||
return data | |||||
return data.decode(errors="ignore") | |||||
def _send(self, msg, hidelog=False): | def _send(self, msg, hidelog=False): | ||||
"""Send data to the server.""" | """Send data to the server.""" | ||||
@@ -93,7 +93,7 @@ class IRCConnection(object): | |||||
if time_since_last < 0.75: | if time_since_last < 0.75: | ||||
sleep(0.75 - time_since_last) | sleep(0.75 - time_since_last) | ||||
try: | try: | ||||
self._sock.sendall(msg + "\r\n") | |||||
self._sock.sendall(msg.encode() + b"\r\n") | |||||
except socket.error: | except socket.error: | ||||
self._is_running = False | self._is_running = False | ||||
else: | else: | ||||
@@ -177,7 +177,7 @@ class IRCConnection(object): | |||||
def ident(self): | def ident(self): | ||||
"""Our ident on the server, like ``"earwig"``. | """Our ident on the server, like ``"earwig"``. | ||||
See http://en.wikipedia.org/wiki/Ident. | |||||
See https://en.wikipedia.org/wiki/Ident_protocol. | |||||
""" | """ | ||||
return self._ident | return self._ident | ||||
@@ -24,7 +24,7 @@ import re | |||||
__all__ = ["Data"] | __all__ = ["Data"] | ||||
class Data(object): | |||||
class Data: | |||||
"""Store data from an individual line received on IRC.""" | """Store data from an individual line received on IRC.""" | ||||
def __init__(self, my_nick, line, msgtype): | def __init__(self, my_nick, line, msgtype): | ||||
@@ -160,7 +160,7 @@ class Data(object): | |||||
@property | @property | ||||
def ident(self): | def ident(self): | ||||
"""`Ident <http://en.wikipedia.org/wiki/Ident>`_ of the sender.""" | |||||
"""`Ident <https://en.wikipedia.org/wiki/Ident_protocol>`_ of the sender.""" | |||||
return self._ident | return self._ident | ||||
@property | @property | ||||
@@ -43,9 +43,8 @@ class Frontend(IRCConnection): | |||||
def __init__(self, bot): | def __init__(self, bot): | ||||
self.bot = bot | self.bot = bot | ||||
cf = bot.config.irc["frontend"] | cf = bot.config.irc["frontend"] | ||||
base = super(Frontend, self) | |||||
base.__init__(cf["host"], cf["port"], cf["nick"], cf["ident"], | |||||
cf["realname"], bot.logger.getChild("frontend")) | |||||
super().__init__(cf["host"], cf["port"], cf["nick"], cf["ident"], | |||||
cf["realname"], bot.logger.getChild("frontend")) | |||||
self._auth_wait = False | self._auth_wait = False | ||||
self._channels = set() | self._channels = set() | ||||
@@ -24,7 +24,7 @@ import re | |||||
__all__ = ["RC"] | __all__ = ["RC"] | ||||
class RC(object): | |||||
class RC: | |||||
"""Store data from an event received from our IRC watcher.""" | """Store data from an event received from our IRC watcher.""" | ||||
re_color = re.compile("\x03([0-9]{1,2}(,[0-9]{1,2})?)?") | re_color = re.compile("\x03([0-9]{1,2}(,[0-9]{1,2})?)?") | ||||
re_edit = re.compile("\A\[\[(.*?)\]\]\s(.*?)\s(https?://.*?)\s\*\s(.*?)\s\*\s(.*?)\Z") | re_edit = re.compile("\A\[\[(.*?)\]\]\s(.*?)\s(https?://.*?)\s\*\s(.*?)\s\*\s(.*?)\Z") | ||||
@@ -40,9 +40,8 @@ class Watcher(IRCConnection): | |||||
def __init__(self, bot): | def __init__(self, bot): | ||||
self.bot = bot | self.bot = bot | ||||
cf = bot.config.irc["watcher"] | cf = bot.config.irc["watcher"] | ||||
base = super(Watcher, self) | |||||
base.__init__(cf["host"], cf["port"], cf["nick"], cf["ident"], | |||||
cf["realname"], bot.logger.getChild("watcher")) | |||||
super().__init__(cf["host"], cf["port"], cf["nick"], cf["ident"], | |||||
cf["realname"], bot.logger.getChild("watcher")) | |||||
self._prepare_process_hook() | self._prepare_process_hook() | ||||
self._connect() | self._connect() | ||||
@@ -22,12 +22,13 @@ | |||||
""" | """ | ||||
Implements a hierarchy of importing classes as defined in `PEP 302 | Implements a hierarchy of importing classes as defined in `PEP 302 | ||||
<http://www.python.org/dev/peps/pep-0302/>`_ to load modules in a safe yet lazy | |||||
<https://www.python.org/dev/peps/pep-0302/>`_ to load modules in a safe yet lazy | |||||
manner, so that they can be referred to by name but are not actually loaded | manner, so that they can be referred to by name but are not actually loaded | ||||
until they are used (i.e. their attributes are read or modified). | until they are used (i.e. their attributes are read or modified). | ||||
""" | """ | ||||
from imp import acquire_lock, release_lock | from imp import acquire_lock, release_lock | ||||
import importlib | |||||
import sys | import sys | ||||
from threading import RLock | from threading import RLock | ||||
from types import ModuleType | from types import ModuleType | ||||
@@ -46,7 +47,7 @@ def _mock_get(self, attr): | |||||
if _real_get(self, "_unloaded"): | if _real_get(self, "_unloaded"): | ||||
type(self)._unloaded = False | type(self)._unloaded = False | ||||
try: | try: | ||||
reload(self) | |||||
importlib.reload(self) | |||||
except ImportError as exc: | except ImportError as exc: | ||||
type(self).__getattribute__ = _create_failing_get(exc) | type(self).__getattribute__ = _create_failing_get(exc) | ||||
del type(self)._lock | del type(self)._lock | ||||
@@ -77,7 +78,7 @@ class _LazyModule(type): | |||||
release_lock() | release_lock() | ||||
class LazyImporter(object): | |||||
class LazyImporter: | |||||
"""An importer for modules that are loaded lazily. | """An importer for modules that are loaded lazily. | ||||
This inserts itself into :py:data:`sys.meta_path`, storing a dictionary of | This inserts itself into :py:data:`sys.meta_path`, storing a dictionary of | ||||
@@ -32,7 +32,7 @@ from earwigbot.tasks import Task | |||||
__all__ = ["CommandManager", "TaskManager"] | __all__ = ["CommandManager", "TaskManager"] | ||||
class _ResourceManager(object): | |||||
class _ResourceManager: | |||||
""" | """ | ||||
**EarwigBot: Resource Manager** | **EarwigBot: Resource Manager** | ||||
@@ -69,7 +69,7 @@ class _ResourceManager(object): | |||||
def __iter__(self): | def __iter__(self): | ||||
with self.lock: | with self.lock: | ||||
for resource in self._resources.itervalues(): | |||||
for resource in self._resources.values(): | |||||
yield resource | yield resource | ||||
def _is_disabled(self, name): | def _is_disabled(self, name): | ||||
@@ -201,7 +201,7 @@ class CommandManager(_ResourceManager): | |||||
Manages (i.e., loads, reloads, and calls) IRC commands. | Manages (i.e., loads, reloads, and calls) IRC commands. | ||||
""" | """ | ||||
def __init__(self, bot): | def __init__(self, bot): | ||||
super(CommandManager, self).__init__(bot, "commands", Command) | |||||
super().__init__(bot, "commands", Command) | |||||
def _wrap_check(self, command, data): | def _wrap_check(self, command, data): | ||||
"""Check whether a command should be called, catching errors.""" | """Check whether a command should be called, catching errors.""" | ||||
@@ -248,7 +248,7 @@ class TaskManager(_ResourceManager): | |||||
Manages (i.e., loads, reloads, schedules, and runs) wiki bot tasks. | Manages (i.e., loads, reloads, schedules, and runs) wiki bot tasks. | ||||
""" | """ | ||||
def __init__(self, bot): | def __init__(self, bot): | ||||
super(TaskManager, self).__init__(bot, "tasks", Task) | |||||
super().__init__(bot, "tasks", Task) | |||||
def _wrapper(self, task, **kwargs): | def _wrapper(self, task, **kwargs): | ||||
"""Wrapper for task classes: run the task and catch any errors.""" | """Wrapper for task classes: run the task and catch any errors.""" | ||||
@@ -25,7 +25,7 @@ from earwigbot import wiki | |||||
__all__ = ["Task"] | __all__ = ["Task"] | ||||
class Task(object): | |||||
class Task: | |||||
""" | """ | ||||
**EarwigBot: Base Bot Task** | **EarwigBot: Base Bot Task** | ||||
@@ -48,8 +48,8 @@ class Task(object): | |||||
This is called once immediately after the task class is loaded by | This is called once immediately after the task class is loaded by | ||||
the task manager (in :py:meth:`tasks.load() | the task manager (in :py:meth:`tasks.load() | ||||
<earwigbot.managers._ResourceManager.load>`). Don't override this | <earwigbot.managers._ResourceManager.load>`). Don't override this | ||||
directly; if you do, remember to place ``super(Task, self).__init()`` | |||||
first. Use :py:meth:`setup` for typical task-init/setup needs. | |||||
directly; if you do, remember to place ``super().__init()`` first. | |||||
Use :py:meth:`setup` for typical task-init/setup needs. | |||||
""" | """ | ||||
self.bot = bot | self.bot = bot | ||||
self.config = bot.config | self.config = bot.config | ||||
@@ -183,11 +183,11 @@ class WikiProjectTagger(Task): | |||||
""" | """ | ||||
prefix = title.split(":", 1)[0] | prefix = title.split(":", 1)[0] | ||||
if prefix == title: | if prefix == title: | ||||
return u":".join((site.namespace_id_to_name(assumed), title)) | |||||
return ":".join((site.namespace_id_to_name(assumed), title)) | |||||
try: | try: | ||||
site.namespace_name_to_id(prefix) | site.namespace_name_to_id(prefix) | ||||
except exceptions.NamespaceNotFoundError: | except exceptions.NamespaceNotFoundError: | ||||
return u":".join((site.namespace_id_to_name(assumed), title)) | |||||
return ":".join((site.namespace_id_to_name(assumed), title)) | |||||
return title | return title | ||||
def get_names(self, site, banner): | def get_names(self, site, banner): | ||||
@@ -197,7 +197,7 @@ class WikiProjectTagger(Task): | |||||
banner = banner.split(":", 1)[1] | banner = banner.split(":", 1)[1] | ||||
page = site.get_page(title) | page = site.get_page(title) | ||||
if page.exists != page.PAGE_EXISTS: | if page.exists != page.PAGE_EXISTS: | ||||
self.logger.error(u"Banner [[%s]] does not exist", title) | |||||
self.logger.error("Banner [[%s]] does not exist", title) | |||||
return banner, None | return banner, None | ||||
names = {banner, title} | names = {banner, title} | ||||
@@ -208,17 +208,17 @@ class WikiProjectTagger(Task): | |||||
if backlink["ns"] == constants.NS_TEMPLATE: | if backlink["ns"] == constants.NS_TEMPLATE: | ||||
names.add(backlink["title"].split(":", 1)[1]) | names.add(backlink["title"].split(":", 1)[1]) | ||||
log = u"Found %s aliases for banner [[%s]]" | |||||
log = "Found %s aliases for banner [[%s]]" | |||||
self.logger.debug(log, len(names), title) | self.logger.debug(log, len(names), title) | ||||
return banner, names | return banner, names | ||||
def process_category(self, page, job, recursive): | def process_category(self, page, job, recursive): | ||||
"""Try to tag all pages in the given category.""" | """Try to tag all pages in the given category.""" | ||||
if page.title in job.processed_cats: | if page.title in job.processed_cats: | ||||
self.logger.debug(u"Skipping category, already processed: [[%s]]", | |||||
self.logger.debug("Skipping category, already processed: [[%s]]", | |||||
page.title) | page.title) | ||||
return | return | ||||
self.logger.info(u"Processing category: [[%s]]", page.title) | |||||
self.logger.info("Processing category: [[%s]]", page.title) | |||||
job.processed_cats.add(page.title) | job.processed_cats.add(page.title) | ||||
if job.tag_categories: | if job.tag_categories: | ||||
@@ -243,7 +243,7 @@ class WikiProjectTagger(Task): | |||||
page = page.toggle_talk() | page = page.toggle_talk() | ||||
if page.title in job.processed_pages: | if page.title in job.processed_pages: | ||||
self.logger.debug(u"Skipping page, already processed: [[%s]]", | |||||
self.logger.debug("Skipping page, already processed: [[%s]]", | |||||
page.title) | page.title) | ||||
return | return | ||||
job.processed_pages.add(page.title) | job.processed_pages.add(page.title) | ||||
@@ -259,7 +259,7 @@ class WikiProjectTagger(Task): | |||||
self.process_new_page(page, job) | self.process_new_page(page, job) | ||||
return | return | ||||
except exceptions.InvalidPageError: | except exceptions.InvalidPageError: | ||||
self.logger.error(u"Skipping invalid page: [[%s]]", page.title) | |||||
self.logger.error("Skipping invalid page: [[%s]]", page.title) | |||||
return | return | ||||
is_update = False | is_update = False | ||||
@@ -270,28 +270,28 @@ class WikiProjectTagger(Task): | |||||
is_update = True | is_update = True | ||||
break | break | ||||
else: | else: | ||||
log = u"Skipping page: [[%s]]; already tagged with '%s'" | |||||
log = "Skipping page: [[%s]]; already tagged with '%s'" | |||||
self.logger.info(log, page.title, template.name) | self.logger.info(log, page.title, template.name) | ||||
return | return | ||||
if job.only_with: | if job.only_with: | ||||
if not any(template.name.matches(job.only_with) | if not any(template.name.matches(job.only_with) | ||||
for template in code.ifilter_templates(recursive=True)): | for template in code.ifilter_templates(recursive=True)): | ||||
log = u"Skipping page: [[%s]]; fails only-with condition" | |||||
log = "Skipping page: [[%s]]; fails only-with condition" | |||||
self.logger.info(log, page.title) | self.logger.info(log, page.title) | ||||
return | return | ||||
if is_update: | if is_update: | ||||
old_banner = unicode(banner) | |||||
old_banner = str(banner) | |||||
self.update_banner(banner, job, code) | self.update_banner(banner, job, code) | ||||
if banner == old_banner: | if banner == old_banner: | ||||
log = u"Skipping page: [[%s]]; already tagged and no updates" | |||||
log = "Skipping page: [[%s]]; already tagged and no updates" | |||||
self.logger.info(log, page.title) | self.logger.info(log, page.title) | ||||
return | return | ||||
self.logger.info(u"Updating banner on page: [[%s]]", page.title) | |||||
self.logger.info("Updating banner on page: [[%s]]", page.title) | |||||
banner = banner.encode("utf8") | banner = banner.encode("utf8") | ||||
else: | else: | ||||
self.logger.info(u"Tagging page: [[%s]]", page.title) | |||||
self.logger.info("Tagging page: [[%s]]", page.title) | |||||
banner = self.make_banner(job, code) | banner = self.make_banner(job, code) | ||||
shell = self.get_banner_shell(code) | shell = self.get_banner_shell(code) | ||||
if shell: | if shell: | ||||
@@ -299,22 +299,22 @@ class WikiProjectTagger(Task): | |||||
else: | else: | ||||
self.add_banner(code, banner) | self.add_banner(code, banner) | ||||
self.save_page(page, job, unicode(code), banner) | |||||
self.save_page(page, job, str(code), banner) | |||||
def process_new_page(self, page, job): | def process_new_page(self, page, job): | ||||
"""Try to tag a *page* that doesn't exist yet using the *job*.""" | """Try to tag a *page* that doesn't exist yet using the *job*.""" | ||||
if job.nocreate or job.only_with: | if job.nocreate or job.only_with: | ||||
log = u"Skipping nonexistent page: [[%s]]" | |||||
log = "Skipping nonexistent page: [[%s]]" | |||||
self.logger.info(log, page.title) | self.logger.info(log, page.title) | ||||
else: | else: | ||||
self.logger.info(u"Tagging new page: [[%s]]", page.title) | |||||
self.logger.info("Tagging new page: [[%s]]", page.title) | |||||
banner = self.make_banner(job) | banner = self.make_banner(job) | ||||
self.save_page(page, job, banner, banner) | self.save_page(page, job, banner, banner) | ||||
def save_page(self, page, job, text, banner): | def save_page(self, page, job, text, banner): | ||||
"""Save a page with an updated banner.""" | """Save a page with an updated banner.""" | ||||
if job.dry_run: | if job.dry_run: | ||||
self.logger.debug(u"[DRY RUN] Banner: %s", banner) | |||||
self.logger.debug("[DRY RUN] Banner: %s", banner) | |||||
else: | else: | ||||
summary = job.summary.replace("$3", banner) | summary = job.summary.replace("$3", banner) | ||||
page.edit(text, self.make_summary(summary), minor=True) | page.edit(text, self.make_summary(summary), minor=True) | ||||
@@ -365,7 +365,7 @@ class WikiProjectTagger(Task): | |||||
classes = {klass: 0 for klass in classnames} | classes = {klass: 0 for klass in classnames} | ||||
for template in code.ifilter_templates(recursive=True): | for template in code.ifilter_templates(recursive=True): | ||||
if template.has("class"): | if template.has("class"): | ||||
value = unicode(template.get("class").value).lower() | |||||
value = str(template.get("class").value).lower() | |||||
if value in classes: | if value in classes: | ||||
classes[value] += 1 | classes[value] += 1 | ||||
@@ -388,14 +388,14 @@ class WikiProjectTagger(Task): | |||||
if not shells: | if not shells: | ||||
shells = code.filter_templates(matches=regex, recursive=True) | shells = code.filter_templates(matches=regex, recursive=True) | ||||
if shells: | if shells: | ||||
log = u"Inserting banner into shell: %s" | |||||
log = "Inserting banner into shell: %s" | |||||
self.logger.debug(log, shells[0].name) | self.logger.debug(log, shells[0].name) | ||||
return shells[0] | return shells[0] | ||||
def add_banner_to_shell(self, shell, banner): | def add_banner_to_shell(self, shell, banner): | ||||
"""Add *banner* to *shell*.""" | """Add *banner* to *shell*.""" | ||||
if shell.has_param(1): | if shell.has_param(1): | ||||
if unicode(shell.get(1).value).endswith("\n"): | |||||
if str(shell.get(1).value).endswith("\n"): | |||||
banner += "\n" | banner += "\n" | ||||
else: | else: | ||||
banner = "\n" + banner | banner = "\n" + banner | ||||
@@ -410,16 +410,16 @@ class WikiProjectTagger(Task): | |||||
name = template.name.lower().replace("_", " ") | name = template.name.lower().replace("_", " ") | ||||
for regex in self.TOP_TEMPS: | for regex in self.TOP_TEMPS: | ||||
if re.match(regex, name): | if re.match(regex, name): | ||||
self.logger.debug(u"Skipping past top template: %s", name) | |||||
self.logger.debug("Skipping past top template: %s", name) | |||||
predecessor = template | predecessor = template | ||||
break | break | ||||
if "wikiproject" in name or name.startswith("wp"): | if "wikiproject" in name or name.startswith("wp"): | ||||
self.logger.debug(u"Skipping past banner template: %s", name) | |||||
self.logger.debug("Skipping past banner template: %s", name) | |||||
predecessor = template | predecessor = template | ||||
if predecessor: | if predecessor: | ||||
self.logger.debug("Inserting banner after template") | self.logger.debug("Inserting banner after template") | ||||
if not unicode(predecessor).endswith("\n"): | |||||
if not str(predecessor).endswith("\n"): | |||||
banner = "\n" + banner | banner = "\n" + banner | ||||
post = code.index(predecessor) + 1 | post = code.index(predecessor) + 1 | ||||
if len(code.nodes) > post and not code.get(post).startswith("\n"): | if len(code.nodes) > post and not code.get(post).startswith("\n"): | ||||
@@ -429,7 +429,7 @@ class WikiProjectTagger(Task): | |||||
self.logger.debug("Inserting banner at beginning") | self.logger.debug("Inserting banner at beginning") | ||||
code.insert(0, banner + "\n") | code.insert(0, banner + "\n") | ||||
class _Job(object): | |||||
class _Job: | |||||
"""Represents a single wikiproject-tagging task. | """Represents a single wikiproject-tagging task. | ||||
Stores information on the banner to add, the edit summary to use, whether | Stores information on the banner to add, the edit summary to use, whether | ||||
@@ -128,8 +128,8 @@ def main(): | |||||
level = logging.DEBUG | level = logging.DEBUG | ||||
elif args.quiet: | elif args.quiet: | ||||
level = logging.WARNING | level = logging.WARNING | ||||
print version | |||||
print(version) | |||||
print() | |||||
bot = Bot(path.abspath(args.path), level=level) | bot = Bot(path.abspath(args.path), level=level) | ||||
if args.task: | if args.task: | ||||
@@ -25,8 +25,8 @@ | |||||
This is a collection of classes and functions to read from and write to | This is a collection of classes and functions to read from and write to | ||||
Wikipedia and other wiki sites. No connection whatsoever to `python-wikitools | Wikipedia and other wiki sites. No connection whatsoever to `python-wikitools | ||||
<http://code.google.com/p/python-wikitools/>`_ written by `Mr.Z-man | |||||
<http://en.wikipedia.org/wiki/User:Mr.Z-man>`_, other than a similar purpose. | |||||
<https://github.com/alexz-enwp/wikitools>`_ written by `Mr.Z-man | |||||
<https://en.wikipedia.org/wiki/User:Mr.Z-man>`_, other than a similar purpose. | |||||
We share no code. | We share no code. | ||||
Import the toolset directly with ``from earwigbot import wiki``. If using the | Import the toolset directly with ``from earwigbot import wiki``. If using the | ||||
@@ -99,7 +99,7 @@ class Category(Page): | |||||
base = row[0].replace("_", " ").decode("utf8") | base = row[0].replace("_", " ").decode("utf8") | ||||
namespace = self.site.namespace_id_to_name(row[1]) | namespace = self.site.namespace_id_to_name(row[1]) | ||||
if namespace: | if namespace: | ||||
title = u":".join((namespace, base)) | |||||
title = ":".join((namespace, base)) | |||||
else: # Avoid doing a silly (albeit valid) ":Pagename" thing | else: # Avoid doing a silly (albeit valid) ":Pagename" thing | ||||
title = base | title = base | ||||
yield self.site.get_page(title, follow_redirects=follow, | yield self.site.get_page(title, follow_redirects=follow, | ||||
@@ -109,7 +109,7 @@ class Category(Page): | |||||
"""Return the size of the category using the API.""" | """Return the size of the category using the API.""" | ||||
result = self.site.api_query(action="query", prop="categoryinfo", | result = self.site.api_query(action="query", prop="categoryinfo", | ||||
titles=self.title) | titles=self.title) | ||||
info = result["query"]["pages"].values()[0]["categoryinfo"] | |||||
info = list(result["query"]["pages"].values())[0]["categoryinfo"] | |||||
return info[member_type] | return info[member_type] | ||||
def _get_size_via_sql(self, member_type): | def _get_size_via_sql(self, member_type): | ||||
@@ -21,7 +21,7 @@ | |||||
# SOFTWARE. | # SOFTWARE. | ||||
from time import sleep, time | from time import sleep, time | ||||
from urllib2 import build_opener | |||||
from urllib.request import build_opener | |||||
from earwigbot import exceptions | from earwigbot import exceptions | ||||
from earwigbot.wiki.copyvios.markov import MarkovChain | from earwigbot.wiki.copyvios.markov import MarkovChain | ||||
@@ -32,7 +32,7 @@ from earwigbot.wiki.copyvios.workers import ( | |||||
__all__ = ["CopyvioMixIn", "globalize", "localize"] | __all__ = ["CopyvioMixIn", "globalize", "localize"] | ||||
class CopyvioMixIn(object): | |||||
class CopyvioMixIn: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Copyright Violation MixIn** | **EarwigBot: Wiki Toolset: Copyright Violation MixIn** | ||||
@@ -114,7 +114,7 @@ class CopyvioMixIn(object): | |||||
(:exc:`.UnknownSearchEngineError`, :exc:`.SearchQueryError`, ...) on | (:exc:`.UnknownSearchEngineError`, :exc:`.SearchQueryError`, ...) on | ||||
errors. | errors. | ||||
""" | """ | ||||
log = u"Starting copyvio check for [[{0}]]" | |||||
log = "Starting copyvio check for [[{0}]]" | |||||
self._logger.info(log.format(self.title)) | self._logger.info(log.format(self.title)) | ||||
searcher = self._get_search_engine() | searcher = self._get_search_engine() | ||||
parser = ArticleTextParser(self.get(), args={ | parser = ArticleTextParser(self.get(), args={ | ||||
@@ -151,7 +151,7 @@ class CopyvioMixIn(object): | |||||
if short_circuit and workspace.finished: | if short_circuit and workspace.finished: | ||||
workspace.possible_miss = True | workspace.possible_miss = True | ||||
break | break | ||||
log = u"[[{0}]] -> querying {1} for {2!r}" | |||||
log = "[[{0}]] -> querying {1} for {2!r}" | |||||
self._logger.debug(log.format(self.title, searcher.name, chunk)) | self._logger.debug(log.format(self.title, searcher.name, chunk)) | ||||
workspace.enqueue(searcher.search(chunk)) | workspace.enqueue(searcher.search(chunk)) | ||||
num_queries += 1 | num_queries += 1 | ||||
@@ -183,7 +183,7 @@ class CopyvioMixIn(object): | |||||
Since no searching is done, neither :exc:`.UnknownSearchEngineError` | Since no searching is done, neither :exc:`.UnknownSearchEngineError` | ||||
nor :exc:`.SearchQueryError` will be raised. | nor :exc:`.SearchQueryError` will be raised. | ||||
""" | """ | ||||
log = u"Starting copyvio compare for [[{0}]] against {1}" | |||||
log = "Starting copyvio compare for [[{0}]] against {1}" | |||||
self._logger.info(log.format(self.title, url)) | self._logger.info(log.format(self.title, url)) | ||||
article = MarkovChain(ArticleTextParser(self.get()).strip()) | article = MarkovChain(ArticleTextParser(self.get()).strip()) | ||||
workspace = CopyvioWorkspace( | workspace = CopyvioWorkspace( | ||||
@@ -24,7 +24,7 @@ import re | |||||
import sqlite3 as sqlite | import sqlite3 as sqlite | ||||
from threading import Lock | from threading import Lock | ||||
from time import time | from time import time | ||||
from urlparse import urlparse | |||||
from urllib.parse import urlparse | |||||
from earwigbot import exceptions | from earwigbot import exceptions | ||||
@@ -45,7 +45,7 @@ DEFAULT_SOURCES = { | |||||
_RE_STRIP_PREFIX = r"^https?://(www\.)?" | _RE_STRIP_PREFIX = r"^https?://(www\.)?" | ||||
class ExclusionsDB(object): | |||||
class ExclusionsDB: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Exclusions Database Manager** | **EarwigBot: Wiki Toolset: Exclusions Database Manager** | ||||
@@ -77,7 +77,7 @@ class ExclusionsDB(object): | |||||
""" | """ | ||||
query = "INSERT INTO sources VALUES (?, ?);" | query = "INSERT INTO sources VALUES (?, ?);" | ||||
sources = [] | sources = [] | ||||
for sitename, pages in DEFAULT_SOURCES.iteritems(): | |||||
for sitename, pages in DEFAULT_SOURCES.items(): | |||||
for page in pages: | for page in pages: | ||||
sources.append((sitename, page)) | sources.append((sitename, page)) | ||||
@@ -168,11 +168,11 @@ class ExclusionsDB(object): | |||||
max_staleness = 60 * 60 * (12 if sitename == "all" else 48) | max_staleness = 60 * 60 * (12 if sitename == "all" else 48) | ||||
time_since_update = int(time() - self._get_last_update(sitename)) | time_since_update = int(time() - self._get_last_update(sitename)) | ||||
if force or time_since_update > max_staleness: | if force or time_since_update > max_staleness: | ||||
log = u"Updating stale database: {0} (last updated {1} seconds ago)" | |||||
log = "Updating stale database: {0} (last updated {1} seconds ago)" | |||||
self._logger.info(log.format(sitename, time_since_update)) | self._logger.info(log.format(sitename, time_since_update)) | ||||
self._update(sitename) | self._update(sitename) | ||||
else: | else: | ||||
log = u"Database for {0} is still fresh (last updated {1} seconds ago)" | |||||
log = "Database for {0} is still fresh (last updated {1} seconds ago)" | |||||
self._logger.debug(log.format(sitename, time_since_update)) | self._logger.debug(log.format(sitename, time_since_update)) | ||||
if sitename != "all": | if sitename != "all": | ||||
self.sync("all", force=force) | self.sync("all", force=force) | ||||
@@ -202,11 +202,11 @@ class ExclusionsDB(object): | |||||
else: | else: | ||||
matches = normalized.startswith(excl) | matches = normalized.startswith(excl) | ||||
if matches: | if matches: | ||||
log = u"Exclusion detected in {0} for {1}" | |||||
log = "Exclusion detected in {0} for {1}" | |||||
self._logger.debug(log.format(sitename, url)) | self._logger.debug(log.format(sitename, url)) | ||||
return True | return True | ||||
log = u"No exclusions in {0} for {1}".format(sitename, url) | |||||
log = "No exclusions in {0} for {1}".format(sitename, url) | |||||
self._logger.debug(log) | self._logger.debug(log) | ||||
return False | return False | ||||
@@ -25,7 +25,7 @@ from re import sub, UNICODE | |||||
__all__ = ["EMPTY", "EMPTY_INTERSECTION", "MarkovChain", | __all__ = ["EMPTY", "EMPTY_INTERSECTION", "MarkovChain", | ||||
"MarkovChainIntersection"] | "MarkovChainIntersection"] | ||||
class MarkovChain(object): | |||||
class MarkovChain: | |||||
"""Implements a basic ngram Markov chain of words.""" | """Implements a basic ngram Markov chain of words.""" | ||||
START = -1 | START = -1 | ||||
END = -2 | END = -2 | ||||
@@ -43,7 +43,7 @@ class MarkovChain(object): | |||||
words = ([self.START] * padding) + words + ([self.END] * padding) | words = ([self.START] * padding) + words + ([self.END] * padding) | ||||
chain = {} | chain = {} | ||||
for i in xrange(len(words) - self.degree + 1): | |||||
for i in range(len(words) - self.degree + 1): | |||||
phrase = tuple(words[i:i+self.degree]) | phrase = tuple(words[i:i+self.degree]) | ||||
if phrase in chain: | if phrase in chain: | ||||
chain[phrase] += 1 | chain[phrase] += 1 | ||||
@@ -53,7 +53,7 @@ class MarkovChain(object): | |||||
def _get_size(self): | def _get_size(self): | ||||
"""Return the size of the Markov chain: the total number of nodes.""" | """Return the size of the Markov chain: the total number of nodes.""" | ||||
return sum(self.chain.itervalues()) | |||||
return sum(self.chain.values()) | |||||
def __repr__(self): | def __repr__(self): | ||||
"""Return the canonical string representation of the MarkovChain.""" | """Return the canonical string representation of the MarkovChain.""" | ||||
@@ -23,9 +23,9 @@ | |||||
import json | import json | ||||
from os import path | from os import path | ||||
import re | import re | ||||
from StringIO import StringIO | |||||
import urllib | |||||
import urlparse | |||||
from io import StringIO | |||||
import urllib.parse | |||||
import urllib.request | |||||
import mwparserfromhell | import mwparserfromhell | ||||
@@ -40,7 +40,7 @@ pdfpage = importer.new("pdfminer.pdfpage") | |||||
__all__ = ["ArticleTextParser", "get_parser"] | __all__ = ["ArticleTextParser", "get_parser"] | ||||
class _BaseTextParser(object): | |||||
class _BaseTextParser: | |||||
"""Base class for a parser that handles text.""" | """Base class for a parser that handles text.""" | ||||
TYPE = None | TYPE = None | ||||
@@ -93,8 +93,8 @@ class ArticleTextParser(_BaseTextParser): | |||||
self._merge_templates(param.value) | self._merge_templates(param.value) | ||||
chunks.append(param.value) | chunks.append(param.value) | ||||
if chunks: | if chunks: | ||||
subst = u" ".join(map(unicode, chunks)) | |||||
code.replace(template, u" " + subst + u" ") | |||||
subst = " ".join(map(str, chunks)) | |||||
code.replace(template, " " + subst + " ") | |||||
else: | else: | ||||
code.remove(template) | code.remove(template) | ||||
@@ -178,7 +178,7 @@ class ArticleTextParser(_BaseTextParser): | |||||
self._merge_templates(wikicode) | self._merge_templates(wikicode) | ||||
clean = wikicode.strip_code(normalize=True, collapse=True) | clean = wikicode.strip_code(normalize=True, collapse=True) | ||||
self.clean = re.sub("\n\n+", "\n", clean).strip() | |||||
self.clean = re.sub(r"\n\n+", "\n", clean).strip() | |||||
return self.clean | return self.clean | ||||
def chunk(self, max_chunks, min_query=8, max_query=128, split_thresh=32): | def chunk(self, max_chunks, min_query=8, max_query=128, split_thresh=32): | ||||
@@ -191,7 +191,7 @@ class ArticleTextParser(_BaseTextParser): | |||||
and *max_chunks* is low, so we don't end up just searching for just the | and *max_chunks* is low, so we don't end up just searching for just the | ||||
first paragraph. | first paragraph. | ||||
This is implemented using :py:mod:`nltk` (http://nltk.org/). A base | |||||
This is implemented using :py:mod:`nltk` (https://nltk.org/). A base | |||||
directory (*nltk_dir*) is required to store nltk's punctuation | directory (*nltk_dir*) is required to store nltk's punctuation | ||||
database, and should be passed as an argument to the constructor. It is | database, and should be passed as an argument to the constructor. It is | ||||
typically located in the bot's working directory. | typically located in the bot's working directory. | ||||
@@ -223,7 +223,7 @@ class ArticleTextParser(_BaseTextParser): | |||||
""" | """ | ||||
schemes = ("http://", "https://") | schemes = ("http://", "https://") | ||||
links = mwparserfromhell.parse(self.text).ifilter_external_links() | links = mwparserfromhell.parse(self.text).ifilter_external_links() | ||||
return [unicode(link.url) for link in links | |||||
return [str(link.url) for link in links | |||||
if link.url.startswith(schemes)] | if link.url.startswith(schemes)] | ||||
@@ -288,7 +288,7 @@ class _HTMLParser(_BaseTextParser): | |||||
"dynamicviews": "1", | "dynamicviews": "1", | ||||
"rewriteforssl": "true", | "rewriteforssl": "true", | ||||
} | } | ||||
raw = self._open(url + urllib.urlencode(params), | |||||
raw = self._open(url + urllib.parse.urlencode(params), | |||||
allow_content_types=["application/json"]) | allow_content_types=["application/json"]) | ||||
if raw is None: | if raw is None: | ||||
return "" | return "" | ||||
@@ -307,9 +307,9 @@ class _HTMLParser(_BaseTextParser): | |||||
"""Return the actual text contained within an HTML document. | """Return the actual text contained within an HTML document. | ||||
Implemented using :py:mod:`BeautifulSoup <bs4>` | Implemented using :py:mod:`BeautifulSoup <bs4>` | ||||
(http://www.crummy.com/software/BeautifulSoup/). | |||||
(https://www.crummy.com/software/BeautifulSoup/). | |||||
""" | """ | ||||
url = urlparse.urlparse(self.url) if self.url else None | |||||
url = urllib.parse.urlparse(self.url) if self.url else None | |||||
soup = self._get_soup(self.text) | soup = self._get_soup(self.text) | ||||
if not soup.body: | if not soup.body: | ||||
# No <body> tag present in HTML -> | # No <body> tag present in HTML -> | ||||
@@ -336,8 +336,8 @@ class _PDFParser(_BaseTextParser): | |||||
"""A parser that can extract text from a PDF file.""" | """A parser that can extract text from a PDF file.""" | ||||
TYPE = "PDF" | TYPE = "PDF" | ||||
substitutions = [ | substitutions = [ | ||||
(u"\x0c", u"\n"), | |||||
(u"\u2022", u" "), | |||||
("\x0c", "\n"), | |||||
("\u2022", " "), | |||||
] | ] | ||||
def parse(self): | def parse(self): | ||||
@@ -359,7 +359,7 @@ class _PDFParser(_BaseTextParser): | |||||
value = output.getvalue().decode("utf8") | value = output.getvalue().decode("utf8") | ||||
for orig, new in self.substitutions: | for orig, new in self.substitutions: | ||||
value = value.replace(orig, new) | value = value.replace(orig, new) | ||||
return re.sub("\n\n+", "\n", value).strip() | |||||
return re.sub(r"\n\n+", "\n", value).strip() | |||||
class _PlainTextParser(_BaseTextParser): | class _PlainTextParser(_BaseTextParser): | ||||
@@ -27,7 +27,7 @@ from earwigbot.wiki.copyvios.markov import EMPTY, EMPTY_INTERSECTION | |||||
__all__ = ["CopyvioSource", "CopyvioCheckResult"] | __all__ = ["CopyvioSource", "CopyvioCheckResult"] | ||||
class CopyvioSource(object): | |||||
class CopyvioSource: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Copyvio Source** | **EarwigBot: Wiki Toolset: Copyvio Source** | ||||
@@ -110,7 +110,7 @@ class CopyvioSource(object): | |||||
event.wait() | event.wait() | ||||
class CopyvioCheckResult(object): | |||||
class CopyvioCheckResult: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Copyvio Check Result** | **EarwigBot: Wiki Toolset: Copyvio Check Result** | ||||
@@ -167,9 +167,9 @@ class CopyvioCheckResult(object): | |||||
def get_log_message(self, title): | def get_log_message(self, title): | ||||
"""Build a relevant log message for this copyvio check result.""" | """Build a relevant log message for this copyvio check result.""" | ||||
if not self.sources: | if not self.sources: | ||||
log = u"No violation for [[{0}]] (no sources; {1} queries; {2} seconds)" | |||||
log = "No violation for [[{0}]] (no sources; {1} queries; {2} seconds)" | |||||
return log.format(title, self.queries, self.time) | return log.format(title, self.queries, self.time) | ||||
log = u"{0} for [[{1}]] (best: {2} ({3} confidence); {4} sources; {5} queries; {6} seconds)" | |||||
log = "{0} for [[{1}]] (best: {2} ({3} confidence); {4} sources; {5} queries; {6} seconds)" | |||||
is_vio = "Violation detected" if self.violation else "No violation" | is_vio = "Violation detected" if self.violation else "No violation" | ||||
return log.format(is_vio, title, self.url, self.confidence, | return log.format(is_vio, title, self.url, self.confidence, | ||||
len(self.sources), self.queries, self.time) | len(self.sources), self.queries, self.time) |
@@ -24,20 +24,18 @@ from gzip import GzipFile | |||||
from json import loads | from json import loads | ||||
from re import sub as re_sub | from re import sub as re_sub | ||||
from socket import error | from socket import error | ||||
from StringIO import StringIO | |||||
from urllib import quote, urlencode | |||||
from urllib2 import URLError | |||||
from io import StringIO | |||||
from urllib.parse import quote, urlencode | |||||
from urllib.error import URLError | |||||
from earwigbot import importer | from earwigbot import importer | ||||
from earwigbot.exceptions import SearchQueryError | from earwigbot.exceptions import SearchQueryError | ||||
lxml = importer.new("lxml") | lxml = importer.new("lxml") | ||||
oauth = importer.new("oauth2") | |||||
__all__ = ["BingSearchEngine", "GoogleSearchEngine", "YahooBOSSSearchEngine", | |||||
"YandexSearchEngine", "SEARCH_ENGINES"] | |||||
__all__ = ["BingSearchEngine", "GoogleSearchEngine", "YandexSearchEngine", "SEARCH_ENGINES"] | |||||
class _BaseSearchEngine(object): | |||||
class _BaseSearchEngine: | |||||
"""Base class for a simple search engine interface.""" | """Base class for a simple search engine interface.""" | ||||
name = "Base" | name = "Base" | ||||
@@ -95,7 +93,7 @@ class BingSearchEngine(_BaseSearchEngine): | |||||
name = "Bing" | name = "Bing" | ||||
def __init__(self, cred, opener): | def __init__(self, cred, opener): | ||||
super(BingSearchEngine, self).__init__(cred, opener) | |||||
super().__init__(cred, opener) | |||||
key = self.cred["key"] | key = self.cred["key"] | ||||
auth = (key + ":" + key).encode("base64").replace("\n", "") | auth = (key + ":" + key).encode("base64").replace("\n", "") | ||||
@@ -170,60 +168,6 @@ class GoogleSearchEngine(_BaseSearchEngine): | |||||
return [] | return [] | ||||
class YahooBOSSSearchEngine(_BaseSearchEngine): | |||||
"""A search engine interface with Yahoo! BOSS.""" | |||||
name = "Yahoo! BOSS" | |||||
@staticmethod | |||||
def _build_url(base, params): | |||||
"""Works like urllib.urlencode(), but uses %20 for spaces over +.""" | |||||
enc = lambda s: quote(s.encode("utf8"), safe="") | |||||
args = ["=".join((enc(k), enc(v))) for k, v in params.iteritems()] | |||||
return base + "?" + "&".join(args) | |||||
@staticmethod | |||||
def requirements(): | |||||
return ["oauth2"] | |||||
def search(self, query): | |||||
"""Do a Yahoo! BOSS web search for *query*. | |||||
Returns a list of URLs ranked by relevance (as determined by Yahoo). | |||||
Raises :py:exc:`~earwigbot.exceptions.SearchQueryError` on errors. | |||||
""" | |||||
key, secret = self.cred["key"], self.cred["secret"] | |||||
consumer = oauth.Consumer(key=key, secret=secret) | |||||
url = "http://yboss.yahooapis.com/ysearch/web" | |||||
params = { | |||||
"oauth_version": oauth.OAUTH_VERSION, | |||||
"oauth_nonce": oauth.generate_nonce(), | |||||
"oauth_timestamp": oauth.Request.make_timestamp(), | |||||
"oauth_consumer_key": consumer.key, | |||||
"q": '"' + query.encode("utf8") + '"', | |||||
"count": str(self.count), | |||||
"type": "html,text,pdf", | |||||
"format": "json", | |||||
} | |||||
req = oauth.Request(method="GET", url=url, parameters=params) | |||||
req.sign_request(oauth.SignatureMethod_HMAC_SHA1(), consumer, None) | |||||
result = self._open(self._build_url(url, req)) | |||||
try: | |||||
res = loads(result) | |||||
except ValueError: | |||||
err = "Yahoo! BOSS Error: JSON could not be decoded" | |||||
raise SearchQueryError(err) | |||||
try: | |||||
results = res["bossresponse"]["web"]["results"] | |||||
except KeyError: | |||||
return [] | |||||
return [result["url"] for result in results] | |||||
class YandexSearchEngine(_BaseSearchEngine): | class YandexSearchEngine(_BaseSearchEngine): | ||||
"""A search engine interface with Yandex Search.""" | """A search engine interface with Yandex Search.""" | ||||
name = "Yandex" | name = "Yandex" | ||||
@@ -263,6 +207,5 @@ class YandexSearchEngine(_BaseSearchEngine): | |||||
SEARCH_ENGINES = { | SEARCH_ENGINES = { | ||||
"Bing": BingSearchEngine, | "Bing": BingSearchEngine, | ||||
"Google": GoogleSearchEngine, | "Google": GoogleSearchEngine, | ||||
"Yahoo! BOSS": YahooBOSSSearchEngine, | |||||
"Yandex": YandexSearchEngine | "Yandex": YandexSearchEngine | ||||
} | } |
@@ -25,17 +25,18 @@ import collections | |||||
from collections import deque | from collections import deque | ||||
import functools | import functools | ||||
from gzip import GzipFile | from gzip import GzipFile | ||||
from httplib import HTTPException | |||||
from http.client import HTTPException | |||||
from logging import getLogger | from logging import getLogger | ||||
from math import log | from math import log | ||||
from Queue import Empty, Queue | |||||
from queue import Empty, Queue | |||||
from socket import error as socket_error | from socket import error as socket_error | ||||
from StringIO import StringIO | |||||
from io import StringIO | |||||
from struct import error as struct_error | from struct import error as struct_error | ||||
from threading import Lock, Thread | from threading import Lock, Thread | ||||
import time | import time | ||||
from urllib2 import build_opener, Request, URLError | |||||
import urlparse | |||||
from urllib.error import URLError | |||||
import urllib.parse | |||||
from urllib.request import build_opener, Request | |||||
from earwigbot import importer | from earwigbot import importer | ||||
from earwigbot.exceptions import ParserExclusionError, ParserRedirectError | from earwigbot.exceptions import ParserExclusionError, ParserRedirectError | ||||
@@ -72,7 +73,7 @@ def globalize(num_workers=8): | |||||
return | return | ||||
_global_queues = _CopyvioQueues() | _global_queues = _CopyvioQueues() | ||||
for i in xrange(num_workers): | |||||
for i in range(num_workers): | |||||
worker = _CopyvioWorker("global-{0}".format(i), _global_queues) | worker = _CopyvioWorker("global-{0}".format(i), _global_queues) | ||||
worker.start() | worker.start() | ||||
_global_workers.append(worker) | _global_workers.append(worker) | ||||
@@ -91,14 +92,14 @@ def localize(): | |||||
if not _is_globalized: | if not _is_globalized: | ||||
return | return | ||||
for i in xrange(len(_global_workers)): | |||||
for i in range(len(_global_workers)): | |||||
_global_queues.unassigned.put((StopIteration, None)) | _global_queues.unassigned.put((StopIteration, None)) | ||||
_global_queues = None | _global_queues = None | ||||
_global_workers = [] | _global_workers = [] | ||||
_is_globalized = False | _is_globalized = False | ||||
class _CopyvioQueues(object): | |||||
class _CopyvioQueues: | |||||
"""Stores data necessary to maintain the various queues during a check.""" | """Stores data necessary to maintain the various queues during a check.""" | ||||
def __init__(self): | def __init__(self): | ||||
@@ -107,7 +108,7 @@ class _CopyvioQueues(object): | |||||
self.unassigned = Queue() | self.unassigned = Queue() | ||||
class _CopyvioWorker(object): | |||||
class _CopyvioWorker: | |||||
"""A multithreaded URL opener/parser instance.""" | """A multithreaded URL opener/parser instance.""" | ||||
def __init__(self, name, queues, until=None): | def __init__(self, name, queues, until=None): | ||||
@@ -149,8 +150,8 @@ class _CopyvioWorker(object): | |||||
None will be returned for URLs that cannot be read for whatever reason. | None will be returned for URLs that cannot be read for whatever reason. | ||||
""" | """ | ||||
parsed = urlparse.urlparse(url) | |||||
if not isinstance(url, unicode): | |||||
parsed = urllib.parse.urlparse(url) | |||||
if not isinstance(url, str): | |||||
url = url.encode("utf8") | url = url.encode("utf8") | ||||
extra_headers = {} | extra_headers = {} | ||||
url, _ = self._try_map_proxy_url(url, parsed, extra_headers) | url, _ = self._try_map_proxy_url(url, parsed, extra_headers) | ||||
@@ -251,7 +252,7 @@ class _CopyvioWorker(object): | |||||
site, queue = self._queues.unassigned.get(timeout=timeout) | site, queue = self._queues.unassigned.get(timeout=timeout) | ||||
if site is StopIteration: | if site is StopIteration: | ||||
raise StopIteration | raise StopIteration | ||||
self._logger.debug(u"Acquired new site queue: {0}".format(site)) | |||||
self._logger.debug("Acquired new site queue: {0}".format(site)) | |||||
self._site = site | self._site = site | ||||
self._queue = queue | self._queue = queue | ||||
@@ -260,7 +261,7 @@ class _CopyvioWorker(object): | |||||
if not self._site: | if not self._site: | ||||
self._acquire_new_site() | self._acquire_new_site() | ||||
logmsg = u"Fetching source URL from queue {0}" | |||||
logmsg = "Fetching source URL from queue {0}" | |||||
self._logger.debug(logmsg.format(self._site)) | self._logger.debug(logmsg.format(self._site)) | ||||
self._queues.lock.acquire() | self._queues.lock.acquire() | ||||
try: | try: | ||||
@@ -273,7 +274,7 @@ class _CopyvioWorker(object): | |||||
self._queues.lock.release() | self._queues.lock.release() | ||||
return self._dequeue() | return self._dequeue() | ||||
self._logger.debug(u"Got source URL: {0}".format(source.url)) | |||||
self._logger.debug("Got source URL: {0}".format(source.url)) | |||||
if source.skipped: | if source.skipped: | ||||
self._logger.debug("Source has been skipped") | self._logger.debug("Source has been skipped") | ||||
self._queues.lock.release() | self._queues.lock.release() | ||||
@@ -331,7 +332,7 @@ class _CopyvioWorker(object): | |||||
thread.start() | thread.start() | ||||
class CopyvioWorkspace(object): | |||||
class CopyvioWorkspace: | |||||
"""Manages a single copyvio check distributed across threads.""" | """Manages a single copyvio check distributed across threads.""" | ||||
def __init__(self, article, min_confidence, max_time, logger, headers, | def __init__(self, article, min_confidence, max_time, logger, headers, | ||||
@@ -359,7 +360,7 @@ class CopyvioWorkspace(object): | |||||
else: | else: | ||||
self._queues = _CopyvioQueues() | self._queues = _CopyvioQueues() | ||||
self._num_workers = num_workers | self._num_workers = num_workers | ||||
for i in xrange(num_workers): | |||||
for i in range(num_workers): | |||||
name = "local-{0:04}.{1}".format(id(self) % 10000, i) | name = "local-{0:04}.{1}".format(id(self) % 10000, i) | ||||
_CopyvioWorker(name, self._queues, self._until).start() | _CopyvioWorker(name, self._queues, self._until).start() | ||||
@@ -371,7 +372,7 @@ class CopyvioWorkspace(object): | |||||
# reaches the default "suspect" confidence threshold, at which | # reaches the default "suspect" confidence threshold, at which | ||||
# point it transitions to polynomial growth with a limit of 1 as | # point it transitions to polynomial growth with a limit of 1 as | ||||
# (delta / article) approaches 1. | # (delta / article) approaches 1. | ||||
# A graph can be viewed here: http://goo.gl/mKPhvr | |||||
# A graph can be viewed here: https://goo.gl/mKPhvr | |||||
ratio = delta / article | ratio = delta / article | ||||
if ratio <= 0.52763: | if ratio <= 0.52763: | ||||
return -log(1 - ratio) | return -log(1 - ratio) | ||||
@@ -383,7 +384,7 @@ class CopyvioWorkspace(object): | |||||
# This piecewise function was derived from experimental data using | # This piecewise function was derived from experimental data using | ||||
# reference points at (0, 0), (100, 0.5), (250, 0.75), (500, 0.9), | # reference points at (0, 0), (100, 0.5), (250, 0.75), (500, 0.9), | ||||
# and (1000, 0.95), with a limit of 1 as delta approaches infinity. | # and (1000, 0.95), with a limit of 1 as delta approaches infinity. | ||||
# A graph can be viewed here: http://goo.gl/lVl7or | |||||
# A graph can be viewed here: https://goo.gl/lVl7or | |||||
if delta <= 100: | if delta <= 100: | ||||
return delta / (delta + 100) | return delta / (delta + 100) | ||||
elif delta <= 250: | elif delta <= 250: | ||||
@@ -417,22 +418,22 @@ class CopyvioWorkspace(object): | |||||
self.sources.append(source) | self.sources.append(source) | ||||
if self._exclude_check and self._exclude_check(url): | if self._exclude_check and self._exclude_check(url): | ||||
self._logger.debug(u"enqueue(): exclude {0}".format(url)) | |||||
self._logger.debug("enqueue(): exclude {0}".format(url)) | |||||
source.excluded = True | source.excluded = True | ||||
source.skip() | source.skip() | ||||
continue | continue | ||||
if self._short_circuit and self.finished: | if self._short_circuit and self.finished: | ||||
self._logger.debug(u"enqueue(): auto-skip {0}".format(url)) | |||||
self._logger.debug("enqueue(): auto-skip {0}".format(url)) | |||||
source.skip() | source.skip() | ||||
continue | continue | ||||
try: | try: | ||||
key = tldextract.extract(url).registered_domain | key = tldextract.extract(url).registered_domain | ||||
except ImportError: # Fall back on very naive method | except ImportError: # Fall back on very naive method | ||||
from urlparse import urlparse | |||||
key = u".".join(urlparse(url).netloc.split(".")[-2:]) | |||||
from urllib.parse import urlparse | |||||
key = ".".join(urlparse(url).netloc.split(".")[-2:]) | |||||
logmsg = u"enqueue(): {0} {1} -> {2}" | |||||
logmsg = "enqueue(): {0} {1} -> {2}" | |||||
if key in self._queues.sites: | if key in self._queues.sites: | ||||
self._logger.debug(logmsg.format("append", key, url)) | self._logger.debug(logmsg.format("append", key, url)) | ||||
self._queues.sites[key].append(source) | self._queues.sites[key].append(source) | ||||
@@ -449,7 +450,7 @@ class CopyvioWorkspace(object): | |||||
conf = self._calculate_confidence(delta) | conf = self._calculate_confidence(delta) | ||||
else: | else: | ||||
conf = 0.0 | conf = 0.0 | ||||
self._logger.debug(u"compare(): {0} -> {1}".format(source.url, conf)) | |||||
self._logger.debug("compare(): {0} -> {1}".format(source.url, conf)) | |||||
with self._finish_lock: | with self._finish_lock: | ||||
if source_chain: | if source_chain: | ||||
source.update(conf, source_chain, delta) | source.update(conf, source_chain, delta) | ||||
@@ -468,7 +469,7 @@ class CopyvioWorkspace(object): | |||||
with self._finish_lock: | with self._finish_lock: | ||||
pass # Wait for any remaining comparisons to be finished | pass # Wait for any remaining comparisons to be finished | ||||
if not _is_globalized: | if not _is_globalized: | ||||
for i in xrange(self._num_workers): | |||||
for i in range(self._num_workers): | |||||
self._queues.unassigned.put((StopIteration, None)) | self._queues.unassigned.put((StopIteration, None)) | ||||
def get_result(self, num_queries=0): | def get_result(self, num_queries=0): | ||||
@@ -24,7 +24,7 @@ from hashlib import md5 | |||||
from logging import getLogger, NullHandler | from logging import getLogger, NullHandler | ||||
import re | import re | ||||
from time import gmtime, strftime | from time import gmtime, strftime | ||||
from urllib import quote | |||||
from urllib.parse import quote | |||||
import mwparserfromhell | import mwparserfromhell | ||||
@@ -94,7 +94,7 @@ class Page(CopyvioMixIn): | |||||
__init__() will not do any API queries, but it will use basic namespace | __init__() will not do any API queries, but it will use basic namespace | ||||
logic to determine our namespace ID and if we are a talkpage. | logic to determine our namespace ID and if we are a talkpage. | ||||
""" | """ | ||||
super(Page, self).__init__(site) | |||||
super().__init__(site) | |||||
self._site = site | self._site = site | ||||
self._title = title.strip() | self._title = title.strip() | ||||
self._follow_redirects = self._keep_following = follow_redirects | self._follow_redirects = self._keep_following = follow_redirects | ||||
@@ -157,7 +157,7 @@ class Page(CopyvioMixIn): | |||||
contains "[") it will always be invalid, and cannot be edited. | contains "[") it will always be invalid, and cannot be edited. | ||||
""" | """ | ||||
if self._exists == self.PAGE_INVALID: | if self._exists == self.PAGE_INVALID: | ||||
e = u"Page '{0}' is invalid.".format(self._title) | |||||
e = "Page '{0}' is invalid.".format(self._title) | |||||
raise exceptions.InvalidPageError(e) | raise exceptions.InvalidPageError(e) | ||||
def _assert_existence(self): | def _assert_existence(self): | ||||
@@ -169,7 +169,7 @@ class Page(CopyvioMixIn): | |||||
""" | """ | ||||
self._assert_validity() | self._assert_validity() | ||||
if self._exists == self.PAGE_MISSING: | if self._exists == self.PAGE_MISSING: | ||||
e = u"Page '{0}' does not exist.".format(self._title) | |||||
e = "Page '{0}' does not exist.".format(self._title) | |||||
raise exceptions.PageNotFoundError(e) | raise exceptions.PageNotFoundError(e) | ||||
def _load(self): | def _load(self): | ||||
@@ -217,11 +217,11 @@ class Page(CopyvioMixIn): | |||||
self._exists = self.PAGE_INVALID | self._exists = self.PAGE_INVALID | ||||
return | return | ||||
res = result["query"]["pages"].values()[0] | |||||
res = list(result["query"]["pages"].values())[0] | |||||
self._title = res["title"] # Normalize our pagename/title | self._title = res["title"] # Normalize our pagename/title | ||||
self._is_redirect = "redirect" in res | self._is_redirect = "redirect" in res | ||||
self._pageid = int(result["query"]["pages"].keys()[0]) | |||||
self._pageid = int(list(result["query"]["pages"].keys())[0]) | |||||
if self._pageid < 0: | if self._pageid < 0: | ||||
if "missing" in res: | if "missing" in res: | ||||
# If it has a negative ID and it's missing; we can still get | # If it has a negative ID and it's missing; we can still get | ||||
@@ -267,7 +267,7 @@ class Page(CopyvioMixIn): | |||||
rvprop="content|timestamp", rvslots="main", | rvprop="content|timestamp", rvslots="main", | ||||
titles=self._title) | titles=self._title) | ||||
res = result["query"]["pages"].values()[0] | |||||
res = list(result["query"]["pages"].values())[0] | |||||
try: | try: | ||||
revision = res["revisions"][0] | revision = res["revisions"][0] | ||||
self._content = revision["slots"]["main"]["*"] | self._content = revision["slots"]["main"]["*"] | ||||
@@ -322,7 +322,7 @@ class Page(CopyvioMixIn): | |||||
def _build_edit_params(self, text, summary, minor, bot, force, section, | def _build_edit_params(self, text, summary, minor, bot, force, section, | ||||
captcha_id, captcha_word): | captcha_id, captcha_word): | ||||
"""Given some keyword arguments, build an API edit query string.""" | """Given some keyword arguments, build an API edit query string.""" | ||||
unitxt = text.encode("utf8") if isinstance(text, unicode) else text | |||||
unitxt = text.encode("utf8") if isinstance(text, str) else text | |||||
hashed = md5(unitxt).hexdigest() # Checksum to ensure text is correct | hashed = md5(unitxt).hexdigest() # Checksum to ensure text is correct | ||||
params = { | params = { | ||||
"action": "edit", "title": self._title, "text": text, | "action": "edit", "title": self._title, "text": text, | ||||
@@ -455,7 +455,7 @@ class Page(CopyvioMixIn): | |||||
encoded = self._title.encode("utf8").replace(" ", "_") | encoded = self._title.encode("utf8").replace(" ", "_") | ||||
slug = quote(encoded, safe="/:").decode("utf8") | slug = quote(encoded, safe="/:").decode("utf8") | ||||
path = self.site._article_path.replace("$1", slug) | path = self.site._article_path.replace("$1", slug) | ||||
return u"".join((self.site.url, path)) | |||||
return "".join((self.site.url, path)) | |||||
@property | @property | ||||
def namespace(self): | def namespace(self): | ||||
@@ -546,7 +546,7 @@ class Page(CopyvioMixIn): | |||||
""" | """ | ||||
if self._namespace < 0: | if self._namespace < 0: | ||||
ns = self.site.namespace_id_to_name(self._namespace) | ns = self.site.namespace_id_to_name(self._namespace) | ||||
e = u"Pages in the {0} namespace can't have talk pages.".format(ns) | |||||
e = "Pages in the {0} namespace can't have talk pages.".format(ns) | |||||
raise exceptions.InvalidPageError(e) | raise exceptions.InvalidPageError(e) | ||||
if self._is_talkpage: | if self._is_talkpage: | ||||
@@ -564,7 +564,7 @@ class Page(CopyvioMixIn): | |||||
# If the new page is in namespace 0, don't do ":Title" (it's correct, | # If the new page is in namespace 0, don't do ":Title" (it's correct, | ||||
# but unnecessary), just do "Title": | # but unnecessary), just do "Title": | ||||
if new_prefix: | if new_prefix: | ||||
new_title = u":".join((new_prefix, body)) | |||||
new_title = ":".join((new_prefix, body)) | |||||
else: | else: | ||||
new_title = body | new_title = body | ||||
@@ -20,14 +20,13 @@ | |||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||||
# SOFTWARE. | # SOFTWARE. | ||||
from cookielib import CookieJar | |||||
from http.cookiejar import CookieJar | |||||
from json import dumps | from json import dumps | ||||
from logging import getLogger, NullHandler | from logging import getLogger, NullHandler | ||||
from os.path import expanduser | from os.path import expanduser | ||||
from threading import RLock | from threading import RLock | ||||
from time import sleep, time | from time import sleep, time | ||||
from urllib import unquote_plus | |||||
from urlparse import urlparse | |||||
from urllib.parse import unquote_plus, urlparse | |||||
import requests | import requests | ||||
from requests_oauthlib import OAuth1 | from requests_oauthlib import OAuth1 | ||||
@@ -42,7 +41,7 @@ oursql = importer.new("oursql") | |||||
__all__ = ["Site"] | __all__ = ["Site"] | ||||
class Site(object): | |||||
class Site: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Site** | **EarwigBot: Wiki Toolset: Site** | ||||
@@ -208,9 +207,9 @@ class Site(object): | |||||
def _unicodeify(self, value, encoding="utf8"): | def _unicodeify(self, value, encoding="utf8"): | ||||
"""Return input as unicode if it's not unicode to begin with.""" | """Return input as unicode if it's not unicode to begin with.""" | ||||
if isinstance(value, unicode): | |||||
if isinstance(value, str): | |||||
return value | return value | ||||
return unicode(value, encoding) | |||||
return str(value, encoding) | |||||
def _api_query(self, params, tries=0, wait=5, ignore_maxlag=False, | def _api_query(self, params, tries=0, wait=5, ignore_maxlag=False, | ||||
no_assert=False, ae_retry=True): | no_assert=False, ae_retry=True): | ||||
@@ -304,7 +303,7 @@ class Site(object): | |||||
info = res["error"]["info"] | info = res["error"]["info"] | ||||
except (TypeError, KeyError): # If there's no error code/info, return | except (TypeError, KeyError): # If there's no error code/info, return | ||||
if "query" in res and "tokens" in res["query"]: | if "query" in res and "tokens" in res["query"]: | ||||
for name, token in res["query"]["tokens"].iteritems(): | |||||
for name, token in res["query"]["tokens"].items(): | |||||
self._tokens[name.split("token")[0]] = token | self._tokens[name.split("token")[0]] = token | ||||
return res | return res | ||||
@@ -574,7 +573,7 @@ class Site(object): | |||||
establish a connection. | establish a connection. | ||||
""" | """ | ||||
args = self._sql_data | args = self._sql_data | ||||
for key, value in kwargs.iteritems(): | |||||
for key, value in kwargs.items(): | |||||
args[key] = value | args[key] = value | ||||
if "read_default_file" not in args and "user" not in args and "passwd" not in args: | if "read_default_file" not in args and "user" not in args and "passwd" not in args: | ||||
args["read_default_file"] = expanduser("~/.my.cnf") | args["read_default_file"] = expanduser("~/.my.cnf") | ||||
@@ -588,7 +587,7 @@ class Site(object): | |||||
try: | try: | ||||
self._sql_conn = oursql.connect(**args) | self._sql_conn = oursql.connect(**args) | ||||
except ImportError: | except ImportError: | ||||
e = "SQL querying requires the 'oursql' package: http://packages.python.org/oursql/" | |||||
e = "SQL querying requires the 'oursql' package: https://pythonhosted.org/oursql/" | |||||
raise exceptions.SQLError(e) | raise exceptions.SQLError(e) | ||||
def _get_service_order(self): | def _get_service_order(self): | ||||
@@ -743,7 +742,7 @@ class Site(object): | |||||
See :py:meth:`_sql_connect` for information on how a connection is | See :py:meth:`_sql_connect` for information on how a connection is | ||||
acquired. Also relevant is `oursql's documentation | acquired. Also relevant is `oursql's documentation | ||||
<http://packages.python.org/oursql>`_ for details on that package. | |||||
<https://pythonhosted.org/oursql/>`_ for details on that package. | |||||
""" | """ | ||||
if not cursor_class: | if not cursor_class: | ||||
if dict_cursor: | if dict_cursor: | ||||
@@ -865,7 +864,7 @@ class Site(object): | |||||
if lname in lnames: | if lname in lnames: | ||||
return ns_id | return ns_id | ||||
e = u"There is no namespace with name '{0}'.".format(name) | |||||
e = "There is no namespace with name '{0}'.".format(name) | |||||
raise exceptions.NamespaceNotFoundError(e) | raise exceptions.NamespaceNotFoundError(e) | ||||
def get_page(self, title, follow_redirects=False, pageid=None): | def get_page(self, title, follow_redirects=False, pageid=None): | ||||
@@ -900,7 +899,7 @@ class Site(object): | |||||
""" | """ | ||||
catname = self._unicodeify(catname) | catname = self._unicodeify(catname) | ||||
prefix = self.namespace_id_to_name(constants.NS_CATEGORY) | prefix = self.namespace_id_to_name(constants.NS_CATEGORY) | ||||
pagename = u':'.join((prefix, catname)) | |||||
pagename = ':'.join((prefix, catname)) | |||||
return Category(self, pagename, follow_redirects, pageid, self._logger) | return Category(self, pagename, follow_redirects, pageid, self._logger) | ||||
def get_user(self, username=None): | def get_user(self, username=None): | ||||
@@ -21,8 +21,8 @@ | |||||
# SOFTWARE. | # SOFTWARE. | ||||
from collections import OrderedDict | from collections import OrderedDict | ||||
from cookielib import LWPCookieJar, LoadError | |||||
import errno | import errno | ||||
from http.cookiejar import LWPCookieJar, LoadError | |||||
from os import chmod, path | from os import chmod, path | ||||
from platform import python_version | from platform import python_version | ||||
import stat | import stat | ||||
@@ -35,7 +35,7 @@ from earwigbot.wiki.site import Site | |||||
__all__ = ["SitesDB"] | __all__ = ["SitesDB"] | ||||
class SitesDB(object): | |||||
class SitesDB: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: Sites Database Manager** | **EarwigBot: Wiki Toolset: Sites Database Manager** | ||||
@@ -207,8 +207,8 @@ class SitesDB(object): | |||||
if not sql: | if not sql: | ||||
sql = config.wiki.get("sql", OrderedDict()).copy() | sql = config.wiki.get("sql", OrderedDict()).copy() | ||||
for key, value in sql.iteritems(): | |||||
if isinstance(value, basestring) and "$1" in value: | |||||
for key, value in sql.items(): | |||||
if isinstance(value, str) and "$1" in value: | |||||
sql[key] = value.replace("$1", name) | sql[key] = value.replace("$1", name) | ||||
return Site(name=name, project=project, lang=lang, base_url=base_url, | return Site(name=name, project=project, lang=lang, base_url=base_url, | ||||
@@ -257,9 +257,9 @@ class SitesDB(object): | |||||
name = site.name | name = site.name | ||||
sites_data = (name, site.project, site.lang, site._base_url, | sites_data = (name, site.project, site.lang, site._base_url, | ||||
site._article_path, site._script_path) | site._article_path, site._script_path) | ||||
sql_data = [(name, key, val) for key, val in site._sql_data.iteritems()] | |||||
sql_data = [(name, key, val) for key, val in site._sql_data.items()] | |||||
ns_data = [] | ns_data = [] | ||||
for ns_id, ns_names in site._namespaces.iteritems(): | |||||
for ns_id, ns_names in site._namespaces.items(): | |||||
ns_data.append((name, ns_id, ns_names.pop(0), True)) | ns_data.append((name, ns_id, ns_names.pop(0), True)) | ||||
for ns_name in ns_names: | for ns_name in ns_names: | ||||
ns_data.append((name, ns_id, ns_name, False)) | ns_data.append((name, ns_id, ns_name, False)) | ||||
@@ -30,7 +30,7 @@ from earwigbot.wiki.page import Page | |||||
__all__ = ["User"] | __all__ = ["User"] | ||||
class User(object): | |||||
class User: | |||||
""" | """ | ||||
**EarwigBot: Wiki Toolset: User** | **EarwigBot: Wiki Toolset: User** | ||||
@@ -106,7 +106,7 @@ class User(object): | |||||
if not hasattr(self, attr): | if not hasattr(self, attr): | ||||
self._load_attributes() | self._load_attributes() | ||||
if not self._exists: | if not self._exists: | ||||
e = u"User '{0}' does not exist.".format(self._name) | |||||
e = "User '{0}' does not exist.".format(self._name) | |||||
raise UserNotFoundError(e) | raise UserNotFoundError(e) | ||||
return getattr(self, attr) | return getattr(self, attr) | ||||
@@ -143,7 +143,7 @@ class User(object): | |||||
self._groups = res["groups"] | self._groups = res["groups"] | ||||
try: | try: | ||||
self._rights = res["rights"].values() | |||||
self._rights = list(res["rights"].values()) | |||||
except AttributeError: | except AttributeError: | ||||
self._rights = res["rights"] | self._rights = res["rights"] | ||||
self._editcount = res["editcount"] | self._editcount = res["editcount"] | ||||
@@ -1,7 +1,7 @@ | |||||
#! /usr/bin/env python | #! /usr/bin/env python | ||||
# -*- coding: utf-8 -*- | # -*- coding: utf-8 -*- | ||||
# | # | ||||
# Copyright (C) 2009-2019 Ben Kurtovic <ben.kurtovic@gmail.com> | |||||
# Copyright (C) 2009-2021 Ben Kurtovic <ben.kurtovic@gmail.com> | |||||
# | # | ||||
# Permission is hereby granted, free of charge, to any person obtaining a copy | # Permission is hereby granted, free of charge, to any person obtaining a copy | ||||
# of this software and associated documentation files (the "Software"), to deal | # of this software and associated documentation files (the "Software"), to deal | ||||
@@ -26,31 +26,29 @@ from setuptools import setup, find_packages | |||||
from earwigbot import __version__ | from earwigbot import __version__ | ||||
required_deps = [ | required_deps = [ | ||||
"PyYAML >= 3.12", # Parsing config files | |||||
"mwparserfromhell >= 0.5", # Parsing wikicode for manipulation | |||||
"requests >= 2.21.0", # Wiki API requests | |||||
"requests_oauthlib >= 1.2.0", # API authentication via OAuth | |||||
"PyYAML >= 5.4.1", # Parsing config files | |||||
"mwparserfromhell >= 0.6", # Parsing wikicode for manipulation | |||||
"requests >= 2.25.1", # Wiki API requests | |||||
"requests_oauthlib >= 1.3.0", # API authentication via OAuth | |||||
] | ] | ||||
extra_deps = { | extra_deps = { | ||||
"crypto": [ | "crypto": [ | ||||
"py-bcrypt >= 0.4", # Hashing the bot key in the config file | |||||
"pycrypto >= 2.6.1", # Storing bot passwords + keys in the config file | |||||
"cryptography >= 3.4.7", # Storing bot passwords + keys in the config file | |||||
], | ], | ||||
"sql": [ | "sql": [ | ||||
"oursql >= 0.9.3.2", # Interfacing with MediaWiki databases | |||||
"oursql3 >= 0.9.4", # Interfacing with MediaWiki databases | |||||
], | ], | ||||
"copyvios": [ | "copyvios": [ | ||||
"beautifulsoup4 >= 4.6.0", # Parsing/scraping HTML | |||||
"cchardet >= 2.1.1", # Encoding detection for BeautifulSoup | |||||
"lxml >= 3.8.0", # Faster parser for BeautifulSoup | |||||
"nltk >= 3.2.4", # Parsing sentences to split article content | |||||
"oauth2 >= 1.9.0", # Interfacing with Yahoo! BOSS Search | |||||
"pdfminer >= 20140328", # Extracting text from PDF files | |||||
"tldextract >= 2.1.0", # Getting domains for the multithreaded workers | |||||
"beautifulsoup4 >= 4.9.3", # Parsing/scraping HTML | |||||
"cchardet >= 2.1.7", # Encoding detection for BeautifulSoup | |||||
"lxml >= 4.6.3", # Faster parser for BeautifulSoup | |||||
"nltk >= 3.6.1", # Parsing sentences to split article content | |||||
"pdfminer >= 20191125", # Extracting text from PDF files | |||||
"tldextract >= 3.1.0", # Getting domains for the multithreaded workers | |||||
], | ], | ||||
"time": [ | "time": [ | ||||
"pytz >= 2017.2", # Handling timezones for the !time IRC command | |||||
"pytz >= 2021.1", # Handling timezones for the !time IRC command | |||||
], | ], | ||||
} | } | ||||
@@ -81,7 +79,7 @@ setup( | |||||
"License :: OSI Approved :: MIT License", | "License :: OSI Approved :: MIT License", | ||||
"Natural Language :: English", | "Natural Language :: English", | ||||
"Operating System :: OS Independent", | "Operating System :: OS Independent", | ||||
"Programming Language :: Python :: 2.7", | |||||
"Programming Language :: Python :: 3", | |||||
"Topic :: Communications :: Chat :: Internet Relay Chat", | "Topic :: Communications :: Chat :: Internet Relay Chat", | ||||
"Topic :: Internet :: WWW/HTTP" | "Topic :: Internet :: WWW/HTTP" | ||||
], | ], | ||||
@@ -28,7 +28,7 @@ from tests import CommandTestCase | |||||
class TestCalc(CommandTestCase): | class TestCalc(CommandTestCase): | ||||
def setUp(self): | def setUp(self): | ||||
super(TestCalc, self).setUp(Command) | |||||
super().setUp(Command) | |||||
def test_check(self): | def test_check(self): | ||||
self.assertFalse(self.command.check(self.make_msg("bloop"))) | self.assertFalse(self.command.check(self.make_msg("bloop"))) | ||||
@@ -28,7 +28,7 @@ from tests import CommandTestCase | |||||
class TestTest(CommandTestCase): | class TestTest(CommandTestCase): | ||||
def setUp(self): | def setUp(self): | ||||
super(TestTest, self).setUp(Command) | |||||
super().setUp(Command) | |||||
def test_check(self): | def test_check(self): | ||||
self.assertFalse(self.command.check(self.make_msg("bloop"))) | self.assertFalse(self.command.check(self.make_msg("bloop"))) | ||||
@@ -42,7 +42,7 @@ class TestTest(CommandTestCase): | |||||
self.command.process(self.make_msg("test")) | self.command.process(self.make_msg("test")) | ||||
self.assertSaidIn(["Hey \x02Foo\x0F!", "'sup \x02Foo\x0F?"]) | self.assertSaidIn(["Hey \x02Foo\x0F!", "'sup \x02Foo\x0F?"]) | ||||
for i in xrange(64): | |||||
for i in range(64): | |||||
test() | test() | ||||
if __name__ == "__main__": | if __name__ == "__main__": | ||||