Browse Source

Merge develop into master (release/0.3)

legacy-python2
Ben Kurtovic 5 years ago
parent
commit
f47dfee2fa
32 changed files with 1395 additions and 632 deletions
  1. +22
    -0
      CHANGELOG
  2. +1
    -1
      LICENSE
  3. +3
    -3
      README.rst
  4. +3
    -3
      docs/conf.py
  5. +7
    -0
      docs/customizing.rst
  6. +1
    -1
      docs/index.rst
  7. +2
    -2
      docs/installation.rst
  8. +4
    -4
      earwigbot/__init__.py
  9. +2
    -1
      earwigbot/bot.py
  10. +179
    -0
      earwigbot/commands/cidr.py
  11. +3
    -3
      earwigbot/commands/lag.py
  12. +37
    -21
      earwigbot/commands/notes.py
  13. +292
    -219
      earwigbot/commands/remind.py
  14. +11
    -7
      earwigbot/commands/stalk.py
  15. +4
    -4
      earwigbot/commands/threads.py
  16. +9
    -5
      earwigbot/config/script.py
  17. +29
    -4
      earwigbot/irc/connection.py
  18. +12
    -6
      earwigbot/irc/data.py
  19. +29
    -6
      earwigbot/irc/frontend.py
  20. +26
    -11
      earwigbot/managers.py
  21. +4
    -0
      earwigbot/tasks/__init__.py
  22. +229
    -99
      earwigbot/tasks/wikiproject_tagger.py
  23. +24
    -18
      earwigbot/wiki/copyvios/__init__.py
  24. +13
    -11
      earwigbot/wiki/copyvios/exclusions.py
  25. +27
    -23
      earwigbot/wiki/copyvios/markov.py
  26. +83
    -31
      earwigbot/wiki/copyvios/parsers.py
  27. +170
    -23
      earwigbot/wiki/copyvios/search.py
  28. +26
    -20
      earwigbot/wiki/copyvios/workers.py
  29. +10
    -8
      earwigbot/wiki/page.py
  30. +112
    -82
      earwigbot/wiki/site.py
  31. +9
    -6
      earwigbot/wiki/sitesdb.py
  32. +12
    -10
      setup.py

+ 22
- 0
CHANGELOG View File

@@ -1,3 +1,25 @@
v0.3 (released March 24, 2019):

- Added various new features to the WikiProjectTagger task.
- Copyvio detector: improved sentence splitting algorithm; many performance
improvements.
- Improved config file command/task exclusion logic.
- Wiki: Added logging for warnings.
- Wiki: Added OAuth support.
- Wiki: Switched to requests from urllib2.
- Wiki: Updated some deprecated API calls.
- Wiki: Fixed Page.toggle_talk() behavior on mainspace titles with colons.
- IRC > !cidr: Added; new command for calculating range blocks.
- IRC > !notes: Improved help and added aliases.
- IRC > !remind: Added !remind all. Fixed multithreading efficiency issues.
Improved time detection and argument parsing. Newly expired reminders are now
triggered on bot startup.
- IRC > !stalk: Allow regular expressions as page titles or usernames.
- IRC: Added a per-channel quiet config setting.
- IRC: Try not to join channels before NickServ auth has completed.
- IRC: Improved detection of maximum IRC message length.
- IRC: Improved some help commands.

v0.2 (released November 8, 2015):

- Added a new command syntax allowing the caller to redirect replies to another


+ 1
- 1
LICENSE View File

@@ -1,4 +1,4 @@
Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
Copyright (C) 2009-2017 Ben Kurtovic <ben.kurtovic@gmail.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal


+ 3
- 3
README.rst View File

@@ -10,7 +10,7 @@ History
-------

Development began, based on the `Pywikipedia framework`_, in early 2009.
Approval for its fist task, a `copyright violation detector`_, was carried out
Approval for its first task, a `copyright violation detector`_, was carried out
in May, and the bot has been running consistently ever since (with the
exception of Jan/Feb 2011). It currently handles `several ongoing tasks`_
ranging from statistics generation to category cleanup, and on-demand tasks
@@ -36,7 +36,7 @@ setup.py test`` from the project's root directory. Note that some
tests require an internet connection, and others may take a while to run.
Coverage is currently rather incomplete.

Latest release (v0.2)
Latest release (v0.3)
~~~~~~~~~~~~~~~~~~~~~

EarwigBot is available from the `Python Package Index`_, so you can install the
@@ -47,7 +47,7 @@ some header files. For example, on Ubuntu, see `this StackOverflow post`_.

You can also install it from source [1]_ directly::

curl -Lo earwigbot.tgz https://github.com/earwig/earwigbot/tarball/v0.2
curl -Lo earwigbot.tgz https://github.com/earwig/earwigbot/tarball/v0.3
tar -xf earwigbot.tgz
cd earwig-earwigbot-*
python setup.py install


+ 3
- 3
docs/conf.py View File

@@ -41,16 +41,16 @@ master_doc = 'index'

# General information about the project.
project = u'EarwigBot'
copyright = u'2009-2015 Ben Kurtovic'
copyright = u'2009-2016 Ben Kurtovic'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = '0.2'
version = '0.3'
# The full version, including alpha/beta/rc tags.
release = '0.2'
release = '0.3'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.


+ 7
- 0
docs/customizing.rst View File

@@ -174,6 +174,13 @@ The bot has a wide selection of built-in commands and plugins to act as sample
code and/or to give ideas. Start with test_, and then check out chanops_ and
afc_status_ for some more complicated scripts.

By default, the bot loads every built-in and custom command available. You can
disable *all* built-in commands with the config entry
:py:attr:`config.commands["disable"]` set to ``True``, or a subset of commands
by setting it to a list of command class names or module names. If using the
former method, you can specifically enable certain built-in commands with
:py:attr:`config.commands["enable"]` set to a list of command module names.

Custom bot tasks
----------------



+ 1
- 1
docs/index.rst View File

@@ -1,4 +1,4 @@
EarwigBot v0.2 Documentation
EarwigBot v0.3 Documentation
============================

EarwigBot_ is a Python_ robot that edits Wikipedia_ and interacts with people


+ 2
- 2
docs/installation.rst View File

@@ -13,7 +13,7 @@ It's recommended to run the bot's unit tests before installing. Run
some tests require an internet connection, and others may take a while to run.
Coverage is currently rather incomplete.

Latest release (v0.2)
Latest release (v0.3)
---------------------

EarwigBot is available from the `Python Package Index`_, so you can install the
@@ -24,7 +24,7 @@ some header files. For example, on Ubuntu, see `this StackOverflow post`_.

You can also install it from source [1]_ directly::

curl -Lo earwigbot.tgz https://github.com/earwig/earwigbot/tarball/v0.2
curl -Lo earwigbot.tgz https://github.com/earwig/earwigbot/tarball/v0.3
tar -xf earwigbot.tgz
cd earwig-earwigbot-*
python setup.py install


+ 4
- 4
earwigbot/__init__.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2019 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -30,9 +30,9 @@ details. This documentation is also available `online
"""

__author__ = "Ben Kurtovic"
__copyright__ = "Copyright (C) 2009-2015 Ben Kurtovic"
__copyright__ = "Copyright (C) 2009-2019 Ben Kurtovic"
__license__ = "MIT License"
__version__ = "0.2"
__version__ = "0.3"
__email__ = "ben.kurtovic@gmail.com"
__release__ = False

@@ -45,7 +45,7 @@ if not __release__:
commit_id = Repo(path).head.object.hexsha
return commit_id[:8]
try:
__version__ += "+git-" + _get_git_commit_id()
__version__ += "+" + _get_git_commit_id()
except Exception:
pass
finally:


+ 2
- 1
earwigbot/bot.py View File

@@ -150,7 +150,8 @@ class Bot(object):
component_names = self.config.components.keys()
skips = component_names + ["MainThread", "reminder", "irc:quit"]
for thread in enumerate_threads():
if thread.name not in skips and thread.is_alive():
if thread.is_alive() and not any(
thread.name.startswith(skip) for skip in skips):
tasks.append(thread.name)
if tasks:
log = "The following commands or tasks will be killed: {0}"


+ 179
- 0
earwigbot/commands/cidr.py View File

@@ -0,0 +1,179 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2016 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from collections import namedtuple
import re
import socket
from socket import AF_INET, AF_INET6

from earwigbot.commands import Command

_IP = namedtuple("_IP", ["family", "ip", "size"])
_Range = namedtuple("_Range", [
"family", "range", "low", "high", "size", "addresses"])

class CIDR(Command):
"""Calculates the smallest CIDR range that encompasses a list of IP
addresses. Used to make range blocks."""
name = "cidr"
commands = ["cidr", "range", "rangeblock", "rangecalc", "blockcalc",
"iprange", "cdir"]

# https://www.mediawiki.org/wiki/Manual:$wgBlockCIDRLimit
LIMIT_IPv4 = 16
LIMIT_IPv6 = 19

def process(self, data):
if not data.args:
msg = ("Specify a list of IP addresses to calculate a CIDR range "
"for. For example, \x0306!{0} 192.168.0.3 192.168.0.15 "
"192.168.1.4\x0F or \x0306!{0} 2500:1:2:3:: "
"2500:1:2:3:dead:beef::\x0F.")
self.reply(data, msg.format(data.command))
return

try:
ips = [self._parse_ip(arg) for arg in data.args]
except ValueError as exc:
msg = "Can't parse IP address \x0302{0}\x0F."
self.reply(data, msg.format(exc.message))
return

if any(ip.family == AF_INET for ip in ips) and any(
ip.family == AF_INET6 for ip in ips):
msg = "Can't calculate a range for both IPv4 and IPv6 addresses."
self.reply(data, msg)
return

cidr = self._calculate_range(ips[0].family, ips)
descr = self._describe(cidr.family, cidr.size)

msg = ("Smallest CIDR range is \x02{0}\x0F, covering {1} from "
"\x0305{2}\x0F to \x0305{3}\x0F{4}.")
self.reply(data, msg.format(
cidr.range, cidr.addresses, cidr.low, cidr.high,
" (\x0304{0}\x0F)".format(descr) if descr else ""))

def _parse_ip(self, arg):
"""Converts an argument into an IP address object."""
arg = self._parse_arg(arg)
oldarg = arg
size = None
if "/" in arg:
arg, size = arg.split("/", 1)
try:
size = int(size, 10)
except ValueError:
raise ValueError(oldarg)
if size < 0 or size > 128:
raise ValueError(oldarg)

try:
ip = _IP(AF_INET, socket.inet_pton(AF_INET, arg), size)
except socket.error:
try:
return _IP(AF_INET6, socket.inet_pton(AF_INET6, arg), size)
except socket.error:
raise ValueError(oldarg)
if size > 32:
raise ValueError(oldarg)
return ip

def _parse_arg(self, arg):
"""Converts an argument into an IP address string."""
if "[[" in arg and "]]" in arg:
regex = r"\[\[\s*(?:User(?:\stalk)?:)?(.*?)(?:\|.*?)?\s*\]\]"
match = re.search(regex, arg, re.I)
if not match:
raise ValueError(arg)
arg = match.group(1)

if re.match(r"https?://", arg):
if "target=" in arg:
regex = r"target=(.*?)(?:&|$)"
elif "page=" in arg:
regex = r"page=(?:User(?:(?:\s|_)talk)?(?::|%3A))?(.*?)(?:&|$)"
elif re.search(r"Special(:|%3A)Contributions/", arg, re.I):
regex = r"Special(?:\:|%3A)Contributions/(.*?)(?:\&|\?|$)"
elif re.search(r"User((\s|_)talk)?(:|%3A)", arg, re.I):
regex = r"User(?:(?:\s|_)talk)?(?:\:|%3A)(.*?)(?:\&|\?|$)"
else:
raise ValueError(arg)
match = re.search(regex, arg, re.I)
if not match:
raise ValueError(arg)
arg = match.group(1)
return arg

def _calculate_range(self, family, ips):
"""Calculate the smallest CIDR range encompassing a list of IPs."""
bin_ips = ["".join(
bin(ord(octet))[2:].zfill(8) for octet in ip.ip) for ip in ips]
for i, ip in enumerate(ips):
if ip.size is not None:
suffix = "X" * (len(bin_ips[i]) - ip.size)
bin_ips[i] = bin_ips[i][:ip.size] + suffix

size = len(bin_ips[0])
for i in xrange(len(bin_ips[0])):
if any(ip[i] == "X" for ip in bin_ips) or (
any(ip[i] == "0" for ip in bin_ips) and
any(ip[i] == "1" for ip in bin_ips)):
size = i
break

bin_low = bin_ips[0][:size].ljust(len(bin_ips[0]), "0")
bin_high = bin_ips[0][:size].ljust(len(bin_ips[0]), "1")
low = self._format_bin(family, bin_low)
high = self._format_bin(family, bin_high)

return _Range(
family, low + "/" + str(size), low, high, size,
self._format_count(2 ** (len(bin_ips[0]) - size)))

@staticmethod
def _format_bin(family, binary):
"""Convert an IP's binary representation to presentation format."""
return socket.inet_ntop(family, "".join(
chr(int(binary[i:i + 8], 2)) for i in xrange(0, len(binary), 8)))

@staticmethod
def _format_count(count):
"""Nicely format a number of addresses affected by a range block."""
if count == 1:
return "1 address"
if count > 2 ** 32:
base = "{0:.2E} addresses".format(count)
if count == 2 ** 64:
return base + " (1 /64 subnet)"
if count > 2 ** 96:
return base + " ({0:.2E} /64 subnets)".format(count >> 64)
if count > 2 ** 63:
return base + " ({0:,} /64 subnets)".format(count >> 64)
return base
return "{0:,} addresses".format(count)

def _describe(self, family, size):
"""Return an optional English description of a range."""
if (family == AF_INET and size < self.LIMIT_IPv4) or (
family == AF_INET6 and size < self.LIMIT_IPv6):
return "too large to block"

+ 3
- 3
earwigbot/commands/lag.py View File

@@ -37,7 +37,7 @@ class Lag(Command):
msg = base.format(site.name, self.get_replag(site))
elif data.command == "maxlag":
base = "\x0302{0}\x0F: {1}."
msg = base.format(site.name, self.get_maxlag(site).capitalize())
msg = base.format(site.name, self.get_maxlag(site))
else:
base = "\x0302{0}\x0F: {1}; {2}."
msg = base.format(site.name, self.get_replag(site),
@@ -45,10 +45,10 @@ class Lag(Command):
self.reply(data, msg)

def get_replag(self, site):
return "replag is {0}".format(self.time(site.get_replag()))
return "SQL replag is {0}".format(self.time(site.get_replag()))

def get_maxlag(self, site):
return "database maxlag is {0}".format(self.time(site.get_maxlag()))
return "API maxlag is {0}".format(self.time(site.get_maxlag()))

def get_site(self, data):
if data.kwargs and "project" in data.kwargs and "lang" in data.kwargs:


+ 37
- 21
earwigbot/commands/notes.py View File

@@ -32,7 +32,19 @@ class Notes(Command):
"""A mini IRC-based wiki for storing notes, tips, and reminders."""
name = "notes"
commands = ["notes", "note", "about"]
version = 2
version = "2.1"

aliases = {
"all": "list",
"show": "read",
"get": "read",
"add": "edit",
"write": "edit",
"change": "edit",
"modify": "edit",
"move": "rename",
"remove": "delete"
}

def setup(self):
self._dbfile = path.join(self.config.root_dir, "notes.db")
@@ -50,14 +62,13 @@ class Notes(Command):
}

if not data.args:
msg = ("\x0302The Earwig Mini-Wiki\x0F: running v{0}. Subcommands "
"are: {1}. You can get help on any with '!{2} help subcommand'.")
cmnds = ", ".join((commands))
self.reply(data, msg.format(self.version, cmnds, data.command))
self.do_help(data)
return
command = data.args[0].lower()
if command in commands:
commands[command](data)
elif command in self.aliases:
commands[self.aliases[command]](data)
else:
msg = "Unknown subcommand: \x0303{0}\x0F.".format(command)
self.reply(data, msg)
@@ -83,8 +94,13 @@ class Notes(Command):
try:
command = data.args[1]
except IndexError:
self.reply(data, "Please specify a subcommand to get help on.")
msg = ("\x0302The Earwig Mini-Wiki\x0F: running v{0}. Subcommands "
"are: {1}. You can get help on any with '!{2} help subcommand'.")
cmnds = ", ".join((info.keys()))
self.reply(data, msg.format(self.version, cmnds, data.command))
return
if command in self.aliases:
command = self.aliases[command]
try:
help_ = re.sub(r"\s\s+", " ", info[command].replace("\n", ""))
self.reply(data, "\x0303{0}\x0F: ".format(command) + help_)
@@ -113,7 +129,7 @@ class Notes(Command):
INNER JOIN revisions ON entry_revision = rev_id
WHERE entry_slug = ?"""
try:
slug = self.slugify(data.args[1])
slug = self._slugify(data.args[1])
except IndexError:
self.reply(data, "Please specify an entry to read from.")
return
@@ -141,7 +157,7 @@ class Notes(Command):
query3 = "INSERT INTO entries VALUES (?, ?, ?, ?)"
query4 = "UPDATE entries SET entry_revision = ? WHERE entry_id = ?"
try:
slug = self.slugify(data.args[1])
slug = self._slugify(data.args[1])
except IndexError:
self.reply(data, "Please specify an entry to edit.")
return
@@ -157,17 +173,17 @@ class Notes(Command):
create = False
except sqlite.OperationalError:
id_, title, author = 1, data.args[1].decode("utf8"), data.host
self.create_db(conn)
self._create_db(conn)
except TypeError:
id_ = self.get_next_entry(conn)
id_ = self._get_next_entry(conn)
title, author = data.args[1].decode("utf8"), data.host
permdb = self.config.irc["permissions"]
if author != data.host and not permdb.is_admin(data):
msg = "You must be an author or a bot admin to edit this entry."
self.reply(data, msg)
return
revid = self.get_next_revision(conn)
userid = self.get_user(conn, data.host)
revid = self._get_next_revision(conn)
userid = self._get_user(conn, data.host)
now = datetime.utcnow().strftime("%b %d, %Y %H:%M:%S")
conn.execute(query2, (revid, id_, userid, now, content))
if create:
@@ -185,7 +201,7 @@ class Notes(Command):
INNER JOIN users ON rev_user = user_id
WHERE entry_slug = ?"""
try:
slug = self.slugify(data.args[1])
slug = self._slugify(data.args[1])
except IndexError:
self.reply(data, "Please specify an entry to get info on.")
return
@@ -221,7 +237,7 @@ class Notes(Command):
query2 = """UPDATE entries SET entry_slug = ?, entry_title = ?
WHERE entry_id = ?"""
try:
slug = self.slugify(data.args[1])
slug = self._slugify(data.args[1])
except IndexError:
self.reply(data, "Please specify an entry to rename.")
return
@@ -246,7 +262,7 @@ class Notes(Command):
msg = "You must be an author or a bot admin to rename this entry."
self.reply(data, msg)
return
args = (self.slugify(newtitle), newtitle.decode("utf8"), id_)
args = (self._slugify(newtitle), newtitle.decode("utf8"), id_)
conn.execute(query2, args)

msg = "Entry \x0302{0}\x0F renamed to \x0302{1}\x0F."
@@ -261,7 +277,7 @@ class Notes(Command):
query2 = "DELETE FROM entries WHERE entry_id = ?"
query3 = "DELETE FROM revisions WHERE rev_entry = ?"
try:
slug = self.slugify(data.args[1])
slug = self._slugify(data.args[1])
except IndexError:
self.reply(data, "Please specify an entry to delete.")
return
@@ -283,11 +299,11 @@ class Notes(Command):

self.reply(data, "Entry \x0302{0}\x0F deleted.".format(data.args[1]))

def slugify(self, name):
def _slugify(self, name):
"""Convert *name* into an identifier for storing in the database."""
return name.lower().replace("_", "").replace("-", "").decode("utf8")

def create_db(self, conn):
def _create_db(self, conn):
"""Initialize the notes database with its necessary tables."""
script = """
CREATE TABLE entries (entry_id, entry_slug, entry_title,
@@ -298,19 +314,19 @@ class Notes(Command):
"""
conn.executescript(script)

def get_next_entry(self, conn):
def _get_next_entry(self, conn):
"""Get the next entry ID."""
query = "SELECT MAX(entry_id) FROM entries"
later = conn.execute(query).fetchone()[0]
return later + 1 if later else 1

def get_next_revision(self, conn):
def _get_next_revision(self, conn):
"""Get the next revision ID."""
query = "SELECT MAX(rev_id) FROM revisions"
later = conn.execute(query).fetchone()[0]
return later + 1 if later else 1

def get_user(self, conn, host):
def _get_user(self, conn, host):
"""Get the user ID corresponding to a hostname, or make one."""
query1 = "SELECT user_id FROM users WHERE user_host = ?"
query2 = "SELECT MAX(user_id) FROM users"


+ 292
- 219
earwigbot/commands/remind.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2016 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -21,7 +21,6 @@
# SOFTWARE.

import ast
from contextlib import contextmanager
from itertools import chain
import operator
import random
@@ -31,13 +30,24 @@ import time
from earwigbot.commands import Command
from earwigbot.irc import Data

DISPLAY = ["display", "show", "list", "info", "details"]
DISPLAY = ["display", "show", "info", "details"]
CANCEL = ["cancel", "stop", "delete", "del", "stop", "unremind", "forget",
"disregard"]
SNOOZE = ["snooze", "delay", "reset", "adjust", "modify", "change"]
SNOOZE_ONLY = ["snooze", "delay", "reset"]

def _format_time(epoch):
"""Format a UNIX timestamp nicely."""
lctime = time.localtime(epoch)
if lctime.tm_year == time.localtime().tm_year:
return time.strftime("%b %d %H:%M:%S %Z", lctime)
else:
return time.strftime("%b %d, %Y %H:%M:%S %Z", lctime)


class Remind(Command):
"""Set a message to be repeated to you in a certain amount of time."""
"""Set a message to be repeated to you in a certain amount of time. See
usage with !remind help."""
name = "remind"
commands = ["remind", "reminder", "reminders", "snooze", "cancel",
"unremind", "forget"]
@@ -49,8 +59,10 @@ class Remind(Command):
return "display"
if command in CANCEL:
return "cancel"
if command in SNOOZE:
if command in SNOOZE_ONLY:
return "snooze"
if command in SNOOZE: # "adjust" == snoozing active reminders
return "adjust"

@staticmethod
def _parse_time(arg):
@@ -77,26 +89,18 @@ class Remind(Command):
else:
raise ValueError(node)

if arg and arg[-1] in time_units:
factor, arg = time_units[arg[-1]], arg[:-1]
else:
factor = 1
for unit, factor in time_units.iteritems():
arg = arg.replace(unit, "*" + str(factor))

try:
parsed = int(_evaluate(ast.parse(arg, mode="eval").body) * factor)
parsed = int(_evaluate(ast.parse(arg, mode="eval").body))
except (SyntaxError, KeyError):
raise ValueError(arg)
if parsed <= 0:
raise ValueError(parsed)
return parsed

@contextmanager
def _db(self):
"""Return a threadsafe context manager for the permissions database."""
with self._db_lock:
yield self.config.irc["permissions"]

def _really_get_reminder_by_id(self, user, rid):
def _get_reminder_by_id(self, user, rid):
"""Return the _Reminder object that corresponds to a particular ID.

Raises IndexError on failure.
@@ -106,17 +110,6 @@ class Remind(Command):
raise IndexError(rid)
return [robj for robj in self.reminders[user] if robj.id == rid][0]

def _get_reminder_by_id(self, user, rid, data):
"""Return the _Reminder object that corresponds to a particular ID.

Sends an error message to the user on failure.
"""
try:
return self._really_get_reminder_by_id(user, rid)
except IndexError:
msg = "Couldn't find a reminder for \x0302{0}\x0F with ID \x0303{1}\x0F."
self.reply(data, msg.format(user, rid))

def _get_new_id(self):
"""Get a free ID for a new reminder."""
taken = set(robj.id for robj in chain(*self.reminders.values()))
@@ -125,13 +118,13 @@ class Remind(Command):

def _start_reminder(self, reminder, user):
"""Start the given reminder object for the given user."""
reminder.start()
if user in self.reminders:
self.reminders[user].append(reminder)
else:
self.reminders[user] = [reminder]
self._thread.add(reminder)

def _create_reminder(self, data, user):
def _create_reminder(self, data):
"""Create a new reminder for the given user."""
try:
wait = self._parse_time(data.args[0])
@@ -144,7 +137,6 @@ class Remind(Command):
msg = "Given time \x02{0}\x0F is too large. Keep it reasonable."
return self.reply(data, msg.format(data.args[0]))

end = time.time() + wait
message = " ".join(data.args[1:])
try:
rid = self._get_new_id()
@@ -152,8 +144,8 @@ class Remind(Command):
msg = "Couldn't set a new reminder: no free IDs available."
return self.reply(data, msg)

reminder = _Reminder(rid, user, wait, end, message, data, self)
self._start_reminder(reminder, user)
reminder = _Reminder(rid, data.host, wait, message, data, self)
self._start_reminder(reminder, data.host)
msg = "Set reminder \x0303{0}\x0F ({1})."
self.reply(data, msg.format(rid, reminder.end_time))

@@ -164,104 +156,85 @@ class Remind(Command):
reminder.message)
self.reply(data, msg)

def _cancel_reminder(self, data, user, reminder):
def _cancel_reminder(self, data, reminder):
"""Cancel a pending reminder."""
reminder.stop()
self.reminders[user].remove(reminder)
if not self.reminders[user]:
del self.reminders[user]
self._thread.remove(reminder)
self.unstore_reminder(reminder.id)
self.reminders[data.host].remove(reminder)
if not self.reminders[data.host]:
del self.reminders[data.host]
msg = "Reminder \x0303{0}\x0F canceled."
self.reply(data, msg.format(reminder.id))

def _snooze_reminder(self, data, reminder, arg=None):
"""Snooze a reminder to be re-triggered after a period of time."""
verb = "snoozed" if reminder.end < time.time() else "adjusted"
if arg:
try:
duration = self._parse_time(data.args[arg])
reminder.wait = duration
except (IndexError, ValueError):
pass

reminder.end = time.time() + reminder.wait
reminder.start()
end = time.strftime("%b %d %H:%M:%S %Z", time.localtime(reminder.end))
verb = "snoozed" if reminder.expired else "adjusted"
try:
duration = self._parse_time(arg) if arg else None
except ValueError:
duration = None

reminder.reset(duration)
end = _format_time(reminder.end)
msg = "Reminder \x0303{0}\x0F {1} until {2}."
self.reply(data, msg.format(reminder.id, verb, end))

def _load_reminders(self):
"""Load previously made reminders from the database."""
with self._db() as permdb:
try:
database = permdb.get_attr("command:remind", "data")
except KeyError:
return
permdb.set_attr("command:remind", "data", "[]")
permdb = self.config.irc["permissions"]
try:
database = permdb.get_attr("command:remind", "data")
except KeyError:
return
permdb.set_attr("command:remind", "data", "[]")

connect_wait = 30
for item in ast.literal_eval(database):
rid, user, wait, end, message, data = item
if end < time.time():
continue
if end < time.time() + connect_wait:
# Make reminders that have expired while the bot was offline
# trigger shortly after startup
end = time.time() + connect_wait
data = Data.unserialize(data)
reminder = _Reminder(rid, user, wait, end, message, data, self)
reminder = _Reminder(rid, user, wait, message, data, self, end)
self._start_reminder(reminder, user)

def _handle_command(self, command, data, user, reminder, arg=None):
"""Handle a reminder-processing subcommand."""
if command in DISPLAY:
self._display_reminder(data, reminder)
elif command in CANCEL:
self._cancel_reminder(data, user, reminder)
elif command in SNOOZE:
self._snooze_reminder(data, reminder, arg)
else:
msg = "Unknown action \x02{0}\x0F for reminder \x0303{1}\x0F."
self.reply(data, msg.format(command, reminder.id))

def _show_reminders(self, data, user):
def _show_reminders(self, data):
"""Show all of a user's current reminders."""
shorten = lambda s: (s[:37] + "..." if len(s) > 40 else s)
tmpl = '\x0303{0}\x0F ("{1}", {2})'
fmt = lambda robj: tmpl.format(robj.id, shorten(robj.message),
robj.end_time)

if user in self.reminders:
rlist = ", ".join(fmt(robj) for robj in self.reminders[user])
msg = "Your reminders: {0}.".format(rlist)
else:
msg = ("You have no reminders. Set one with \x0306!remind [time] "
"[message]\x0F. See also: \x0306!remind help\x0F.")
self.reply(data, msg)

def _process_snooze_command(self, data, user):
"""Process the !snooze command."""
if not data.args:
if user not in self.reminders:
self.reply(data, "You have no reminders to snooze.")
elif len(self.reminders[user]) == 1:
self._snooze_reminder(data, self.reminders[user][0])
else:
msg = "You have {0} reminders. Snooze which one?"
self.reply(data, msg.format(len(self.reminders[user])))
if data.host not in self.reminders:
self.reply(data, "You have no reminders. Set one with "
"\x0306!remind [time] [message]\x0F. See also: "
"\x0306!remind help\x0F.")
return
reminder = self._get_reminder_by_id(user, data.args[0], data)
if reminder:
self._snooze_reminder(data, reminder, 1)

def _process_cancel_command(self, data, user):
"""Process the !cancel, !unremind, and !forget commands."""
if not data.args:
if user not in self.reminders:
self.reply(data, "You have no reminders to cancel.")
elif len(self.reminders[user]) == 1:
self._cancel_reminder(data, user, self.reminders[user][0])
else:
msg = "You have {0} reminders. Cancel which one?"
self.reply(data, msg.format(len(self.reminders[user])))
shorten = lambda s: (s[:37] + "..." if len(s) > 40 else s)
dest = lambda data: (
"privately" if data.is_private else "in {0}".format(data.chan))
fmt = lambda robj: '\x0303{0}\x0F ("{1}" {2}, {3})'.format(
robj.id, shorten(robj.message), dest(robj.data), robj.end_time)

rlist = ", ".join(fmt(robj) for robj in self.reminders[data.host])
self.reply(data, "Your reminders: {0}.".format(rlist))

def _show_all_reminders(self, data):
"""Show all reminders to bot admins."""
if not self.config.irc["permissions"].is_admin(data):
self.reply(data, "You must be a bot admin to view other users' "
"reminders. View your own with "
"\x0306!reminders\x0F.")
return
reminder = self._get_reminder_by_id(user, data.args[0], data)
if reminder:
self._cancel_reminder(data, user, reminder)
if not self.reminders:
self.reply(data, "There are no active reminders.")
return

dest = lambda data: (
"privately" if data.is_private else "in {0}".format(data.chan))
fmt = lambda robj, user: '\x0303{0}\x0F (for {1} {2}, {3})'.format(
robj.id, user, dest(robj.data), robj.end_time)

rlist = (fmt(rem, user) for user, rems in self.reminders.iteritems()
for rem in rems)
self.reply(data, "All reminders: {0}.".format(", ".join(rlist)))

def _show_help(self, data):
"""Reply to the user with help for all major subcommands."""
@@ -271,122 +244,205 @@ class Remind(Command):
("Get info", "!remind [id]"),
("Cancel", "!remind cancel [id]"),
("Adjust", "!remind adjust [id] [time]"),
("Restart", "!snooze [id]")
("Restart", "!snooze [id] [time]"),
("Admin", "!remind all")
]
extra = "In most cases, \x0306[id]\x0F can be omitted if you have only one reminder."
extra = "The \x0306[id]\x0F can be omitted if you have only one reminder."
joined = " ".join("{0}: \x0306{1}\x0F.".format(k, v) for k, v in parts)
self.reply(data, joined + " " + extra)

def setup(self):
self.reminders = {}
self._db_lock = RLock()
self._load_reminders()

def process(self, data):
if data.command == "snooze":
return self._process_snooze_command(data, data.host)
if data.command in ["cancel", "unremind", "forget"]:
return self._process_cancel_command(data, data.host)
if not data.args:
return self._show_reminders(data, data.host)

def _dispatch_command(self, data, command, args):
"""Handle a reminder-processing subcommand."""
user = data.host
if len(data.args) == 1:
command = data.args[0]
if command == "help":
return self._show_help(data)
if command in DISPLAY + CANCEL + SNOOZE:
if user not in self.reminders:
msg = "You have no reminders to {0}."
self.reply(data, msg.format(self._normalize(command)))
elif len(self.reminders[user]) == 1:
reminder = self.reminders[user][0]
self._handle_command(command, data, user, reminder)
else:
msg = "You have {0} reminders. {1} which one?"
num = len(self.reminders[user])
command = self._normalize(command).capitalize()
self.reply(data, msg.format(num, command))
reminder = None
if args and args[0].upper().startswith("R"):
try:
reminder = self._get_reminder_by_id(user, args[0])
except IndexError:
msg = "Couldn't find a reminder for \x0302{0}\x0F with ID \x0303{1}\x0F."
self.reply(data, msg.format(user, args[0]))
return
reminder = self._get_reminder_by_id(user, data.args[0], data)
if reminder:
self._display_reminder(data, reminder)
args.pop(0)
elif user not in self.reminders:
msg = "You have no reminders to {0}."
self.reply(data, msg.format(self._normalize(command)))
return
elif len(self.reminders[user]) == 1:
reminder = self.reminders[user][0]
elif command in SNOOZE_ONLY: # Select most recent expired reminder
rmds = [rmd for rmd in self.reminders[user] if rmd.expired]
rmds.sort(key=lambda rmd: rmd.end)
if len(rmds) > 0:
reminder = rmds[-1]
elif command in SNOOZE or command in CANCEL: # Select only active one
rmds = [rmd for rmd in self.reminders[user] if not rmd.expired]
if len(rmds) == 1:
reminder = rmds[0]
if not reminder:
msg = "You have {0} reminders. {1} which one?"
num = len(self.reminders[user])
command = self._normalize(command).capitalize()
self.reply(data, msg.format(num, command))
return

if command in DISPLAY:
self._display_reminder(data, reminder)
elif command in CANCEL:
self._cancel_reminder(data, reminder)
elif command in SNOOZE:
self._snooze_reminder(data, reminder, args[0] if args else None)
else:
msg = "Unknown action \x02{0}\x0F for reminder \x0303{1}\x0F."
self.reply(data, msg.format(command, reminder.id))

def _process(self, data):
"""Main entry point."""
if data.command in SNOOZE + CANCEL:
return self._dispatch_command(data, data.command, data.args)
if not data.args:
return self._show_reminders(data)

if data.args[0] == "help":
return self._show_help(data)
if data.args[0] == "list":
return self._show_reminders(data)
if data.args[0] == "all":
return self._show_all_reminders(data)
if data.args[0] in DISPLAY + CANCEL + SNOOZE:
reminder = self._get_reminder_by_id(user, data.args[1], data)
if reminder:
self._handle_command(data.args[0], data, user, reminder, 2)
return
return self._dispatch_command(data, data.args[0], data.args[1:])

try:
reminder = self._really_get_reminder_by_id(user, data.args[0])
self._get_reminder_by_id(data.host, data.args[0])
except IndexError:
return self._create_reminder(data, user)
return self._create_reminder(data)
if len(data.args) == 1:
return self._dispatch_command(data, "display", data.args)
self._dispatch_command(
data, data.args[1], [data.args[0]] + data.args[2:])

self._handle_command(data.args[1], data, user, reminder, 2)
@property
def lock(self):
"""Return the reminder modification/access lock."""
return self._lock

def setup(self):
self.reminders = {}
self._lock = RLock()
self._thread = _ReminderThread(self._lock)
self._load_reminders()

def process(self, data):
with self.lock:
self._process(data)

def unload(self):
for reminder in chain(*self.reminders.values()):
reminder.stop(delete=False)
self._thread.stop()

def store_reminder(self, reminder):
"""Store a serialized reminder into the database."""
with self._db() as permdb:
try:
dump = permdb.get_attr("command:remind", "data")
except KeyError:
dump = "[]"
permdb = self.config.irc["permissions"]
try:
dump = permdb.get_attr("command:remind", "data")
except KeyError:
dump = "[]"

database = ast.literal_eval(dump)
database.append(reminder)
permdb.set_attr("command:remind", "data", str(database))
database = ast.literal_eval(dump)
database.append(reminder)
permdb.set_attr("command:remind", "data", str(database))

def unstore_reminder(self, rid):
"""Remove a reminder from the database by ID."""
with self._db() as permdb:
try:
dump = permdb.get_attr("command:remind", "data")
except KeyError:
dump = "[]"
permdb = self.config.irc["permissions"]
try:
dump = permdb.get_attr("command:remind", "data")
except KeyError:
dump = "[]"

database = ast.literal_eval(dump)
database = [item for item in database if item[0] != rid]
permdb.set_attr("command:remind", "data", str(database))


class _ReminderThread(object):
"""A single thread that handles reminders."""

def __init__(self, lock):
self._thread = None
self._abort = False
self._active = {}
self._lock = lock

def _running(self):
"""Return if the thread should still be running."""
return self._active and not self._abort

def _get_soonest(self):
"""Get the soonest reminder to trigger."""
return min(self._active.values(), key=lambda robj: robj.end)

def _get_ready_reminder(self):
"""Block until a reminder is ready to be triggered."""
while self._running():
if self._get_soonest().end <= time.time():
return self._get_soonest()
self._lock.release()
time.sleep(0.25)
self._lock.acquire()

def _callback(self):
"""Internal callback function to be executed by the reminder thread."""
with self._lock:
while True:
reminder = self._get_ready_reminder()
if not reminder:
break

if reminder.trigger():
del self._active[reminder.id]
self._thread = None

def _start(self):
"""Start the thread."""
self._thread = Thread(target=self._callback, name="reminder")
self._thread.daemon = True
self._thread.start()
self._abort = False

def add(self, reminder):
"""Add a reminder to the table of active reminders."""
self._active[reminder.id] = reminder
if not self._thread:
self._start()

def remove(self, reminder):
"""Remove a reminder from the table of active reminders."""
if reminder.id in self._active:
del self._active[reminder.id]
if not self._active:
self.stop()

def stop(self):
"""Stop the thread."""
if not self._thread:
return
self._abort = True
self._thread = None

database = ast.literal_eval(dump)
database = [item for item in database if item[0] != rid]
permdb.set_attr("command:remind", "data", str(database))

class _Reminder(object):
"""Represents a single reminder."""

def __init__(self, rid, user, wait, end, message, data, cmdobj):
def __init__(self, rid, user, wait, message, data, cmdobj, end=None):
self.id = rid
self.wait = wait
self.end = end
self.end = time.time() + wait if end is None else end
self.message = message

self._user = user
self._data = data
self._cmdobj = cmdobj
self._thread = None
self._expired = False

def _callback(self):
"""Internal callback function to be executed by the reminder thread."""
thread = self._thread
while time.time() < thread.end:
time.sleep(1)
if thread.abort:
return
self._cmdobj.reply(self._data, self.message)
self._delete()
for i in xrange(60):
time.sleep(1)
if thread.abort:
return
try:
self._cmdobj.reminders[self._user].remove(self)
if not self._cmdobj.reminders[self._user]:
del self._cmdobj.reminders[self._user]
except (KeyError, ValueError): # Already canceled by the user
pass
self._save()

def _save(self):
"""Save this reminder to the database."""
@@ -394,37 +450,54 @@ class _Reminder(object):
item = (self.id, self._user, self.wait, self.end, self.message, data)
self._cmdobj.store_reminder(item)

def _delete(self):
"""Remove this reminder from the database."""
def _fire(self):
"""Activate the reminder for the user."""
self._cmdobj.reply(self._data, self.message)
self._cmdobj.unstore_reminder(self.id)
self.end = time.time() + (60 * 60 * 24)
self._expired = True

def _finalize(self):
"""Clean up after a reminder has been expired for too long."""
try:
self._cmdobj.reminders[self._user].remove(self)
if not self._cmdobj.reminders[self._user]:
del self._cmdobj.reminders[self._user]
except (KeyError, ValueError): # Already canceled by the user
pass

@property
def data(self):
"""Return the IRC data object associated with this reminder."""
return self._data

@property
def end_time(self):
"""Return a string representing the end time of a reminder."""
if self.end >= time.time():
lctime = time.localtime(self.end)
if lctime.tm_year == time.localtime().tm_year:
ends = time.strftime("%b %d %H:%M:%S %Z", lctime)
else:
ends = time.strftime("%b %d, %Y %H:%M:%S %Z", lctime)
return "ends {0}".format(ends)
return "expired"

def start(self):
"""Start the reminder timer thread. Stops it if already running."""
self.stop()
self._thread = Thread(target=self._callback, name="remind-" + self.id)
self._thread.end = self.end
self._thread.daemon = True
self._thread.abort = False
self._thread.start()
if self._expired or self.end < time.time():
return "expired"
return "ends {0}".format(_format_time(self.end))
@property
def expired(self):
"""Return whether the reminder is expired."""
return self._expired
def reset(self, wait=None):
"""Reactivate a reminder."""
if wait is not None:
self.wait = wait
self.end = self.wait + time.time()
self._expired = False
self._cmdobj.unstore_reminder(self.id)
self._save()

def stop(self, delete=True):
"""Stop a currently running reminder."""
if not self._thread:
return
if delete:
self._delete()
self._thread.abort = True
self._thread = None
def trigger(self):
"""Hook run by the reminder thread."""
if not self._expired:
self._fire()
return False
else:
self._finalize()
return True

+ 11
- 7
earwigbot/commands/stalk.py View File

@@ -21,13 +21,14 @@
# SOFTWARE.

from ast import literal_eval
import re

from earwigbot.commands import Command
from earwigbot.irc import RC

class Stalk(Command):
"""Stalk a particular user (!stalk/!unstalk) or page (!watch/!unwatch) for
edits. Applies to the current bot session only."""
edits. Prefix regular expressions with "re:" (uses re.match)."""
name = "stalk"
commands = ["stalk", "watch", "unstalk", "unwatch", "stalks", "watches",
"allstalks", "allwatches", "unstalkall", "unwatchall"]
@@ -79,9 +80,12 @@ class Stalk(Command):
target = " ".join(data.args).replace("_", " ")
if target.startswith("[[") and target.endswith("]]"):
target = target[2:-2]
if target.startswith("User:") and "stalk" in data.command:
target = target[5:]
target = target[0].upper() + target[1:]
if target.startswith("re:"):
target = "re:" + target[3:].lstrip()
else:
if target.startswith("User:") and "stalk" in data.command:
target = target[5:]
target = target[0].upper() + target[1:]

if data.command in ["stalk", "watch"]:
if data.is_private:
@@ -119,12 +123,12 @@ class Stalk(Command):
else:
chans[item[0]] = None

def _wildcard_match(target, tag):
return target[-1] == "*" and tag.startswith(target[:-1])
def _regex_match(target, tag):
return target.startswith("re:") and re.match(target[3:], tag)

def _process(table, tag):
for target, stalks in table.iteritems():
if target == tag or _wildcard_match(target, tag):
if target == tag or _regex_match(target, tag):
_update_chans(stalks)

chans = {}


+ 4
- 4
earwigbot/commands/threads.py View File

@@ -71,14 +71,11 @@ class Threads(Command):
tname = thread.name
ident = thread.ident % 10000
if tname == "MainThread":
t = "\x0302MainThread\x0F (id {0})"
t = "\x0302main\x0F (id {0})"
normal_threads.append(t.format(ident))
elif tname in self.config.components:
t = "\x0302{0}\x0F (id {1})"
normal_threads.append(t.format(tname, ident))
elif tname.startswith("remind-"):
t = "\x0302reminder\x0F (id {0})"
daemon_threads.append(t.format(tname[len("remind-"):]))
elif tname.startswith("cvworker-"):
t = "\x0302copyvio worker\x0F (site {0})"
daemon_threads.append(t.format(tname[len("cvworker-"):]))
@@ -145,6 +142,9 @@ class Threads(Command):
return

data.kwargs["fromIRC"] = True
data.kwargs["_IRCCallback"] = lambda: self.reply(
data, "Task \x0302{0}\x0F finished.".format(task_name))

self.bot.tasks.start(task_name, **data.kwargs)
msg = "Task \x0302{0}\x0F started.".format(task_name)
self.reply(data, msg)

+ 9
- 5
earwigbot/config/script.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2016 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -143,9 +143,9 @@ class ConfigScript(object):
self._print("""I can encrypt passwords stored in your config file in
addition to preventing other users on your system from
reading the file. Encryption is recommended if the bot
is to run on a public computer like the Toolserver, but
otherwise the need to enter a key everytime you start
the bot may be annoying.""")
is to run on a public server like Wikimedia Labs, but
otherwise the need to enter a key every time you start
the bot may be an inconvenience.""")
self.data["metadata"]["encryptPasswords"] = False
if self._ask_bool("Encrypt stored passwords?"):
key = getpass(self.PROMPT + "Enter an encryption key: ")
@@ -270,7 +270,7 @@ class ConfigScript(object):
password = self._ask_pass("Bot password:", encrypt=False)
self.data["wiki"]["password"] = password
self.data["wiki"]["userAgent"] = "EarwigBot/$1 (Python/$2; https://github.com/earwig/earwigbot)"
self.data["wiki"]["summary"] = "([[WP:BOT|Bot]]): $2"
self.data["wiki"]["summary"] = "([[WP:BOT|Bot]]) $2"
self.data["wiki"]["useHTTPS"] = True
self.data["wiki"]["assert"] = "user"
self.data["wiki"]["maxlag"] = 10
@@ -442,6 +442,10 @@ class ConfigScript(object):
"""Make a new config file based on the user's input."""
try:
makedirs(path.dirname(self.config.path))
except OSError as exc:
if exc.errno != 17:
raise
try:
open(self.config.path, "w").close()
chmod(self.config.path, stat.S_IRUSR|stat.S_IWUSR)
except IOError:


+ 29
- 4
earwigbot/irc/connection.py View File

@@ -45,6 +45,7 @@ class IRCConnection(object):
self._last_recv = time()
self._last_send = 0
self._last_ping = 0
self._myhost = "." * 63 # default: longest possible hostname

def __repr__(self):
"""Return the canonical string representation of the IRCConnection."""
@@ -100,8 +101,19 @@ class IRCConnection(object):
self.logger.debug(msg)
self._last_send = time()

def _split(self, msgs, maxlen, maxsplits=3):
"""Split a large message into multiple messages smaller than maxlen."""
def _get_maxlen(self, extra):
"""Return our best guess of the maximum length of a standard message.

This applies mainly to PRIVMSGs and NOTICEs.
"""
base_max = 512
userhost = len(self.nick) + len(self.ident) + len(self._myhost) + 2
padding = 4 # "\r\n" at end, ":" at beginning, and " " after userhost
return base_max - userhost - padding - extra

def _split(self, msgs, extralen, maxsplits=3):
"""Split a large message into multiple messages."""
maxlen = self._get_maxlen(extralen)
words = msgs.split(" ")
splits = 0
while words and splits < maxsplits:
@@ -128,6 +140,19 @@ class IRCConnection(object):
self._last_recv = time()
if line[0] == "PING": # If we are pinged, pong back
self.pong(line[1][1:])
elif line[1] == "001": # Update nickname on startup
if line[2] != self.nick:
self.logger.warn("Nickname changed from {0} to {1}".format(
self.nick, line[2]))
self._nick = line[2]
elif line[1] == "376": # After sign-on, get our userhost
self._send("WHOIS {0}".format(self.nick))
elif line[1] == "311": # Receiving WHOIS result
if line[2] == self.nick:
self._ident = line[4]
self._myhost = line[5]
elif line[1] == "396": # Hostname change
self._myhost = line[3]

def _process_message(self, line):
"""To be overridden in subclasses."""
@@ -163,7 +188,7 @@ class IRCConnection(object):

def say(self, target, msg, hidelog=False):
"""Send a private message to a target on the server."""
for msg in self._split(msg, 400):
for msg in self._split(msg, len(target) + 10):
msg = "PRIVMSG {0} :{1}".format(target, msg)
self._send(msg, hidelog)

@@ -182,7 +207,7 @@ class IRCConnection(object):

def notice(self, target, msg, hidelog=False):
"""Send a notice to a target on the server."""
for msg in self._split(msg, 400):
for msg in self._split(msg, len(target) + 9):
msg = "NOTICE {0} :{1}".format(target, msg)
self._send(msg, hidelog)



+ 12
- 6
earwigbot/irc/data.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2016 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -50,20 +50,26 @@ class Data(object):

def _parse(self):
"""Parse a line from IRC into its components as instance attributes."""
sender = re.findall(r":(.*?)!(.*?)@(.*?)\Z", self.line[0])[0]
self._chan = self.line[2]
try:
sender = re.findall(r":(.*?)!(.*?)@(.*?)\Z", self.line[0])[0]
except IndexError:
self._host = self.line[0][1:]
self._nick = self._ident = self._reply_nick = "*"
return
self._nick, self._ident, self._host = sender
self._reply_nick = self._nick
self._chan = self.line[2]

if self._msgtype == "PRIVMSG":
if self._msgtype in ["PRIVMSG", "NOTICE"]:
if self.chan.lower() == self.my_nick:
# This is a privmsg to us, so set 'chan' as the nick of the
# sender instead of the 'channel', which is ourselves:
self._chan = self._nick
self._is_private = True
self._msg = " ".join(self.line[3:])[1:]
self._parse_args()
self._parse_kwargs()
if self._msgtype == "PRIVMSG":
self._parse_args()
self._parse_kwargs()

def _parse_args(self):
"""Parse command arguments from the message.


+ 29
- 6
earwigbot/irc/frontend.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2016 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -20,6 +20,8 @@
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from time import sleep

from earwigbot.irc import IRCConnection, Data

__all__ = ["Frontend"]
@@ -36,6 +38,7 @@ class Frontend(IRCConnection):
:py:mod:`earwigbot.commands` or the bot's custom command directory
(explained in the :doc:`documentation </customizing>`).
"""
NICK_SERVICES = "NickServ"

def __init__(self, bot):
self.bot = bot
@@ -43,6 +46,8 @@ class Frontend(IRCConnection):
base = super(Frontend, self)
base.__init__(cf["host"], cf["port"], cf["nick"], cf["ident"],
cf["realname"], bot.logger.getChild("frontend"))

self._auth_wait = False
self._connect()

def __repr__(self):
@@ -56,6 +61,11 @@ class Frontend(IRCConnection):
res = "<Frontend {0}!{1} at {2}:{3}>"
return res.format(self.nick, self.ident, self.host, self.port)

def _join_channels(self):
"""Join all startup channels as specified by the config file."""
for chan in self.bot.config.irc["frontend"]["channels"]:
self.join(chan)

def _process_message(self, line):
"""Process a single message from IRC."""
if line[1] == "JOIN":
@@ -74,17 +84,30 @@ class Frontend(IRCConnection):
self.bot.commands.call("msg_public", data)
self.bot.commands.call("msg", data)

elif line[1] == "NOTICE":
data = Data(self.nick, line, msgtype="NOTICE")
if self._auth_wait and data.nick == self.NICK_SERVICES:
if data.msg.startswith("This nickname is registered."):
return
self._auth_wait = False
sleep(2) # Wait for hostname change to propagate
self._join_channels()

elif line[1] == "376": # On successful connection to the server
# If we're supposed to auth to NickServ, do that:
try:
username = self.bot.config.irc["frontend"]["nickservUsername"]
password = self.bot.config.irc["frontend"]["nickservPassword"]
except KeyError:
pass
self._join_channels()
else:
self.logger.debug("Identifying with services")
msg = "IDENTIFY {0} {1}".format(username, password)
self.say("NickServ", msg, hidelog=True)
self.say(self.NICK_SERVICES, msg, hidelog=True)
self._auth_wait = True

# Join all of our startup channels:
for chan in self.bot.config.irc["frontend"]["channels"]:
self.join(chan)
elif line[1] == "401": # No such nickname
if self._auth_wait and line[3] == self.NICK_SERVICES:
# Services is down, or something...?
self._auth_wait = False
self._join_channels()

+ 26
- 11
earwigbot/managers.py View File

@@ -72,14 +72,21 @@ class _ResourceManager(object):
for resource in self._resources.itervalues():
yield resource

def _is_disabled(self, name):
"""Check whether a resource should be disabled."""
conf = getattr(self.bot.config, self._resource_name)
disabled = conf.get("disable", [])
enabled = conf.get("enable", [])
return name not in enabled and (disabled is True or name in disabled)

def _load_resource(self, name, path, klass):
"""Instantiate a resource class and add it to the dictionary."""
res_type = self._resource_name[:-1] # e.g. "command" or "task"
if hasattr(klass, "name"):
res_config = getattr(self.bot.config, self._resource_name)
if getattr(klass, "name") in res_config.get("disable", []):
classname = getattr(klass, "name")
if self._is_disabled(name) and self._is_disabled(classname):
log = "Skipping disabled {0} {1}"
self.logger.debug(log.format(res_type, getattr(klass, "name")))
self.logger.debug(log.format(res_type, classname))
return
try:
resource = klass(self.bot) # Create instance of resource
@@ -119,8 +126,6 @@ class _ResourceManager(object):
def _load_directory(self, dir):
"""Load all valid resources in a given directory."""
self.logger.debug("Loading directory {0}".format(dir))
res_config = getattr(self.bot.config, self._resource_name)
disabled = res_config.get("disable", [])
processed = []
for name in listdir(dir):
if not name.endswith(".py") and not name.endswith(".pyc"):
@@ -128,14 +133,14 @@ class _ResourceManager(object):
if name.startswith("_") or name.startswith("."):
continue
modname = sub("\.pyc?$", "", name) # Remove extension
if modname in disabled:
if modname in processed:
continue
processed.append(modname)
if self._is_disabled(modname):
log = "Skipping disabled module {0}".format(modname)
self.logger.debug(log)
processed.append(modname)
continue
if modname not in processed:
self._load_module(modname, dir)
processed.append(modname)
self._load_module(modname, dir)

def _unload_resources(self):
"""Unload all resources, calling their unload hooks in the process."""
@@ -162,7 +167,8 @@ class _ResourceManager(object):
self._unload_resources()
builtin_dir = path.join(path.dirname(__file__), name)
plugins_dir = path.join(self.bot.config.root_dir, name)
if getattr(self.bot.config, name).get("disable") is True:
conf = getattr(self.bot.config, name)
if conf.get("disable") is True and not conf.get("enable"):
log = "Skipping disabled builtins directory: {0}"
self.logger.debug(log.format(builtin_dir))
else:
@@ -219,6 +225,13 @@ class CommandManager(_ResourceManager):
.. note::
The special ``rc`` hook actually passes a :class:`~.RC` object.
"""
try:
quiet = self.bot.config.irc["frontend"]["quiet"][data.chan]
if quiet is True or hook in quiet:
return
except KeyError:
pass

for command in self:
if hook in command.hooks and self._wrap_check(command, data):
thread = Thread(target=self._wrap_process,
@@ -247,6 +260,8 @@ class TaskManager(_ResourceManager):
else:
msg = "Task '{0}' finished successfully"
self.logger.info(msg.format(task.name))
if kwargs.get("fromIRC"):
kwargs.get("_IRCCallback")()

def start(self, task_name, **kwargs):
"""Start a given task in a new daemon thread, and return the thread.


+ 4
- 0
earwigbot/tasks/__init__.py View File

@@ -54,6 +54,10 @@ class Task(object):
self.bot = bot
self.config = bot.config
self.logger = bot.tasks.logger.getChild(self.name)

number = self.config.tasks.get(self.name, {}).get("number")
if number is not None:
self.number = number
self.setup()

def __repr__(self):


+ 229
- 99
earwigbot/tasks/wikiproject_tagger.py View File

@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2009-2015 Ben Kurtovic <ben.kurtovic@gmail.com>
# Copyright (C) 2009-2017 Ben Kurtovic <ben.kurtovic@gmail.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
@@ -30,16 +30,16 @@ class WikiProjectTagger(Task):
"""A task to tag talk pages with WikiProject banners.

Usage: :command:`earwigbot -t wikiproject_tagger PATH
--banner BANNER (--category CAT | --file FILE) [--summary SUM]
[--append TEXT] [--autoassess] [--nocreate] [--recursive NUM]
[--site SITE]`
--banner BANNER (--category CAT | --file FILE) [--summary SUM] [--update]
[--append PARAMS] [--autoassess [CLASSES]] [--only-with BANNER]
[--nocreate] [--recursive [NUM]] [--site SITE] [--dry-run]`

.. glossary::

``--banner BANNER``
the page name of the banner to add, without a namespace (unless the
namespace is something other than ``Template``) so
``--banner WikiProject Biography`` for ``{{WikiProject Biography}}``
``--banner "WikiProject Biography"`` for ``{{WikiProject Biography}}``
``--category CAT`` or ``--file FILE``
determines which pages to tag; either all pages in a category (to
include subcategories as well, see ``--recursive``) or all
@@ -47,21 +47,33 @@ class WikiProjectTagger(Task):
current directory)
``--summary SUM``
an optional edit summary to use; defaults to
``"Adding WikiProject banner {{BANNER}}."``
``--append TEXT``
optional text to append to the banner (after an autoassessment, if
any), like ``|importance=low``
``--autoassess``
``"Tagging with WikiProject banner {{BANNER}}."``
``--update``
updates existing banners with new fields; should include at least one
of ``--append`` or ``--autoassess`` to be useful
``--append PARAMS``
optional comma-separated parameters to append to the banner (after an
auto-assessment, if any); use syntax ``importance=low,taskforce=yes``
to add ``|importance=low|taskforce=yes``
``--autoassess [CLASSES]``
try to assess each article's class automatically based on the class of
other banners on the same page
other banners on the same page; if CLASSES is given as a
comma-separated list, only those classes will be auto-assessed
``--only-with BANNER``
only tag pages that already have the given banner
``--nocreate``
don't create new talk pages with just a banner if the page doesn't
already exist
``--recursive NUM``
recursively go through subcategories up to a maximum depth of ``NUM``,
or if ``NUM`` isn't provided, go infinitely (this can be dangerous)
``--tag-categories``
also tag category pages
``--site SITE``
the ID of the site to tag pages on, defaulting to the... default site
the ID of the site to tag pages on, defaulting to the default site
``--dry-run``
don't actually make any edits, just log the pages that would have been
edited

"""
name = "wikiproject_tagger"
@@ -90,11 +102,10 @@ class WikiProjectTagger(Task):
r"failed ?ga$",
r"old ?prod( ?full)?$",
r"(old|previous) ?afd$",

r"((wikiproject|wp) ?)?bio(graph(y|ies))?$",
]

def _upperfirst(self, text):
@staticmethod
def _upperfirst(text):
"""Try to uppercase the first letter of a string."""
try:
return text[0].upper() + text[1:]
@@ -114,15 +125,29 @@ class WikiProjectTagger(Task):

site = self.bot.wiki.get_site(name=kwargs.get("site"))
banner = kwargs["banner"]
summary = kwargs.get("summary", "Adding WikiProject banner $3.")
summary = kwargs.get("summary", "Tagging with WikiProject banner $3.")
update = kwargs.get("update", False)
append = kwargs.get("append")
autoassess = kwargs.get("autoassess", False)
ow_banner = kwargs.get("only-with")
nocreate = kwargs.get("nocreate", False)
recursive = kwargs.get("recursive", 0)
tag_categories = kwargs.get("tag-categories", False)
dry_run = kwargs.get("dry-run", False)
banner, names = self.get_names(site, banner)
if not names:
return
job = _Job(banner, names, summary, append, autoassess, nocreate)
if ow_banner:
_, only_with = self.get_names(site, ow_banner)
if not only_with:
return
else:
only_with = None

job = _Job(banner=banner, names=names, summary=summary, update=update,
append=append, autoassess=autoassess, only_with=only_with,
nocreate=nocreate, tag_categories=tag_categories,
dry_run=dry_run)

try:
self.run_job(kwargs, site, job, recursive)
@@ -172,139 +197,237 @@ class WikiProjectTagger(Task):
banner = banner.split(":", 1)[1]
page = site.get_page(title)
if page.exists != page.PAGE_EXISTS:
self.logger.error(u"Banner [[{0}]] does not exist".format(title))
self.logger.error(u"Banner [[%s]] does not exist", title)
return banner, None

if banner == title:
names = [self._upperfirst(banner)]
else:
names = [self._upperfirst(banner), self._upperfirst(title)]
names = {banner, title}
result = site.api_query(action="query", list="backlinks", bllimit=500,
blfilterredir="redirects", bltitle=title)
for backlink in result["query"]["backlinks"]:
names.append(backlink["title"])
names.add(backlink["title"])
if backlink["ns"] == constants.NS_TEMPLATE:
names.append(backlink["title"].split(":", 1)[1])
names.add(backlink["title"].split(":", 1)[1])

log = u"Found {0} aliases for banner [[{1}]]".format(len(names), title)
self.logger.debug(log)
log = u"Found %s aliases for banner [[%s]]"
self.logger.debug(log, len(names), title)
return banner, names

def process_category(self, page, job, recursive):
"""Try to tag all pages in the given category."""
self.logger.info(u"Processing category: [[{0]]".format(page.title))
if page.title in job.processed_cats:
self.logger.debug(u"Skipping category, already processed: [[%s]]",
page.title)
return
self.logger.info(u"Processing category: [[%s]]", page.title)
job.processed_cats.add(page.title)

if job.tag_categories:
self.process_page(page, job)
for member in page.get_members():
if member.namespace == constants.NS_CATEGORY:
nspace = member.namespace
if nspace == constants.NS_CATEGORY:
if recursive is True:
self.process_category(member, job, True)
elif recursive:
elif recursive > 0:
self.process_category(member, job, recursive - 1)
elif job.tag_categories:
self.process_page(member, job)
elif nspace in (constants.NS_USER, constants.NS_USER_TALK):
continue
else:
self.process_page(member, job)

def process_page(self, page, job):
"""Try to tag a specific *page* using the *job* description."""
if not page.is_talkpage:
page = page.toggle_talk()

if page.title in job.processed_pages:
self.logger.debug(u"Skipping page, already processed: [[%s]]",
page.title)
return
job.processed_pages.add(page.title)

if job.counter % 10 == 0: # Do a shutoff check every ten pages
if self.shutoff_enabled(page.site):
raise _ShutoffEnabled()
job.counter += 1

if not page.is_talkpage:
page = page.toggle_talk()
try:
code = page.parse()
except exceptions.PageNotFoundError:
if job.nocreate:
log = u"Skipping nonexistent page: [[{0}]]".format(page.title)
self.logger.info(log)
else:
log = u"Tagging new page: [[{0}]]".format(page.title)
self.logger.info(log)
banner = "{{" + job.banner + job.append + "}}"
summary = job.summary.replace("$3", banner)
page.edit(banner, self.make_summary(summary))
self.process_new_page(page, job)
return
except exceptions.InvalidPageError:
log = u"Skipping invalid page: [[{0}]]".format(page.title)
self.logger.error(log)
self.logger.error(u"Skipping invalid page: [[%s]]", page.title)
return

is_update = False
for template in code.ifilter_templates(recursive=True):
name = self._upperfirst(template.name.strip())
if name in job.names:
log = u"Skipping page: [[{0}]]; already tagged with '{1}'"
self.logger.info(log.format(page.title, name))
if template.name.matches(job.names):
if job.update:
banner = template
is_update = True
break
else:
log = u"Skipping page: [[%s]]; already tagged with '%s'"
self.logger.info(log, page.title, template.name)
return

if job.only_with:
if not any(template.name.matches(job.only_with)
for template in code.ifilter_templates(recursive=True)):
log = u"Skipping page: [[%s]]; fails only-with condition"
self.logger.info(log, page.title)
return

banner = self.make_banner(job, code)
shell = self.get_banner_shell(code)
if shell:
if shell.has_param(1):
shell.get(1).value.insert(0, banner + "\n")
else:
shell.add(1, banner)
if is_update:
old_banner = unicode(banner)
self.update_banner(banner, job, code)
if banner == old_banner:
log = u"Skipping page: [[%s]]; already tagged and no updates"
self.logger.info(log, page.title)
return
self.logger.info(u"Updating banner on page: [[%s]]", page.title)
banner = banner.encode("utf8")
else:
self.add_banner(code, banner)
self.apply_genfixes(code)
self.logger.info(u"Tagging page: [[%s]]", page.title)
banner = self.make_banner(job, code)
shell = self.get_banner_shell(code)
if shell:
self.add_banner_to_shell(shell, banner)
else:
self.add_banner(code, banner)

self.save_page(page, job, unicode(code), banner)

self.logger.info(u"Tagging page: [[{0}]]".format(page.title))
summary = job.summary.replace("$3", banner)
page.edit(unicode(code), self.make_summary(summary))
def process_new_page(self, page, job):
"""Try to tag a *page* that doesn't exist yet using the *job*."""
if job.nocreate or job.only_with:
log = u"Skipping nonexistent page: [[%s]]"
self.logger.info(log, page.title)
else:
self.logger.info(u"Tagging new page: [[%s]]", page.title)
banner = self.make_banner(job)
self.save_page(page, job, banner, banner)

def save_page(self, page, job, text, banner):
"""Save a page with an updated banner."""
if job.dry_run:
self.logger.debug(u"[DRY RUN] Banner: %s", banner)
else:
summary = job.summary.replace("$3", banner)
page.edit(text, self.make_summary(summary), minor=True)

def make_banner(self, job, code):
def make_banner(self, job, code=None):
"""Return banner text to add based on a *job* and a page's *code*."""
banner = "{{" + job.banner
if job.autoassess:
classes = {"fa": 0, "fl": 0, "ga": 0, "a": 0, "b": 0, "start": 0,
"stub": 0, "list": 0, "dab": 0, "c": 0, "redirect": 0,
"book": 0, "template": 0, "category": 0}
for template in code.ifilter_templates(recursive