A Python robot that edits Wikipedia and interacts with people over IRC https://en.wikipedia.org/wiki/User:EarwigBot

248 line
12 KiB

  1. The Wiki Toolset
  2. ================
  3. EarwigBot's answer to the `Pywikipedia framework`_ is the Wiki Toolset
  4. (:py:mod:`earwigbot.wiki`), which you will mainly access through
  5. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>`.
  6. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>` provides three methods for the
  7. management of Sites - :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`,
  8. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`, and
  9. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.remove_site`. Sites are objects that
  10. simply represent a MediaWiki site. A single instance of EarwigBot (i.e. a
  11. single *working directory*) is expected to relate to a single site or group of
  12. sites using the same login info (like all WMF wikis with `CentralAuth`_).
  13. Load your default site (the one that you picked during setup) with
  14. ``site = bot.wiki.get_site()``.
  15. Dealing with other sites
  16. ~~~~~~~~~~~~~~~~~~~~~~~~
  17. *Skip this section if you're only working with one site.*
  18. If a site is *already known to the bot* (meaning that it is stored in the
  19. :file:`sites.db` file, which includes just your default wiki at first), you can
  20. load a site with ``site = bot.wiki.get_site(name)``, where ``name`` might be
  21. ``"enwiki"`` or ``"frwiktionary"`` (you can also do
  22. ``site = bot.wiki.get_site(project="wikipedia", lang="en")``). Recall that not
  23. giving any arguments to ``get_site()`` will return the default site.
  24. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site` is used to add new sites to
  25. the sites database. It may be called with similar arguments as
  26. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`, but the difference is
  27. important. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site` only needs
  28. enough information to identify the site in its database, which is usually just
  29. its name; the database stores all other necessary connection info. With
  30. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`, you need to provide enough
  31. connection info so the toolset can successfully access the site's API/SQL
  32. databases and store that information for later. That might not be much; for WMF
  33. wikis, you can usually use code like this::
  34. project, lang = "wikipedia", "es"
  35. try:
  36. site = bot.wiki.get_site(project=project, lang=lang)
  37. except earwigbot.SiteNotFoundError:
  38. # Load site info from http://es.wikipedia.org/w/api.php:
  39. site = bot.wiki.add_site(project=project, lang=lang)
  40. This works because EarwigBot assumes that the URL for the site is
  41. ``"//{lang}.{project}.org"``, the API is at ``/w/api.php``, and the SQL
  42. connection info (if any) is stored as ``config.wiki["sql"]``. This might change
  43. if you're dealing with non-WMF wikis, where the code might look something more
  44. like::
  45. project, lang = "mywiki", "it"
  46. try:
  47. site = bot.wiki.get_site(project=project, lang=lang)
  48. except earwigbot.SiteNotFoundError:
  49. # Load site info from http://mysite.net/mywiki/it/s/api.php:
  50. base_url = "http://mysite.net/" + project + "/" + lang
  51. db_name = lang + project + "_p"
  52. sql = {host: "sql.mysite.net", db: db_name}
  53. site = bot.wiki.add_site(base_url=base_url, script_path="/s", sql=sql)
  54. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.remove_site` does the opposite of
  55. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`: give it a site's name or a
  56. project/lang pair like :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`
  57. takes, and it'll remove that site from the sites database.
  58. Sites
  59. ~~~~~
  60. :py:class:`earwigbot.wiki.Site <earwigbot.wiki.site.Site>` objects provide the
  61. following attributes:
  62. - :py:attr:`~earwigbot.wiki.site.Site.name`: the site's name (or "wikiid"),
  63. like ``"enwiki"``
  64. - :py:attr:`~earwigbot.wiki.site.Site.project`: the site's project name, like
  65. ``"wikipedia"``
  66. - :py:attr:`~earwigbot.wiki.site.Site.lang`: the site's language code, like
  67. ``"en"``
  68. - :py:attr:`~earwigbot.wiki.site.Site.domain`: the site's web domain, like
  69. ``"en.wikipedia.org"``
  70. - :py:attr:`~earwigbot.wiki.site.Site.url`: the site's full base URL, like
  71. ``"https://en.wikipedia.org"``
  72. and the following methods:
  73. - :py:meth:`api_query(**kwargs) <earwigbot.wiki.site.Site.api_query>`: does an
  74. API query with the given keyword arguments as params
  75. - :py:meth:`sql_query(query, params=(), ...)
  76. <earwigbot.wiki.site.Site.sql_query>`: does an SQL query and yields its
  77. results (as a generator)
  78. - :py:meth:`~earwigbot.wiki.site.Site.get_replag`: returns the estimated
  79. database replication lag (if we have the site's SQL connection info)
  80. - :py:meth:`namespace_id_to_name(id, all=False)
  81. <earwigbot.wiki.site.Site.namespace_id_to_name>`: given a namespace ID,
  82. returns the primary associated namespace name (or a list of all names when
  83. ``all`` is ``True``)
  84. - :py:meth:`namespace_name_to_id(name)
  85. <earwigbot.wiki.site.Site.namespace_name_to_id>`: given a namespace name,
  86. returns the associated namespace ID
  87. - :py:meth:`get_page(title, follow_redirects=False, ...)
  88. <earwigbot.wiki.site.Site.get_page>`: returns a ``Page`` object for the given
  89. title (or a :py:class:`~earwigbot.wiki.category.Category` object if the
  90. page's namespace is "``Category:``")
  91. - :py:meth:`get_category(catname, follow_redirects=False, ...)
  92. <earwigbot.wiki.site.Site.get_category>`: returns a ``Category`` object for
  93. the given title (sans namespace)
  94. - :py:meth:`get_user(username) <earwigbot.wiki.site.Site.get_user>`: returns a
  95. :py:class:`~earwigbot.wiki.user.User` object for the given username
  96. - :py:meth:`delegate(services, ...) <earwigbot.wiki.site.Site.delegate>`:
  97. delegates a task to either the API or SQL depending on various conditions,
  98. such as server lag
  99. Pages and categories
  100. ~~~~~~~~~~~~~~~~~~~~
  101. Create :py:class:`earwigbot.wiki.Page <earwigbot.wiki.page.Page>` objects with
  102. :py:meth:`site.get_page(title) <earwigbot.wiki.site.Site.get_page>`,
  103. :py:meth:`page.toggle_talk() <earwigbot.wiki.page.Page.toggle_talk>`,
  104. :py:meth:`user.get_userpage() <earwigbot.wiki.user.User.get_userpage>`, or
  105. :py:meth:`user.get_talkpage() <earwigbot.wiki.user.User.get_talkpage>`. They
  106. provide the following attributes:
  107. - :py:attr:`~earwigbot.wiki.page.Page.site`: the page's corresponding
  108. :py:class:`~earwigbot.wiki.site.Site` object
  109. - :py:attr:`~earwigbot.wiki.page.Page.title`: the page's title, or pagename
  110. - :py:attr:`~earwigbot.wiki.page.Page.exists`: whether or not the page exists
  111. - :py:attr:`~earwigbot.wiki.page.Page.pageid`: an integer ID representing the
  112. page
  113. - :py:attr:`~earwigbot.wiki.page.Page.url`: the page's URL
  114. - :py:attr:`~earwigbot.wiki.page.Page.namespace`: the page's namespace as an
  115. integer
  116. - :py:attr:`~earwigbot.wiki.page.Page.protection`: the page's current
  117. protection status
  118. - :py:attr:`~earwigbot.wiki.page.Page.is_talkpage`: ``True`` if the page is a
  119. talkpage, else ``False``
  120. - :py:attr:`~earwigbot.wiki.page.Page.is_redirect`: ``True`` if the page is a
  121. redirect, else ``False``
  122. and the following methods:
  123. - :py:meth:`~earwigbot.wiki.page.Page.reload`: forcibly reloads the page's
  124. attributes (emphasis on *reload* - this is only necessary if there is reason
  125. to believe they have changed)
  126. - :py:meth:`toggle_talk(...) <earwigbot.wiki.page.Page.toggle_talk>`: returns a
  127. content page's talk page, or vice versa
  128. - :py:meth:`~earwigbot.wiki.page.Page.get`: returns page content
  129. - :py:meth:`~earwigbot.wiki.page.Page.get_redirect_target`: if the page is a
  130. redirect, returns its destination
  131. - :py:meth:`~earwigbot.wiki.page.Page.get_creator`: returns a
  132. :py:class:`~earwigbot.wiki.user.User` object representing the first user to
  133. edit the page
  134. - :py:meth:`edit(text, summary, minor=False, bot=True, force=False)
  135. <earwigbot.wiki.page.Page.edit>`: replaces the page's content with ``text``
  136. or creates a new page
  137. - :py:meth:`add_section(text, title, minor=False, bot=True, force=False)
  138. <earwigbot.wiki.page.Page.add_section>`: adds a new section named ``title``
  139. at the bottom of the page
  140. - :py:meth:`copyvio_check(...)
  141. <earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_check>`: checks the page for
  142. copyright violations
  143. - :py:meth:`copyvio_compare(url, ...)
  144. <earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_compare>`: checks the page like
  145. :py:meth:`~earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_check`, but
  146. against a specific URL
  147. - :py:meth:`check_exclusion(username=None, optouts=None)
  148. <earwigbot.wiki.page.Page.check_exclusion>`: checks whether or not we are
  149. allowed to edit the page per ``{{bots}}``/``{{nobots}}``
  150. Additionally, :py:class:`~earwigbot.wiki.category.Category` objects (created
  151. with :py:meth:`site.get_category(name) <earwigbot.wiki.site.Site.get_category>`
  152. or :py:meth:`site.get_page(title) <earwigbot.wiki.site.Site.get_page>` where
  153. ``title`` is in the ``Category:`` namespace) provide the following additional
  154. attributes:
  155. - :py:attr:`~earwigbot.wiki.category.Category.size`: the total number of
  156. members in the category
  157. - :py:attr:`~earwigbot.wiki.category.Category.pages`: the number of pages in
  158. the category
  159. - :py:attr:`~earwigbot.wiki.category.Category.files`: the number of files in
  160. the category
  161. - :py:attr:`~earwigbot.wiki.category.Category.subcats`: the number of
  162. subcategories in the category
  163. And the following additional method:
  164. - :py:meth:`get_members(limit=None, ...)
  165. <earwigbot.wiki.category.Category.get_members>`: iterates over
  166. :py:class:`~earwigbot.wiki.page.Page`\ s in the category, until either the
  167. category is exhausted or (if given) ``limit`` is reached
  168. Users
  169. ~~~~~
  170. Create :py:class:`earwigbot.wiki.User <earwigbot.wiki.user.User>` objects with
  171. :py:meth:`site.get_user(name) <earwigbot.wiki.site.Site.get_user>` or
  172. :py:meth:`page.get_creator() <earwigbot.wiki.page.Page.get_creator>`. They
  173. provide the following attributes:
  174. - :py:attr:`~earwigbot.wiki.user.User.site`: the user's corresponding
  175. :py:class:`~earwigbot.wiki.site.Site` object
  176. - :py:attr:`~earwigbot.wiki.user.User.name`: the user's username
  177. - :py:attr:`~earwigbot.wiki.user.User.exists`: ``True`` if the user exists, or
  178. ``False`` if they do not
  179. - :py:attr:`~earwigbot.wiki.user.User.userid`: an integer ID representing the
  180. user
  181. - :py:attr:`~earwigbot.wiki.user.User.blockinfo`: information about any current
  182. blocks on the user (``False`` if no block, or a dict of
  183. ``{"by": blocking_user, "reason": block_reason,
  184. "expiry": block_expire_time}``)
  185. - :py:attr:`~earwigbot.wiki.user.User.groups`: a list of the user's groups
  186. - :py:attr:`~earwigbot.wiki.user.User.rights`: a list of the user's rights
  187. - :py:attr:`~earwigbot.wiki.user.User.editcount`: the number of edits made by
  188. the user
  189. - :py:attr:`~earwigbot.wiki.user.User.registration`: the time the user
  190. registered as a :py:obj:`time.struct_time`
  191. - :py:attr:`~earwigbot.wiki.user.User.emailable`: ``True`` if you can email the
  192. user, ``False`` if you cannot
  193. - :py:attr:`~earwigbot.wiki.user.User.gender`: the user's gender (``"male"``,
  194. ``"female"``, or ``"unknown"``)
  195. - :py:attr:`~earwigbot.wiki.user.User.is_ip`: ``True`` if the user is an IP
  196. address, IPv4 or IPv6, otherwise ``False``
  197. and the following methods:
  198. - :py:meth:`~earwigbot.wiki.user.User.reload`: forcibly reloads the user's
  199. attributes (emphasis on *reload* - this is only necessary if there is reason
  200. to believe they have changed)
  201. - :py:meth:`~earwigbot.wiki.user.User.get_userpage`: returns a
  202. :py:class:`~earwigbot.wiki.page.Page` object representing the user's userpage
  203. - :py:meth:`~earwigbot.wiki.user.User.get_talkpage`: returns a
  204. :py:class:`~earwigbot.wiki.page.Page` object representing the user's talkpage
  205. Additional features
  206. ~~~~~~~~~~~~~~~~~~~
  207. Not all aspects of the toolset are covered here. Explore `its code and
  208. docstrings`_ to learn how to use it in a more hands-on fashion. For reference,
  209. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>` is an instance of
  210. :py:class:`earwigbot.wiki.SitesDB <earwigbot.wiki.sitesdb.SitesDB>` tied to the
  211. :file:`sites.db` file in the bot's working directory.
  212. .. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/
  213. .. _CentralAuth: http://www.mediawiki.org/wiki/Extension:CentralAuth
  214. .. _its code and docstrings: https://github.com/earwig/earwigbot/tree/develop/earwigbot/wiki