A Python robot that edits Wikipedia and interacts with people over IRC https://en.wikipedia.org/wiki/User:EarwigBot

12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
12 年之前
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247
  1. The Wiki Toolset
  2. ================
  3. EarwigBot's answer to the `Pywikipedia framework`_ is the Wiki Toolset
  4. (:py:mod:`earwigbot.wiki`), which you will mainly access through
  5. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>`.
  6. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>` provides three methods for the
  7. management of Sites - :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`,
  8. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`, and
  9. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.remove_site`. Sites are objects that
  10. simply represent a MediaWiki site. A single instance of EarwigBot (i.e. a
  11. single *working directory*) is expected to relate to a single site or group of
  12. sites using the same login info (like all WMF wikis with `CentralAuth`_).
  13. Load your default site (the one that you picked during setup) with
  14. ``site = bot.wiki.get_site()``.
  15. Dealing with other sites
  16. ~~~~~~~~~~~~~~~~~~~~~~~~
  17. *Skip this section if you're only working with one site.*
  18. If a site is *already known to the bot* (meaning that it is stored in the
  19. :file:`sites.db` file, which includes just your default wiki at first), you can
  20. load a site with ``site = bot.wiki.get_site(name)``, where ``name`` might be
  21. ``"enwiki"`` or ``"frwiktionary"`` (you can also do
  22. ``site = bot.wiki.get_site(project="wikipedia", lang="en")``). Recall that not
  23. giving any arguments to ``get_site()`` will return the default site.
  24. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site` is used to add new sites to
  25. the sites database. It may be called with similar arguments as
  26. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`, but the difference is
  27. important. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site` only needs
  28. enough information to identify the site in its database, which is usually just
  29. its name; the database stores all other necessary connection info. With
  30. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`, you need to provide enough
  31. connection info so the toolset can successfully access the site's API/SQL
  32. databases and store that information for later. That might not be much; for WMF
  33. wikis, you can usually use code like this::
  34. project, lang = "wikipedia", "es"
  35. try:
  36. site = bot.wiki.get_site(project=project, lang=lang)
  37. except earwigbot.SiteNotFoundError:
  38. # Load site info from http://es.wikipedia.org/w/api.php:
  39. site = bot.wiki.add_site(project=project, lang=lang)
  40. This works because EarwigBot assumes that the URL for the site is
  41. ``"//{lang}.{project}.org"``, the API is at ``/w/api.php``, and the SQL
  42. connection info (if any) is stored as ``config.wiki["sql"]``. This might change
  43. if you're dealing with non-WMF wikis, where the code might look something more
  44. like::
  45. project, lang = "mywiki", "it"
  46. try:
  47. site = bot.wiki.get_site(project=project, lang=lang)
  48. except earwigbot.SiteNotFoundError:
  49. # Load site info from http://mysite.net/mywiki/it/s/api.php:
  50. base_url = "http://mysite.net/" + project + "/" + lang
  51. db_name = lang + project + "_p"
  52. sql = {host: "sql.mysite.net", db: db_name}
  53. site = bot.wiki.add_site(base_url=base_url, script_path="/s", sql=sql)
  54. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.remove_site` does the opposite of
  55. :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.add_site`: give it a site's name or a
  56. project/lang pair like :py:meth:`~earwigbot.wiki.sitesdb.SitesDB.get_site`
  57. takes, and it'll remove that site from the sites database.
  58. Sites
  59. ~~~~~
  60. :py:class:`earwigbot.wiki.Site <earwigbot.wiki.site.Site>` objects provide the
  61. following attributes:
  62. - :py:attr:`~earwigbot.wiki.site.Site.name`: the site's name (or "wikiid"),
  63. like ``"enwiki"``
  64. - :py:attr:`~earwigbot.wiki.site.Site.project`: the site's project name, like
  65. ``"wikipedia"``
  66. - :py:attr:`~earwigbot.wiki.site.Site.lang`: the site's language code, like
  67. ``"en"``
  68. - :py:attr:`~earwigbot.wiki.site.Site.domain`: the site's web domain, like
  69. ``"en.wikipedia.org"``
  70. - :py:attr:`~earwigbot.wiki.site.Site.url`: the site's full base URL, like
  71. ``"https://en.wikipedia.org"``
  72. and the following methods:
  73. - :py:meth:`api_query(**kwargs) <earwigbot.wiki.site.Site.api_query>`: does an
  74. API query with the given keyword arguments as params
  75. - :py:meth:`sql_query(query, params=(), ...)
  76. <earwigbot.wiki.site.Site.sql_query>`: does an SQL query and yields its
  77. results (as a generator)
  78. - :py:meth:`~earwigbot.wiki.site.Site.get_replag`: returns the estimated
  79. database replication lag (if we have the site's SQL connection info)
  80. - :py:meth:`namespace_id_to_name(id, all=False)
  81. <earwigbot.wiki.site.Site.namespace_id_to_name>`: given a namespace ID,
  82. returns the primary associated namespace name (or a list of all names when
  83. ``all`` is ``True``)
  84. - :py:meth:`namespace_name_to_id(name)
  85. <earwigbot.wiki.site.Site.namespace_name_to_id>`: given a namespace name,
  86. returns the associated namespace ID
  87. - :py:meth:`get_page(title, follow_redirects=False, ...)
  88. <earwigbot.wiki.site.Site.get_page>`: returns a ``Page`` object for the given
  89. title (or a :py:class:`~earwigbot.wiki.category.Category` object if the
  90. page's namespace is "``Category:``")
  91. - :py:meth:`get_category(catname, follow_redirects=False, ...)
  92. <earwigbot.wiki.site.Site.get_category>`: returns a ``Category`` object for
  93. the given title (sans namespace)
  94. - :py:meth:`get_user(username) <earwigbot.wiki.site.Site.get_user>`: returns a
  95. :py:class:`~earwigbot.wiki.user.User` object for the given username
  96. - :py:meth:`delegate(services, ...) <earwigbot.wiki.site.Site.delegate>`:
  97. delegates a task to either the API or SQL depending on various conditions,
  98. such as server lag
  99. Pages and categories
  100. ~~~~~~~~~~~~~~~~~~~~
  101. Create :py:class:`earwigbot.wiki.Page <earwigbot.wiki.page.Page>` objects with
  102. :py:meth:`site.get_page(title) <earwigbot.wiki.site.Site.get_page>`,
  103. :py:meth:`page.toggle_talk() <earwigbot.wiki.page.Page.toggle_talk>`,
  104. :py:meth:`user.get_userpage() <earwigbot.wiki.user.User.get_userpage>`, or
  105. :py:meth:`user.get_talkpage() <earwigbot.wiki.user.User.get_talkpage>`. They
  106. provide the following attributes:
  107. - :py:attr:`~earwigbot.wiki.page.Page.site`: the page's corresponding
  108. :py:class:`~earwigbot.wiki.site.Site` object
  109. - :py:attr:`~earwigbot.wiki.page.Page.title`: the page's title, or pagename
  110. - :py:attr:`~earwigbot.wiki.page.Page.exists`: whether or not the page exists
  111. - :py:attr:`~earwigbot.wiki.page.Page.pageid`: an integer ID representing the
  112. page
  113. - :py:attr:`~earwigbot.wiki.page.Page.url`: the page's URL
  114. - :py:attr:`~earwigbot.wiki.page.Page.namespace`: the page's namespace as an
  115. integer
  116. - :py:attr:`~earwigbot.wiki.page.Page.protection`: the page's current
  117. protection status
  118. - :py:attr:`~earwigbot.wiki.page.Page.is_talkpage`: ``True`` if the page is a
  119. talkpage, else ``False``
  120. - :py:attr:`~earwigbot.wiki.page.Page.is_redirect`: ``True`` if the page is a
  121. redirect, else ``False``
  122. and the following methods:
  123. - :py:meth:`~earwigbot.wiki.page.Page.reload`: forcibly reloads the page's
  124. attributes (emphasis on *reload* - this is only necessary if there is reason
  125. to believe they have changed)
  126. - :py:meth:`toggle_talk(...) <earwigbot.wiki.page.Page.toggle_talk>`: returns a
  127. content page's talk page, or vice versa
  128. - :py:meth:`~earwigbot.wiki.page.Page.get`: returns page content
  129. - :py:meth:`~earwigbot.wiki.page.Page.get_redirect_target`: if the page is a
  130. redirect, returns its destination
  131. - :py:meth:`~earwigbot.wiki.page.Page.get_creator`: returns a
  132. :py:class:`~earwigbot.wiki.user.User` object representing the first user to
  133. edit the page
  134. - :py:meth:`edit(text, summary, minor=False, bot=True, force=False)
  135. <earwigbot.wiki.page.Page.edit>`: replaces the page's content with ``text``
  136. or creates a new page
  137. - :py:meth:`add_section(text, title, minor=False, bot=True, force=False)
  138. <earwigbot.wiki.page.Page.add_section>`: adds a new section named ``title``
  139. at the bottom of the page
  140. - :py:meth:`copyvio_check(...)
  141. <earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_check>`: checks the page for
  142. copyright violations
  143. - :py:meth:`copyvio_compare(url, ...)
  144. <earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_compare>`: checks the page like
  145. :py:meth:`~earwigbot.wiki.copyvios.CopyvioMixIn.copyvio_check`, but
  146. against a specific URL
  147. - :py:meth:`check_exclusion(username=None, optouts=None)
  148. <earwigbot.wiki.page.Page.check_exclusion>`: checks whether or not we are
  149. allowed to edit the page per ``{{bots}}``/``{{nobots}}``
  150. Additionally, :py:class:`~earwigbot.wiki.category.Category` objects (created
  151. with :py:meth:`site.get_category(name) <earwigbot.wiki.site.Site.get_category>`
  152. or :py:meth:`site.get_page(title) <earwigbot.wiki.site.Site.get_page>` where
  153. ``title`` is in the ``Category:`` namespace) provide the following additional
  154. attributes:
  155. - :py:attr:`~earwigbot.wiki.category.Category.size`: the total number of
  156. members in the category
  157. - :py:attr:`~earwigbot.wiki.category.Category.pages`: the number of pages in
  158. the category
  159. - :py:attr:`~earwigbot.wiki.category.Category.files`: the number of files in
  160. the category
  161. - :py:attr:`~earwigbot.wiki.category.Category.subcats`: the number of
  162. subcategories in the category
  163. And the following additional method:
  164. - :py:meth:`get_members(limit=None, ...)
  165. <earwigbot.wiki.category.Category.get_members>`: iterates over
  166. :py:class:`~earwigbot.wiki.page.Page`\ s in the category, until either the
  167. category is exhausted or (if given) ``limit`` is reached
  168. Users
  169. ~~~~~
  170. Create :py:class:`earwigbot.wiki.User <earwigbot.wiki.user.User>` objects with
  171. :py:meth:`site.get_user(name) <earwigbot.wiki.site.Site.get_user>` or
  172. :py:meth:`page.get_creator() <earwigbot.wiki.page.Page.get_creator>`. They
  173. provide the following attributes:
  174. - :py:attr:`~earwigbot.wiki.user.User.site`: the user's corresponding
  175. :py:class:`~earwigbot.wiki.site.Site` object
  176. - :py:attr:`~earwigbot.wiki.user.User.name`: the user's username
  177. - :py:attr:`~earwigbot.wiki.user.User.exists`: ``True`` if the user exists, or
  178. ``False`` if they do not
  179. - :py:attr:`~earwigbot.wiki.user.User.userid`: an integer ID representing the
  180. user
  181. - :py:attr:`~earwigbot.wiki.user.User.blockinfo`: information about any current
  182. blocks on the user (``False`` if no block, or a dict of
  183. ``{"by": blocking_user, "reason": block_reason,
  184. "expiry": block_expire_time}``)
  185. - :py:attr:`~earwigbot.wiki.user.User.groups`: a list of the user's groups
  186. - :py:attr:`~earwigbot.wiki.user.User.rights`: a list of the user's rights
  187. - :py:attr:`~earwigbot.wiki.user.User.editcount`: the number of edits made by
  188. the user
  189. - :py:attr:`~earwigbot.wiki.user.User.registration`: the time the user
  190. registered as a :py:obj:`time.struct_time`
  191. - :py:attr:`~earwigbot.wiki.user.User.emailable`: ``True`` if you can email the
  192. user, ``False`` if you cannot
  193. - :py:attr:`~earwigbot.wiki.user.User.gender`: the user's gender (``"male"``,
  194. ``"female"``, or ``"unknown"``)
  195. - :py:attr:`~earwigbot.wiki.user.User.is_ip`: ``True`` if the user is an IP
  196. address, IPv4 or IPv6, otherwise ``False``
  197. and the following methods:
  198. - :py:meth:`~earwigbot.wiki.user.User.reload`: forcibly reloads the user's
  199. attributes (emphasis on *reload* - this is only necessary if there is reason
  200. to believe they have changed)
  201. - :py:meth:`~earwigbot.wiki.user.User.get_userpage`: returns a
  202. :py:class:`~earwigbot.wiki.page.Page` object representing the user's userpage
  203. - :py:meth:`~earwigbot.wiki.user.User.get_talkpage`: returns a
  204. :py:class:`~earwigbot.wiki.page.Page` object representing the user's talkpage
  205. Additional features
  206. ~~~~~~~~~~~~~~~~~~~
  207. Not all aspects of the toolset are covered here. Explore `its code and
  208. docstrings`_ to learn how to use it in a more hands-on fashion. For reference,
  209. :py:attr:`bot.wiki <earwigbot.bot.Bot.wiki>` is an instance of
  210. :py:class:`earwigbot.wiki.SitesDB <earwigbot.wiki.sitesdb.SitesDB>` tied to the
  211. :file:`sites.db` file in the bot's working directory.
  212. .. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/
  213. .. _CentralAuth: http://www.mediawiki.org/wiki/Extension:CentralAuth
  214. .. _its code and docstrings: https://github.com/earwig/earwigbot/tree/develop/earwigbot/wiki