A Python robot that edits Wikipedia and interacts with people over IRC https://en.wikipedia.org/wiki/User:EarwigBot
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

171 regels
7.5 KiB

  1. The Wiki Toolset
  2. ================
  3. EarwigBot's answer to the `Pywikipedia framework`_ is the Wiki Toolset
  4. (``earwigbot.wiki``), which you will mainly access through ``bot.wiki``.
  5. ``bot.wiki`` provides three methods for the management of Sites -
  6. ``get_site()``, ``add_site()``, and ``remove_site()``. Sites are objects that
  7. simply represent a MediaWiki site. A single instance of EarwigBot (i.e. a
  8. single *working directory*) is expected to relate to a single site or group of
  9. sites using the same login info (like all WMF wikis with CentralAuth).
  10. Load your default site (the one that you picked during setup) with
  11. ``site = bot.wiki.get_site()``.
  12. Dealing with other sites
  13. ~~~~~~~~~~~~~~~~~~~~~~~~
  14. *Skip this section if you're only working with one site.*
  15. If a site is *already known to the bot* (meaning that it is stored in the
  16. ``sites.db`` file, which includes just your default wiki at first), you can
  17. load a site with ``site = bot.wiki.get_site(name)``, where ``name`` might be
  18. ``"enwiki"`` or ``"frwiktionary"`` (you can also do
  19. ``site = bot.wiki.get_site(project="wikipedia", lang="en")``). Recall that not
  20. giving any arguments to ``get_site()`` will return the default site.
  21. ``add_site()`` is used to add new sites to the sites database. It may be called
  22. with similar arguments as ``get_site()``, but the difference is important.
  23. ``get_site()`` only needs enough information to identify the site in its
  24. database, which is usually just its name; the database stores all other
  25. necessary connection info. With ``add_site()``, you need to provide enough
  26. connection info so the toolset can successfully access the site's API/SQL
  27. databases and store that information for later. That might not be much; for
  28. WMF wikis, you can usually use code like this::
  29. project, lang = "wikipedia", "es"
  30. try:
  31. site = bot.wiki.get_site(project=project, lang=lang)
  32. except earwigbot.SiteNotFoundError:
  33. # Load site info from http://es.wikipedia.org/w/api.php:
  34. site = bot.wiki.add_site(project=project, lang=lang)
  35. This works because EarwigBot assumes that the URL for the site is
  36. ``"//{lang}.{project}.org"`` and the API is at ``/w/api.php``; this might
  37. change if you're dealing with non-WMF wikis, where the code might look
  38. something more like::
  39. project, lang = "mywiki", "it"
  40. try:
  41. site = bot.wiki.get_site(project=project, lang=lang)
  42. except earwigbot.SiteNotFoundError:
  43. Load site info from http://mysite.net/mywiki/it/s/api.php:
  44. base_url = "http://mysite.net/" + project + "/" + lang
  45. db_name = lang + project + "_p"
  46. sql = {host: "sql.mysite.net", db: db_name}
  47. site = bot.wiki.add_site(base_url=base_url, script_path="/s", sql=sql)
  48. ``remove_site()`` does the opposite of ``add_site()``: give it a site's name
  49. or a project/lang pair like ``get_site()`` takes, and it'll remove that site
  50. from the sites database.
  51. Sites
  52. ~~~~~
  53. ``Site`` objects provide the following attributes:
  54. - ``name``: the site's name (or "wikiid"), like ``"enwiki"``
  55. - ``project``: the site's project name, like ``"wikipedia"``
  56. - ``lang``: the site's language code, like ``"en"``
  57. - ``domain``: the site's web domain, like ``"en.wikipedia.org"``
  58. and the following methods:
  59. - ``api_query(**kwargs)``: does an API query with the given keyword arguments
  60. as params
  61. - ``sql_query(query, params=(), ...)``: does an SQL query and yields its
  62. results (as a generator)
  63. - ``get_replag()``: returns the estimated database replication lag (if we have
  64. the site's SQL connection info)
  65. - ``namespace_id_to_name(id, all=False)``: given a namespace ID, returns the
  66. primary associated namespace name (or a list of all names when ``all`` is
  67. ``True``)
  68. - ``namespace_name_to_id(name)``: given a namespace name, returns the
  69. associated namespace ID
  70. - ``get_page(title, follow_redirects=False)``: returns a ``Page`` object for
  71. the given title (or a ``Category`` object if the page's namespace is
  72. "``Category:``")
  73. - ``get_category(catname, follow_redirects=False)``: returns a ``Category``
  74. object for the given title (sans namespace)
  75. - ``get_user(username)``: returns a ``User`` object for the given username
  76. Pages (and Categories)
  77. ~~~~~~~~~~~~~~~~~~~~~~
  78. Create ``Page`` objects with ``site.get_page(title)``,
  79. ``page.toggle_talk()``, ``user.get_userpage()``, or ``user.get_talkpage()``.
  80. They provide the following attributes:
  81. - ``title``: the page's title, or pagename
  82. - ``exists``: whether the page exists
  83. - ``pageid``: an integer ID representing the page
  84. - ``url``: the page's URL
  85. - ``namespace``: the page's namespace as an integer
  86. - ``protection``: the page's current protection status
  87. - ``is_talkpage``: ``True`` if the page is a talkpage, else ``False``
  88. - ``is_redirect``: ``True`` if the page is a redirect, else ``False``
  89. and the following methods:
  90. - ``reload()``: forcibly reload the page's attributes (emphasis on *reload* -
  91. this is only necessary if there is reason to believe they have changed)
  92. - ``toggle_talk(...)``: returns a content page's talk page, or vice versa
  93. - ``get()``: returns page content
  94. - ``get_redirect_target()``: if the page is a redirect, returns its destination
  95. - ``get_creator()``: returns a ``User`` object representing the first user to
  96. edit the page
  97. - ``edit(text, summary, minor=False, bot=True, force=False)``: replaces the
  98. page's content with ``text`` or creates a new page
  99. - ``add_section(text, title, minor=False, bot=True, force=False)``: adds a new
  100. section named ``title`` at the bottom of the page
  101. - ``copyvio_check(...)``: checks the page for copyright violations
  102. - ``copyvio_compare(url, ...)``: checks the page like ``copyvio_check()``, but
  103. against a specific URL
  104. Additionally, ``Category`` objects (created with ``site.get_category(name)`` or
  105. ``site.get_page(title)`` where ``title`` is in the ``Category:`` namespace)
  106. provide the following additional method:
  107. - ``get_members(use_sql=False, limit=None)``: returns a list of page titles in
  108. the category (limit is ``50`` by default if using the API)
  109. Users
  110. ~~~~~
  111. Create ``User`` objects with ``site.get_user(name)`` or
  112. ``page.get_creator()``. They provide the following attributes:
  113. - ``name``: the user's username
  114. - ``exists``: ``True`` if the user exists, or ``False`` if they do not
  115. - ``userid``: an integer ID representing the user
  116. - ``blockinfo``: information about any current blocks on the user (``False`` if
  117. no block, or a dict of ``{"by": blocking_user, "reason": block_reason,
  118. "expiry": block_expire_time}``)
  119. - ``groups``: a list of the user's groups
  120. - ``rights``: a list of the user's rights
  121. - ``editcount``: the number of edits made by the user
  122. - ``registration``: the time the user registered as a ``time.struct_time``
  123. - ``emailable``: ``True`` if you can email the user, ``False`` if you cannot
  124. - ``gender``: the user's gender (``"male"``, ``"female"``, or ``"unknown"``)
  125. and the following methods:
  126. - ``reload()``: forcibly reload the user's attributes (emphasis on *reload* -
  127. this is only necessary if there is reason to believe they have changed)
  128. - ``get_userpage()``: returns a ``Page`` object representing the user's
  129. userpage
  130. - ``get_talkpage()``: returns a ``Page`` object representing the user's
  131. talkpage
  132. Additional features
  133. ~~~~~~~~~~~~~~~~~~~
  134. Not all aspects of the toolset are covered here. Explore `its code and
  135. docstrings`_ to learn how to use it in a more hands-on fashion. For reference,
  136. ``bot.wiki`` is an instance of ``earwigbot.wiki.SitesDB`` tied to the
  137. ``sites.db`` file in the bot's working directory.
  138. .. _Pywikipedia framework: http://pywikipediabot.sourceforge.net/
  139. .. _its code and docstrings: https://github.com/earwig/earwigbot/tree/develop/earwigbot/wiki