Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
A Pythonic client for the official https://data.gov.gr API.

pydatagovgr An unofficial Pythonic client for the official data.gov.gr API. Aims to be an easy, intuitive and out-of-the-box way to: find data publish

Ilias Antonopoulos 40 Nov 10, 2022
Discord-RAID-Tool - Hacks/tools

How to use Python must be installed run install-config If you dont have python installed, download python 3.7.6 and make sure you click on the 'ADD TO

1 Jan 01, 2022
Step by Step Guide To Install Discord Py Master Branch on Replit

Guide to Install Discord Py Master Branch on Replit Step 1 Create an empty repl on replit Step 2 Add this Basic Code to the file main.py so as to chec

Pranav Saxena 7 Nov 18, 2022
Sail is a free CLI tool to deploy, manage and scale WordPress applications in the DigitalOcean cloud.

Deploy WordPress to DigitalOcean with Sail Sail is a free CLI tool to deploy, manage and scale WordPress applications in the DigitalOcean cloud. Conte

Konstantin Kovshenin 159 Dec 12, 2022
A small module to communicate with Triller's API

A small, UNOFFICIAL module to communicate with Triller's API. I plan to add more features/methods in the future.

A3R0 1 Nov 01, 2022
A simple and stupid Miinto API wrapper

miinto-api-wrapper Miinto API Wrapper is a simple python wrapper for Miinto API. Miinto is a fashion luxury marketplace. For more information see the

Giuseppe Checchia 3 Jan 09, 2022
GUI Pancakeswap V2 and Uniswap V3 trading client (and bot)MOST ADVANCE TRADING BOT SUPPORT WINDOWS LINUX MAC

GUI Pancakeswap 2 and Uniswap 3 trading client (and bot) (MOST ADVANCE TRADING BOT SUPPORT WINDOWS LINUX MAC) UPDATE: MUTI TRADE TOKEN ENABLE ,TRADE 1

2 Dec 27, 2021
Generate discord nitro codes and check them

Discord Nitro Generator and Checker A discord nitro generator and checker for all your nitro needs Explore the docs » Report Bug · Request Feature · J

509 Jan 02, 2023
PunkScape Discord bot to lookup rarities, create diptychs and more.

PunkScape Discord Bot A Discord bot created for the Discord server of PunkScapes, a banner NFT project. It was intially created to lookup rarities of

Akuti 4 Jun 24, 2022
A Bot to Upload files to Many Cloud services. Powered by Telethon.

oVo MultiUpload V1.0 👀 A Bot to Upload files to Many Cloud services. Powered by Telethon _ 🎯 Follow me and star this repo for more telegram bots. @H

32 Dec 30, 2022
Slack Developer Kit for Python

Python Slack SDK The Slack platform offers several APIs to build apps. Each Slack API delivers part of the capabilities from the platform, so that you

SlackAPI 3.5k Jan 02, 2023
a script to bulk check usernames on multiple site. includes proxy & threading support.

linked-bulk-checker bulk checks username availability on multiple sites info people have been selling these so i just made one to release dm my discor

krul 9 Sep 20, 2021
Telegram 隨機色圖,支援每日自動爬取

Telegram 隨機色圖機器人 使用此原始碼的Bot 開放的隨機色圖機器人: @katonei_bot 已實現的功能 爬取每日R18排行榜 不夠色!再來一張 Tag 索引,指定Tag色圖 將爬取到的色圖轉為 WebP 格式儲存,節省空間 需要注意的事件 好久之前的怪東西,代碼質量不保證 請在使用A

cluckbird 15 Oct 18, 2021
Semplice pagina di informazione per sapere se e quando è uscito Joypad, il podcast a tema videoludico di Matteo Bordone (Corri!), Francesco Fossetti (Salta!) e Alessandro Zampini (Spara! per finta).

È uscito Joypad? Semplice pagina di informazione per sapere se e quando è uscito Joypad, il podcast a tema videoludico di Matteo Bordone (Corri!), Fra

Paolo Donadeo 32 Jan 02, 2023
Battle.net and PlayStation title watcher that reports updates via Discord.

Renovate Renovate is a Battle.net and PlayStation title watcher that reports updates via Discord. Usage Open config_example.json and provide the confi

Ethan 1 Nov 23, 2022
Raphtory-client - The python client for the Raphtory project

Raphtory Client This is the python client for the Raphtory project Install via p

Raphtory 5 Apr 28, 2022
Improved file host. Change of interface and storage: 15 GB available.

File hosting v2 Improved file host. Change of interface and storage: 15 GB available. This app now uses the Google API to store, view, and delete file

Sarusman 1 Jan 18, 2022
Awslogs - AWS CloudWatch logs for Humans™

awslogs awslogs is a simple command line tool for querying groups, streams and events from Amazon CloudWatch logs. One of the most powerful features i

Jorge Bastida 4.5k Dec 30, 2022
Yet another discord-BOT

Note I have not added comments to the initial code as it is for my educational purpose. Use This is the code for a discord-BOT API py-cord-2.0.0a4178+

IRONMELTS 1 Dec 18, 2021
Telegram Group Manager Bot + Userbot Written In Python Using Pyrogram.

Telegram Group Manager Bot + Userbot Written In Python Using PyrogramTelegram Group Manager Bot + Userbot Written In Python Using Pyrogram

1 Nov 11, 2021