Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
The smart farm is an idea that designing Smart Farm by IoT

The smart farm is an idea that designing Smart Farm by IoT. Using Raspberry Pi 4 detect the data from different sensors(Raindrop sensor and DHT22 sensor), and push the data to Azure IoT central.

Jiage 1 Jan 11, 2022
YouTube bot, this is just my introduction to api and requests, this isn't intended on being an actual view bot.

YouTube bot, this is just my introduction to api and requests, this isn't intended on being an actual view bot.

Aran 2 Jul 25, 2022
52pojie 吾爱破解论坛 签到 支持云函数/服务器等Py3环境运行

52pojie-Checkin 52pojie 吾爱破解论坛 签到 Py3单程序 支持云函数/服务器等Py3环境运行 只需要Cookie即可运行 新版说明 依赖包请用项目 https://github.com/BlueSkyXN/requirements-serverless 需要填写的参数有 co

BlueSkyXN 22 Sep 15, 2022
This is a very easy to use tool developed in python that will search for free courses from multiple sites including youtube and enroll in the ones in which it can.

Free-Course-Hunter-and-Enroller This is a very easy to use tool developed in python that will search for free courses from multiple sites including yo

Zain 12 Nov 12, 2022
A simple discord tool that translates english to either spanish, german or french and sends it. Free to rework but please give me credit.

discord-translator A simple discord tool that translates english to either spanish, german or french and sends it. Free to rework but please give me c

TrolledTooHard 2 Oct 04, 2021
Change your discord avatar every x h/d based on a list of images

Discord-Avatar-Autochange Introduction A simple script that automatically keeps changing your discord avatar after a given amount of time based on the

Armin Amiri 5 Apr 30, 2022
Download nitro generator that generates free nitro code that you can use for Discord

Download nitro generator that generates free nitro code that you can use for Discord, run it and wait for free nitro to come

Umut Bayraktar 154 Jan 05, 2023
Cool Discord bot for you

BountyBot Баунти – современный бот созданный с целью сделать ваш сервер лучше! В кратце В нем присутствует множество основных и интересных функций, та

Leestarb Original 1 Nov 22, 2021
Бот для скачивания треков с Deezer используя ISRC и UPC коды

deez_robot Запуск Установите необходимые библиотеки pip install -r requirements.txt Создайте файл config.py и поместите туда токен бота и ARL-токен De

Max 4 Jul 31, 2022
Adds a new git subcommand named "ranch".

Git Ranch This script adds ranch, a new subcommand for git that makes it easier to order 1 Gallon of Kraft Ranch Salad Dressing from Amazon. Installat

Austin T Schaffer 8 Jul 06, 2022
A python to scratch API connector. Can fetch data from the API and send it back in cloud variables.

Scratch2py Scratch2py or S2py is a easy to use, versatile tool to communicate with the Scratch API Based of scratchclient by Raihan142857 Installation

20 Jun 18, 2022
Linky bot, A open-source discord bot that allows you to add links to ur website, youtube url, etc for the people all around discord to see!

LinkyBot Linky bot, An open-source discord bot that allows you to add links to ur website, youtube url, etc for the people all around discord to see!

AlexyDaCoder 1 Sep 20, 2022
Github integration with Telegram

The Telegram bot myGit is your GiHub assistant. In your conversations with your team, you can simply insert the information about the projects you are working at.

Alexandru Buzescu 2 Jan 06, 2022
GET-ACQ is a python tool used to gather all companies acquired by a given company domain name.

get-acq 🏢 GET-ACQ is a python tool used to gather all companies acquired by a given company domain name. It is done by calling SecurityTrails API. Us

Milan 7 Dec 19, 2022
BT CCXT Store

bt-ccxt-store-cn backtrader是一个非常好的开源量化回测平台,我自己也时常用它,backtrader也能接入实盘,而bt-ccxt-store就是帮助backtrader接入数字货币实盘交易的一个插件,但是bt-ccxt-store的某些实现并不是很好,无节制的网络轮询,一些

moses 40 Dec 31, 2022
Info gathering | API hacketarget.com

InfoFetch Info gathering | API hackertarget.com set-up: apt-get install python3 pip3 install requests apt-get install git git clone https://github.com

Muhammed Rizad 4 Nov 22, 2021
A Flask & Twilio Secret Santa app.

🎄 ✨ Secret Santa Twilio ✨ 📱 A contactless Secret Santa game built with Python, Flask and Twilio! Prerequisites 📝 A Twilio account. Sign up here ngr

Sangeeta Jadoonanan 5 Dec 23, 2021
Mailjet API implementation in Python

READ THIS FIRST!! This repository isn't compatible with the current Mailjet API (v3) and, as a consequence, is considered deprecated and won't undergo

Rick van Hattem 18 Oct 21, 2022
基于nonebot2开发的群管机器人qbot,支持上传并运行python代码以及一些基础管理功能

nonebot2-Eleina 基于nonebot2开发的群管机器人qbot,支持上传并运行python代码以及一些基础管理功能 Readme 环境:python3.7.3+,go-cqhttp 安装及配置:参见(https://v2.nonebot.dev/guide/installation.h

1 Dec 06, 2022
Probably Overengineered Unimore Booker

POUB Probably Overengineered Unimore Booker A python-powered, actor-based, telegram-facing, timetable-aware booker for unimore (if you know more adjec

Lorenzo Rossi 3 Feb 20, 2022