PyMongo - the Python driver for MongoDB

Overview

PyMongo

Info: See the mongo site for more information. See GitHub for the latest source.
Documentation: Available at pymongo.readthedocs.io
Author: Mike Dirolf
Maintainer: Bernie Hackett <[email protected]>

About

The PyMongo distribution contains tools for interacting with MongoDB database from Python. The bson package is an implementation of the BSON format for Python. The pymongo package is a native Python driver for MongoDB. The gridfs package is a gridfs implementation on top of pymongo.

PyMongo supports MongoDB 2.6, 3.0, 3.2, 3.4, 3.6, 4.0, 4.2, and 4.4.

Support / Feedback

For issues with, questions about, or feedback for PyMongo, please look into our support channels. Please do not email any of the PyMongo developers directly with issues or questions - you're more likely to get an answer on the MongoDB Community Forums.

Bugs / Feature Requests

Think you’ve found a bug? Want to see a new feature in PyMongo? Please open a case in our issue management tool, JIRA:

Bug reports in JIRA for all driver projects (i.e. PYTHON, CSHARP, JAVA) and the Core Server (i.e. SERVER) project are public.

How To Ask For Help

Please include all of the following information when opening an issue:

  • Detailed steps to reproduce the problem, including full traceback, if possible.

  • The exact python version used, with patch level:

    $ python -c "import sys; print(sys.version)"
    
  • The exact version of PyMongo used, with patch level:

    $ python -c "import pymongo; print(pymongo.version); print(pymongo.has_c())"
    
  • The operating system and version (e.g. Windows 7, OSX 10.8, ...)

  • Web framework or asynchronous network library used, if any, with version (e.g. Django 1.7, mod_wsgi 4.3.0, gevent 1.0.1, Tornado 4.0.2, ...)

Security Vulnerabilities

If you’ve identified a security vulnerability in a driver or any other MongoDB project, please report it according to the instructions here.

Installation

PyMongo can be installed with pip:

$ python -m pip install pymongo

Or easy_install from setuptools:

$ python -m easy_install pymongo

You can also download the project source and do:

$ python setup.py install

Do not install the "bson" package from pypi. PyMongo comes with its own bson package; doing "easy_install bson" installs a third-party package that is incompatible with PyMongo.

Dependencies

PyMongo supports CPython 3.6+ and PyPy3.6+.

Optional dependencies:

GSSAPI authentication requires pykerberos on Unix or WinKerberos on Windows. The correct dependency can be installed automatically along with PyMongo:

$ python -m pip install pymongo[gssapi]

MONGODB-AWS authentication requires pymongo-auth-aws:

$ python -m pip install pymongo[aws]

Support for mongodb+srv:// URIs requires dnspython:

$ python -m pip install pymongo[srv]

OCSP (Online Certificate Status Protocol) requires PyOpenSSL, requests, service_identity and may require certifi:

$ python -m pip install pymongo[ocsp]

Wire protocol compression with snappy requires python-snappy:

$ python -m pip install pymongo[snappy]

Wire protocol compression with zstandard requires zstandard:

$ python -m pip install pymongo[zstd]

Client-Side Field Level Encryption requires pymongocrypt:

$ python -m pip install pymongo[encryption]

You can install all dependencies automatically with the following command:

$ python -m pip install pymongo[gssapi,aws,ocsp,snappy,srv,tls,zstd,encryption]

Additional dependencies are:

  • (to generate documentation) sphinx

Examples

Here's a basic example (for more see the examples section of the docs):

>>> import pymongo
>>> client = pymongo.MongoClient("localhost", 27017)
>>> db = client.test
>>> db.name
'test'
>>> db.my_collection
Collection(Database(MongoClient('localhost', 27017), 'test'), 'my_collection')
>>> db.my_collection.insert_one({"x": 10}).inserted_id
ObjectId('4aba15ebe23f6b53b0000000')
>>> db.my_collection.insert_one({"x": 8}).inserted_id
ObjectId('4aba160ee23f6b543e000000')
>>> db.my_collection.insert_one({"x": 11}).inserted_id
ObjectId('4aba160ee23f6b543e000002')
>>> db.my_collection.find_one()
{'x': 10, '_id': ObjectId('4aba15ebe23f6b53b0000000')}
>>> for item in db.my_collection.find():
...     print(item["x"])
...
10
8
11
>>> db.my_collection.create_index("x")
'x_1'
>>> for item in db.my_collection.find().sort("x", pymongo.ASCENDING):
...     print(item["x"])
...
8
10
11
>>> [item["x"] for item in db.my_collection.find().limit(2).skip(1)]
[8, 11]

Documentation

Documentation is available at pymongo.readthedocs.io.

To build the documentation, you will need to install sphinx. Documentation can be generated by running python setup.py doc. Generated documentation can be found in the doc/build/html/ directory.

Testing

The easiest way to run the tests is to run python setup.py test in the root of the distribution.

To verify that PyMongo works with Gevent's monkey-patching:

$ python green_framework_test.py gevent

Or with Eventlet's:

$ python green_framework_test.py eventlet
Comments
  • PYTHON-436 Add support for a maximum number of open connections

    PYTHON-436 Add support for a maximum number of open connections

    This patch adds a semaphore to keep track of sockets opened and released from the pool in order to be able to enforce a maximum number of open connections. I've also fixed a few places where sockets were being leaked from the pool (never returned) and an extra socket return in a test.

    Timeout support for the semaphore is added in this patch to support connection timeouts properly.

    Changes have been made to make sure that internal functions can always get a socket from the pool. even if it's at max, so it's possible for more than the configured max to be opened, but these "forced" connections are still tracked and should always be returned.

    https://jira.mongodb.org/browse/PYTHON-436

    opened by reversefold 26
  • PYTHON-436 Change max_pool_size to limit the maximum concurrent connections

    PYTHON-436 Change max_pool_size to limit the maximum concurrent connections

    PYTHON-436 Change max_pool_size to limit the maximum concurrent connections rather than just the idle connections in the pool. Also add support for waitQueueTimeoutMS and waitQueueMultiple.

    https://jira.mongodb.org/browse/PYTHON-436

    See previous pull request for history: https://github.com/mongodb/mongo-python-driver/pull/163

    opened by reversefold 15
  • Change exception wrapping to use repr instead of str

    Change exception wrapping to use repr instead of str

    I have an app that makes heavy use of pymongo. However once in a while I get a "InvalidBSON('',)" exception. I haven't been able to track the original exception because pymongo discards useful information when wrapping exceptions.

    tracked-in-jira 
    opened by ionelmc 14
  • MemoryError while retrieving large cursors

    MemoryError while retrieving large cursors

    This is the issue we experience on windows machines: Python version v2.7.2 Mongo v2.0.7 (running on ubuntu)

    >>> ================================ RESTART ================================
    >>> from pymongo import Connection
    >>> c = Connection("mongodb://user:[email protected]/admin")
    >>> c_users  = c["data"]["user"]
    >>> users = c_users.find({}, ["_id"])
    >>> users.count()
    193845
    >>> for u in users:
        s = u["_id"]
    
    
    
    Traceback (most recent call last):
      File "<pyshell#24>", line 1, in <module>
        for u in users:
      File "build\bdist.win32\egg\pymongo\cursor.py", line 778, in next
        if len(self.__data) or self._refresh():
      File "build\bdist.win32\egg\pymongo\cursor.py", line 742, in _refresh
        limit, self.__id))
      File "build\bdist.win32\egg\pymongo\cursor.py", line 666, in __send_message
        **kwargs)
      File "build\bdist.win32\egg\pymongo\connection.py", line 907, in _send_message_with_response
        return self.__send_and_receive(message, sock_info)
      File "build\bdist.win32\egg\pymongo\connection.py", line 885, in __send_and_receive
        return self.__receive_message_on_socket(1, request_id, sock_info)
      File "build\bdist.win32\egg\pymongo\connection.py", line 877, in __receive_message_on_socket
        return self.__receive_data_on_socket(length - 16, sock_info)
      File "build\bdist.win32\egg\pymongo\connection.py", line 858, in __receive_data_on_socket
        chunk = sock_info.sock.recv(length)
    MemoryError
    >>> 
    

    At this point I am not even convinced that this is pymongo's issue, and not python's, but the change seems to fix the this without any other bad side affects. Please do not hesitate to request for more info, I'm quite eager to have it resolved in driver's mainstream code.

    opened by TomasB 14
  • PYTHON-2467 Fix hex only representation of UUID by allowing for canonical represe…

    PYTHON-2467 Fix hex only representation of UUID by allowing for canonical represe…

    …ntation (Shouldn't be too confusing.. doesn't stray from current behaviour)

    Added in canonical_uuid option to json_options. This was the only way I could think to not break current API globally but also allow for this option to exist.

    >>> test_uuid = uuid.uuid1()
    >>> test_uuid
    UUID('e387e6cc-3b45-11eb-9ec8-00163e0987ed')
    >>> JSON_OPTIONS = bson.json_util.DEFAULT_JSON_OPTIONS.with_options(canonical_uuid=True)
    >>> payload = bson.json_util.dumps({'uuid': test_uuid}, json_options=JSON_OPTIONS)
    >>> payload
    '{"uuid": {"$uuid": "e387e6cc-3b45-11eb-9ec8-00163e0987ed"}}'
    >>> bson.json_util.loads(payload)
    {'uuid': UUID('e387e6cc-3b45-11eb-9ec8-00163e0987ed')}
    >>> 
    
    tracked-in-jira 
    opened by whardier 11
  • Fix up shutdown for periodic executors

    Fix up shutdown for periodic executors

    Hello.

    It's from a robot test that uses loop operator. FOR ${lengthOfSCHMCDXXXX} IN 35 15 36

    Just temporary debug messages:

    _shutdown_executors set([]) _shutdown_executors None _shutdown_executors None

    We get this error when we launch tests.

    unexpected error: Error in atexit._run_exitfuncs:
    Traceback (most recent call last):
      File "C:\Python27\lib\atexit.py", line 24, in _run_exitfuncs
        func(*targs, **kargs)
      File "C:\Python27\lib\site-packages\pymongo\periodic_executor.py", line 133, in _shutdown_executors
        executors = list(_EXECUTORS)
    TypeError: 'NoneType' object is not iterable
    Error in atexit._run_exitfuncs:
    Traceback (most recent call last):
      File "C:\Python27\lib\atexit.py", line 24, in _run_exitfuncs
        func(*targs, **kargs)
      File "C:\Python27\lib\site-packages\pymongo\periodic_executor.py", line 133, in _shutdown_executors
        executors = list(_EXECUTORS)
    TypeError: 'NoneType' object is not iterable
    Error in sys.exitfunc:
    
    [2016.04.26-15:25:09] : accessed by C:\Python27\lib\site-packages\robot\run.py
    Traceback (most recent call last):
      File "C:\Python27\lib\atexit.py", line 24, in _run_exitfuncs
        func(*targs, **kargs)
      File "C:\Python27\lib\site-packages\pymongo\periodic_executor.py", line 133, in _shutdown_executors
        executors = list(_EXECUTORS)
    TypeError: 'NoneType' object is not iterable
    
    test finished 20160426 15:25:09
    
    opened by ghost 11
  • bson.decode_file_iter: allow to bypass read/decode errors

    bson.decode_file_iter: allow to bypass read/decode errors

    If a file is corrupted or cannot be processed by the python driver, the iterator used to stop the processing of the file on the first error. With the new yield_errors optional argument the user can bypass errors.

    Signed-off-by: Nicolas Sebrecht [email protected]

    opened by nicolas33 10
  • Avoid MemoryError on large queries

    Avoid MemoryError on large queries

    When receiving data from large queries we were running into a MemoryError. From investigating sock_info.sock.recv returns a buffer of length size, which is then inserted into the chunks list. Unfortunately we were only receiving a small amount of bytes per iteration so chunks was filling up with items of size (approximately) length and quickly running out of memory. In total our query looked like it would try to allocate about 4Tb worth of memory.

    I've rewritten the function to behave more like the a previous version which fixes the memory issue as the chunk memory is freed once the data has been concatenated to the message.

    opened by tompko 10
  • add hostname validation option during CA validation

    add hostname validation option during CA validation

    There are circumstances where you still require a CA to validate that your key and cert are valid, but forgo verifying that the hostname matches the CN.

    Link to JIRA: https://jira.mongodb.org/browse/PYTHON-834

    This patch adds an extra parameter: ssl_validate_hostname which defaults to True.

    I'm hoping that this will make it into the 2.8 and future branches.

    This is already done in TxMongo which uses Twisted's SSL Context Factory to achieve the same thing.

    opened by psi29a 9
  • AutoReconnect error should be raised instead of TypeError when ReadPrefrence.SECONDARY is used

    AutoReconnect error should be raised instead of TypeError when ReadPrefrence.SECONDARY is used

    This fix is for a bug where non-strings are joined using .join(...) in case the user has ReadPreference.SECONDARY. This causes a TypeError to be thrown instead of the expected AutoReconnect error.

    opened by eKIK 9
  • Automatic per-socket authentication in Connection class

    Automatic per-socket authentication in Connection class

    I have modified the pymongo's Connection class to perform per-socket authentication for databases when credentials are provided in advance. The connection caches information about already-authenticated socket+database combinations, so authentication is performed only once for each new socket.

    Please consider pulling this feature because it will make it much easier to efficiently use MongoDB's auth within multi-threaded web frameworks.

    The changeset includes unit tests that exercise the authentication in single- and multi-threaded scenarios -- these tests are skipped if the local server does not have auth enabled. There is no API documentation yet for the new methods on pymongo.Connection, but I would be happy to add some if it's likely my changes will be accepted.

    Cheers, James

    opened by jmurty 9
Releases(4.3.3)
  • 4.3.3(Nov 17, 2022)

  • 3.13.0(Nov 1, 2022)

  • 4.3.2(Oct 18, 2022)

  • 4.2.0(Jul 20, 2022)

  • 4.2.0b0(Jun 8, 2022)

  • 4.1.1(Apr 13, 2022)

  • 4.1.0(Apr 4, 2022)

  • 4.0.2(Mar 3, 2022)

  • 4.0.1(Mar 3, 2022)

  • 3.12.3(Mar 3, 2022)

  • 4.0(Mar 3, 2022)

  • 3.12.0(Jul 13, 2021)

  • 3.12.0b1(May 28, 2021)

  • 3.11.4(May 28, 2021)

  • 3.12.0b0(Mar 31, 2021)

  • 3.11.2(Dec 2, 2020)

  • 3.11.1(Nov 17, 2020)

  • 3.11.0(Jul 30, 2020)

    PyMongo 3.11.0 - Add support for MongoDB 4.4

    Release notes: https://developer.mongodb.com/community/forums/t/pymongo-3-11-0-released/7371

    Source code(tar.gz)
    Source code(zip)
  • 3.11.0b1(Jun 9, 2020)

    PyMongo 3.11.0b1 - Beta support for MongoDB 4.4

    Release notes: https://developer.mongodb.com/community/forums/t/pymongo-3-11-0b1-released/5156

    Source code(tar.gz)
    Source code(zip)
  • 3.11.0b0(Apr 10, 2020)

  • 3.10.1(Feb 3, 2020)

    Version 3.10.1 fixes the following issues discovered since the release of 3.10.0:

    • Fix a TypeError logged to stderr that could be triggered during server maintenance or during pymongo.mongo_client.MongoClient.close().
    • Avoid creating new connections during pymongo.mongo_client.MongoClient.close().

    Documentation - https://pymongo.readthedocs.io/en/3.10.1/ Changelog - https://pymongo.readthedocs.io/en/3.10.1/changelog.html Installation - https://pymongo.readthedocs.io/en/3.10.1/installation.html

    Source code(tar.gz)
    Source code(zip)
  • 3.10.0(Feb 3, 2020)

    Support for Client-Side Field Level Encryption with MongoDB 4.2 and support for Python 3.8.

    Documentation - https://pymongo.readthedocs.io/en/3.10.0/ Changelog - https://pymongo.readthedocs.io/en/3.10.0/changelog.html Installation - https://pymongo.readthedocs.io/en/3.10.0/installation.html

    Source code(tar.gz)
    Source code(zip)
  • 3.9.0(Feb 3, 2020)

    MongoDB 4.2 support.

    Documentation - https://pymongo.readthedocs.io/en/3.9.0/ Changelog - https://pymongo.readthedocs.io/en/3.9.0/changelog.html Installation - https://pymongo.readthedocs.io/en/3.9.0/installation.html

    Source code(tar.gz)
    Source code(zip)
  • 3.9.0b1(Jun 16, 2019)

    Beta release for MongoDB 4.2 support.

    Documentation - https://api.mongodb.com/python/3.9.0b1/ Changelog - https://api.mongodb.com/python/3.9.0b1/changelog.html#changes-in-version-3-9-0b1 Installation - https://api.mongodb.com/python/3.9.0b1/installation.html#installing-a-beta-or-release-candidate

    Source code(tar.gz)
    Source code(zip)
Sample scripts to show extracting details directly from the AIQUM database

Sample scripts to show extracting details directly from the AIQUM database

1 Nov 19, 2021
Pysolr — Python Solr client

pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.

Haystack Search 626 Dec 01, 2022
Redis client for Python asyncio (PEP 3156)

Redis client for Python asyncio. Redis client for the PEP 3156 Python event loop. This Redis library is a completely asynchronous, non-blocking client

Jonathan Slenders 554 Dec 04, 2022
Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Logica: language of Big Data Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog,

Evgeny Skvortsov 1.5k Dec 30, 2022
A tool to snapshot sqlite databases you don't own

The core here is my first attempt at a solution of this, combining ideas from browser_history.py and karlicoss/HPI/sqlite.py to create a library/CLI tool to (as safely as possible) copy databases whi

Sean Breckenridge 10 Dec 22, 2022
A Python-based RPC-like toolkit for interfacing with QuestDB.

pykit A Python-based RPC-like toolkit for interfacing with QuestDB. Requirements Python 3.9 Java Azul

QuestDB 11 Aug 03, 2022
Python ODBC bridge

pyodbc pyodbc is an open source Python module that makes accessing ODBC databases simple. It implements the DB API 2.0 specification but is packed wit

Michael Kleehammer 2.6k Dec 27, 2022
Google Cloud Client Library for Python

Google Cloud Python Client Python idiomatic clients for Google Cloud Platform services. Stability levels The development status classifier on PyPI ind

Google APIs 4.1k Jan 01, 2023
A tutorial designed to introduce you to SQlite 3 database using python

SQLite3-python-tutorial A tutorial designed to introduce you to SQlite 3 database using python What is SQLite? SQLite is an in-process library that im

0 Dec 28, 2021
Database connection pooler for Python

Nimue Strange women lying in ponds distributing swords is no basis for a system of government! --Dennis, Peasant Nimue is a database connection pool f

1 Nov 09, 2021
asyncio (PEP 3156) Redis support

aioredis asyncio (PEP 3156) Redis client library. Features hiredis parser Yes Pure-python parser Yes Low-level & High-level APIs Yes Connections Pool

aio-libs 2.2k Jan 04, 2023
A fast PostgreSQL Database Client Library for Python/asyncio.

asyncpg -- A fast PostgreSQL Database Client Library for Python/asyncio asyncpg is a database interface library designed specifically for PostgreSQL a

magicstack 5.8k Dec 31, 2022
asyncio compatible driver for elasticsearch

asyncio client library for elasticsearch aioes is a asyncio compatible library for working with Elasticsearch The project is abandoned aioes is not su

97 Sep 05, 2022
Kafka Connect JDBC Docker Image.

kafka-connect-jdbc This is a dockerized version of the Confluent JDBC database connector. Usage This image is running the connect-standalone command w

Marc Horlacher 1 Jan 05, 2022
Python PostgreSQL adapter to stream results of multi-statement queries without a server-side cursor

streampq Stream results of multi-statement PostgreSQL queries from Python without server-side cursors. Has benefits over some other Python PostgreSQL

Department for International Trade 6 Oct 31, 2022
Neo4j Bolt driver for Python

Neo4j Bolt Driver for Python This repository contains the official Neo4j driver for Python. Each driver release (from 4.0 upwards) is built specifical

Neo4j 762 Dec 30, 2022
Apache Libcloud is a Python library which hides differences between different cloud provider APIs and allows you to manage different cloud resources through a unified and easy to use API

Apache Libcloud - a unified interface for the cloud Apache Libcloud is a Python library which hides differences between different cloud provider APIs

The Apache Software Foundation 1.9k Dec 25, 2022
Python PostgreSQL database performance insights. Locks, index usage, buffer cache hit ratios, vacuum stats and more.

Python PG Extras Python port of Heroku PG Extras with several additions and improvements. The goal of this project is to provide powerful insights int

Paweł Urbanek 35 Nov 01, 2022
PubMed Mapper: A Python library that map PubMed XML to Python object

pubmed-mapper: A Python Library that map PubMed XML to Python object 中文文档 1. Philosophy view UML Programmatically access PubMed article is a common ta

灵魂工具人 33 Dec 08, 2022
This is a repository for a task assigned to me by Bilateral solutions!

Processing-Files-using-MySQL This is a repository for a task assigned to me by Bilateral solutions! Task: Make Folders named Processing,queue and proc

Kandal Khandeka 1 Nov 07, 2022