Demonstration on how to use async python to control multiple playwright browsers for web-scraping

Last update: Oct 27, 2022

Related tags

Overview

Playwright Browser Pool

This example illustrates how it's possible to use a pool of browsers to retrieve page urls in a single asynchronous process.

# "response": 
(contains response status, headers etc.) # } if __name__ == '__main__': asyncio.run(run())">
import asyncio


async def run():
    # some example urls
    urls = [
        "https://www.airbnb.com/experiences/2496585",
        "https://www.airbnb.com/experiences/2488061",
        "https://www.airbnb.com/experiences/2563542",
        "https://www.airbnb.com/experiences/3010357",
        "https://www.airbnb.com/experiences/2624432",
        "https://www.airbnb.com/experiences/3033250",
    ]
    # start a browser pool
    async with BrowserPool(pool_size=3, browser_type="chromium", browser_kwargs={"headless": True}) as pool:
        # concurrently execute page retrieval
        for data in asyncio.as_completed(
            [pool.get_page(url) for url in batch]
        ):
            print(data)
            # will print:
            # {
            #   "content": 
    
            #   "response": 
    
      (contains response status, headers etc.)
    
            # }


if __name__ == '__main__':
    asyncio.run(run())

Owner

Bernardas Ališauskas

I like python, education and free software. More on https://gitlab.com/granitosaurus

GitHub Repository

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This

1 Jan 12, 2022

A Python module to bypass Cloudflare's anti-bot page.

cloudflare-scrape A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Reque

3k Jan 04, 2023

Dictionary - Application focused on word search through web scraping

Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.

2 May 09, 2022

Telegram group scraper tool

Telegram Group Scrapper

2 Jan 11, 2022

Ebay Webscraper for Getting Average Product Price

Ebay-Webscraper-for-Getting-Average-Product-Price The code in this repo is used to determine the average price of an item on Ebay given a valid search

17 Jan 05, 2023

Minecraft Item Scraper

Minecraft Item Scraper To run, first ensure you have the BeautifulSoup module: pip install bs4 Then run, python minecraft_items.py folder-to-save-ima

1 Dec 29, 2021

A web scraper for nomadlist.com, made to avoid website restrictions.

Gypsylist gypsylist.py is a web scraper for nomadlist.com, made to avoid website restrictions. nomadlist.com is a website with a lot of information fo

5 Nov 24, 2022

A distributed crawler for weibo, building with celery and requests.

4.8k Jan 03, 2023

Danbooru scraper with python

Danbooru Version: 0.0.1 License under: MIT License Dependencies Python: = 3.9.7 beautifulsoup4 cloudscraper Example of use Danbooru from danbooru imp

2 Oct 27, 2022

Jobinja.ir jobs scraper.

Jobinja.ir Dataset Introduction This project is a simple web scraper that scraps pages of jobinja.ir concurrently and writes and update (if file gets

3 Apr 15, 2022

Scrape Twitter for Tweets

Backers Thank you to all our backers! 🙏 [Become a backer] Sponsors Support this project by becoming a sponsor. Your logo will show up here with a lin

2.2k Jan 05, 2023

This program will help you to properly scrape all data from a specific website

0 May 15, 2022

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

1 Feb 10, 2022

Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot

Aliexpress to telegram post Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a b

6 Dec 06, 2022