Make Selenium work on Github Actions

Overview

Make Selenium work on Github Actions

Scraping with BeautifulSoup on GitHub Actions is easy-peasy. But what about Selenium?? After you jump through some hoops it's just as simple.

How to use this

I think you can just fork this repository and get to work on scraper.py.

Otherwise, just steal scraper.py and .github/workflows/scrape.yml and you'll be good to go.

Saving CSV files

If you want every scrape to update a CSV or something like this, you'll need to edit your scrape.yml to commit to the repo after the scraper is done running.

Take a look at the bottom of autoscraper-history to see how to do that. It should be one more step, something like this:

      - name: Commit and push if content changed
        run: |-
          git config user.name "Automated"
          git config user.email "[email protected]"
          git add -A
          timestamp=$(date -u)
          git commit -m "Latest data: ${timestamp}" || exit 0
          git push

I'm pretty sure I lifted that code 100% from Simmon Willison.

Scheduling

Right now the yaml is set to only scrape when you click the "Run workflow" button in the Actions tab, but you can add something like

  schedule:
    - cron: '0 * * * *'

To make it run the first minute of every hour.

How this all works

To make Selenium + GitHub Actions work, this repo does a few magic things. In a normal world, you start Chrome like this:

from selenium import webdriver

driver = webdriver.Chrome()

But we.... do things a little differently. Let me walk you through the changes!

webdriver-manager to manage the webdriver

You can add a special action to set up Chromedriver but I feel it's honestly easier to use webdriver-manager. It magically picks out the right version of chromedriver for you.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())

It works on your local machine, too!

Chromium, not Chrome

When you're running GitHub Actions, it's probably on a nice little Ubuntu Linux machine. In those situations, you install software using apt. Since you can't install Chrome with apt, you'll install Chromium instead, the open-source version of Chrome. Works the same, just opens a little differently.

As a result, our chromedriver install code changed a little more:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.utils import ChromeType

driver_path = ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install()
driver = webdriver.Chrome(driver_path)

This also means we need to add a line that does apt-get install -y chromium-browser to our scrape.yml)

Headless Chromium

Since we don't have a monitor plugged into GitHub Actions, we can't actually see what's going on in the browser. In the olden days you had to construct some odd technical fake screen, but these days you just run in headless mode!

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(options=chrome_options)

Other options and more management of things

To combine the webdriver-manager driver path and the headless Chrome option, there are a lot of hoops to jump through. During the research process a lot of extra chrome options poked up, so I thought hey, let's just add all those, too.

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.utils import ChromeType
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

chrome_service = Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install())

chrome_options = Options()
options = [
    "--headless",
    "--disable-gpu",
    "--window-size=1920,1200",
    "--ignore-certificate-errors",
    "--disable-extensions",
    "--no-sandbox",
    "--disable-dev-shm-usage"
]
for option in options:
    chrome_options.add_argument(option)

driver = webdriver.Chrome(service=chrome_service, options=chrome_options)

The biggest change here beyond all those options is the Service thing. Apparently just giving it the path to chromdriver isn't good enough? Who knows, I just do what works.

Owner
Jonathan Soma
baby data journo wrangler @ledeprogram + @littlecolumns, cat wrangler @cat-republic
Jonathan Soma
Flexible test automation for Python

Nox - Flexible test automation for Python nox is a command-line tool that automates testing in multiple Python environments, similar to tox. Unlike to

Stargirl Flowers 941 Jan 03, 2023
A Proof of concept of a modern python CLI with click, pydantic, rich and anyio

httpcli This project is a proof of concept of a modern python networking cli which can be simple and easy to maintain using some of the best packages

Kevin Tewouda 17 Nov 15, 2022
🏃💨 For when you need to fill out feedback in the last minute.

BMSCE Auto Feedback For when you need to fill out feedback in the last minute. 🏃 💨 Setup Clone the repository Run pip install selenium Set the RATIN

Shaan Subbaiah 10 May 23, 2022
Generic automation framework for acceptance testing and RPA

Robot Framework Introduction Installation Example Usage Documentation Support and contact Contributing License Introduction Robot Framework is a gener

Robot Framework 7.7k Jan 07, 2023
输入Google Hacking语句,自动调用Chrome浏览器爬取结果

Google-Hacking-Crawler 该脚本可输入Google Hacking语句,自动调用Chrome浏览器爬取结果 环境配置 python -m pip install -r requirements.txt 下载Chrome浏览器

Jarcis 4 Jun 21, 2022
:game_die: Pytest plugin to randomly order tests and control random.seed

pytest-randomly Pytest plugin to randomly order tests and control random.seed. Features All of these features are on by default but can be disabled wi

pytest-dev 471 Dec 30, 2022
The Good Old Days. | Testing Out A New Module-

The-Good-Old-Days. The Good Old Days. | Testing Out A New Module- Installation Asciimatics supports Python versions 2 & 3. For the precise list of tes

Syntax. 2 Jun 08, 2022
Kent - Fake Sentry server for local development, debugging, and integration testing

Kent is a service for debugging and integration testing Sentry.

Will Kahn-Greene 100 Dec 15, 2022
Automated tests for OKAY websites in Python (Selenium) - user friendly version

Okay Selenium Testy Aplikace určená k testování produkčních webů společnosti OKAY s.r.o. Závislosti K běhu aplikace je potřeba mít v počítači nainstal

Viktor Bem 0 Oct 01, 2022
Show surprise when tests are passing

pytest-pikachu pytest-pikachu prints ascii art of Surprised Pikachu when all tests pass. Installation $ pip install pytest-pikachu Usage Pass the --p

Charlie Hornsby 13 Apr 15, 2022
A toolbar overlay for debugging Flask applications

Flask Debug-toolbar This is a port of the excellent django-debug-toolbar for Flask applications. Installation Installing is simple with pip: $ pip ins

863 Dec 29, 2022
Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur.

Python dilinin Selenium kütüphanesini kullanarak; Amazon, LinkedIn ve ÇiçekSepeti üzerinde test işlemleri yaptığımız bir case study reposudur. LinkedI

Furkan Gulsen 8 Nov 01, 2022
AutoExploitSwagger is an automated API security testing exploit tool that can be combined with xray, BurpSuite and other scanners.

AutoExploitSwagger is an automated API security testing exploit tool that can be combined with xray, BurpSuite and other scanners.

6 Jan 28, 2022
This is a web test framework based on python+selenium

Basic thoughts for this framework There should have a BasePage.py to be the parent page and all the page object should inherit this class BasePage.py

Cactus 2 Mar 09, 2022
API Test Automation with Requests and Pytest

api-testing-requests-pytest Install Make sure you have Python 3 installed on your machine. Then: 1.Install pipenv sudo apt-get install pipenv 2.Go to

Sulaiman Haque 2 Nov 21, 2021
Django-google-optimize is a Django application designed to make running server side Google Optimize A/B tests easy.

Django-google-optimize Django-google-optimize is a Django application designed to make running Google Optimize A/B tests easy. Here is a tutorial on t

Adin Hodovic 39 Oct 25, 2022
This file will contain a series of Python functions that use the Selenium library to search for elements in a web page while logging everything into a file

element_search with Selenium (Now With docstrings 😎 ) Just to mention, I'm a beginner to all this, so it it's very possible to make some mistakes The

2 Aug 12, 2021
A modern API testing tool for web applications built with Open API and GraphQL specifications.

Schemathesis Schemathesis is a modern API testing tool for web applications built with Open API and GraphQL specifications. It reads the application s

Schemathesis.io 1.6k Jan 06, 2023
bulk upload files to libgen.lc (Selenium script)

LibgenBulkUpload bulk upload files to http://libgen.lc/librarian.php (Selenium script) Usage ./upload.py to_upload uploaded rejects So title and autho

8 Jul 07, 2022
Code coverage measurement for Python

Coverage.py Code coverage testing for Python. Coverage.py measures code coverage, typically during test execution. It uses the code analysis tools and

Ned Batchelder 2.3k Jan 04, 2023