This program scrapes information and images for movies and TV shows.

Last update: Dec 05, 2021

Related tags

Overview

Media-WebScraper

This program scrapes information and images for movies and TV shows.

Summary

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

For a given list of media, the program will scrape and save general information, images and any episode information for each media.

General Information (default):

Saved as a .txt file

This will scrape general information:

Title
Release date
Runtime
Genre
Director
Cast
Plot description

Additional information saved:

Source database used for scrape
ID for media in source database
Poster image link

Images (default):

Saved as a .jpg file

This will scrape the poster.

Episode Information (if specified):

Saved as a .csv file

This will scrape information for each episode for a TV show:

Season number
Episode number
Episode title
Episode air date
Episode description

Features:

Multithreaded scraping for media in list to greatly improve the time taken when scraping for large media lists.
Can generate a media list from folders and files in a specified directory or from user input.
Can specify save location for scraped data.
Can specify search tags for media list for a more accurate scrape.
Can choose to scrape all episode information for a TV show.
Can detect if data is already scraped which allows for scraping new media from an already scraped list of media very efficient.
Can recover missing scraped files if one or more are missing without rescraping all data.
Can retry the scrape before exiting the program if there were any incomplete scrapes (successfully scraped files will not be altered or rescraped).
Currently only supports scraping data from IMDb.

Usage:

For more information on the program, read the WebScrape_help text file (this can also be accessed while running the program).

Currently a terminal-based program.

Running the program using python:

Requirements: Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable file (created using pyinstaller):

Requirements: Windows 10
Creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.
- The temporary files will delete automatically but if the program is closed abruptly, the files will remain.
- The 'temp' folder can be manually deleted after closing the program.
- (As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Updates:

For information on version history, read the HISTORY markdown file.

Scrapes proxies and saves them to a text file

Proxy Scraper Scrapes proxies from https://proxyscrape.com and saves them to a file. Also has a customizable theme system Made by nell and Lamp

2 Dec 22, 2021

Meme-videos - Scrapes memes and turn them into a video compilations

Meme Videos Scrapes memes from reddit using praw and request and then converts t

12 Oct 28, 2022

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

1 Feb 10, 2022

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

WebScraping Web scraping Pyton program that scrapes Job website for python devel

2 Jul 22, 2022

:arrow_double_down: Dumb downloader that scrapes the web

You-Get NOTICE: Read this if you are looking for the conventional "Issues" tab. You-Get is a tiny command-line utility to download media contents (vid

46.4k Jan 3, 2023

Anonymously scrapes onlinesim.ru for new usable phone numbers.

phone-scraper Anonymously scrapes onlinesim.ru for new usable phone numbers. Usage Clone the repository $ git clone https://github.com/thomasgruebl/ph

16 Oct 8, 2022

A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)

6 Aug 10, 2022

Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

18 Dec 14, 2022

Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

2 Jan 15, 2022

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)
WebScrape v1.3.0

See version history document for all changes.

Running the program using python:

Download the source code.

Requirements:

Python 3.2+ (additional libraries: requests, beautifulsoup4)

Running the program from bundled executable:

Download the WebScrape-1.3.0 zip file containing the bundled executable (created using pyinstaller).

Requirements:

Windows 10

Note:

The executable file creates a 'temp' folder containing extracted libraries and support files in the same location as the program while running.

The temporary files will delete automatically but if the program is closed abruptly, the files will remain.

The 'temp' folder can be manually deleted after closing the program.

(As of pyinstaller v4.7, a one-file bundled executable will leave any temp '_MEIxxxxxx' folders if the program is force closed)

Source code(tar.gz)
Source code(zip)
WebScrape-1.3.0.zip(8.71 MB)

This program scrapes information and images for movies and TV shows.

Related tags

Overview

Media-WebScraper

Summary

General Information (default):

Images (default):

Episode Information (if specified):

Features:

Usage:

Running the program using python:

Running the program from bundled executable file (created using pyinstaller):

Updates:

You might also like...

Scrapes proxies and saves them to a text file

Meme-videos - Scrapes memes and turn them into a video compilations

This scrapper scrapes the mail ids of faculty members from a given linl/page and stores it in a csv file

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

:arrow_double_down: Dumb downloader that scrapes the web

Anonymously scrapes onlinesim.ru for new usable phone numbers.

A Python package that scrapes Google News article data while remaining undetected by Google.

Scrapes Every Email Address of Every Society in Every University

Automatically scrapes all menu items from the Taco Bell website

Releases(v1.3.0)

v1.3.0(Dec 5, 2021)

WebScrape v1.3.0

Running the program using python:

Requirements:

Running the program from bundled executable:

Requirements:

Note:

Owner

A low-code tool that generates python crawler code based on curl or url

优化版本的京东茅台抢购神器

Scrapy, a fast high-level web crawling & scraping framework for Python.

A web scraper that exports your entire WhatsApp chat history.

Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Examine.com supplement research scraper!

Scraping web pages to get data

Telegram Group Scrapper

A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

Python web scrapper

Web-Scrapper using Python and Flask

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

A command-line program to download media, like and unlike posts, and more from creators on OnlyFans.

Get-web-images - A python code that get images from any site

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

OSTA web scraper, for checking the status of school buses in Ottawa

Searching info from Google using Python Scrapy

Web Content Retrieval for Humans™