A Spider for BiliBili comments with a simple API server.

Last update: Jul 05, 2021

Related tags

Overview

BiliComment

A spider for BiliBili comment.

Spider Usage

Put config.json into config directory, and then python . ./config/config.json. A example config locate at /config/config.json.example.

API Usage

We use uwsgi to start our Web server(with very poor frontend). uwsgi --ini uwsgi.ini

ProxyPool Required

This project use jhao104/proxy_pool as a proxy pool. You need to deploy it before start the spider.

Owner

Hao

GitHub Repository

Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

45.5k Jan 07, 2023

An experiment to deploy a serverless infrastructure for a scrapy project.

Serverless Scrapy project This project aims to evaluate the feasibility of an architecture based on serverless technology for a web crawler using scra

5 Jul 08, 2022

Extract embedded metadata from HTML markup

extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic

725 Jan 03, 2023

PyQuery-based scraping micro-framework.

demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip

109 Jul 20, 2022

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Iceberg Locations Antarctic large iceberg positions derived from ASCAT and OSCAT-2. All data collected here are from the NASA SCP website Overview Thi

5 Jul 27, 2022

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

543 Jan 03, 2023

A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response and scrap complete article - No need to write scrappers for articles fetching anymore

GNews 🚩 A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response 🚩 As well as you can fetch full

273 Dec 31, 2022

An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.

TikTok Scraper An utility library to scrape data from TikTok hassle-free Go to the website » View Demo · Report Bug · Request Feature About The Projec

6 Jan 08, 2023

一个m3u8视频流下载脚本

一个Python的m3u8流视频下载脚本介绍 m3u8流视频日益常见，目前好用的下载器也有很多，我把之前自己写的一个小脚本分享出来，供广大网友使用。写此程序的目的在于给视频下载爱好者提供一个下载样例，可直接调用，勿再重复造轮子。使用方法在python中直接运行程序或进行外部调用 import

0 Oct 10, 2021

This is a sport analytics project that combines the knowledge of OOP and Webscraping

This is a sport analytics project that combines the knowledge of Object Oriented Programming (OOP) and Webscraping, the weekly scraping of the English Premier league table is carried out to assess th

1 Nov 26, 2021

A Spider for BiliBili comments with a simple API server.

BiliComment A spider for BiliBili comment. Spider Usage Put config.json into config directory, and then python . ./config/config.json. A example confi

3 Jul 05, 2021

Minimal set of tools to conduct stealthy scraping.

Stealthy Scraping Tools Do not use puppeteer and playwright for scraping. Explanation. We only use the CDP to obtain the page source and to get the ab

88 Jan 04, 2023

This is a webscraper for a specific website

This is a webscraper for a specific website. It is tuned to extract the headlines of that website. With some little adjustments the webscraper is able to extract any part of the website.

1 Dec 13, 2021

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Github Scraper Github scraper app is used to scrape data for a specific user profile. Github scraper app gets a github profile name and check whether

6 Apr 05, 2022

A Spider for BiliBili comments with a simple API server.

Related tags

Overview

BiliComment

Spider Usage

API Usage

ProxyPool Required

Owner

Hao

Scrapy, a fast high-level web crawling & scraping framework for Python.

An experiment to deploy a serverless infrastructure for a scrapy project.

Extract embedded metadata from HTML markup

PyQuery-based scraping micro-framework.

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.

A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON response and scrap complete article - No need to write scrappers for articles fetching anymore

An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.

一个m3u8视频流下载脚本

This is a sport analytics project that combines the knowledge of OOP and Webscraping

A Spider for BiliBili comments with a simple API server.

Minimal set of tools to conduct stealthy scraping.

This is a webscraper for a specific website

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

对于有验证码的站点爆破，用于安全合法测试

Find thumbnails and original images from URL or HTML file.

京东秒杀商品抢购Python脚本

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

Amazon web scraping using Scrapy Framework