爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介

  • csdn csdn
  • 时光荏苒,记不清写了多少案例了。作者文章发布在csdn,代码随后往github上更新。csdn部分文章为收费案例,合理订阅。

声明

  • 本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。

  • 作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。

  • 因本库引起的或与之有关的任何争议,各方应友好协商解决,协商不成的任何后果与作者无关。


专栏

网络爬虫基础 : 适合有python语法基础 准备学爬虫的同学

web逆向基础 : 有爬虫经验即可(包含猿人学爬虫题目解析)

安卓逆向基础 :工具介绍、逆向记录、案例分享

爬虫案例合集 :付费专栏、经典案例、持续更新


目录

博客

推荐

交流

avatar

You might also like...
Releases(快手弹幕采集工具)
  • 快手弹幕采集工具(Jan 30, 2021)

    使用说明:

    • 1、启动dist目录下的run.exe程序。
    • 2、填入主播uid,你的cookie,房间id
    • 3、点击启动后,等待即可,不可重复点击。
    • 4、需要确认主播当前是否还在直播。

    参数获取:

    主播uid: 浏览器上的网址最后一个参数。

    比如网址为: https://live.kuaishou.com/u/yingjia2019

    主播的uid为: yingjia2019

    你的cookie:

    • 1、打开控制台,鼠标右键点击审查元素或者按F12.
    • 2、点击控制台的Network。
    • 3、刷新页面,可已按F5刷新
    • 4、找到和主播uid一样html文件,然后点击右侧的headers
    • 5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;
    • 6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

    房间id:

    • 1、点击控制台的 Elements,按ctrl+F,打开搜索框。输入: live-stream-id
    • 2、复制 live-stream-id="Zo9Upaz8w90"
    • 3、要输入的房间id是 Zo9Upaz8w90

    运行时最好保持页面打开,关闭页面后过一段时间会导致cookie失效。

    此工具以学习为主,禁止滥用

    Source code(tar.gz)
    Source code(zip)
    default.rar(21.47 MB)
  • 小说下载器(Feb 2, 2021)

    简介

    1、小说下载(优势:速度快,直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势:资源全,已上架的小说几乎都能找到)

    特别声明:

    • 本脚本仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。

    • 本项目内所有资源文件,禁止任何公众号、自媒体进行任何形式的转载、发布。

    • 本项目内任何脚本问题概不负责,包括但不限于由任何脚本错误导致的任何损失或损害.

    • 请勿将项目的任何内容用于商业或非法目的,否则后果自负。

    • 本项目遵循GPL-3.0 License协议,如果本特别声明与GPL-3.0 License协议有冲突之处,以本特别声明为准。

    Source code(tar.gz)
    Source code(zip)
    default.zip(44.16 MB)
Owner
lx
Every noble work is at first impossible.
lx
This program will help you to properly scrape all data from a specific website

This program will help you to properly scrape all data from a specific website

MD. MINHAZ 0 May 15, 2022
This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

AccessibilityLU 7 Sep 30, 2022
A Web Scraping Program.

Web Scraping AUTHOR: Saurabh G. MTech Information Security, IIT Jammu. If you find this repository useful. I would appreciate if you Star it and Fork

Saurabh G. 2 Dec 14, 2022
VG-Scraper is a python program using the module called BeautifulSoup which allows anyone to scrape something off an website. This program lets you put in a number trough an input and a number is 1 news article.

VG-Scraper VG-Scraper is a convinient program where you can find all the news articles instead of finding one yourself. Installing [Linux] Open a term

3 Feb 13, 2022
A web service for scanning media hosted by a Matrix media repository

Matrix Content Scanner A web service for scanning media hosted by a Matrix media repository Installation TODO Development In a virtual environment wit

Brendan Abolivier 5 Dec 01, 2022
Instagram profile scrapper with python

IG Profile Scrapper Instagram profile Scrapper Just type the username, and boo! :D Instalation clone this repo to your computer git clone https://gith

its Galih 6 Nov 07, 2022
Get-web-images - A python code that get images from any site

image retrieval This is a python code to retrieve an image from the internet, a

CODE 1 Dec 30, 2021
Dex-scrapper - Hobby project for scrapping dex data on VeChain

Folders /zumo_abis # abi extracted from zumo repo /zumo_pools # runtime e

3 Jan 20, 2022
茅台抢购最新优化版本,茅台秒杀,优化了抢购协程队列

茅台抢购最新优化版本,茅台秒杀,优化了抢购协程队列

MaoTai 33 Sep 03, 2022
Get paper names from dblp.org

scraper-dblp Get paper names from dblp.org and store them in a .txt file Useful for a related literature :) Install libraries pip3 install -r requirem

Daisy Lab 1 Dec 07, 2021
A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

combined-shop-scraper A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items. Features Define an

2 Dec 13, 2021
Library to scrape and clean web pages to create massive datasets.

lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr

Chip Huyen 2.1k Jan 06, 2023
simple http & https proxy scraper and checker

simple http & https proxy scraper and checker

Neospace 11 Nov 15, 2021
Goblyn is a Python tool focused to enumeration and capture of website files metadata.

Goblyn Metadata Enumeration What's Goblyn? Goblyn is a tool focused to enumeration and capture of website files metadata. How it works? Goblyn will se

Gustavo 46 Nov 22, 2022
A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

Xuye (Chris) Qin 1.5k Jan 04, 2023
A webdriver-based script for reserving Tsinghua badminton courts.

AutoReserve A webdriver-based script for reserving badminton courts. 使用说明 下载 chromedriver 选择当前Chrome对应版本 安装 selenium pip install selenium 更改场次、金额信息dat

Payne Zhang 4 Nov 09, 2021
A simple code to fetch comments below an Instagram post and save them to a csv file

fetch_comments A simple code to fetch comments below an Instagram post and save them to a csv file usage First you have to enter your username and pas

2 Jul 14, 2022
Find papers by keywords and venues. Then download it automatically

paper finder Find papers by keywords and venues. Then download it automatically. How to use this? Search CLI python search.py -k "knowledge tracing,kn

Jiahao Chen (TabChen) 2 Dec 15, 2022
Iptvcrawl - A scrapy project for crawl IPTV playlist

iptvcrawl a scrapy project for crawl IPTV playlist. Dependency Python3 pip insta

Zhijun 18 May 05, 2022
Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Scrapping the data from each page of biocides listed on the BAUA website into a csv file

Eric DE MARIA 1 Nov 30, 2021