爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介

  • csdn csdn
  • 时光荏苒,记不清写了多少案例了。作者文章发布在csdn,代码随后往github上更新。csdn部分文章为收费案例,合理订阅。

声明

  • 本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。

  • 作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。

  • 因本库引起的或与之有关的任何争议,各方应友好协商解决,协商不成的任何后果与作者无关。


专栏

网络爬虫基础 : 适合有python语法基础 准备学爬虫的同学

web逆向基础 : 有爬虫经验即可(包含猿人学爬虫题目解析)

安卓逆向基础 :工具介绍、逆向记录、案例分享

爬虫案例合集 :付费专栏、经典案例、持续更新


目录

博客

推荐

交流

avatar

You might also like...
Releases(快手弹幕采集工具)
  • 快手弹幕采集工具(Jan 30, 2021)

    使用说明:

    • 1、启动dist目录下的run.exe程序。
    • 2、填入主播uid,你的cookie,房间id
    • 3、点击启动后,等待即可,不可重复点击。
    • 4、需要确认主播当前是否还在直播。

    参数获取:

    主播uid: 浏览器上的网址最后一个参数。

    比如网址为: https://live.kuaishou.com/u/yingjia2019

    主播的uid为: yingjia2019

    你的cookie:

    • 1、打开控制台,鼠标右键点击审查元素或者按F12.
    • 2、点击控制台的Network。
    • 3、刷新页面,可已按F5刷新
    • 4、找到和主播uid一样html文件,然后点击右侧的headers
    • 5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;
    • 6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

    房间id:

    • 1、点击控制台的 Elements,按ctrl+F,打开搜索框。输入: live-stream-id
    • 2、复制 live-stream-id="Zo9Upaz8w90"
    • 3、要输入的房间id是 Zo9Upaz8w90

    运行时最好保持页面打开,关闭页面后过一段时间会导致cookie失效。

    此工具以学习为主,禁止滥用

    Source code(tar.gz)
    Source code(zip)
    default.rar(21.47 MB)
  • 小说下载器(Feb 2, 2021)

    简介

    1、小说下载(优势:速度快,直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势:资源全,已上架的小说几乎都能找到)

    特别声明:

    • 本脚本仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。

    • 本项目内所有资源文件,禁止任何公众号、自媒体进行任何形式的转载、发布。

    • 本项目内任何脚本问题概不负责,包括但不限于由任何脚本错误导致的任何损失或损害.

    • 请勿将项目的任何内容用于商业或非法目的,否则后果自负。

    • 本项目遵循GPL-3.0 License协议,如果本特别声明与GPL-3.0 License协议有冲突之处,以本特别声明为准。

    Source code(tar.gz)
    Source code(zip)
    default.zip(44.16 MB)
Owner
lx
Every noble work is at first impossible.
lx
Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Twitter Turbo / Auto Claimer / Swapper Version: 1.0 Last Update: 01/26/2022 Use this at your own descretion. I've only used this on test accounts and

Underscores 6 May 02, 2022
Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

wallstreetbets-tracker Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit.

91 Dec 08, 2022
Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

Charles 3 Feb 04, 2022
A simple python script to fetch the latest covid info

covid-tracker-script A simple python script to fetch the latest covid info How it works First, get the current date in MM-DD-YYYY format. Check if the

Dot 0 Dec 15, 2021
A web scraper that exports your entire WhatsApp chat history.

WhatSoup 🍲 A web scraper that exports your entire WhatsApp chat history. Table of Contents Overview Demo Prerequisites Instructions Frequen

Eddy Harrington 87 Jan 06, 2023
此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

N0el4kLs 5 Nov 19, 2021
tweet random sand cat pictures

sandcatbot setup pip3 install --user -r requirements.txt cp sandcatbot.example.conf sandcatbot.conf vim sandcatbot.conf running the first parameter i

jess 8 Aug 07, 2022
download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

1 Dec 02, 2022
Scraping weather data using Python to receive umbrella reminders

A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.

Edula Vinay Kumar Reddy 1 Aug 23, 2022
Html Content / Article Extractor, web scrapping lib in Python

Python-Goose - Article Extractor Intro Goose was originally an article extractor written in Java that has most recently (Aug2011) been converted to a

Xavier Grangier 3.8k Jan 02, 2023
Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

Shopee Scraper A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil. The project was crea

Paulo DaRosa 5 Nov 29, 2022
Unja is a fast & light tool for fetching known URLs from Wayback Machine

Unja Fetch Known Urls What's Unja? Unja is a fast & light tool for fetching known URLs from Wayback Machine, Common Crawl, Virus Total & AlienVault's

Sheryar 10 Aug 07, 2022
The core packages of security analyzer web crawler

Security Analyzer 🐍 A large scale web crawler (considered also as vulnerability scanner tool) to take an overview about security of Moroccan sites Cu

Security Analyzer 10 Jul 03, 2022
crypto currency scraping

SCRYPTO What ? Crypto currencies scraping (At the moment, only bitcoin and ethereum crypto currencies are supported) How ? A python script is running

15 Sep 01, 2022
PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

PaperRobot PaperRobot 是一个论文抓取工具,可以快速批量下载大量论文,方便后期进行持续的论文管理与学习。 PaperRobot通过多个接口抓取论文,目前抓取成功率维持在90%以上。通过配置Config文件,可以抓取任意计算机领域相关会议的论文。 Installation Down

moxiaoxi 47 Nov 23, 2022
This project was created using Python technology and flask tools to scrape a music site

python-scrapping This project was created using Python technology and flask tools to scrape a music site You need to install the following packages to

hosein moradi 1 Dec 07, 2021
A Python module to bypass Cloudflare's anti-bot page.

cloudscraper A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests.

VeNoMouS 2.6k Dec 31, 2022
Python script who crawl first shodan page and check DBLTEK vulnerability

🐛 MASS DBLTEK EXPLOIT CHECKER USING SHODAN 🕸 Python script who crawl first shodan page and check DBLTEK vulnerability

Divin 4 Jan 09, 2022
simple http & https proxy scraper and checker

simple http & https proxy scraper and checker

Neospace 11 Nov 15, 2021
Incredibly fast crawler designed for OSINT.

Photon Incredibly fast crawler designed for OSINT. Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap Key Features Dat

Somdev Sangwan 9.3k Jan 02, 2023