Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

YouCompleteMe: a code-completion engine for Vim

YouCompleteMe: a code-completion engine for Vim Help, Advice, Support Looking for help, advice or support? Having problems getting YCM to work? First

24.5k Jan 06, 2023
A Python package for Misty II development

Misty2py Misty2py is a Python 3 package for Misty II development using Misty's REST API. Read the full documentation here! Installation Poetry To inst

Chris Scarred 1 Mar 07, 2022
A multipurpose discord bot with more than 220 commands

Welcome WM Bot A advanced bot with more than 220 commands to fit your needs Explore the commands » View Demo · Report Bug · Request Feature Table of C

Wasi Master 12 Dec 16, 2022
An interactive aquarium for your terminal.

sipedon An interactive aquarium for your terminal, written using pytermgui. The project got its name from the Common Watersnake, also known as Nerodia

17 Nov 07, 2022
Themes for Windows Terminal

Windows Terminal Themes Preview and copy themes for the new Windows Terminal. Use the project at windowsterminalthemes.dev How to use the themes This

Tom 1.1k Jan 03, 2023
Random scripts and other bits for interacting with the SpaceX Starlink user terminal hardware

starlink-grpc-tools This repository has a handful of tools for interacting with the gRPC service implemented on the Starlink user terminal (AKA "the d

270 Dec 29, 2022
Shellcode runner to execute malicious payload and bypass AV

buffshark-shellcode-runner Python Shellcode Runner to execute malicious payload and bypass AV This script utilizes mmap(for linux) and win api wrapper

Momo Lenard 9 Dec 29, 2022
A command line tool to query source code from your current Python env

wxc wxc (pronounced "which") allows you to inspect source code in your Python environment from the command line. It is based on the inspect module fro

Clément Robert 13 Nov 08, 2022
Convert markdown to HTML using the GitHub API and some additional tweaks with Python.

Convert markdown to HTML using the GitHub API and some additional tweaks with Python. Comes with full formula support and image compression.

phseiff 70 Dec 23, 2022
liquidctl – liquid cooler control Cross-platform tool and drivers for liquid coolers and other devices

Cross-platform CLI and Python drivers for AIO liquid coolers and other devices

1.7k Jan 08, 2023
A python package to display progress of loops to the user

ProgressBars A python package to display progress of loops to the user. Installation This package can be installed using pip. pip install progressbars

Matthias 3 Jan 16, 2022
A Julia library for solving Wordle puzzles.

Wordle.jl A Julia library for solving Wordle puzzles. Usage julia import Wordle: play julia play("panic") 4 julia play("panic", verbose = true) I

John Myles White 3 Jan 23, 2022
Display Images in your terminal with python

Term-Img Display Images in your terminal with python NOTE: This project is a work in progress and not everything on here has actually been implemented

My avatar ;D 118 Jan 05, 2023
Personal and work vim 8 configuration with submodules

vimfiles Windows Vim 8 configuration files based on the recommendations of Ruslan Osipov, Keep Your vimrc file clean and The musings of bluz71. :help

1 Aug 27, 2022
ghfetch is ai customizable CLI GitHub personal README generator.

ghfetch is ai customizable CLI GitHub personal README generator. Inspired by famous fetch such as screenfetch, neofetch and ufetch, the purpose of this tool is to introduce yourself as if you were a

Alessio Celentano 3 Sep 10, 2021
Notion-cli-list-manager - A simple command-line tool for managing Notion databases

A simple command-line tool for managing Notion List databases. ✨

Giacomo Salici 75 Dec 04, 2022
Apple Silicon 'top' CLI

asitop pip install asitop What A nvtop/htop style/inspired command line tool for Apple Silicon (aka M1) Macs. Note that it requires sudo to run due to

Timothy Liu 1.2k Dec 31, 2022
Analysis of a daily word game "Wordle"

Wordle Analysis of a daily word game "Wordle" https://www.powerlanguage.co.uk/wordle/ Description Worlde is a daily word game in which a player attemp

Bartek 1 Feb 07, 2022
commandpack - A package of modules for working with commands, command packages, files with command packages.

commandpack Help the project financially: Donate: https://smartlegion.github.io/donate/ Yandex Money: https://yoomoney.ru/to/4100115206129186 PayPal:

4 Sep 04, 2021
Install python modules from pypi from a previous date in history

pip-rewind is a command-line tool that can rewind pypi module versions (given as command-line arguments or read from a requirements.txt file) to a previous date in time.

Amar Paul 4 Jul 03, 2021