Web-Scraping using Selenium

What is the need of Selenium?

Some websites don't like to be scrapped and in that case you need to disguise your webscraping bot as a Human Being.

What is locator or css selector or xpath?

Locator can be termed as an address that identifies a web element uniquely within the webpage. Locators are the HTML properties of a web element which tells the Selenium about the web element it need to perform action on.

There is a diverse range of web elements. The most common amongst them are:

Text box Button Drop Down Hyperlink Check Box Radio Button

Types of Locators in Selenium

Photo Credit - www.softwaretestinghelp.com

XPATH

Xpath is used to locate a web element based on its XML path. XML stands for Extensible Markup Language and is used to store, organize and transport arbitrary data. It stores data in a key-value pair which is very much similar to HTML tags. Both being mark up languages and since they fall under the same umbrella, xpath can be used to locate HTML elements.

The fundamental behind locating elements using Xpath is the traversing between various elements across the entire page and thus enabling a user to find an element with the reference of another element.

CSS-Selector

CSS Selector is combination of an element selector and a selector value which identifies the web element within a web page. The composite of element selector and selector value is known as Selector Pattern.

Photo-Credit - www.softwaretestinghelp.com

Primitive types of CSS Selector

Different Types of CSS Selector

Web-Scraping using Selenium Master

Related tags

Overview

Web-Scraping using Selenium

What is the need of Selenium?

What is locator or css selector or xpath?

Owner

Md Rashidul Islam

UsernameScraperTool - Username Scraper Tool With Python

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Google Scholar Web Scraping

High available distributed ip proxy pool, powerd by Scrapy and Redis

The first public repository that provides free BUBT website scraping API script on Github.

Scrap-mtg-top-8 - A top 8 mtg scraper using python

Scrap the 42 Intranet's elearning videos in a single click

Parse feeds in Python

A Python module to bypass Cloudflare's anti-bot page.

Scrape puzzle scrambles from csTimer.net

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

A simple code to fetch comments below an Instagram post and save them to a csv file

自动完成每日体温上报（Github Actions）

Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

一些爬虫相关的签名、验证码破解

This tool crawls a list of websites and download all PDF and office documents

a small library for extracting rich content from urls

Scrapes all articles and their headlines from theonion.com

Basic-html-scraper - A complete how to of web scraping with Python for beginners