Scrapping Connections' info on Linkedin

Overview

Scrap It!

! Disclaimer:

  1. THIS CODE HAS BEEN IMPLEMENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE INTERVIEW PROCESS OF MCI.IR AND INTERVIEWEES WERE SUPPOSED TO PUSH THE CODE ON THEIR GITHUB. CONTACT ME TO REMOVE THIS REPOSITORY, IN CASE IT IS AGAINST YOUR TOS.
  2. IF ANY CONNECTION IS NOT OK TO THEIR CONTACT INFO BE HERE, CONTACT ME TO REMOVE THEM ASAP.

Functionalities:

This script automatically:

  • opens your Linkedin profile
  • accesses your connections page
  • crawls the page for grabbing their profile links
  • scraps each person's information and dumps it to Sqlite db
  • and simultaneously logs all necessary level of info into Linkedin.log

DataFlowDiagram

db

Enlisted desing patterns are (but not limited to):

  • Creator
  • Low Coupling
  • High Cohesion
  • Indirection
  • Modularization
  • Information Expert

Log/DB files:

db log log2

Further develepments notes:

  • Check out other DBs that supports multithreading which anable us dumpping all information rows at once
  • change IP per request (You can find its code on my "Social Media Computing course" repository)
  • Sometimes you need to scroll down manually when "connection" page is being loaded. You can add one line code to scroll down for you.

References:

https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven

https://www.geeksforgeeks.org/scrape-linkedin-using-selenium-and-beautiful-soup-in-python/

https://stackoverflow.com/questions/28883769/remove-odd-indexed-elements-from-list-in-python#:~:text=Fun%20fact%3A%20to%20remove%20all,remove(x)%20.

https://stackoverflow.com/questions/34759787/fetch-all-href-link-using-selenium-in-python

https://www.tutorialspoint.com/fetch-all-href-link-using-selenium-in-python

https://stackoverflow.com/questions/64717302/deprecationwarning-executable-path-has-been-deprecated-selenium-python

https://chromedriver.chromium.org/home

https://www.youtube.com/watch?v=-ARI4Cz-awo

Owner
MohammadReza Ardestani
CS @ AUT
MohammadReza Ardestani
Get-web-images - A python code that get images from any site

image retrieval This is a python code to retrieve an image from the internet, a

CODE 1 Dec 30, 2021
CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022
A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Alex Papadopoulos 1 Nov 13, 2021
This is a webscraper for a specific website

This is a webscraper for a specific website. It is tuned to extract the headlines of that website. With some little adjustments the webscraper is able to extract any part of the website.

Rahul Siyanwal 1 Dec 13, 2021
Docker containerized Python Flask API that uses selenium to scrape and interact with websites

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

Christian Gracia 0 Jan 22, 2022
Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

Charles 3 Feb 04, 2022
A Spider for BiliBili comments with a simple API server.

BiliComment A spider for BiliBili comment. Spider Usage Put config.json into config directory, and then python . ./config/config.json. A example confi

Hao 3 Jul 05, 2021
🐞 Douban Movie / Douban Book Scarpy

Python3-based Douban Movie/Douban Book Scarpy crawler for cover downloading + data crawling + review entry.

Xingbo Jia 1 Dec 03, 2022
Nekopoi scraper using python3

Features Scrap from url Todo [+] Search by genre [+] Search by query [+] Scrap from homepage Example # Hentai Scraper from nekopoi import Hent

MhankBarBar 9 Apr 06, 2022
联通手机营业厅自动做任务、签到、领流量、领积分等。

联通手机营业厅自动完成每日任务,领流量、签到获取积分等,月底流量不发愁。 功能 沃之树领流量、浇水(12M日流量) 每日签到(1积分+翻倍4积分+第七天1G流量日包) 天天抽奖,每天三次免费机会(随机奖励) 游戏中心每日打卡(连续打卡,积分递增至最高

2k May 06, 2021
SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features.

SearchifyX SearchifyX, predecessor to Searchify, is a fast Quizlet, Quizizz, and Brainly webscraper with various stealth features. SearchifyX lets you

28 Dec 20, 2022
A tool to easily scrape youtube data using the Google API

YouTube data scraper To easily scrape any data from the youtube homepage, a youtube channel/user, search results, playlists, and a single video itself

7 Dec 03, 2022
河南工业大学 完美校园 自动校外打卡

HAUT-checkin 河南工业大学自动校外打卡 由于github actions存在明显延迟,建议直接使用腾讯云函数 特点 多人打卡 使用简单,仅需账号密码以及用于微信推送的uid 自动获取上一次打卡信息用于打卡 向所有成员微信单独推送打卡状态 完美校园服务器繁忙时造成打卡失败会自动重新打卡

36 Oct 27, 2022
This is a script that scrapes the longitude and latitude on food.grab.com

grab This is a script that scrapes the longitude and latitude for any restaurant in Manila on food.grab.com, location can be adjusted. Search Result p

0 Nov 22, 2021
Scrape and display grades onto the console

WebScrapeGrades About The Project This Project is a personal project where I learned how to webscrape using python requests. Being able to get request

Cyrus Baybay 1 Oct 23, 2021
for those who dont want to pay $10/month for high school game footage with ads

nfhs-scraper Disclaimer: I am in no way responsible for what you choose to do with this script and guide. I do not endorse avoiding paywalls or any il

Conrad Crawford 5 Apr 12, 2022
Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Web Scrapping Popular Youtube Tech Channels with Selenium Data Mining, Data Wrangling, and Exploratory Data Analysis About the Data Web scrapi

David Rusho 0 Aug 18, 2021
a way to scrape a database of all of the isef projects

ISEF Database This is a simple web scraper which gets all of the projects and abstract information from here. My goal for this is for someone to get i

William Kaiser 1 Mar 18, 2022
Scraping news from Ucsal portal with Scrapy.

NewsScraping Esse é um projeto de raspagem das últimas noticias, de 2021, do portal da universidade Ucsal http://noosfero.ucsal.br/institucional Tecno

Crissiano Pires 0 Sep 30, 2021
薅薅乐 - JD 测试脚本

薅薅乐 安裝 使用docker docker一键安装: docker run -d --name jd classmatelin/hhl:latest. 使用 进入容器: docker exec -it jd bash 获取JD_COOKIES: python get_jd_cookies.py,

ClassmateLin 575 Dec 28, 2022