Generate a repository with mirror links for DriveDroid app

Last update: Nov 19, 2022

Overview

DriveDroid Repository Generator

Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone

Check also an official scraper written in JavaScript

Try Already Built Repo

Add the next link to image repositories in DriveDroid app:

https://dd.hexed.pw

https://raw.githubusercontent.com/flameshikari/ddrg/master/repo/repo.json

Requirements
Usage
How to Make a Scraper
Misc
Roadmap
Credits
License

Requirements

Python 3.6+ with packages included in requirements.txt.

I recommend to create a venv then install packages there.

Usage

python ./src/main.py [-i dir] [-o dir] [-g]

-i dir where dir is a directory with distro scrapers (./src/distros is default).

-o dir where dir is a directory where the built repo will be saved (./build is default).

-g will generate a webpage to present the content of repo.json.

-h option is available anyway.

How to Make a Scraper

Create a folder in ./src/distros with next structure:

distro_name
├── info.toml
├── logo.png
└── scraper.py

If distro_name starts with underscore (e.g. _disabled), it will not be counted.

Let's take a look for every file.

`info.toml`

info.toml contains a distro name and a link to the official website. Arch Linux info.toml example:

name = "Arch Linux" # name of distro
url  = "https://example.com" # official site

If info.toml is missing or values ain't provided, fallback values will be used. Arch Linux fallback values will be next:

name = "arch" # distro folder name as value, also used in url
url  = "https://distrowatch.com/table.php?distribution=arch"

`logo.png`

Should be 128x128px with transparent background. Arch Linux logo.png example:

If logo.png is missing, the fallback logo will be used:

`scraper.py`

A scraper can be written as you like, as long as it returns the desired values.

It must return an array of tuples (every tuple contains iso_url, iso_arch, iso_size, iso_version in order).

Arch Linux scraper returns next values:

[
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.05.01/archlinux-2021.05.01-x86_64.iso',
    'x86_64',
    792014848,
    '2021.05.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.06.01/archlinux-2021.06.01-x86_64.iso',
    'x86_64',
    811937792,
    '2021.06.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.07.01/archlinux-2021.07.01-x86_64.iso',
    'x86_64',
    817180672,
    '2021.07.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot-network.iso',
    'x86_64',
    516947968,
    '2020.07'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot.iso',
    'x86_64',
    1280491520,
    '2020.07'
  )
]

A scraper includes from public import * in top which imports next stuff to the namespace:

bs (short for BeautifulSoup)
json
re
requests

Also it includes these functions:

get_afh_url(iso_url) — returns a download link for the file from AndroidFileHost
iso_url must be like this: https://androidfilehost.com/?fid=8889791610682936459
get_iso_arch(iso_url) — returns the used processor architecture of iso_url
get_iso_size(iso_url) — returns the file size of iso_url in bytes

Arch Linux scraper.py example:

from public import *  # noqa


def init():

    array = []
    base_urls = [
        "https://mirror.yandex.ru/archlinux/iso/latest",
        "https://mirror.yandex.ru/archlinux/iso/archboot/latest"
    ]

    for base_url in base_urls:

        html = bs(requests.get(base_url).text, "html.parser")

        for filename in html.find_all("a", {"href": re.compile("^.*\.iso$")}):

            iso_url = f"{base_url}/{filename['href']}"
            iso_arch = get_iso_arch(iso_url)
            iso_size = get_iso_size(iso_url)
            iso_version = re.search(r"-(\d+.\d+(.\d+)?)", iso_url).group(1)

            array.append((iso_url, iso_arch, iso_size, iso_version))

    return array

Misc

Here's a snippet for nginx if you decided to self host the repository with website and you wanna access repo.json only by hostname via DriveDroid. Place it in server section of your config:

location = / {
  if ($http_user_agent ~* 'okhttp') {
    rewrite ^/(.*)$ /repo.json break;
  }
}

Roadmap

Option to generate a webpage
Add a mechanism to retry scraping if a network error occurs
Option to select mirrors (mainly uses mirrors based in Russia)
Package this project perhaps
Probably make the code better

Credits

afh-dl by kade-robertson
Yandex.Disk direct links by DokPub

License

MIT License

Generate a repository with mirror links for DriveDroid app

Related tags

Overview

DriveDroid Repository Generator

Try Already Built Repo

Contents

Requirements

Usage

How to Make a Scraper

`info.toml`

`logo.png`

`scraper.py`

Misc

Roadmap

Credits

License

Owner

Evgeny

Automated data scraper for Thailand COVID-19 data

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Grab the changelog from releases on Github

Facebook Group Scraping Using Beautiful Soup & Selenium

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

学习强国自动化百分百正确、瞬间答题，分值45分

Transistor, a Python web scraping framework for intelligent use cases.

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

Python script for crawling ResearchGate.net papers✨⭐️📎

Kusonime scraper using python3

The core packages of security analyzer web crawler

用python爬取江苏几大高校的就业网站，并提供3种方式通知给用户，分别是通过微信发送、命令行直接输出、windows气泡通知。

Incredibly fast crawler designed for OSINT.

A database scraper created with mechanical soup and sqlite

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

a high-performance, lightweight and human friendly serving engine for scrapy

Use Flask API to wrap Facebook data. Grab the wapper of Facebook public pages without an API key.

Html Content / Article Extractor, web scrapping lib in Python

Generate a repository with mirror links for DriveDroid app

Related tags

Overview

DriveDroid Repository Generator

Try Already Built Repo

Contents

Requirements

Usage

How to Make a Scraper

info.toml

logo.png

scraper.py

Misc

Roadmap

Credits

License

Owner

Evgeny

Automated data scraper for Thailand COVID-19 data

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Grab the changelog from releases on Github

Facebook Group Scraping Using Beautiful Soup & Selenium

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

学习强国 自动化 百分百正确、瞬间答题，分值45分

Transistor, a Python web scraping framework for intelligent use cases.

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

Python script for crawling ResearchGate.net papers✨⭐️📎

Kusonime scraper using python3

The core packages of security analyzer web crawler

用python爬取江苏几大高校的就业网站，并提供3种方式通知给用户，分别是通过微信发送、命令行直接输出、windows气泡通知。

Incredibly fast crawler designed for OSINT.

A database scraper created with mechanical soup and sqlite

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

a high-performance, lightweight and human friendly serving engine for scrapy

Use Flask API to wrap Facebook data. Grab the wapper of Facebook public pages without an API key.

Html Content / Article Extractor, web scrapping lib in Python

`info.toml`

`logo.png`

`scraper.py`

学习强国自动化百分百正确、瞬间答题，分值45分