Pyspark sam - Analyze Big Sequence Alignments with PySpark in AWSย EMR

Overview

pyspark_sam

This repo hosts my code for the article "Analyze Big Sequence Alignments with PySpark in AWS EMR".

Prerequisite

  1. Spark

  2. AWS CLI

  3. AWS Account

Run

Follow the instruction in the article. Once you have uploaded the files into your S3 bucket, run

aws emr create-cluster --name "Spark_step_pip" \
    --release-label emr-6.5.0 \
    --applications Name=Spark \
    --log-uri s3://[your_S3_bucket]/logs/ \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --bootstrap-actions Path=s3://[your_S3_bucket]/emr_bootstrap.sh \
    --use-default-roles --auto-terminate \
    --steps "Type=Spark,Name=SparkProgram,ActionOnFailure=CONTINUE,Args=[--deploy-mode,cluster,--master,yarn,--py-files,s3://[your_S3_bucket]/helper_function.py,s3://[your_S3_bucket]/spark_3mer.py,s3://[your_S3_bucket]/test.sam,[your_S3_bucket],sankey.json]" 

When the job finishes, download the sankey.json. And run this command to visualize:

python sankey.py sankey.json

Authors

  • Sixing Huang - Concept and Coding

License

This project is licensed under the MIT License - see the LICENSE file for details

Owner
Sixing Huang
A triple Neo4j certified data scientist. I am currently working at BGI in Shenzhen.
Sixing Huang
Python client for Midea dhumidifier

This is a library that allows communication with Midea dehumidifier appliances via the local area network. midea-beautiful-dehumidifier This library a

Nenad Bogojevic 42 Dec 22, 2022
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Jan 03, 2023
This Bot Can Upload Video from Link Of Pdisk to Pdisk using its API. @PredatorHackerzZ

๐๐๐ข๐ฌ๐ค ๐‚๐จ๐ง๐ฏ๐ž๐ซ๐ญ๐ž๐ซ ๐๐จ๐ญ Make short link by using ๐๐๐ข๐ฌ๐ค API key Installation ๐“๐ก๐ž ๐„๐š๐ฌ๐ฒ ๐–๐š๐ฒ ๐‘๐ž๐ช๐ฎ๐ข๐ซ๐ž๐ ๐•๐š๐ซ๐ข๐š๐›๐ฅ๐ž

ฯัั”โˆ‚ฮฑั‚ฯƒั 25 Dec 02, 2022
A simple python discord bot which give you a yogurt brand name, basing on a large database often updated.

YaourtBot A discord simple bot by Lopinosaurus Before using this code : ใƒปMove env file to .env ใƒปChange the channel ID on line 38 of bot.py to your #pi

The only one bunny who can dev. 0 May 09, 2022
Signs the target email up to over 1000 different mailing lists to get spammed each day.

Email Bomber Say goodbye to that email Features Signs up to over 1k different mailing lists Written in python so the program is lightweight Easy to us

Loxdr 1 Nov 30, 2021
Implementation of Chatterbot using Discord API

discord-chat-bot Implementation of Chatterbot using Discord API. Usage Due to the necessity of storing files to train the AI, the bot is not hosted pu

kiwijuice56 0 Sep 29, 2022
Instrument asyncio Python for distributed tracing with AWS X-Ray.

xraysink (aka xray-asyncio) Extra AWS X-Ray instrumentation to use distributed tracing with asyncio Python libraries that are not (yet) supported by t

Gary Donovan 12 Nov 10, 2022
A feishu bot daily push arxiv latest articles.

arxiv-feishu-bot We develop A simple feishu bot script daily pushes arxiv latest articles. His effect is as follows: Of course, you can also use other

huchi 6 Apr 06, 2022
Python: Asynchronous client for the Tailscale API

Python: Asynchronous client for the Tailscale API Asynchronous client for the Tailscale API. About This package allows you to control and monitor Tail

Franck Nijhof 9 Nov 22, 2022
A template / demo bot for the Halcyon matrix bot library

Halcyon stock bot Hello! This is an example / template bot using the halcyon matrix bot library. Feel free to ask questions in the matrix chat #halcyo

Wes Ring 1 Feb 04, 2022
A Discord Token Grabber/Stealer But It's in One Line of Coding

Discord-Token-Grabber-But-In-One-Line That's a Discord Token Grabber/Stealer But It's in One Line of Coding! The Name Says All 3

YoSoyAngi 2 Jan 11, 2022
Python script to replace BTC adresses in the clipboard with similar looking ones, whose private key can be retrieved by a netcat listener or similar.

BTCStealer Python script to replace BTC adresses in the clipboard with similar looking ones, whose private key can be retrieved by a netcat listener o

Some Person 6 Jun 07, 2022
The worst but simplest webhook bot for GitHub and Matrix.

gh-bot gh-bot is maybe the worst (but simplest) Matrix webhook bot for Github. Example of commits: Example of workflow finished: Setting up Server You

Jae Lo Presti 4 Aug 18, 2022
๐Ÿง‘โ€๐Ÿ’ผ Python wrapper for the Seek API

seek-com-au-api ๐Ÿง‘โ€๐Ÿ’ผ Python wrapper for the seek.com.au API (unofficial) Installation Using Python = 3.6: pip install -e git+https://github.com/tomq

Tom Quirk 1 Oct 24, 2021
WikipediaBot from mohirdev.uz

wiki-bot WikipediaBot from mohirdev.uz Requirements wikipedia aiogram Installing wiki/aiogram pip install wikipedia pip install aiogram

Muhammad Ali 5 Sep 28, 2022
Easy to use API Wrapper for somerandomapi.ml.

Overview somerandomapi is an API Wrapper for some-random-api.ml Examples Asynchronous from somerandomapi import Animal import asyncio async def main

Myxi 1 Dec 31, 2021
With this program you can work English & Turkish

1 - How Can I Work This? You must have Python compilers in order to run this program. First of all, download the compiler in the link. Compiler 2 - Do

Mustafa Bahadฤฑr DoฤŸrusรถz 3 Aug 07, 2021
Ein PY-Skript, mit dem tiled-Editor-Maps bearbeitet werden

tilesetCopyrighter Ein PY-Skript, mit dem tiled-Editor-Maps bearbeitet werden kรถnnen fรผgt je Tileset eine custom-Property tilesetCopyright (string) hi

1 Dec 26, 2021
:lock: Python 2.7/3.X client for HashiCorp Vault

hvac HashiCorp Vault API client for Python 3.x Tested against the latest release, HEAD ref, and 3 previous minor versions (counting back from the late

hvac 1k Dec 29, 2022
A telegram bot that messages you available vaccine appointments in the Veneto region

Serenissimo, domande frequenti Chi sei? Sono Alberto Granzotto, libero professionista a Berlino. Mi occupo di servizi software, privacy, decentralizza

vrde 31 Sep 30, 2022