Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Overview

Real-time stock predictions with deep learning and news scraping

This repository contains a partial implementation of my bachelor's thesis "Real-time stock predictions with deep learning and news scraping". The code has been built using PyTorch Lightning, read its documentation to get a complete overview of how this repository is structured.

Disclaimer: Neither the pipeline nor the model published in this repository are the ones used in the thesis. On the pipeline side, notice that the model tries to match headlines and prices of the same day, while in the thesis we used news published the day before. For the case of the model, the one shared here has nothing to do with the original and should be considered a toy model.

Preparing the data

The data used in the thesis has been completely crawled and put together from scratch. Specifically, you can find the titles and descriptions of the news published on Reuters.com from January 2010 to May 2018. In addition to that, you also have the stock prices (end of the day) of S&P 500 companies extracted from AlphaVantage.co. Everything is compressed in a H5DF file that you can download from this link.

The first step is to clone this repository and install its dependencies:

git clone https://github.com/davidalvarezdlt/bachelor_thesis.git
cd bachelor_thesis
pip install -r requirements.txt

Move both bachelor_thesis_data.hdf5 and word2vec.bin inside ./data. The resulting folder structure should look like this:

bachelor_thesis/
    bachelor_thesis/
    data/
        bachelor_thesis_data.hdf5
        word2vec.bin
    lightning_logs/
    .gitignore
    .pre-commit-config.yaml
    LICENSE
    README.md
    requirements.txt

Training the model

In short, you can train the model by calling:

python -m bachelor_thesis

You can modify the default parameters of the code by using CLI parameters. Get a complete list of the available parameters by calling:

python -m bachelor_thesis --help

For instance, if we want to train the model using GOOGL stock prices, with a batch size of 32 and using one GPUs, we would call:

python -m bachelor_thesis --symbol GOOGL --batch_size 32 --gpus 1

Every time you train the model, a new folder inside ./lightning_logs will be created. Each folder represents a different version of the model, containing its checkpoints and auxiliary files.

Testing the model

You can measure the loss and the accuracy of the model (number of times the prediction is correct) and store it in TensorBoard by calling:

python -m bachelor_thesis --test --test_checkpoint <test_checkpoint>

Where --test_checkpoint is a valid path to the model checkpoint that should be used.

Citation

If you use the data provided in this repository or if you find this thesis useful, please use the following citation:

@thesis{Alvarez2018,
    type = {Bachelor's Thesis},
    author = {David Álvarez de la Torre},
    title = {Real-time stock predictions with Deep Learning and news scrapping},
    school = {Universitat Politècnica de Catalunya},
    year = 2018,
}
Owner
David Álvarez de la Torre
Founder of @lemonplot. Alumni of UPC and ETH.
David Álvarez de la Torre
UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

dddzg 49 Jan 02, 2023
DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polar

Ashish Salunkhe 37 Dec 17, 2022
This is the official repository of the paper Stocastic bandits with groups of similar arms (NeurIPS 2021). It contains the code that was used to compute the figures and experiments of the paper.

Experiments How to reproduce experimental results of Stochastic bandits with groups of similar arms submitted paper ? Section 5 of the paper To reprod

Fabien 0 Oct 25, 2021
Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

Robotic Arm Simulation in ROS2 and Gazebo General Overview This repository includes: First, how to simulate a 6DoF Robotic Arm from scratch using GAZE

David Valencia 12 Jan 02, 2023
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022
Quantized tflite models for ailia TFLite Runtime

ailia-models-tflite Quantized tflite models for ailia TFLite Runtime About ailia TFLite Runtime ailia TF Lite Runtime is a TensorFlow Lite compatible

ax Inc. 13 Dec 23, 2022
PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

AttentionHTR PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks. Scene Text

Dmitrijs Kass 31 Dec 22, 2022
Collection of sports betting AI tools.

sports-betting sports-betting is a collection of tools that makes it easy to create machine learning models for sports betting and evaluate their perf

George Douzas 109 Dec 31, 2022
End-to-end speech secognition toolkit

End-to-end speech secognition toolkit This is an E2E ASR toolkit modified from Espnet1 (version 0.9.9). This is the official implementation of paper:

Jinchuan Tian 147 Dec 28, 2022
Deep learning model for EEG artifact removal

DeepSeparator Introduction Electroencephalogram (EEG) recordings are often contaminated with artifacts. Various methods have been developed to elimina

23 Dec 21, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022
Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments This work presents an approach to explainable navigation under

RAIL Group @ George Mason University 1 Oct 28, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

VisDrone 98 Nov 16, 2022
A developer interface for creating Chat AIs for the Chai app.

ChaiPy A developer interface for creating Chat AIs for the Chai app. Usage Local development A quick start guide is available here, with a minimal exa

Chai 28 Dec 28, 2022
Official DGL implementation of "Rethinking High-order Graph Convolutional Networks"

SE Aggregation This is the implementation for Rethinking High-order Graph Convolutional Networks. Here we show the codes for citation networks as an e

Tianqi Zhang (张天启) 32 Jul 19, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
Automatic Attendance marker for LMS Practice School Division, BITS Pilani

LMS Attendance Marker Automatic script for lazy people to mark attendance on LMS for Practice School 1. Setup Add your LMS credentials and time slot t

Nihar Bansal 3 Jun 12, 2021
Automated image registration. Registrationimation was too much of a mouthful.

alignimation Automated image registration. Registrationimation was too much of a mouthful. This repo contains the code used for my blog post Alignimat

Ethan Rosenthal 9 Oct 13, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

hierarchical-transformer-1d Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers In Progress!! 2021.

MyungHoon Jin 7 Nov 06, 2022