Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Last update: Jan 13, 2022

Overview

Predict-The-Price-Of-Books

For this task, a big dataset which consists of book of different genres and authors was utilized. The provided dataset included various book features, such as Author, Edition, Reviews, etc. Those features have been used as regressors in order to predict the price of books, using various proposed methods and models.

Author: Nikolas Petrou, MSc in Data Science

Technical-Report and Code Availability

A complete file-folder guide is located in the folder-file guide folder
The technical report and analysis of the work is available and located in report.pdf file
The implementation and code of the project is located in the code files folder

Dataset Overview

Regarding the data of this work, there is an online competition for this task, which has been up since 27/09/2019. Currently, the competition has 3579 participants in total. The data was downloaded directly from MachineHack. There were two files forthe train and test sets. The training and test sets included 6237 and 1560 records respectively. In addition, the values of the target variable (Price) were not included in the test set, as the evaluation of the test set is employed through the website of MachineHack.

Methodology

Some of the key methods which were used throughout the work are:

Visualization
TF-IDF and LDA Topic Extraction
Text-tranlsation using Google Trasnlate Ajax API
Cyclical feature encoding for time-based feature extraction
Price Prediction using different conventional and advanced algorithms (e.g. GBM, RF, SVM, CatBoost, LightGBM)

An abstract methodology scheme of the work is illustrated in the following Figure.

Summarizing, firstly the exploratory data understanding process was commenced. Each feature was assessed in order to obtain a better understanding of what it represents and how it could affect book pricing. Next, each future was brought into a format that was appropriate for model development. Following, through visualization, it was examined how the different features were correlated to the dependent-target variable. Furthermore, the processed data were used to implement the employed models. The prediction-modelling phase was conducted with two different approaches. Finally, the whole methodology procedure followed a cyclical behaviour, until the final prediction model was implemented.

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Related tags

Overview

Predict-The-Price-Of-Books

Technical-Report and Code Availability

Dataset Overview

Methodology

Owner

Nikolas Petrou

HyperLib: Deep learning in the Hyperbolic space

Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

torchbearer: A model fitting library for PyTorch

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

Social Distancing Detector

This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

Language Used: Python . Made in Jupyter(Anaconda) notebook.

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队

a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

EmoTag helps you train emotion detection model for Chinese audios

joint detection and semantic segmentation, based on ultralytics/yolov5,

Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

《Dual-Resolution Correspondence Network》(NeurIPS 2020)

This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking".

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Interactive dimensionality reduction for large datasets

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Related tags

Overview

Predict-The-Price-Of-Books

Technical-Report and Code Availability

Dataset Overview

Methodology

Owner

Nikolas Petrou

HyperLib: Deep learning in the Hyperbolic space

Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

torchbearer: A model fitting library for PyTorch

This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

Social Distancing Detector

This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

Language Used: Python . Made in Jupyter(Anaconda) notebook.

2021搜狐校园文本匹配算法大赛 分比我们低的都是帅哥队

a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

EmoTag helps you train emotion detection model for Chinese audios

joint detection and semantic segmentation, based on ultralytics/yolov5,

Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

《Dual-Resolution Correspondence Network》(NeurIPS 2020)

This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking".

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Interactive dimensionality reduction for large datasets

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队