This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file contians the analysis and code This project is done on Jupyter notebook The project uses Linear Regression and Pipeline() to fit and predict the prices.
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house
Overview
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand
This module is used to create Convolutional AutoEncoders for Variational Data Assimilation
VarDACAE This module is used to create Convolutional AutoEncoders for Variational Data Assimilation. A user can define, create and train an AE for Dat
Shot notebooks resuming the main functions of GeoPandas
Shot notebooks resuming the main functions of GeoPandas, 2 notebooks written as Exercises to apply these functions.
A forecasting system dedicated to smart city data
smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams
PLStream: A Framework for Fast Polarity Labelling of Massive Data Streams Motivation When dataset freshness is critical, the annotating of high speed
This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.
This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.
CS50 pset9: Using flask API to create a web application to exchange stocks' shares.
C$50 Finance In this guide we want to implement a website via which users can “register”, “login” “buy” and “sell” stocks, like below: Background If y
This creates a ohlc timeseries from downloaded CSV files from NSE India website and makes a SQLite database for your research.
NSE-timeseries-form-CSV-file-creator-and-SQL-appender- This creates a ohlc timeseries from downloaded CSV files from National Stock Exchange India (NS
Data Science Environment Setup in single line
datascienv is package that helps your to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries
ASOUL直播间弹幕抓取&&数据分析
ASOUL直播间弹幕抓取&&数据分析(更新中) 这些文件用于爬取ASOUL直播间的弹幕(其他直播间也可以)和其他信息,以及简单的数据分析生成。
Probabilistic reasoning and statistical analysis in TensorFlow
TensorFlow Probability TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFl
The Spark Challenge Student Check-In/Out Tracking Script
The Spark Challenge Student Check-In/Out Tracking Script This Python Script uses the Student ID Database to match the entries with the ID Card Swipe a
Data imputations library to preprocess datasets with missing data
Impyute is a library of missing data imputation algorithms. This library was designed to be super lightweight, here's a sneak peak at what impyute can do.
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.
Detecting Underwater Objects (DUO)
Underwater object detection for robot picking has attracted a lot of interest. However, it is still an unsolved problem due to several challenges. We take steps towards making it more realistic by ad
Creating a statistical model to predict 10 year treasury yields
Predicting 10-Year Treasury Yields Intitially, I wanted to see if the volatility in the stock market, represented by the VIX index (data source), had
Vectorizers for a range of different data types
Vectorizers for a range of different data types
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.
Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza
Desafio 1 ~ Bantotal
Challenge 01 | Bantotal Please read the instructions for the challenge by selecting your preferred language below: Español Português License Copyright
Stochastic Gradient Trees implementation in Python
Stochastic Gradient Trees - Python Stochastic Gradient Trees1 by Henry Gouk, Bernhard Pfahringer, and Eibe Frank implementation in Python. Based on th