原神抽卡记录数据集-Genshin Impact gacha data

Last update: Dec 27, 2022

Related tags

Text Data & NLP genshin-impact

Overview

提要

持续收集原神抽卡记录中

可以使用抽卡记录导出工具导出抽卡记录的json，将json文件发送至[email protected]，我会在清除个人信息后将文件提交到此处。以下两种导出工具任选其一即可。

一种抽卡记录导出工具 from sunfkny 使用方法演示视频

另一种electron版的抽卡记录导出工具 from lvlvl

目前数据集中有195917条抽卡记录

数据使用说明

你可以以个人身份自由的使用本项目数据用于抽卡机制研究，你可以自由的修改和发布我的分析代码（虽然我这代码还不如重新写一次）

但是一定不要将抽卡数据集发布整合到别的平台上，若如此，以后有人去使用多个来源的抽卡数据可能会遇到严重的数据重复问题。请让想要获得抽卡数据朋友来GitHub下载，或注明数据来自本项目。

在使用本数据集得出任何结论时，请自问过程是否严谨，结论是否可信。不应当发布显然不正确的抽卡模型或是不正确且会造成不良影响的模型，如造成不良影响，数据集整理者和提供数据的玩家不负任何责任。

通过一段时间的研究，我基本整理出了原神抽卡的所有机制：

原神抽卡全机制总结

分析抽卡机制的一些工具

数据格式说明

dataset_02文件夹中文件从0001开始顺序编号

每个文件夹内包含一个账号的抽卡记录

gacha100.csv 记录初行者推荐祈愿抽卡数据

gacha200.csv 记录常驻祈愿抽卡数据

gacha301.csv 记录角色活动祈愿数据

gacha302.csv 记录武器活动祈愿数据

csv文件内数据记录格式如下

抽卡时间	名称	类别	星级
YYYY-MM-DD HH:MM:SS	物品全名	角色/武器	3/4/5

分析工具说明

DataAnalysis.py用于分析csv抽卡文件，这段代码还在重写中，会非常的难用，仅供参考，运行后会输出参考统计量并画出分布图，分布图中理论值是我根据实际数据、部分游戏文件推理建立的概率增长模型。

DistributionMatrix.py用于在四星五星耦合的情况下分析设计模型的抽卡概率和分布，是计算抽卡模型的综合概率与期望的大杀器

原神抽卡记录数据集-Genshin Impact gacha data

Related tags

Overview

提要

数据使用说明

数据格式说明

推荐数据处理方式

分析工具说明

Owner

LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating

A simple word search made in python

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Nested Named Entity Recognition

Script to download some free japanese lessons in portuguse from NHK

KoBERT - Korean BERT pre-trained cased (KoBERT)

ChatBotProyect - This is an unfinished project about a simple chatbot.

Enterprise Scale NLP with Hugging Face & SageMaker Workshop series

Blue Brain text mining toolbox for semantic search and structured information extraction

Задания КЕГЭ по информатике 2021 на Python

Official PyTorch implementation of SegFormer

Telegram bot to auto post messages of one channel in another channel as soon as it is posted, without the forwarded tag.

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Sentello is python script that simulates the anti-evasion and anti-analysis techniques used by malware.

A minimal code for fairseq vq-wav2vec model inference.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Exploration of BERT-based models on twitter sentiment classifications

code for modular summarization work published in ACL2021 by Krishna et al