Exploring dimension-reduced embeddings

Last update: Nov 29, 2022

Related tags

Text Data & NLP sleepwalk

Overview

sleepwalk

Exploring dimension-reduced embeddings

This is the code repository. See here for the Sleepwalk web page.

License and disclaimer

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Comments

Error running sleepwalk: cannot open the connection
Dear sleepwalk developers, Thanks a lot for providing such nice method. I could install the package but I get the following error when I tried to run:

> sleepwalk([email protected][email protected], [email protected][email protected]) Estimating 'maxdist' for feature matrix 1 Server has been stopped. Server has been stopped. Error in app$openPage(useViewer, browser) : Timeout waiting for websocket. In addition: Warning messages: 1: In file(con, "r") : cannot open file 'sleepwalk_canvas.html': No such file or directory 2: In func(req) : File '/favicon.ico' is not found

I know this is probably not a sleepwalk specific error, but I couldn't find a solution for this. Any hints/help on how to fix this issue?

Also, I have a question about the output. Besides using the interactive mode to manually inspect cells that might be "misplaced" on the reduced-dimension space, I would like to systematically find the cells that don't quite fit to the clusters they were originally assigned to. In other words, how would you suggest to use sleepwalk to refine my clustering since I suspect that many of my cells were wrongly assigned to their clusters. I am using Seurat package to reduce dimension and clustering.

Thank you very much, Gustavo
opened by gufranca 2
Error: 'browser' must be a non-empty character string
Hello,

After calling the sleepwalk function on a Seurat object, I got this error:

> sleepwalk( as.matrix([email protected][email protected]), as.matrix([email protected][email protected]) ) Estimating 'maxdist' for feature matrix 1 Error in browseURL(str_c("http://localhost:", port, "/", pageobj$startPage), : 'browser' must be a non-empty character string

I have loaded the stringr library (containing the function str_c()), and I cannot find the file originating this error. Can I ask if someone had this problem at some point?

Thank you
opened by PedroRaposo 2
slw_on_selection error when sleepwalk is not attached

Running sleepwalk without attaching the package (i.e., NOT specifying library(sleepwalk)) like this works fine:

sleepwalk::sleepwalk(se[email protected][email protected], t([email protected][[email protected],]))

But the moment you select cells with your mouse, it crashed (browser tab closes) and R gives this error:

Error in slw_on_selection(selPoints, 1) : could not find function "slw_on_selection"

Loading the package using library(sleepwalk) solves the issue, but it'd be nice if it weren't necessary.

opened by FelixTheStudent 0
doc for comparison

The example on the web page for comparing two embeddings still uses the old version where both distances are used concurrently. We also need to change the explanation below to say that the same cell always has the same colour in all embeddings

opened by simon-anders 0
Suggestion: Link embeddings from transposed table

Let say I have e.g. a matrix where I have individuals (cells e.g.) as rows and features as columns, and then run a UMAP on both the ordinary matrix, and the transposed one. Then it would be natural to want to look at the individual UMAP with the default usage (the distances to other individuals), but it would also be interesting to see the features for that individual (and vice versa).

Is it clear what I mean?

opened by StaffanBetner 2

Releases(v0.3.2)

v0.3.2(Sep 17, 2021)
jrc now (v.0.5.0) uses setLimits function for all the security restriction. This update fixes the dependency problem caused by that change.

Source code(tar.gz)
Source code(zip)
v0.3.1(Sep 30, 2020)
broken path to the start page, caused by jrc update fixed

Source code(tar.gz)
Source code(zip)
v.0.3.0(Feb 27, 2020)
New argument metric allows to use angular distance (metric = "cosine") as an alternative to default Euclidean distance (meric = "euclid").

If compare = "distances", it is no longer required to provide several embeddings. If only one embedding is given, it will be used for all the distances.

Source code(tar.gz)
Source code(zip)
v0.2.1(Oct 2, 2019)
Changes due to an update of the jrc package.

Indices of selected points are no longer stored in a variable and can be accessed only via the callback function. Thus, no changes to the global environment are made, unless user specifies them his- or herself.

Added the possibility to pass arguments to jrc::openPage (such as port number or browser in which to open the app.)

Source code(tar.gz)
Source code(zip)
v0.2.0(Sep 27, 2019)
Now HTML Canvas is used to plot the embedding. It makes Sleepwalk faster and allows to simultaneously display more points.

New parameter mode = c("canvas", "svg") is added, that allows user to go back to the old SVG-based version of Sleepwalk app.

Bug in slw_snapshot is fixed. The function no longer returns a list of identical plots, when used with several different embeddings.

Source code(tar.gz)
Source code(zip)

Owner

S. Anders's research group at ZMBH

GitHub Repository https://anders-biostat.github.io/sleepwalk/

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

MCSE: Multimodal Contrastive Learning of Sentence Embeddings This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multi

39 Nov 15, 2022

Translation to python of Chris Sims' optimization function

pycsminwel This is a locol minimization algorithm. Uses a quasi-Newton method with BFGS update of the estimated inverse hessian. It is robust against

1 Mar 21, 2022

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation This repository contains the implementation of the following paper: Live Speech

575 Dec 31, 2022

Faster, modernized fork of the language identification tool langid.py

py3langid py3langid is a fork of the standalone language identification tool langid.py by Marco Lui. Original license: BSD-2-Clause. Fork license: BSD

12 Nov 05, 2022

A PyTorch Implementation of End-to-End Models for Speech-to-Text

speech Speech is an open-source package to build end-to-end models for automatic speech recognition. Sequence-to-sequence models with attention, Conne

647 Dec 25, 2022

Creating an LSTM model to generate music

Music-Generation Creating an LSTM model to generate music music-generator Used to create basic sin wave sounds music-ai Contains the functions to conv

2 Dec 02, 2021

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Feel free to check my the

38.5k Jan 03, 2023

DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex

3 Apr 19, 2022

Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP

Stat4ML Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP This is the first course from our trio courses: Statistics Foundatio

83 Dec 29, 2022

Facilitating the design, comparison and sharing of deep text matching models.

MatchZoo Facilitating the design, comparison and sharing of deep text matching models. MatchZoo 是一个通用的文本匹配工具包，它旨在方便大家快速的实现、比较、以及分享最新的深度文本匹配模型。 🔥 News

3.7k Jan 02, 2023

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

Coreferee Author: Richard Paul Hudson, msg systems ag 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 German 1.2.3 Polish 1

169 Dec 21, 2022

This is a Prototype of an Ai ChatBot "Tea and Coffee Supplier" using python.

Ai-ChatBot-Python A chatbot is an intelligent system which can hold a conversation with a human using natural language in real time. Due to the rise o

1 Oct 30, 2021

Code for the paper "Are Sixteen Heads Really Better than One?"

Are Sixteen Heads Really Better than One? This repository contains code to reproduce the experiments in our paper Are Sixteen Heads Really Better than

143 Dec 14, 2022

One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.

One Stop Anomaly Shop (OSAS) Quick start guide Step 1: Get/build the docker image Option 1: Use precompiled image (might not reflect latest changes):

148 Dec 26, 2022

Outreachy TFX custom component project

Schema Curation Custom Component Outreachy TFX custom component project This repo contains the code for Schema Curation Custom Component made as a par

5 Jul 16, 2021

Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images This repository contains the inference code for COTR. We plan to release the training code

358 Dec 24, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

[ICLR 2021 Spotlight] Pytorch implementation for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

RIDE: Long-tailed Recognition by Routing Diverse Distribution-Aware Experts. by Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu and Stella X. Yu at UC

205 Dec 16, 2022

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 Billion Parameters) on a single 16 GB VRAM V100 Google Cloud instance with Huggingfa

289 Jan 06, 2023

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results

Seq2seq_attn Use the Seq2Seq method to implement machine translation and use the

1 Jun 28, 2022

Exploring dimension-reduced embeddings

Related tags

Overview

sleepwalk

License and disclaimer

Comments

Error running sleepwalk: cannot open the connection

Error: 'browser' must be a non-empty character string

slw_on_selection error when sleepwalk is not attached

doc for comparison

Suggestion: Link embeddings from transposed table

Releases(v0.3.2)

v0.3.2(Sep 17, 2021)

v0.3.1(Sep 30, 2020)

v.0.3.0(Feb 27, 2020)

v0.2.1(Oct 2, 2019)

v0.2.0(Sep 27, 2019)

Owner

S. Anders's research group at ZMBH

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Translation to python of Chris Sims' optimization function

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Faster, modernized fork of the language identification tool langid.py

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Creating an LSTM model to generate music

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP

Facilitating the design, comparison and sharing of deep text matching models.

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

This is a Prototype of an Ai ChatBot "Tea and Coffee Supplier" using python.

Code for the paper "Are Sixteen Heads Really Better than One?"

One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.

Outreachy TFX custom component project

Code release for "COTR: Correspondence Transformer for Matching Across Images"

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

[ICLR 2021 Spotlight] Pytorch implementation for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results