Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Last update: Nov 21, 2022

Overview

BigGAN Audio Visualizer

Description

This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generate and interpolate between noise/class vector inputs to the model. Classes are chosen manually or optionally using semantic similarity on BERT encodings of a lyrics corpus.

Usage:

usage: visualize.py [-h] -s SONG [--resolution {128,256,512}] [-d DURATION]
               [-ps [200-295]] [-ts [0.05-0.8]]
               [--classes CLASSES [CLASSES ...]] [-n NUM_CLASSES]
               [--jitter [0-1]] [--frame_length i*2^6] [--truncation [0.1-1]]
               [--smooth_factor [10-30]] [--batch_size BATCH_SIZE]
               [-o OUTPUT_FILE] [--use_last_vectors] [--use_last_classes]
               [-l LYRICS]

Arguments

short	long	default	range	help
`-h`	`--help`			show this help message and exit
`-s`	`--song`	`input/romantic.mp3`		path to input audio file
	`--resolution`	`512`	`{128,256,512}`	output video resolution
`-d`	`--duration`	`None`		output video duration
`-ps`	`--pitch_sensitivity`	`220`	`[200-295]`	controls the sensitivity of the class vector to changes in pitch
`-ts`	`--tempo_sensitivity`	`0.25`	`[0.05-0.8]`	controls the sensitivity of the noise vector to changes in volume and tempo
	`--classes`	`None`		manually specify [--num_classes] ImageNet classes
`-n`	`--num_classes`	`12`		number of unique classes to use
	`--jitter`	`0.5`	`[0-1]`	controls jitter of the noise vector to reduce repitition
	`--frame_length`	`512`	`i*2^6`	number of audio frames to video frames in the output
	`--truncation`	`1`	`[0.1-1]`	BigGAN truncation parameter controls complexity of structure within frames
	`--smooth_factor`	`20`	`[10-30]`	controls interpolation between class vectors to smooth rapid flucations
	`--batch_size`	`30`		BigGAN batch_size
`-o`	`--output_file`			name of output file stored in output/, defaults to [--song] path base_name
	`--use_last_vectors`	`False`		set flag to use previous saved class/noise vectors
	`--use_last_classes`	`False`		set flag to use previous classes
`-l`	`--lyrics`	`None`		path to lyrics file; setting [--lyrics LYRICS] computes classes by semantic similarity under BERT encodings

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Related tags

Overview

BigGAN Audio Visualizer

Description

Usage:

Arguments

Owner

Rush Kapoor

A pytorch-based real-time segmentation model for autonomous driving

code for Fast Point Cloud Registration with Optimal Transport

Open source Python module for computer vision

How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

TabNet for fastai

Chunkmogrify: Real image inversion via Segments

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Global-Local Attention for Emotion Recognition

ReAct: Out-of-distribution Detection With Rectified Activations

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

Uncertain natural language inference

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Share a benchmark that can easily apply reinforcement learning in Job-shop-scheduling

K-Nearest Neighbor in Pytorch

Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

We simulate traveling back in time with a modern camera to rephotograph famous historical subjects.

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.