Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

Related tags

Deep Learningsimmc2
Overview

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021

Welcome to the Second Situated Interactive Multimodal Conversations (SIMMC 2.0) Track for DSTC10 2021.

The SIMMC challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. Similar to the First SIMMC challenge (as part of DSTC9), we focus on the task-oriented dialogs that encompass a situated multimodal user context in the form of a co-observed & immersive virtual reality (VR) environment. The conversational context is dynamically updated on each turn based on the user actions (e.g. via verbal interactions, navigation within the scene). For this challenge, we release a new Immersive SIMMC 2.0 dataset in the shopping domains: furniture and fashion.

Organizers: Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ahmad Beirami, Babak Damavandi, Alborz Geramifard

Example from SIMMC

Example from SIMMC-Furniture Dataset

Latest News

  • [June 14, 2021] Challenge announcement. Training / development datasets (SIMMC v2.0) are released.

Important Links

Timeline

Date Milestone
June 14, 2021 Training & development data released
Sept 24, 2021 Test-Std data released, End of Challenge Phase 1
Oct 1, 2021 Entry submission deadline, End of Challenge Phase 2
Oct 8, 2021 Final results announced

Track Description

Tasks and Metrics

We present four sub-tasks primarily aimed at replicating human-assistant actions in order to enable rich and interactive shopping scenarios.

Sub-Task #1 Multimodal Disambiguation
Goal To classify if the assistant should disambiguate in the next turn
Input Current user utterance, Dialog context, Multimodal context
Output Binary label
Metrics Binary classification accuracy
Sub-Task #2 Multimodal Coreference Resolution
Goal To resolve referent objects to thier canonical ID(s) as defined by the catalog.
Input Current user utterance with objection mentions, Dialog context, Multimodal context
Output Canonical object IDs
Metrics Coref F1 / Precision / Recall
Sub-Task #3 Multimodal Dialog State Tracking (MM-DST)
Goal To track user belief states across multiple turns
Input Current user utterance, Dialogue context, Multimodal context
Output Belief state for current user utterance
Metrics Slot F1, Intent F1
Sub-Task #4 Multimodal Dialog Response Generation & Retrieval
Goal To generate Assistant responses or retrieve from a candidate pool
Input Current user utterance, Dialog context, Multimodal context, (Ground-truth API Calls)
Output Assistant response utterance
Metrics Generation: BLEU-4, Retrieval: MRR, [email protected], [email protected], [email protected], Mean Rank

Please check the task input file for a full description of inputs for each subtask.

Evaluation

For the DSTC10 SIMMC Track, we will do a two phase evaluation as follows.

Challenge Period 1: Participants will evaluate the model performance on the provided devtest set. At the end of Challenge Period 1 (Sept 24), we ask participants to submit their model prediction results and a link to their code repository.

Challenge Period 2: A test-std set will be released on Sept 28 for the participants who submitted the results for the Challenge Period 1. We ask participants to submit their model predictions on the test-std set by Oct 1. We will announce the final results and the winners on Oct 8.

Challenge Instructions

(1) Challenge Registration

  • Fill out this form to register at DSTC10. Check “Track 3: SIMMC 2.0: Situated Interactive Multimodal Conversational AI” along with other tracks you are participating in.

(2) Download Datasets and Code

  • Irrespective of participation in the challenge, we'd like to encourge those interested in this dataset to complete this optional survey. This will also help us communicate any future updates on the codebase, the datasets, and the challenge track.

  • Git clone our repository to download the datasets and the code. You may use the provided baselines as a starting point to develop your models.

$ git lfs install
$ git clone https://github.com/facebookresearch/simmc2.git

(3) Reporting Results for Challenge Phase 1

  • Submit your model prediction results on the devtest set, following the submission instructions.
  • We will release the test-std set (with ground-truth labels hidden) on Sept 24.

(4) Reporting Results for Challenge Phase 2

  • Submit your model prediction results on the test-std set, following the submission instructions.
  • We will evaluate the participants’ model predictions using the same evaluation script for Phase 1, and announce the results.

Contact

Questions related to SIMMC Track, Data, and Baselines

Please contact [email protected], or leave comments in the Github repository.

DSTC Mailing List

If you want to get the latest updates about DSTC10, join the DSTC mailing list.

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

@article{kottur2021simmc,
  title={SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations},
  author={Kottur, Satwik and Moon, Seungwhan and Geramifard, Alborz and Damavandi, Babak},
  journal={arXiv preprint arXiv:2104.08667},
  year={2021}
}

NOTE: The paper above describes in detail the datasets, the collection process, and some of the baselines we provide in this challenge. The paper reports the results from an earlier version of the dataset and with different train-dev-test splits, hence the baseline performances on the challenge resources will be slightly different.

License

SIMMC 2.0 is released under CC-BY-NC-SA-4.0, see LICENSE for details.

Owner
Facebook Research
Facebook Research
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training The Unreasonable Effectiveness of

VITA 44 Dec 23, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 07, 2023
GDSC-ML Team Interview Task

GDSC-ML-Team---Interview-Task Task 1 : Clean or Messy room In this task we have to classify the given test images as clean or messy. - Link for datase

Aayush. 1 Jan 19, 2022
O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

O-CNN This repository contains the implementation of our papers related with O-CNN. The code is released under the MIT license. O-CNN: Octree-based Co

Microsoft 607 Dec 28, 2022
Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

64 Dec 20, 2022
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
🐾 Semantic segmentation of paws from cute pet images (PyTorch)

🐾 paw-segmentation 🐾 Semantic segmentation of paws from cute pet images 🐾 Semantic segmentation of paws from cute pet images (PyTorch) 🐾 Paw Segme

Zabir Al Nazi Nabil 3 Feb 01, 2022
AI that generate music

PianoGPT ai that generate music try it here https://share.streamlit.io/annasajkh/pianogpt/main/main.py or here https://huggingface.co/spaces/Annas/Pia

Annas 28 Nov 27, 2022
Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Privacy-Aware Inverse RL (PRIL) Analysis Framework Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based

1 Dec 06, 2021
DP-CL(Continual Learning with Differential Privacy)

DP-CL(Continual Learning with Differential Privacy) This is the official implementation of the Continual Learning with Differential Privacy. If you us

Phung Lai 3 Nov 04, 2022
Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

Sun Ran 1 May 18, 2022
Neural style in TensorFlow! 🎨

neural-style An implementation of neural style in TensorFlow. This implementation is a lot simpler than a lot of the other ones out there, thanks to T

Anish Athalye 5.5k Dec 29, 2022
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 125 Dec 31, 2022
An Unbiased Learning To Rank Algorithms (ULTRA) toolbox

Unbiased Learning to Rank Algorithms (ULTRA) This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiment

back 3 Nov 18, 2022
An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Sketch Simulator An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics. See

12 Dec 18, 2022
Convert Pytorch model to onnx or tflite, and the converted model can be visualized by Netron

Convert Pytorch model to onnx or tflite, and the converted model can be visualized by Netron

Roxbili 5 Nov 19, 2022
Machine Learning automation and tracking

The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea

873 Jan 04, 2023
Human Pose estimation with TensorFlow framework

Human Pose Estimation with TensorFlow Here you can find the implementation of the Human Body Pose Estimation algorithm, presented in the DeeperCut and

Eldar Insafutdinov 1.1k Dec 29, 2022
The implementation of the lifelong infinite mixture model

Lifelong infinite mixture model 📋 This is the implementation of the Lifelong infinite mixture model 📋 Accepted by ICCV 2021 Title : Lifelong Infinit

Fei Ye 5 Oct 20, 2022
Official implementation of Monocular Quasi-Dense 3D Object Tracking

Monocular Quasi-Dense 3D Object Tracking Monocular Quasi-Dense 3D Object Tracking (QD-3DT) is an online framework detects and tracks objects in 3D usi

Visual Intelligence and Systems Group 441 Dec 20, 2022