The Submission for SIMMC 2.0 Challenge 2021

Last update: Jul 26, 2022

Related tags

Deep Learning simmc2.0

Overview

The Submission for SIMMC 2.0 Challenge 2021

challenge website

Requirements

python 3.8.8
pytorch 1.8.1
transformers 4.8.2
apex for multi-gpu
nltk

Preprocessing

Download Data

Download the data provided by the challenge organizer and put it in the data folder.
Unzip data files

Image saving

Preprocess the image files in advance. The preprocessed result has the image name as the key and visual as the value.

python3 image_preprocessor.py
python3 image_preprocessor_final.py

The result(.pickle) is saved in res folder.

Step 1 (ITM)

First, the model is post-trained by image-to-text matching. Here, image is each object and text is the visual metadata of the object. Code is provided in the ITM folder.

Step 2 (BTM)

Second, pretraining is performed to use background reprsentation of image in subtasks. Similar to ITM, it is trained to match image and text, and the image is the background of the dialog and the text is the entire context of the dialog. Code is provided in the BTM folder.

Step 3

This is the learning process for each subtask. You can train the model in each folder (sub1, sub2_1, sub2_2, sub2_3, sub2_4, sub4).

Model

All models can be downloaded from the following link

model.pt is a model for evaluating devtest, and the result is saved in the dstc10-simmc-entry folder. model_final.pt is a model for evaluating teststd, and the result is saved in the dstc10-simmc-final-entry folder. However, the training of the model was not completed within the challenge period, so we inferred to model.pt for the teststd data in subtask2.

Evlauation

Using the evaluation script suggested by the challenge organizer

The SIMMC organizers introduce the scripts:

(line-by-line evaluation) $ python -m gpt2_dst.scripts.evaluate \ --input_path_target={PATH_TO_GROUNDTRUTH_TARGET} \ --input_path_predicted={PATH_TO_MODEL_PREDICTIONS} \ --output_path_report={PATH_TO_REPORT} (Or, dialog level evaluation) $ python -m utils.evaluate_dst \ --input_path_target={PATH_TO_GROUNDTRUTH_TARGET} \ --input_path_predicted={PATH_TO_MODEL_PREDICTIONS} \ --output_path_report={PATH_TO_REPORT} $ python tools/response_evaluation.py \ --data_json_path={PATH_TO_GOLD_RESPONSES} \ --model_response_path={PATH_TO_MODEL_RESPONSES} \ --single_round_evaluation $ python tools/retrieval_evaluation.py \ --retrieval_json_path={PATH_TO_GROUNDTRUTH_RETRIEVAL} \ --model_score_path={PATH_TO_MODEL_CANDIDATE_SCORES} \ --single_round_evaluation ">


     
      
$ python tools/disambiguator_evaluation.py \
	--pred_file="{PATH_TO_PRED_FILE}" \
	--test_file="{PATH_TO_TEST_FILE}" \


      
       
(line-by-line evaluation)
$ python -m gpt2_dst.scripts.evaluate \
  --input_path_target={PATH_TO_GROUNDTRUTH_TARGET} \
  --input_path_predicted={PATH_TO_MODEL_PREDICTIONS} \
  --output_path_report={PATH_TO_REPORT}

(Or, dialog level evaluation)
$ python -m utils.evaluate_dst \
    --input_path_target={PATH_TO_GROUNDTRUTH_TARGET} \
    --input_path_predicted={PATH_TO_MODEL_PREDICTIONS} \
    --output_path_report={PATH_TO_REPORT}
    

       
        
$ python tools/response_evaluation.py \
    --data_json_path={PATH_TO_GOLD_RESPONSES} \
    --model_response_path={PATH_TO_MODEL_RESPONSES} \
    --single_round_evaluation


        
         
$ python tools/retrieval_evaluation.py \
    --retrieval_json_path={PATH_TO_GROUNDTRUTH_RETRIEVAL} \
    --model_score_path={PATH_TO_MODEL_CANDIDATE_SCORES} \
    --single_round_evaluation

DevTest Results

Subtask #1: Multimodal Disambiguation

Test Method	Accuracy
GPT2 from CO(Challenge Organizer)	73.9
Ours	92.28

Subtask #2: Multimodal Coreference Resolution

Test Method	Object F1
GPT2 from CO	0.366
Ours-1 (sub2_1)	0.595
Ours-2 (sub2_2)	0.604
Ours-3 (sub2_3)	0.607
Ours-4 (sub2_4)	0.608

Subtask #3: Multimodal Dialog State Tracking

No Training/Testing

Subtask #4: Multimodal Dialog Response Generation

Generation

Baseline	BLEU
GPT2 from CO	0.192
MTN-SIMMC2 from CO	0.217
Ours	0.285

Retrieval

No Training/Testing

The Submission for SIMMC 2.0 Challenge 2021

Related tags

Overview

The Submission for SIMMC 2.0 Challenge 2021

Requirements

Preprocessing

Step 1 (ITM)

Step 2 (BTM)

Step 3

Model

Evlauation

DevTest Results

Owner

《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

Config files for my GitHub profile.

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"

Code, pre-trained models and saliency results for the paper "Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images".

Machine Learning automation and tracking

A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Fast Neural Representations for Direct Volume Rendering

Datasets, Transforms and Models specific to Computer Vision

A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Implementation of Bidirectional Recurrent Independent Mechanisms (Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules)

Study of human inductive biases in CNNs and Transformers.

A robust pointcloud registration pipeline based on correlation.

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Code for our CVPR 2022 Paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection"

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Code for 1st place solution in Sleep AI Challenge SNU Hospital

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

Official repository for the paper "Instance-Conditioned GAN"