ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos

Last update: Dec 29, 2022

Overview

ComPhy

This repository holds the code for the paper.

ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos, (Under review)

PDF

Project Website

Framework

Code Preparation

git clone https://github.com/comphyreasoning/compositional_physics_learner.git

Installation

pip install -r requirements

Data Preparation

Download videos, video annotation, questions from the project website.

Fast Evaluation

Download the regional proposals with attribute and physical property prediction from the anonymous Google drive
Download the dynamic predictions from the anonymous Google drive
Run executor for factual questions.

sh scripts/test_oe_release.sh

Run executor for multiple-choice questions.

sh scripts/test_mc_release.sh

Supporting sub-modules

Physical Property Learner and Dynamic predictor

Please refer to this repo for property learning and dynamics prediction.

Perception

This module uses the public NS-VQA's perception module object detection and visual attribute extraction.

Program parser

This module uses the public NS-VQA's program parser module to tranform language into executable programs.

ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos

Related tags

Overview

ComPhy

Framework

Code Preparation

Installation

Data Preparation

Fast Evaluation

Supporting sub-modules

Physical Property Learner and Dynamic predictor

Perception

Program parser

Owner

Trajectory Variational Autoencder baseline for Multi-Agent Behavior challenge 2022

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

pytorch implementation for PointNet

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This is a library for training and applying sparse fine-tunings with torch and transformers.

Transfer Learning Remote Sensing

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

You Only Look Once for Panopitic Driving Perception

python 93% acc. CNN Dogs Vs Cats ( Pytorch )

Deep Reinforcement Learning for Multiplayer Online Battle Arena

Cowsay - A rewrite of cowsay in python

This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit

Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

Creating Artificial Life with Reinforcement Learning

Perform Linear Classification with Multi-way Data

🔮 Execution time predictions for deep neural network training iterations across different GPUs.

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

maximal update parametrization (µP)

An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches