PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Overview

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks.

Code, based on the PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks.

Install Requirements

Tested with python 3.8.

pip install -r requirements.txt

1. Incremental Hierarchical Tensor Rank Learning

1.1 Generating Data

Matrix Completion/Sensing

python matrix_factorization_data_generator.py --task_type completion
  • Setting task_type to "sensing" will generate matrix sensing data.
  • Use the -h flag for information on the customizable run arguments.

Tensor Completion/Sensing

python tensor_sensing_data_generator.py --task_type completion
  • Setting task_type to "sensing" will generate tensor sensing data.
  • Use the -h flag for information on the customizable run arguments.

1.2 Running Experiments

Matrix Factorization

python matrix_factorization_experiments_runner.py \
--dataset_path 
   
     \
--epochs 500000 \
--num_train_samples 2048 \
--outputs_dir "outputs/mf_exps" \
--save_logs \
--save_metric_plots \
--save_checkpoints \
--validate_every 25 \
--save_every_num_val 50 \
--epoch_log_interval 25 \
--train_batch_log_interval -1 

   
  • dataset_path should point to the dataset file generated in the previous step.
  • A folder with checkpoints, metric plots, and a log file will be automatically created under the directory specified by outputs_dir.
  • Use the -h flag for information on the customizable run arguments.

Tensor Factorization

python tensor_factorization_experiments_runner.py \
--dataset_path 
   
     \
--epochs 500000 \
--num_train_samples 2048 \
--outputs_dir "outputs/tf_exps" \
--save_logs \
--save_metric_plots \
--save_checkpoints \
--validate_every 25 \
--save_every_num_val 50 \
--epoch_log_interval 25 \
--train_batch_log_interval -1 

   
  • dataset_path should point to the dataset file generated in the previous step.
  • A folder with checkpoints, metric plots, and a log file will be automatically created under the directory specified by outputs_dir.
  • Use the -h flag for information on the customizable run arguments.

Hierarchical Tensor Factorization

python hierarchical_tensor_factorization_experiments_runner.py \
--dataset_path 
   
     \
--epochs 500000 \
--num_train_samples 2048 \
--outputs_dir "outputs/htf_exps" \
--save_logs \
--save_metric_plots \
--save_checkpoints \
--validate_every 25 \
--save_every_num_val 50 \
--epoch_log_interval 25 \
--train_batch_log_interval -1 

   
  • dataset_path should point to the dataset file generated in the previous step.
  • A folder with checkpoints, metric plots, and a log file will be automatically created under the directory specified by outputs_dir.
  • Use the -h flag for information on the customizable run arguments.

1.3 Plotting Results

Plotting metrics against the number of iterations for an experiment (or multiple experiments) can be done by:

python dynamical_analysis_results_multi_plotter.py \
--plot_config_path 
   

   
  • plot_config_path should point to a file with the plot configuration. For example, plot_configs/mf_tf_htf_dyn_plot_config.json is the configuration used to create the plot below. To run it, it suffices to fill in the checkpoint_path fields (checkpoints are created during training inside the respective experiment's folder).

Example plot:

2. Countering Locality Bias of Convolutional Networks via Regularization

2.1. Is Same Class

2.1.1 Generating Data

Generating train data is done by running:

python is_same_class_data_generator.py --train --num_samples 5000

For test data use:

python is_same_class_data_generator.py --num_samples 10000
  • Use the output_dir argument to set the output directory in which the datasets will be saved (default is ./data/is_same).
  • The flag train determines whether to generate the dataset using the train or test set of the original dataset.
  • Specify num_samples to set how many samples to generate.
  • Use the -h flag for information on the customizable run arguments.

2.1.2 Running Experiments

python is_same_class_experiments_runner.py \
--train_dataset_path 
   
     \
--test_dataset_path 
    
      \
--epochs 150 \
--outputs_dir "outputs/is_same_exps" \
--save_logs \
--save_metric_plots \
--save_checkpoints \
--validate_every 1 \
--save_every_num_val 1 \
--epoch_log_interval 1 \
--train_batch_log_interval 50 \
--stop_on_perfect_train_acc \
--stop_on_perfect_train_acc_patience 20 \
--model resnet18 \
--distance 0 \
--grad_change_reg_coeff 0

    
   
  • train_dataset_path and test_dataset_path are the paths of the train and test dataset files, respectively.
  • A folder with checkpoints, metric plots, and a log file will be automatically created under the directory specified by outputs_dir.
  • Use the -h flag for information on the customizable run arguments.

2.1.3 Plotting Results

Plotting different regularization options against the task difficulty can be done by:

\ --error_bars_opacity 0.5 ">
python locality_bias_plotter.py \
--experiments_dir 
   
     \
--experiment_groups_dir_names 
     
     
       .. \
--per_experiment_group_y_axis_value_name 
       
       
         .. \ --per_experiment_group_label 
         
         
           .. \ --x_axis_value_name "distance" \ --plot_title "Is Same Class" \ --x_label "distance between images" \ --y_label "test accuracy (%)" \ --save_plot_to 
          
            \ --error_bars_opacity 0.5 
          
         
        
       
      
     
    
   
  • Set experiments_dir to the directory containing the experiments you would like to plot.
  • Specify after experiment_groups_dir_names the names of the experiment groups, each group name should correspond to a sub-directory with the group name under experiments_dir path.
  • Use per_experiment_group_y_axis_value_name to name the report value for each experiment. Name should match key in experiment's summary.json files. Use dot notation for nested keys.
  • per_experiment_group_label sets a label for the groups by the same order they were mentioned.
  • save_plot_to is the path to save the plot at.
  • Use x_axis_value_name to set the name of the value to use as the x-axis. This should match to a key in either summary.json or config.json files. Use dot notation for nested keys.
  • Use the -h flag for information on the customizable run arguments.

Example plots:

2.2. Pathfinder

2.2.1 Generating Data

To generate Pathfinder datasets, first run the following command to create raw image samples for all specified path lengths:

python pathfinder_raw_images_generator.py \
--num_samples 20000 \
--path_lengths 3 5 7 9
  • Use the output_dir argument to set the output directory in which the raw samples will be saved (default is ./data/pathfinder/raw).
  • The samples for each path length are separated to different directories.
  • Use the -h flag for information on the customizable run arguments.

Then, use the following command to create the dataset files for all path lengths (one dataset per length):

python pathfinder_data_generator.py \
--dataset_path data/pathfinder/raw \
--num_train_samples 10000 \
--num_test_samples 10000
  • dataset_path is the path to the directory of the raw images.
  • Use the output_dir argument to set the output directory in which the datasets will be saved (default is ./data/pathfinder).
  • Use the -h flag for information on the customizable run arguments.

2.2.2 Running Experiments

python pathfinder_experiments_runner.py \
--dataset_path 
   
     \
--epochs 150 \
--outputs_dir "outputs/pathfinder_exps" \
--save_logs \
--save_metric_plots \
--save_checkpoints \
--validate_every 1 \
--save_every_num_val 1 \
--epoch_log_interval 1 \
--train_batch_log_interval 50 \
--stop_on_perfect_train_acc \
--stop_on_perfect_train_acc_patience 20 \
--model resnet18 \
--grad_change_reg_coeff 0

   
  • dataset_path should point to the dataset file generated in the previous step.
  • A folder with checkpoints, metric plots, and a log file will be automatically created under the directory specified by outputs_dir.
  • Use the -h flag for information on the customizable run arguments.

2.2.3 Plotting Results

Plotting different regularization options against the task difficulty can be done by:

\ --error_bars_opacity 0.5">
python locality_bias_plotter.py \
--experiments_dir 
   
     \
--experiment_groups_dir_names 
     
     
       .. \
--per_experiment_group_y_axis_value_name 
       
       
         .. \ --per_experiment_group_label 
         
         
           .. \ --x_axis_value_name "dataset_path" \ --plot_title "Pathfinder" \ --x_label "path length" \ --y_label "test accuracy (%)" \ --x_axis_ticks 3 5 7 9 \ --save_plot_to 
          
            \ --error_bars_opacity 0.5 
          
         
        
       
      
     
    
   
  • Set experiments_dir to the directory containing the experiments you would like to plot.
  • Specify after experiment_groups_dir_names the names of the experiment groups, each group name should correspond to a sub-directory with the group name under experiments_dir path.
  • Use per_experiment_group_y_axis_value_name to name the report value for each experiment. Name should match key in experiment's summary.json files. Use dot notation for nested keys.
  • per_experiment_group_label sets a label for the groups by the same order they were mentioned.
  • save_plot_to is the path to save the plot at.
  • Use x_axis_value_name to set the name of the value to use as the x-axis. This should match to a key in either summary.json or config.json files. Use dot notation for nested keys.
  • Use the -h flag for information on the customizable run arguments.

Example plots:

Citation

For citing the paper, you can use:

@article{razin2022implicit,
  title={Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks},
  author={Razin, Noam and Maman, Asaf and Cohen, Nadav},
  journal={arXiv preprint arXiv:2201.11729},
  year={2022}
}
Owner
Asaf
MS.c Student Computer Science
Asaf
High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

Image Completion Transformer (ICT) Project Page | Paper (ArXiv) | Pre-trained Models | Supplemental Material This repository is the official pytorch i

Ziyu Wan 243 Jan 03, 2023
CVPR2020 Counterfactual Samples Synthesizing for Robust VQA

CVPR2020 Counterfactual Samples Synthesizing for Robust VQA This repo contains code for our paper "Counterfactual Samples Synthesizing for Robust Visu

72 Dec 22, 2022
A Comparative Framework for Multimodal Recommender Systems

Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia

Preferred.AI 671 Jan 03, 2023
R3Det based on mmdet 2.19.0

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Installation # install mmdetection first if you haven't installed it

SJTU-Thinklab-Det 38 Dec 15, 2022
DLWP: Deep Learning Weather Prediction

DLWP: Deep Learning Weather Prediction DLWP is a Python project containing data-

Kushal Shingote 3 Aug 14, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Official TensorFlow implementation of the unsupervised reconstruction model using zero-Shot Learned Adversarial TransformERs (SLATER). (https://arxiv.

ICON Lab 22 Dec 22, 2022
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
MIRACLE (Missing data Imputation Refinement And Causal LEarning)

MIRACLE (Missing data Imputation Refinement And Causal LEarning) Code Author: Trent Kyono This repository contains the code used for the "MIRACLE: Cau

van_der_Schaar \LAB 15 Dec 29, 2022
[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".

SMYRF: Efficient attention using asymmetric clustering Get started: Abstract We propose a novel type of balanced clustering algorithm to approximate a

Giannis Daras 46 Dec 22, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)

Fast Axiomatic Attribution for Neural Networks This is the official repository accompanying the NeurIPS 2021 paper: R. Hesse, S. Schaub-Meyer, and S.

Visual Inference Lab @TU Darmstadt 11 Nov 21, 2022
PyTorch implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy

Anomaly Transformer in PyTorch This is an implementation of Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. This pape

spencerbraun 160 Dec 19, 2022
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.

Microsoft 7.6k Jan 01, 2023
Hunt down social media accounts by username across social networks

Hunt down social media accounts by username across social networks Installation | Usage | Docker Notes | Contributing Installation # clone the repo $

1 Dec 14, 2021
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Real-time stock predictions with deep learning and news scraping This repository contains a partial implementation of my bachelor's thesis "Real-time

David Álvarez de la Torre 0 Feb 09, 2022
PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis

Mehmet Kerim Yucel 5 Feb 04, 2022
An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Fast Transformer This repo implements Fastformer: Additive Attention Can Be All You Need by Wu et al. in TensorFlow. Fast Transformer is a Transformer

Rishit Dagli 139 Dec 28, 2022