Release of PromptSRC with pretrained models.

2023-07-13 23:43:31 +05:00
commit 8be7dcff6b
132 changed files with 106641 additions and 0 deletions
--- a/docs/Co-CoOp.md
+++ b/docs/Co-CoOp.md
@@ -0,0 +1,99 @@
+# Conditional Prompt Learning for Vision-Language Models (Co-CoOp, CVPR'22)
+[![paper](https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg)](https://arxiv.org/abs/2203.05557)
+
+We provide the scripts in [scripts/cocoop](../scripts/cocoop) to reproduce Co-CoOp results (CVPR'22).
+
+Make sure to configure the dataset paths in environment variable `DATA` and run the commands from the main directory `PromptSRC/`.
+
+## Generalization From Base to New Classes
+
+This corresponds to the experiments in Section 4.1, i.e., Table 1.
+
+You will need both `scripts/cocoop/base2new_train.sh` and `scripts/cocoop/base2new_test.sh`. The former trains a model on bash classes while the latter evaluates the trained model on new classes. Both scripts have two input arguments, i.e., `DATASET` and `SEED`.
+
+`DATASET` takes as input a dataset name, like `imagenet` or `caltech101`. The valid names are the files' names in `CoOp/configs/datasets/`.
+
+Below we provide an example on how to evaluate the model on ImageNet.
+
+```bash
+# seed=1
+bash scripts/cocoop/base2new_train.sh imagenet 1
+bash scripts/cocoop/base2new_test.sh imagenet 1
+
+# seed=2
+bash scripts/cocoop/base2new_train.sh imagenet 2
+bash scripts/cocoop/base2new_test.sh imagenet 2
+
+# seed=3
+bash scripts/cocoop/base2new_train.sh imagenet 3
+bash scripts/cocoop/base2new_test.sh imagenet 3
+```
+
+When the evaluation is done, you can use `parse_test_res.py` to automatically calculate the average results. For instance, after you finish the evaluation (including `base2new_train.sh` and `base2new_test.sh`) on ImageNet using the aforementioned commands, you would get
+
+```
+output
+|–– base2new/
+|   |–– test_new/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– CoCoOp/
+|   |   |   |   |   |–– vit_b16_c4_ep10_batch1_ctxv1/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+|   |–– train_base/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– CoCoOp/
+|   |   |   |   |   |–– vit_b16_c4_ep10_batch1_ctxv1/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+```
+
+Then, to get the average performance on the base classes, run
+
+```bash
+python parse_test_res.py output/base2new/train_base/imagenet/shots_16/CoCoOp/vit_b16_c4_ep10_batch1_ctxv1
+```
+
+To get the average performance on the new classes, run
+
+```bash
+python parse_test_res.py output/base2new/test_new/imagenet/shots_16/CoCoOp/vit_b16_c4_ep10_batch1_ctxv1 --test-log
+```
+
+## Cross-Dataset Transfer
+
+This corresponds to the experiments in Section 4.2, i.e., Table 2.
+
+The relevant scripts are `scripts/cocoop/xd_train.sh` and `scripts/cocoop/xd_test.sh` where the `DATASET` variable is set to the default, namely `imagenet`. To train the model, run
+
+```bash
+# seed=1
+bash scripts/cocoop/xd_train.sh 1
+
+# seed=2
+bash scripts/cocoop/xd_train.sh 2
+
+# seed=3
+bash scripts/cocoop/xd_train.sh 3
+```
+
+Then, you evaluate the model on other datasets, e.g.,
+
+```bash
+for SEED in 1 2 3
+do
+    bash scripts/cocoop/xd_test.sh caltech101 ${SEED}
+    bash scripts/cocoop/xd_test.sh oxford_pets ${SEED}
+    bash scripts/cocoop/xd_test.sh stanford_cars ${SEED}
+done
+```
+
+## Domain Generalization
+
+This corresponds to the experiments in Section 4.3, i.e., Table 3.
+
+The steps are similar to those discussed in "Cross-Dataset Transfer" except you evaluate the model on the variants of ImageNet, i.e., `imagenetv2`, `imagenet_sketch`, `imagenet_a` and `imagenet_r`.
--- a/docs/CoOp.md
+++ b/docs/CoOp.md
@@ -0,0 +1,99 @@
+# Conditional Prompt Learning for Vision-Language Models (Co-CoOp, CVPR'22)
+[![paper](https://img.shields.io/badge/arXiv-Paper-<COLOR>.svg)](https://arxiv.org/abs/2203.05557)
+
+We provide the scripts in [scripts/cocoop](../scripts/cocoop) to reproduce Co-CoOp results (CVPR'22).
+
+Make sure to configure the dataset paths in environment variable `DATA` and run the commands from the main directory `PromptSRC/`.
+
+## Generalization From Base to New Classes
+
+This corresponds to the experiments in Section 4.1, i.e., Table 1.
+
+You will need both `scripts/cocoop/base2new_train.sh` and `scripts/cocoop/base2new_test.sh`. The former trains a model on bash classes while the latter evaluates the trained model on new classes. Both scripts have two input arguments, i.e., `DATASET` and `SEED`.
+
+`DATASET` takes as input a dataset name, like `imagenet` or `caltech101`. The valid names are the files' names in `CoOp/configs/datasets/`.
+
+Below we provide an example on how to evaluate the model on ImageNet.
+
+```bash
+# seed=1
+bash scripts/cocoop/base2new_train.sh imagenet 1
+bash scripts/cocoop/base2new_test.sh imagenet 1
+
+# seed=2
+bash scripts/cocoop/base2new_train.sh imagenet 2
+bash scripts/cocoop/base2new_test.sh imagenet 2
+
+# seed=3
+bash scripts/cocoop/base2new_train.sh imagenet 3
+bash scripts/cocoop/base2new_test.sh imagenet 3
+```
+
+When the evaluation is done, you can use `parse_test_res.py` to automatically calculate the average results. For instance, after you finish the evaluation (including `base2new_train.sh` and `base2new_test.sh`) on ImageNet using the aforementioned commands, you would get
+
+```
+output
+|–– base2new/
+|   |–– test_new/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– CoCoOp/
+|   |   |   |   |   |–– vit_b16_c4_ep10_batch1_ctxv1/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+|   |–– train_base/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– CoCoOp/
+|   |   |   |   |   |–– vit_b16_c4_ep10_batch1_ctxv1/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+```
+
+Then, to get the average performance on the base classes, run
+
+```bash
+python parse_test_res.py output/base2new/train_base/imagenet/shots_16/CoCoOp/vit_b16_c4_ep10_batch1_ctxv1
+```
+
+To get the average performance on the new classes, run
+
+```bash
+python parse_test_res.py output/base2new/test_new/imagenet/shots_16/CoCoOp/vit_b16_c4_ep10_batch1_ctxv1 --test-log
+```
+
+## Cross-Dataset Transfer
+
+This corresponds to the experiments in Section 4.2, i.e., Table 2.
+
+The relevant scripts are `scripts/cocoop/xd_train.sh` and `scripts/cocoop/xd_test.sh` where the `DATASET` variable is set to the default, namely `imagenet`. To train the model, run
+
+```bash
+# seed=1
+bash scripts/cocoop/xd_train.sh 1
+
+# seed=2
+bash scripts/cocoop/xd_train.sh 2
+
+# seed=3
+bash scripts/cocoop/xd_train.sh 3
+```
+
+Then, you evaluate the model on other datasets, e.g.,
+
+```bash
+for SEED in 1 2 3
+do
+    bash scripts/cocoop/xd_test.sh caltech101 ${SEED}
+    bash scripts/cocoop/xd_test.sh oxford_pets ${SEED}
+    bash scripts/cocoop/xd_test.sh stanford_cars ${SEED}
+done
+```
+
+## Domain Generalization
+
+This corresponds to the experiments in Section 4.3, i.e., Table 3.
+
+The steps are similar to those discussed in "Cross-Dataset Transfer" except you evaluate the model on the variants of ImageNet, i.e., `imagenetv2`, `imagenet_sketch`, `imagenet_a` and `imagenet_r`.
--- a/docs/DATASETS.md
+++ b/docs/DATASETS.md
@@ -0,0 +1,233 @@
+# How to install datasets
+
+### Acknowledgement: This readme file for installing datasets has been borrowed directly from [MaPLe's](https://github.com/muzairkhattak/multimodal-prompt-learning) official repository.
+
+We recommend putting all datasets under the same folder (say `$DATA`) to ease management and following the instructions below to organize datasets to avoid modifying the source code. The file structure should look like:
+
+```
+$DATA/
+|–– imagenet/
+|–– caltech-101/
+|–– oxford_pets/
+|–– stanford_cars/
+```
+
+If you have some datasets already installed somewhere else, you can create symbolic links in `$DATA/dataset_name` that point to the original data to avoid duplicate download.
+
+Datasets list:
+- [ImageNet](#imagenet)
+- [Caltech101](#caltech101)
+- [OxfordPets](#oxfordpets)
+- [StanfordCars](#stanfordcars)
+- [Flowers102](#flowers102)
+- [Food101](#food101)
+- [FGVCAircraft](#fgvcaircraft)
+- [SUN397](#sun397)
+- [DTD](#dtd)
+- [EuroSAT](#eurosat)
+- [UCF101](#ucf101)
+- [ImageNetV2](#imagenetv2)
+- [ImageNet-Sketch](#imagenet-sketch)
+- [ImageNet-A](#imagenet-a)
+- [ImageNet-R](#imagenet-r)
+
+The instructions to prepare each dataset are detailed below. To ensure reproducibility and fair comparison for future work, we provide fixed train/val/test splits for all datasets except ImageNet where the validation set is used as test set. The fixed splits are either from the original datasets (if available) or created by us.
+
+### ImageNet
+- Create a folder named `imagenet/` under `$DATA`.
+- Create `images/` under `imagenet/`.
+- Download the dataset from the [official website](https://image-net.org/index.php) and extract the training and validation sets to `$DATA/imagenet/images`. The directory structure should look like
+```
+imagenet/
+|–– images/
+|   |–– train/ # contains 1,000 folders like n01440764, n01443537, etc.
+|   |–– val/
+```
+- If you had downloaded the ImageNet dataset before, you can create symbolic links to map the training and validation sets to `$DATA/imagenet/images`.
+- Download the `classnames.txt` to `$DATA/imagenet/` from this [link](https://drive.google.com/file/d/1-61f_ol79pViBFDG_IDlUQSwoLcn2XXF/view?usp=sharing). The class names are copied from [CLIP](https://github.com/openai/CLIP/blob/main/notebooks/Prompt_Engineering_for_ImageNet.ipynb).
+
+### Caltech101
+- Create a folder named `caltech-101/` under `$DATA`.
+- Download `101_ObjectCategories.tar.gz` from http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz and extract the file under `$DATA/caltech-101`.
+- Download `split_zhou_Caltech101.json` from this [link](https://drive.google.com/file/d/1hyarUivQE36mY6jSomru6Fjd-JzwcCzN/view?usp=sharing) and put it under `$DATA/caltech-101`. 
+
+The directory structure should look like
+```
+caltech-101/
+|–– 101_ObjectCategories/
+|–– split_zhou_Caltech101.json
+```
+
+### OxfordPets
+- Create a folder named `oxford_pets/` under `$DATA`.
+- Download the images from https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz.
+- Download the annotations from https://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz.
+- Download `split_zhou_OxfordPets.json` from this [link](https://drive.google.com/file/d/1501r8Ber4nNKvmlFVQZ8SeUHTcdTTEqs/view?usp=sharing). 
+
+The directory structure should look like
+```
+oxford_pets/
+|–– images/
+|–– annotations/
+|–– split_zhou_OxfordPets.json
+```
+
+### StanfordCars
+- Create a folder named `stanford_cars/` under `$DATA`.
+- Download the train images http://ai.stanford.edu/~jkrause/car196/cars_train.tgz.
+- Download the test images http://ai.stanford.edu/~jkrause/car196/cars_test.tgz.
+- Download the train labels https://ai.stanford.edu/~jkrause/cars/car_devkit.tgz.
+- Download the test labels http://ai.stanford.edu/~jkrause/car196/cars_test_annos_withlabels.mat.
+- Download `split_zhou_StanfordCars.json` from this [link](https://drive.google.com/file/d/1ObCFbaAgVu0I-k_Au-gIUcefirdAuizT/view?usp=sharing).
+
+The directory structure should look like
+```
+stanford_cars/
+|–– cars_test\
+|–– cars_test_annos_withlabels.mat
+|–– cars_train\
+|–– devkit\
+|–– split_zhou_StanfordCars.json
+```
+
+### Flowers102
+- Create a folder named `oxford_flowers/` under `$DATA`.
+- Download the images and labels from https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz and https://www.robots.ox.ac.uk/~vgg/data/flowers/102/imagelabels.mat respectively.
+- Download `cat_to_name.json` from [here](https://drive.google.com/file/d/1AkcxCXeK_RCGCEC_GvmWxjcjaNhu-at0/view?usp=sharing). 
+- Download `split_zhou_OxfordFlowers.json` from [here](https://drive.google.com/file/d/1Pp0sRXzZFZq15zVOzKjKBu4A9i01nozT/view?usp=sharing).
+
+The directory structure should look like
+```
+oxford_flowers/
+|–– cat_to_name.json
+|–– imagelabels.mat
+|–– jpg/
+|–– split_zhou_OxfordFlowers.json
+```
+
+### Food101
+- Download the dataset from https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/ and extract the file `food-101.tar.gz` under `$DATA`, resulting in a folder named `$DATA/food-101/`.
+- Download `split_zhou_Food101.json` from [here](https://drive.google.com/file/d/1QK0tGi096I0Ba6kggatX1ee6dJFIcEJl/view?usp=sharing).
+
+The directory structure should look like
+```
+food-101/
+|–– images/
+|–– license_agreement.txt
+|–– meta/
+|–– README.txt
+|–– split_zhou_Food101.json
+```
+
+### FGVCAircraft
+- Download the data from https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/archives/fgvc-aircraft-2013b.tar.gz.
+- Extract `fgvc-aircraft-2013b.tar.gz` and keep only `data/`.
+- Move `data/` to `$DATA` and rename the folder to `fgvc_aircraft/`.
+
+The directory structure should look like
+```
+fgvc_aircraft/
+|–– images/
+|–– ... # a bunch of .txt files
+```
+
+### SUN397
+- Create a folder named  `sun397/` under `$DATA`.
+- Download the images http://vision.princeton.edu/projects/2010/SUN/SUN397.tar.gz.
+- Download the partitions https://vision.princeton.edu/projects/2010/SUN/download/Partitions.zip.
+- Extract these files under `$DATA/sun397/`.
+- Download `split_zhou_SUN397.json` from this [link](https://drive.google.com/file/d/1y2RD81BYuiyvebdN-JymPfyWYcd8_MUq/view?usp=sharing).
+
+The directory structure should look like
+```
+sun397/
+|–– SUN397/
+|–– split_zhou_SUN397.json
+|–– ... # a bunch of .txt files
+```
+
+### DTD
+- Download the dataset from https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz and extract it to `$DATA`. This should lead to `$DATA/dtd/`.
+- Download `split_zhou_DescribableTextures.json` from this [link](https://drive.google.com/file/d/1u3_QfB467jqHgNXC00UIzbLZRQCg2S7x/view?usp=sharing).
+
+The directory structure should look like
+```
+dtd/
+|–– images/
+|–– imdb/
+|–– labels/
+|–– split_zhou_DescribableTextures.json
+```
+
+### EuroSAT
+- Create a folder named `eurosat/` under `$DATA`.
+- Download the dataset from http://madm.dfki.de/files/sentinel/EuroSAT.zip and extract it to `$DATA/eurosat/`.
+- Download `split_zhou_EuroSAT.json` from [here](https://drive.google.com/file/d/1Ip7yaCWFi0eaOFUGga0lUdVi_DDQth1o/view?usp=sharing).
+
+The directory structure should look like
+```
+eurosat/
+|–– 2750/
+|–– split_zhou_EuroSAT.json
+```
+
+### UCF101
+- Create a folder named `ucf101/` under `$DATA`.
+- Download the zip file `UCF-101-midframes.zip` from [here](https://drive.google.com/file/d/10Jqome3vtUA2keJkNanAiFpgbyC9Hc2O/view?usp=sharing) and extract it to `$DATA/ucf101/`. This zip file contains the extracted middle video frames.
+- Download `split_zhou_UCF101.json` from this [link](https://drive.google.com/file/d/1I0S0q91hJfsV9Gf4xDIjgDq4AqBNJb1y/view?usp=sharing).
+
+The directory structure should look like
+```
+ucf101/
+|–– UCF-101-midframes/
+|–– split_zhou_UCF101.json
+```
+
+### ImageNetV2
+- Create a folder named `imagenetv2/` under `$DATA`.
+- Go to this github repo https://github.com/modestyachts/ImageNetV2.
+- Download the matched-frequency dataset from https://s3-us-west-2.amazonaws.com/imagenetv2public/imagenetv2-matched-frequency.tar.gz and extract it to `$DATA/imagenetv2/`.
+- Copy `$DATA/imagenet/classnames.txt` to `$DATA/imagenetv2/`.
+
+The directory structure should look like
+```
+imagenetv2/
+|–– imagenetv2-matched-frequency-format-val/
+|–– classnames.txt
+```
+
+### ImageNet-Sketch
+- Download the dataset from https://github.com/HaohanWang/ImageNet-Sketch.
+- Extract the dataset to `$DATA/imagenet-sketch`.
+- Copy `$DATA/imagenet/classnames.txt` to `$DATA/imagenet-sketch/`.
+
+The directory structure should look like
+```
+imagenet-sketch/
+|–– images/ # contains 1,000 folders whose names have the format of n*
+|–– classnames.txt
+```
+
+### ImageNet-A
+- Create a folder named `imagenet-adversarial/` under `$DATA`.
+- Download the dataset from https://github.com/hendrycks/natural-adv-examples and extract it to `$DATA/imagenet-adversarial/`.
+- Copy `$DATA/imagenet/classnames.txt` to `$DATA/imagenet-adversarial/`.
+
+The directory structure should look like
+```
+imagenet-adversarial/
+|–– imagenet-a/ # contains 200 folders whose names have the format of n*
+|–– classnames.txt
+```
+
+### ImageNet-R
+- Create a folder named `imagenet-rendition/` under `$DATA`.
+- Download the dataset from https://github.com/hendrycks/imagenet-r and extract it to `$DATA/imagenet-rendition/`.
+- Copy `$DATA/imagenet/classnames.txt` to `$DATA/imagenet-rendition/`.
+
+The directory structure should look like
+```
+imagenet-rendition/
+|–– imagenet-r/ # contains 200 folders whose names have the format of n*
+|–– classnames.txt
+```
--- a/docs/EVAL.md
+++ b/docs/EVAL.md
@@ -0,0 +1,149 @@
+# Evaluating and Reproducing PromptSRC Results
+
+We provide bash scripts in [scripts/](../scripts) directory for evaluating PromptSRC and independent V-L prompting baseline using the provided pre-trained model checkpoints.
+
+
+Make sure to update the `DATA` variable with dataset path in the script file and run the commands from the main directory `PromptSRC/`.
+Below we provide the pre-trained models evaluation instructions for PromptSRC. The same instructions applies for reproducing results for the baseline *independent V-L prompting* and MaPLe.
+
+## PromptSRC
+
+#### (1) Base-to-Novel class generalization setting
+The base-to-novel PromptSRC configuration is provided in config file at `configs/trainers/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx.yaml`. No hyper-parameters or other settings should be changed in the config file during evaluation of pre-trained models. 
+
+We show an example to reproduce results for imagenet. Follow the instructions below to reproduce results using our pre-trained model weights:
+* Download the zipped folder containing base-to-novel generalization pre-trained weights for a single dataset from this [link](https://mbzuaiac-my.sharepoint.com/:f:/g/personal/syed_wasim_mbzuai_ac_ae/Em_3tkSj6T9AmhVjmzKTL3gBYNehhvfJl8ke2pU3U0nabA?e=9ecjQA). After unzipping, the directory should look like this:
+
+```
+imagenet
+|–– base/
+|   |–– seed1/
+|   |–– seed2/
+|   |–– seed3/
+```
+
+Now use the evaluation script `scripts/promptsrc/reproduce_base2novel_setting.sh` and run the commands below to calculate the results over 3 seeds:
+```bash
+# Other possible dataset values includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+
+# evaluate on base and novel classes for SEED1
+bash scripts/promptsrc/reproduce_base2novel_setting.sh imagenet 1 /path/to/imagenet/weights/folder
+# evaluate on base and novel classes for SEED2
+bash scripts/promptsrc/reproduce_base2novel_setting.sh imagenet 2 /path/to/imagenet/weights/folder
+# evaluate on base and novel classes for SEED3
+bash scripts/promptsrc/reproduce_base2novel_setting.sh imagenet 3 /path/to/imagenet/weights/folder
+```
+
+This should evaluate and save the log files in `output/` directory. To obtain the averaged results, run:
+
+```bash
+# prints averaged results for base classes
+python output/base2new/test_base/imagenet/shots_16/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx --test-log
+# prints averaged results for novel classes
+python output/base2new/test_new/imagenet/shots_16/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx --test-log
+```
+The same above steps can be repeated for other individual datasets by providing respective dataset name and checkpoints path.
+
+
+#### (2) Cross-dataset and domain generalization setting
+In cross-dataset and domain generalization setting, we first train PromptSRC on ImageNet-1k in few-shot manner with 16 shots for all 3 seeds and then evaluate the trained model directly on cross-datasets and out-of-distribution datasets.
+
+We provide the instructions below to reproduce cross-datasets and domain generalization results using our pre-trained imagenet model weights for PromptSRC:
+* Download the zipped folder containing pre-trained weights for imagenet from this [link](https://mbzuaiac-my.sharepoint.com/:f:/g/personal/syed_wasim_mbzuai_ac_ae/Ekr9qF0cSaVDr0X6OlP2JAEBG1xjlTMjHNLc28g1SjwW-w?e=AA5ABi). After unzipping, the directory should look like this:
+
+```
+imagenet
+|–– seed1/
+|–– seed2/
+|–– seed3/
+```
+
+Now use the evaluation script `scripts/promptsrc/reproduce_xd.sh` and run the commands below to calculate the results for food101 dataset over 3 seeds:
+```bash
+# Other possible dataset values for cross-datasets includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+# possible dataset values for domain generalization benchmark includes [imagenetv2, imagenet_sketch, imagenet_a, imagenet_r]
+
+# evaluate on given dataset for SEED1
+bash scripts/promptsrc/reproduce_xd.sh food101 1 /path/to/imagenet/weights/folder
+# evaluate on given dataset for SEED2
+bash scripts/promptsrc/reproduce_xd.sh food101 2 /path/to/imagenet/weights/folder
+# evaluate on given dataset for SEED3
+bash scripts/promptsrc/reproduce_xd.sh food101 3 /path/to/imagenet/weights/folder
+```
+
+This should evaluate and save the log files in `output/` directory. To obtain the results averaged over 3 seeds, run:
+
+```bash
+# prints averaged results for food101 dataset
+python parse_test_res.py output/evaluation/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx_cross_datasets_16shots/food101 --test-log
+```
+
+The same above steps can be repeated for other individual datasets by providing respective dataset name and checkpoints path.
+
+
+#### (3) Few-shot setting
+In this setting, PromptSRC is trained on all classes individual datasets with different few-shot splits (K = 1, 2, 4, 8, 16). The PromptSRC config for few-shot setting is available at: `configs/trainers/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot.yaml`. 
+Follow the instructions below to reproduce PromptSRC few-shot setting results using our pre-trained models:
+
+Now use the evaluation script `scripts/promptsrc/reproduce_few_shot.sh` and run the commands below to calculate the results for imagenet dataset over 3 seeds:
+```bash
+# reproduce_few_shot.sh calculates results for all 3 seeds for a given K
+# Other possible dataset values includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+
+# evaluate on given dataset for K=1 shot
+bash scripts/promptsrc/reproduce_few_shot.sh food101 1 /path/to/imagenet/weights/folder
+# evaluate on given dataset for K=2 shot
+bash scripts/promptsrc/reproduce_few_shot.sh food101 2 /path/to/imagenet/weights/folder
+# evaluate on given dataset for K=4 shot
+bash scripts/promptsrc/reproduce_few_shot.sh food101 4 /path/to/imagenet/weights/folder
+# evaluate on given dataset for K=8 shot
+bash scripts/promptsrc/reproduce_few_shot.sh food101 8 /path/to/imagenet/weights/folder
+# evaluate on given dataset for K=16 shot
+bash scripts/promptsrc/reproduce_few_shot.sh food101 16 /path/to/imagenet/weights/folder
+```
+
+This should evaluate and save the log files in `output/` directory. To obtain the results averaged over 3 seeds for all shots, run:
+
+```bash
+# prints averaged results for food101 dataset for K=1
+python parse_test_res.py output/few_shot/food101/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot_1shots/food101 --test-log
+# prints averaged results for food101 dataset for K=2
+python parse_test_res.py output/few_shot/food101/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot_2shots/food101 --test-log
+# prints averaged results for food101 dataset for K=4
+python parse_test_res.py output/few_shot/food101/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot_4shots/food101 --test-log
+# prints averaged results for food101 dataset for K=8
+python parse_test_res.py output/few_shot/food101/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot_8shots/food101 --test-log
+# prints averaged results for food101 dataset for K=16
+python parse_test_res.py output/few_shot/food101/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot_16shots/food101 --test-log
+```
+
+The same above steps can be repeated for other individual datasets by providing respective dataset name and checkpoints path.
+
+<br>
+
+## Training and Evaluating the independent V-L prompting baseline results
+
+For IVLP baseline method, we provide its corresponding default configs and evaluation scripts as follows.
+
+```
+configs
+|–– datasets/
+|–– trainers/
+|   |–– CoCoOp/
+|   |–– CoOp/
+|   |–– MaPLe/
+|   |–– IVLP/
+|   |–– PromptSRC/
+```
+
+```
+scripts
+|–– cocoop/
+|–– coop/
+|–– maple/
+|–– independent-vlp/
+|–– promptsrc/
+```
+
+Please use the corresponding config and script files and follow the same instructions as provided for PromptSRC in order to evaluate and reproduce results of IVLP baseline approach. The pretrained weights for IVLP baseline are provided [at this link](https://mbzuaiac-my.sharepoint.com/:f:/g/personal/syed_wasim_mbzuai_ac_ae/EuIwh-yMh_JBqB2Y_o8Jl14BPDKDRHC0JBPE1BugIeZiSQ?e=oJnJwy). 
+This repository also supports using official [CoOp](CoOp.md) and [Co-CoOp](Co-CoOp.md) configs and models.
--- a/docs/INSTALL.md
+++ b/docs/INSTALL.md
@@ -0,0 +1,48 @@
+# Installation
+
+### Acknowledgement: This readme file for installing datasets is modified from [MaPLe's](https://github.com/muzairkhattak/multimodal-prompt-learning) official repository.
+
+This codebase is tested on Ubuntu 20.04.2 LTS with python 3.8. Follow the below steps to create environment and install dependencies.
+
+* Setup conda environment (recommended).
+```bash
+# Create a conda environment
+conda create -y -n promptsrc python=3.8
+
+# Activate the environment
+conda activate promptsrc
+
+# Install torch (requires version >= 1.8.1) and torchvision
+# Please refer to https://pytorch.org/ if you need a different cuda version
+pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
+```
+
+* Install dassl library.
+```bash
+# Instructions borrowed from https://github.com/KaiyangZhou/Dassl.pytorch#installation
+
+# Clone this repo
+git clone https://github.com/KaiyangZhou/Dassl.pytorch.git
+cd Dassl.pytorch/
+
+# Install dependencies
+pip install -r requirements.txt
+
+# Install this library (no need to re-build if the source code is modified)
+python setup.py develop
+cd ..
+```
+
+* Clone PromptSRC code repository and install requirements
+```bash
+# Clone PromptSRC code base
+git clone https://github.com/muzairkhattak/PromptSRC.git
+
+cd PromptSRC/
+# Install requirements
+
+pip install -r requirements.txt
+
+# Update setuptools package 
+pip install setuptools==59.5.0
+```
--- a/docs/MaPLe.md
+++ b/docs/MaPLe.md
@@ -0,0 +1,211 @@
+# Training and Evaluation
+
+We provide bash scripts in [scripts/](../scripts) for each prompting variant including MaPLe, vision, language and independent V-L prompting.
+Make sure to configure the dataset paths in environment variable `DATA` and run the commands from the main directory `multimodal-prompt-learning/`.
+Below we provide training and evaluation instructions for MaPLe. The same instructions applies for all other variants including *Vision (VPT), Language and independent V-L prompting*.
+
+
+### Training time and compute
+We train MaPLe on each dataset with a batch size of 4 using a **single** NVIDIA A100 GPU.
+Training MaPle on ImageNet for 5 epochs takes 1 hour for a single seed. So results for 3 seeds takes around 3 hours. For all remaining 10 datasets, it combinedly takes around 4 hours (for all 3 seeds) on a single A100 GPU. To ease reproduction of MaPLe results, we have provided [training logs](https://drive.google.com/drive/folders/1EvuvgR8566bL0T7ucvAL3LFVwuUPMRas?usp=sharing) for all datasets. 
+
+## MaPLe
+
+#### (1) Base-to-Novel class generalization setting
+The default training settings are provided in config file at `configs/trainers/MaPLe/vit_b16_c2_ep5_batch4_2ctx.yaml`. All hyper-parameters such as prompt length, prompt depth, etc., can be modified using this config file.
+
+Below, we provide instructions to train MaPLe on imagenet. 
+
+
+```bash
+# Other possible dataset values includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+
+# seed=1
+# trains and evaluates on base classes
+bash scripts/maple/base2new_train_maple.sh imagenet 1
+# evaluates on novel classes
+bash scripts/maple/base2new_test_maple.sh imagenet 1
+
+# seed=2
+# trains and evaluates on base classes
+bash scripts/maple/base2new_train_maple.sh imagenet 2
+# evaluates on novel classes
+bash scripts/maple/base2new_test_maple.sh imagenet 2
+
+# seed=3
+# trains and evaluates on base classes
+bash scripts/maple/base2new_train_maple.sh imagenet 3
+# evaluates on novel classes
+bash scripts/maple/base2new_test_maple.sh imagenet 3
+```
+
+#### Averaging results over 3 seeds: 
+Once the above trainings and evaluations are completed, the `output/` directory should have the following structure:
+
+```
+output
+|–– base2new/
+|   |–– test_new/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– MaPLe/
+|   |   |   |   |   |–– vit_b16_c2_ep5_batch4_2ctx/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+|   |–– train_base/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– MaPLe/
+|   |   |   |   |   |–– vit_b16_c2_ep5_batch4_2ctx/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+```
+
+Now use the script `parse_test_res.py` and run the commands below to calculate the averaged results:
+```bash
+# prints averaged results for base classes
+python parse_test_res.py output/base2new/train_base/imagenet/shots_16/MaPLe/vit_b16_c2_ep5_batch4_2ctx
+# averaged results for novel classes
+python parse_test_res.py output/base2new/test_new/imagenet/shots_16/MaPLe/vit_b16_c2_ep5_batch4_2ctx --test-log
+```
+
+The above steps can be repeated for other individual datasets.
+
+#### Reproducing results using pre-trained weights for base-to-novel generalization setting
+
+We show an example to reproduce results for imagenet. Follow the instructions below to reproduce results using our pre-trained model weights:
+* Download the zipped folder containing pre-trained weights for a single dataset from this [link](https://drive.google.com/drive/folders/1-tB6BUDBzs9CXTOJ7p5hM4Svq1tL_mGz?usp=sharing). Additionally we also provide the log files for both training and evaluation. After unzipping, the directory should look like this:
+
+```
+imagenet
+|–– base/
+|   |–– seed1/
+|   |–– seed2/
+|   |–– seed3/
+|–– novel/
+|   |–– seed1/
+|   |–– seed2/
+|   |–– seed3/
+```
+
+Now use the evaluation script `scripts/maple/reproduce_maple.sh` and run the commands below to calculate the averaged results:
+```bash
+# evaluate on base and novel classes for SEED1
+bash scripts/maple/reproduce_maple.sh imagenet 1 /path/to/imagenet/weights/folder
+# evaluate on base and novel classes for SEED2
+bash scripts/maple/reproduce_maple.sh imagenet 2 /path/to/imagenet/weights/folder
+# evaluate on base and novel classes for SEED3
+bash scripts/maple/reproduce_maple.sh imagenet 3 /path/to/imagenet/weights/folder
+```
+
+This should evaluate and save the log files in `output/` directory. To obtain the averaged results, run:
+
+```bash
+# prints averaged results for base classes
+python parse_test_res.py output/base2new/train_base/imagenet/shots_16/MaPLe/vit_b16_c2_ep5_batch4_2ctx
+# averaged results for novel classes
+python parse_test_res.py output/base2new/test_new/imagenet/shots_16/MaPLe/vit_b16_c2_ep5_batch4_2ctx --test-log
+```
+
+
+#### (2) Cross-Dataset Transfer
+We provide instructions to train MaPLe on imageNet using all 1000 classes and then evaluating it directly on new downstream datasets.
+We provide cross-dataset config for MaPLe: `configs/MaPLe/vit_b16_c2_ep5_batch4_2ctx_cross_datasets.yaml`.
+* Firstly, train MaPLe on imagenet in few-shot manner (for all 3 seeds).
+
+```bash
+# seed=1 
+bash scripts/maple/xd_train_maple.sh imagenet 1
+# seed=2 
+bash scripts/maple/xd_train_maple.sh imagenet 2
+# seed=3 
+bash scripts/maple/xd_train_maple.sh imagenet 3
+```
+
+* Now evaluate imageNet model on downstream datasets.
+
+```bash
+for SEED in 1 2 3
+do
+    bash scripts/maple/xd_test_maple.sh caltech101 ${SEED}
+    bash scripts/maple/xd_test_maple.sh oxford_pets ${SEED}
+    bash scripts/maple/xd_test_maple.sh stanford_cars ${SEED}
+done
+```
+
+#### (3) Domain Generalization 
+We use imagenet trained MaPLe model for domain generalization experiments. The steps are similar to above cross-dataset experiments, however, model is evaluated on imagenet variants.
+* Evaluate imageNet model on variants of imagenet (domain shift datasets).
+
+```bash
+for SEED in 1 2 3
+do
+    bash scripts/maple/xd_test_maple.sh imagenetv2 ${SEED}
+    bash scripts/maple/xd_test_maple.sh imagenet_sketch ${SEED}
+    bash scripts/maple/xd_test_maple.sh imagenet_a ${SEED}
+    bash scripts/maple/xd_test_maple.sh imagenet_r ${SEED}
+done
+```
+
+
+You can obtain averaged results by using the script `parse_test_res.py` and following the similar steps as provided in base-to-novel generalization experiments.
+<br>
+
+
+#### Reproducing official results for cross-dataset and domain generalization setting
+
+We provide the instructions below to reproduce domain-generalization and cross-datasets results using our pre-trained imagenet model weights for MaPLe:
+* Download the zipped folder containing pre-trained weights for imagenet from this [link](https://drive.google.com/drive/folders/1bmhvmNZc13WJ5U71qt0t8k91wyuoemVF?usp=sharing). Additionally, we also provide the log files for both training and evaluation. After unzipping, the directory should look like this:
+
+```
+imagenet
+|–– seed1/
+|–– seed2/
+|–– seed3/
+```
+
+Now use the evaluation script `scripts/maple/reproduce_maple_xd.sh` and run the commands below to calculate the averaged results:
+```bash
+# evaluate on given dataset for SEED1
+bash scripts/maple/reproduce_maple_xd.sh food101 1 /path/to/imagenet/weights/folder
+# evaluate on given dataset for SEED2
+bash scripts/maple/reproduce_maple_xd.sh food101 2 /path/to/imagenet/weights/folder
+# evaluate on given dataset for SEED3
+bash scripts/maple/reproduce_maple_xd.sh food101 3 /path/to/imagenet/weights/folder
+```
+
+This should evaluate and save the log files in `output/` directory. To obtain the averaged results, run:
+
+```bash
+# prints averaged results for food101 dataset
+python parse_test_res.py output/evaluation/MaPLe/vit_b16_c2_ep5_batch4_2ctx_cross_datasets_16shots/food101 --test-log
+```
+
+
+#### Training and Evaluating other variants
+
+For other variants including vision, language and independent V-L prompting techniques, we provide their corresponding configs and scripts as follows.
+
+```
+configs
+|–– datasets/
+|–– trainers/
+|   |–– CoCoOp/
+|   |–– CoOp/
+|   |–– MaPLe/
+|   |–– IVLP/
+|   |–– VPT/
+```
+
+```
+scripts
+|–– cocoop/
+|–– coop/
+|–– language-prompting/
+|–– maple/
+|–– independent-vlp/
+```
+
+Please use the corresponding config and script files and follow the same instructions as provided for MaPLe in order to train and evaluate the other variants. Same instructions can be followed to reproduce results of other variants using provided pretrained weights.
--- a/docs/TRAIN.md
+++ b/docs/TRAIN.md
@@ -0,0 +1,169 @@
+# PromptSRC Training
+
+We provide bash scripts in [scripts/](../scripts) for training PromptSRC and independent V-L prompting baseline.
+Make sure to update the `DATA` variable with dataset path in the script file and run the commands from the main directory `PromptSRC/`.
+Below we provide training and testing instructions for PromptSRC. The same instructions are applicable for the baseline *independent V-L prompting* approach, MaPLe, CoOp and CoCoOp.
+
+### Training time and compute
+We train PromptSRC on each dataset with a batch size of 4 using a **single** NVIDIA A100 GPU.
+Training PromptSRC on ImageNet for 20 epochs takes around 6 hours for a single seed. So results for 3 seeds takes around 18 hours. For all remaining 10 datasets, it combinedly takes around around 8 hours (for all 3 seeds) on a single A100 GPU. 
+
+## PromptSRC
+
+#### (1) Base-to-Novel class generalization setting
+The base-to-novel PromptSRC configuration is provided in config file at `configs/trainers/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx.yaml`. All hyper-parameters such as GPA STD, GPA Mean, SCL loss weights coefficients, prompt length and prompt depth etc., can be modified using this config file.
+
+Run the commands below to train PromptSRC on ImageNet.
+
+```bash
+# Other possible dataset values includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+
+# seed=1
+# trains and evaluates on base classes
+bash scripts/promptsrc/base2new_train.sh imagenet 1
+# evaluates on novel classes
+bash scripts/promptsrc/base2new_test.sh imagenet 1
+
+# seed=2
+# trains and evaluates on base classes
+bash scripts/promptsrc/base2new_train.sh imagenet 2
+# evaluates on novel classes
+bash scripts/promptsrc/base2new_test.sh imagenet 2
+
+# seed=3
+# trains and evaluates on base classes
+bash scripts/promptsrc/base2new_train.sh imagenet 3
+# evaluates on novel classes
+bash scripts/promptsrc/base2new_test.sh imagenet 3
+```
+
+#### Averaging results over 3 seeds: 
+Once the above trainings and evaluations are completed, the `output/` directory should have the following structure:
+
+```
+output
+|–– base2new/
+|   |–– test_new/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– PromptSRC/
+|   |   |   |   |   |–– vit_b16_c2_ep20_batch4_4+4ctx/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+|   |–– train_base/
+|   |   |–– imagenet/
+|   |   |   |–– shots_16/
+|   |   |   |   |–– PromptSRC/
+|   |   |   |   |   |–– vit_b16_c2_ep20_batch4_4+4ctx/
+|   |   |   |   |   |   |–– seed1/
+|   |   |   |   |   |   |–– seed2/
+|   |   |   |   |   |   |–– seed3/
+```
+
+Now use the script `parse_test_res.py` and run the commands below to calculate the averaged results:
+```bash
+# prints averaged results for base classes
+python output/base2new/train_base/imagenet/shots_16/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx --test-log
+# averaged results for novel classes
+python output/base2new/test_new/imagenet/shots_16/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx --test-log
+```
+
+The above steps can be repeated for other individual datasets.
+
+#### (2) Cross-Dataset Transfer setting
+We provide instructions to train PromptSRC on ImageNet using all 1000 classes with 16 shots and then evaluating it directly on new downstream datasets.
+The corresponding cross-dataset config for PromptSRC is available at: `configs/trainers/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx_cross_datasets.yaml`. All PromptSRC hyper-parameters can be modified in this config file.
+* Firstly, train PromptSRC on imagenet in few-shot manner (for all 3 seeds).
+
+```bash
+# seed=1 
+bash scripts/promptsrc/xd_train.sh imagenet 1
+# seed=2 
+bash scripts/promptsrc/xd_train.sh imagenet 2
+# seed=3 
+bash scripts/promptsrc/xd_train.sh imagenet 3
+```
+
+* Now directly evaluate the ImageNet trained model on downstream cross-datasets.
+
+```bash
+# Other possible dataset values includes [imagenet, food101, dtd, ucf101, oxford_flowers, fgvc_aircraft, sun397, eurosat]
+
+for SEED in 1 2 3
+do
+    bash scripts/promptsrc/xd_test.sh caltech101 ${SEED}
+    bash scripts/promptsrc/xd_test.sh oxford_pets ${SEED}
+    bash scripts/promptsrc/xd_test.sh stanford_cars ${SEED}
+done
+```
+You can obtain averaged results by using the script `parse_test_res.py` and following the similar steps as provided in base-to-novel generalization experiments.
+
+
+#### (3) Domain Generalization setting
+We use the same ImageNet trained PromptSRC model for domain generalization experiments. The steps are similar to above cross-dataset experiments, however, the trained model is now evaluated on ImageNet variants.
+The corresponding domain generalization config for PromptSRC is available at: `configs/trainers/PromptSRC/vit_b16_c2_ep20_batch4_4+4ctx_cross_datasets.yaml`.
+* Evaluate ImageNet model on different variants of ImageNet (datasets with domain shifts).
+
+```bash
+for SEED in 1 2 3
+do
+    bash scripts/promptsrc/xd_test.sh imagenetv2 ${SEED}
+    bash scripts/promptsrc/xd_test.sh imagenet_sketch ${SEED}
+    bash scripts/promptsrc/xd_test.sh imagenet_a ${SEED}
+    bash scripts/promptsrc/xd_test.sh imagenet_r ${SEED}
+done
+```
+
+
+You can obtain averaged results by using the script `parse_test_res.py` and following the similar steps as provided in base-to-novel generalization experiments.
+
+#### (4) Few-shot setting 
+In this setting, PromptSRC is trained on all classes individual datasets with different few-shot splits (K = 1, 2, 4, 8, 16). The corresponding few-shot setting config for PromptSRC is available at: `configs/trainers/PromptSRC/vit_b16_c2_ep50_batch4_4+4ctx_few_shot.yaml`.
+
+Now use the training script `scripts/promptsrc/few_shot.sh` and run the commands below to calculate the results for imagenet dataset for all shots over 3 seeds:
+
+```bash
+# Other possible dataset values includes [caltech101, food101, dtd, ucf101, oxford_flowers, oxford_pets, fgvc_aircraft, stanford_cars, sun397, eurosat]
+
+# train and test on given dataset for K=1 shot
+bash scripts/promptsrc/few_shot.sh imagenet 1 
+# train and test on given dataset for K=2 shot
+bash scripts/promptsrc/few_shot.sh imagenet 2 
+# train and test on given dataset for K=4 shot
+bash scripts/promptsrc/few_shot.sh imagenet 4 
+# train and test on given dataset for K=8 shot
+bash scripts/promptsrc/few_shot.sh imagenet 8 
+# train and test on given dataset for K=17 shot
+bash scripts/promptsrc/few_shot.sh imagenet 16
+```
+
+
+You can obtain averaged results by using the script `parse_test_res.py` and following the similar steps as provided in base-to-novel generalization experiments.
+<br>
+
+
+#### Training and testing independent V-L prompting baseline approach
+
+For training independent V-L prompting baseline approach, we provide their corresponding configs and scripts as follows.
+
+```
+configs
+|–– datasets/
+|–– trainers/
+|   |–– CoCoOp/
+|   |–– CoOp/
+|   |–– IVLP/
+|   |–– PromptSRC/
+```
+
+```
+scripts
+|–– cocoop/
+|–– coop/
+|–– promptsrc/
+|–– independent-vlp/
+```
+    
+Please use the corresponding config and script files and follow the same instructions as provided for PromptSRC for training and testing. 
+This repository also supports using official [MaPLe](MaPLe.md), [CoOp](CoOp.md) and [Co-CoOp](Co-CoOp.md) configs and models.
--- a/docs/main_figure.png
+++ b/docs/main_figure.png