We propose a novel approach to utilize Parameter-Efficient Tuning for generAl-purpose vision-Language models, namely PETAL. PETAL enhances the semantic depth of instructions in two innovative ways: 1) by introducing adaptive instruction mixture-of-experts (MOEs), and 2) by fortifying the score-based linkage between parameterefficient tuning and mutual information.
The whole architecture of our PETAL:
Case study and the visualization output:
-
First clone the directory.
-
Install dependencies.
First, create a new environment using conda (with python >= 3.7). Then install pytorch and other dependencies as follows
Install pytorch (replace "cu113" with appropriate cuda version. For example, cuda11.1 will use "cu111"):
pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/torch_stable.html
Install other dependencies. Run the following command:
pip install -r requirements.txt
cd PETAL
bash run_scripts/blip2/train/train_aurora_mixture.sh
bash run_scripts/blip2/eval/eval_aurora_mixture.sh