Skip to content

Mergecraft is a simple library to streamline model merging operations, with seamless integration with HuggingFace🤗

Notifications You must be signed in to change notification settings

Mamiglia/mergecraft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mergecraft

Mergecraft is a Python library designed to simplify the process of merging machine learning models. Whether you're combining weights from different models or experimenting with novel merging strategies, Mergecraft provides the tools and decorators to make this process seamless and efficient.

Features

  • Easy-to-use interface for model merging
  • Support for various merging paradigms (TIES, SLERP, DARE, etc.)
  • Seamless integration with Hugging Face models
  • Extensible architecture for implementing custom merging methods
  • Efficient handling of model loading and conversion

Quick Start

Running an Evaluation

You can evaluate your merged model on tasks like the RTE task from the GLUE benchmark:

models = [
    'google-bert/bert-base-uncased',
    'textattack/bert-base-uncased-RTE',
    'yoshitomo-matsubara/bert-base-uncased-rte',
    'Ruizhou/bert-base-uncased-finetuned-rte',
    'howey/bert-base-uncased-rte',
    'anirudh21/bert-base-uncased-finetuned-rte'
]

# List of layers where DARE should be skipped
# because these layers are randomly initialized during finetuning
rnd_layers = ['classifier.weight', 'classifier.bias']

merged = mergecraft.dare(models, passthrough=rnd_layers, base_index=0)

Example: Median Merger

Below is an example of how to implement a median-based model merging strategy using Mergecraft:

import torch
from mergecraft import layer_merge

@layer_merge
def median_merge(models: List[torch.Tensor], **kwargs) -> torch.Tensor:
    return torch.median(torch.stack(models), dim=0).values

Contributing

Contributions are welcome! If you’d like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcomed.

About

Mergecraft is a simple library to streamline model merging operations, with seamless integration with HuggingFace🤗

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published