Any plan to release fine-tuning scripts?

by Mengyao00 - opened Jan 5, 2024

Discussion

Mengyao00

Jan 5, 2024

Great work, are you going to open source fine-tuning scripts?

intfloat

Owner Jan 5, 2024

We will not release "official" fine-tuning script. Instead, we recommend using off-the-shelf embedding fine-tuning libraries such as Tevatron. You only need to change the tokenization, and pooling part.

serialcoder

Jan 6, 2024

@intfloat it looks like the pooling part in Tevatron is the same as in the paper (using hidden state of eos token). Therefore, only tokenization needs to be changed right?

kamalkraj

Jan 6, 2024

@Mengyao00 @serialcoder
Model finetuning using huggingface peft + deepspeed
https://github.com/kamalkraj/e5-mistral-7b-instruct/

@intfloat

ijkim

Feb 5, 2024

@serialcoder
In which files can I find the pooling and tokenization parts of Tevatron?
I can't find them.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment