An extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community.
Our package has been tested on Linux OS (Ubuntu 20.04). Other OS platforms (MacOS, Windows) are not fully tested, where you may encounter unexpected errors.
git clone -b v0.0.9 https://github.com/OptimalScale/LMFlow.git
cd LMFlow
git checkout data4elm
conda create -n lmflow python=3.9 -y
conda activate lmflow
conda install mpi4py
pip install -e .
Tip
We use Wandb to track and visualize the training process by default. Before running the training scripts, users may need to log in to Wandb using the command:
wandb login
For detailed instructions, refer to the Wandb Quickstart Guide. Step 1 (registration) and Step 2 (login using your Wandb API key) should be sufficient to set up your environment.
Disabling wandb
One can disable wandb by either:
- Adding environment variable before running the training command.
export WANDB_MODE=disabled
- OR, specifying the integrations to report the results and logs to. In the training script, add:
--report_to none \
For sanity check, we provide a small dataset for you to test the finetuning process.
To process your own dataset, please refer to our doc.
DoRA is a parameter-efficient finetuning algorithm and is more efficient than full finetuning.
bash train.sh
Note: Please double-check that you have updated the training script with the correct arguments for your use case.
Note: So that we eliminate hyperparameters as a confounding factor, you must keep num_train_epochs
as 1
, learning_rate
as 1e-5
, and lora_r
as 16.
Tip
Merge Dora Weight
Merge Dora weight and the base model into one using:
bash ./scripts/run_merge_dora.sh \
--model_name_or_path Qwen/Qwen1.5-1.8B \
--lora_model_path output_models/dora \
--output_model_path output_models/dora_merged \
Aligned with the objective of this challenge, we propose a new benchmark for evaluating edge LMs, named the Edge Language Model Benchmark (ELMB).
It includes the following tasks:
-
Roleplay: Enhancing performance in interactive digital environments.
-
Reasoning: Improving complex problem-solving for downstream applications like robotics.
-
Function Calling: Optimizing models for mobile device interactions.
-
Retrieval-Augmented Generation (RAG): Boosting capabilities in retrieval-augmented applications.
To evaluate the performance of the model on the ELMB, you can use the following command:
cd LMFlow/lm-evaluation-harness
pip install -e .
lm_eval --model hf \
--model_args pretrained=[YOUR_MODEL_PATH],trust_remote_code=True,cache_dir=~/.cache \
--tasks elmb_roleplay,elmb_reasoning,elmb_functioncalling,elmb_chatrag \
--device cuda:0 \
--batch_size 1 \
--log_samples \
--output_path ./eval_results/test_elmb
Note that in order to test your model, you can either use a model path or use a Hugging Face path.
If using a model path, please merge your DoRA weights into the base model, and then provide the path to the merged model's folder in [YOUR_MODEL_PATH]
.
If using HuggingFace, [YOUR_MODEL_PATH]
is the HuggingFace model path (username/model_name
) to your uploaded model obtained after merging the DoRA weights. You may reference example_upload_peft_model.py
for a starter script on how to upload your DoRA-finetuned model.
Below are some commonly asked questions from our Discord.
Q: Where can I go if I have questions about the challenge?
A: The main place to ask questions will be under the challenge-questions
text channel in our Discord.
Q: How many tokens can I use to finetune my model?
A: As listed in the challenge website, you may use up to 10B tokens total.
Q: How can I resume from a checkpoint?
A: Users can resume from checkpoints by adding the argument --resume_from_checkpoint
to the training script with the path to the latest checkpoint.
For example, --resume_from_checkpoint [model-dir]/checkpoint-[checkpoint-index]
.
Q: How can I test using a validation split?
A: Users can view validation loss during training by adding the arguments --validation_split_percentage
, --eval_strategy
, and --eval_steps
. For instance:
--validation_split_percentage 5 \ --eval_strategy steps \ --eval_steps 20
will show the validation loss every 20 steps using a validation split of 5 percent.
Q: How do I know if I am registered?
A: You will receive a confirmation email titled "PLEASE READ: Data Filtering Challenge - Confirmation of Registration" from an Outlook account named "data4elm". The names of the registered teams will also be listed on our Discord periodically.
Q: Where can I find the dataset?
A: You can find the starter dataset here.
Q: The starter dataset consists of tokens. How can I convert it into a raw text format?
A: You can use the unofficial detokenized dataset, or you may detokenize the dataset yourself using the script detokenize_climblab.py
found here.
If you need any help, please submit a Github issue.
The code included in this project is licensed under the Apache 2.0 license. If you wish to use the codes and models included in this project for commercial purposes, please sign this document to obtain authorization.
If you find this repository useful, please consider giving ⭐ and citing our paper:
@article{diao2023lmflow,
title={Lmflow: An extensible toolkit for finetuning and inference of large foundation models},
author={Diao, Shizhe and Pan, Rui and Dong, Hanze and Shum, Ka Shun and Zhang, Jipeng and Xiong, Wei and Zhang, Tong},
journal={arXiv preprint arXiv:2306.12420},
year={2023}
}
@inproceedings{liu2024dora,
title={Dora: Weight-decomposed low-rank adaptation},
author={Liu, Shih-Yang and Wang, Chien-Yi and Yin, Hongxu and Molchanov, Pavlo and Wang, Yu-Chiang Frank and Cheng, Kwang-Ting and Chen, Min-Hung},
booktitle={Forty-first International Conference on Machine Learning},
year={2024}
}