This repo contains a fully fine-tuned LLaMA-7b, trained on the 🧬 protein-oriented instructions from the 🧪 Mol-Instructions dataset.
Instructions for running it can be found at https://github.com/zjunlp/Mol-Instructions.
Please refer to our paper for more details.
🧬 Tasks
📝 Demo
As illustrated in our repository, we provide an example to perform generation.
For model fine-tuned on protein-oriented instructions, you can conveniently recover the model weights we trained through the following command.
Please download llama-7b-hf to obtain the pre-training weights of LLaMA-7B, refine the --base_model to point towards the location where the model weights are saved.
Then replace $DIFF_WEIGHT_PATH with the path of our provided diff weights, and replace $RECOVER_WEIGHT_PATH with the desired path to save the recovered weights. If the directory of recovered weights lacks required files (e.g., tokenizer configuration files), you can copy from $DIFF_WEIGHT_PATH.
python weight_diff.py recover \
--path_raw $BASE_MODEL_PATH \
--path_diff $DIFF_WEIGHT_PATH \
--path_tuned $RECOVER_WEIGHT_PATH
After that, you can execute the following command to generate outputs with the fine-tuned LLaMA model.
>> python generate.py \
--CLI True \
--protein True \
--base_model $RECOVER_WEIGHT_PATH \
🚨 Limitations
The current state of the model, obtained via instruction tuning, is a preliminary demonstration. Its capacity to handle real-world, production-grade tasks remains limited.
📚 References
If you use our repository, please cite the following related paper:@inproceedings{fang2023mol,
author = {Yin Fang and
Xiaozhuan Liang and
Ningyu Zhang and
Kangwei Liu and
Rui Huang and
Zhuo Chen and
Xiaohui Fan and
Huajun Chen},
title = {Mol-Instructions: {A} Large-Scale Biomolecular Instruction Dataset
for Large Language Models},
booktitle = {{ICLR}},
publisher = {OpenReview.net},
year = {2024},
url = {https://openreview.net/pdf?id=Tlsdsb6l9n}
}
🫱🏻🫲 Acknowledgements
We appreciate LLaMA, Huggingface Transformers Llama, Alpaca, Alpaca-LoRA, Chatbot Service and many other related works for their open-source contributions.
- Downloads last month
- 35
