This repo contains a fully fine-tuned LLaMA-7b, trained on the ๐ฅผ biomolecule text instructions from the ๐งช Mol-Instructions dataset.
Instructions for running it can be found at https://github.com/zjunlp/Mol-Instructions.
Please refer to our paper for more details.
๐ฅผ Tasks
๐ Demo
As illustrated in our repository, we provide an example to perform generation.
>> python generate.py \
--CLI True \
--protein False\
--load_8bit \
--base_model $BASE_MODEL_PATH \
--lora_weights $FINETUNED_MODEL_PATH \
Please download llama-7b-hf to obtain the pre-training weights of LLaMA-7B, refine the --base_model to point towards the location where the model weights are saved.
For model fine-tuned on biomolecular text instructions, set $FINETUNED_MODEL_PATH to 'zjunlp/llama-molinst-biotext-7b'.
๐จ Limitations
The current state of the model, obtained via instruction tuning, is a preliminary demonstration. Its capacity to handle real-world, production-grade tasks remains limited.
๐ References
If you use our repository, please cite the following related paper:@inproceedings{fang2023mol,
author = {Yin Fang and
Xiaozhuan Liang and
Ningyu Zhang and
Kangwei Liu and
Rui Huang and
Zhuo Chen and
Xiaohui Fan and
Huajun Chen},
title = {Mol-Instructions: {A} Large-Scale Biomolecular Instruction Dataset
for Large Language Models},
booktitle = {{ICLR}},
publisher = {OpenReview.net},
year = {2024},
url = {https://openreview.net/pdf?id=Tlsdsb6l9n}
}
๐ซฑ๐ปโ๐ซฒ Acknowledgements
We appreciate LLaMA, Huggingface Transformers Llama, Alpaca, Alpaca-LoRA, Chatbot Service and many other related works for their open-source contributions.
