Voozh

This repo contains a fully fine-tuned LLaMA-7b, trained on the 🧬 protein-oriented instructions from the 🧪 Mol-Instructions dataset.

Instructions for running it can be found at https://github.com/zjunlp/Mol-Instructions.

Please refer to our paper for more details.

👁 image.png

🧬 Tasks

📝 Demo

As illustrated in our repository, we provide an example to perform generation.

For model fine-tuned on protein-oriented instructions, you can conveniently recover the model weights we trained through the following command.

Please download llama-7b-hf to obtain the pre-training weights of LLaMA-7B, refine the --base_model to point towards the location where the model weights are saved.

Then replace $DIFF_WEIGHT_PATH with the path of our provided diff weights, and replace $RECOVER_WEIGHT_PATH with the desired path to save the recovered weights. If the directory of recovered weights lacks required files (e.g., tokenizer configuration files), you can copy from $DIFF_WEIGHT_PATH.

python weight_diff.py recover \
 --path_raw $BASE_MODEL_PATH \
 --path_diff $DIFF_WEIGHT_PATH \
 --path_tuned $RECOVER_WEIGHT_PATH

After that, you can execute the following command to generate outputs with the fine-tuned LLaMA model.

>> python generate.py \
 --CLI True \
 --protein True \
 --base_model $RECOVER_WEIGHT_PATH \

🚨 Limitations

The current state of the model, obtained via instruction tuning, is a preliminary demonstration. Its capacity to handle real-world, production-grade tasks remains limited.

📚 References

If you use our repository, please cite the following related paper:

@inproceedings{fang2023mol,
 author = {Yin Fang and
 Xiaozhuan Liang and
 Ningyu Zhang and
 Kangwei Liu and
 Rui Huang and
 Zhuo Chen and
 Xiaohui Fan and
 Huajun Chen},
 title = {Mol-Instructions: {A} Large-Scale Biomolecular Instruction Dataset
 for Large Language Models},
 booktitle = {{ICLR}},
 publisher = {OpenReview.net},
 year = {2024},
 url = {https://openreview.net/pdf?id=Tlsdsb6l9n}
}

🫱🏻‍🫲 Acknowledgements

We appreciate LLaMA, Huggingface Transformers Llama, Alpaca, Alpaca-LoRA, Chatbot Service and many other related works for their open-source contributions.

Downloads last month: 35

Collection including zjunlp/llama-molinst-protein-7b

Data and models for Mol-Instructions. • 9 items • Updated Jul 21, 2025 • 3

Paper for zjunlp/llama-molinst-protein-7b

Paper • 2306.08018 • Published Jun 13, 2023 • 4

URL: https://huggingface.co/zjunlp/llama-molinst-protein-7b