rinna-neox-small-ja-it-adapter

LoRA adapter for instruction-tuned rinna/japanese-gpt-neox-small in Japanese.

Task: instruction-following / text generation
Language: Japanese
License: CC BY-SA 4.0
Base model: rinna/japanese-gpt-neox-small
Release type: LoRA adapter weights (base model required)

Usage

Load the base model and apply the LoRA adapter:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_id = "rinna/japanese-gpt-neox-small"
adapter_id = "takehika/rinna-neox-small-ja-it-adapter"
tokenizer = AutoTokenizer.from_pretrained(base_id, use_fast=False)
if tokenizer.pad_token is None:
 tokenizer.pad_token = tokenizer.eos_token
base_model = AutoModelForCausalLM.from_pretrained(base_id)
model = PeftModel.from_pretrained(base_model, adapter_id).eval()

def build_prompt(instruction, input_text=""):
 if input_text:
 return (
 "### Instruction:\n"
 f"{instruction}\n\n"
 "### Input:\n"
 f"{input_text}\n\n"
 "### Response:\n"
 )
 else:
 return (
 "### Instruction:\n"
 f"{instruction}\n\n"
 "### Response:\n"
 )


instruction = "毎日健康的に過ごすコツを5つ挙げてください。"
prompt = build_prompt(instruction)

inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
with torch.no_grad():
 output_ids = model.generate(
 **inputs,
 max_new_tokens=128,
 do_sample=True,
 pad_token_id=tokenizer.eos_token_id,
 eos_token_id=tokenizer.eos_token_id,
 )

gen_ids = output_ids[0][inputs["input_ids"].shape[1]:]
generated = tokenizer.decode(gen_ids, skip_special_tokens=True)
print(generated)

Data

Dataset:
- llm-jp/llm-jp-instructions
- kunishou/databricks-dolly-15k-ja

Training

This adapter is instruction-tuned with a prompt-response format:

### Instruction:
{instruction}

### Input:
{input}

### Response:
{response}

LoRA adapters are trained on the base model, and the adapter is applied at inference time by loading the base model and the adapter weights.

Intended Use & Limitations

Intended for Japanese instruction-following generation.
Outputs may be verbose or partially off-instruction.
It can produce incorrect or misleading content; verify critical outputs.

Attribution & Licenses

License: CC BY-SA 4.0
Base model: rinna/japanese-gpt-neox-small — MIT License
- Model card: https://huggingface.co/rinna/japanese-gpt-neox-small
Dataset: llm-jp/llm-jp-instructions — CC BY 4.0
- Dataset card: https://huggingface.co/datasets/llm-jp/llm-jp-instructions
Dataset: kunishou/databricks-dolly-15k-ja — CC BY-SA 3.0
- Dataset card: https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja

This adapter modifies the base model by fine-tuning on the above datasets.

Base Model Citation

@misc{rinna-japanese-gpt-neox-small,
 title = {rinna/japanese-gpt-neox-small},
 author = {Zhao, Tianyu and Sawada, Kei},
 url = {https://huggingface.co/rinna/japanese-gpt-neox-small}
}

@inproceedings{sawada2024release,
 title = {Release of Pre-Trained Models for the {J}apanese Language},
 author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
 booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
 month = {5},
 year = {2024},
 pages = {13898--13905},
 url = {https://aclanthology.org/2024.lrec-main.1213},
 note = {\\url{https://arxiv.org/abs/2404.01657}}
}

Downloads last month: 1

Model tree for takehika/rinna-neox-small-ja-it-adapter

Base model

rinna/japanese-gpt-neox-small

Adapter

(1)

this model

Datasets used to train takehika/rinna-neox-small-ja-it-adapter

Paper for takehika/rinna-neox-small-ja-it-adapter

Paper • 2404.01657 • Published Apr 2, 2024 • 1

URL: https://huggingface.co/takehika/rinna-neox-small-ja-it-adapter

⇱ takehika/rinna-neox-small-ja-it-adapter · Hugging Face