VOOZH about

URL: https://huggingface.co/zfj1998/A2Search-3B-Instruct

⇱ zfj1998/A2Search-3B-Instruct · Hugging Face


  • This repository contains the RL-trained model accompanying our paper, A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning. More details are available at https://github.com/zfj1998/A2Search
Downloads last month
8
Safetensors
Model size
3B params
Tensor type
BF16
·

Model tree for zfj1998/A2Search-3B-Instruct

Base model

Qwen/Qwen2.5-3B
Finetuned
(1367)
this model
Quantizations
2 models

Dataset used to train zfj1998/A2Search-3B-Instruct

Collection including zfj1998/A2Search-3B-Instruct