VOOZH about

URL: https://huggingface.co/gplsi/Aitana-TA-2B-S

⇱ gplsi/Aitana-TA-2B-S · Hugging Face


AitanaTA Model Card

AitanaTA-2b-Instruct is a translation LLM that has been instruction-tuned from SalamandraTA-2b-Instruct. This model results from instruction-tuning on parallel data from Salamandra-2b-base. AitanaTA-2b-Instruct has been specifically instruction-tuned for translation between Spanish and Valencian, with a focus on sentence-level translation.

Table of Contents

Model Description

Property Value
Base Model BSC-LT/salamandraTA-2b-instruct
Architecture Transformer decoder-only
Parameters ~2.25B
Languages Valencian and Spanish
License Apache 2.0
Finetuned from model: This model follows the same instruction pattern as SalamandraTA-2b-Instruct. The only adaptation was focusing the instructions on translation between Valencian and Spanish, in both directions.

Training Details

Training Data

The data comes from L'Associació de Mitjans d'Informació i Comunicació (AMIC), from which we built a parallel sentence-level dataset. This dataset was specifically created to align Spanish and Valencian sentences, ensuring high-quality parallel examples for training and evaluation.

== Language pairs== Valencian -> Spanish: 738777 (mean_src_len=165.6, mean_tgt_len=168.6)

Training Hyperparameters

Training regime:

  • epochs: 1
  • learning_rate: 1e-5
  • beta1: 0.9
  • beta2: 0.99
  • weight_decay: 0
  • global_batch_size: 64
  • micro_batch_size: 2
  • log_interval: 5
  • save_interval: 5
  • lr_warmup_steps: 100
  • max_seq_length: 2048

How to use

You can translate between Spanish to Valencian. The instruction-following model uses the commonly adopted ChatML template:

<|im_start|>system
{SYSTEM PROMPT}<|im_end|>
<|im_start|>user
{USER PROMPT}<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}<|im_end|>
<|im_start|>user
[...]

The easiest way to apply it is by using the tokenizer's built-in functions, as shown in the following snippet.

from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model_id = "gplsi/Aitana-TA-2B-S"

source = 'Spanish'
target = 'Valencian'
sentence = "La inteligencia artificial está transformando el mundo."

text = f"Translate the following text from {source} into {target}.\n{source}: {sentence} \n{target}:"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
 model_id,
 device_map="auto",
 dtype=torch.bfloat16
 )

message = [ { "role": "user", "content": text } ]

prompt = tokenizer.apply_chat_template(
 message,
 tokenize=False,
 add_generation_prompt=True,
)

inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
input_length = inputs.shape[1]
outputs = model.generate(input_ids=inputs.to(model.device),
 max_new_tokens=400,
 early_stopping=True,
 num_beams=5)

print(tokenizer.decode(outputs[0, input_length:], skip_special_tokens=True))
# La intel·ligència artificial està transformant el món.

Using this template, each turn is preceded by a <|im_start|> delimiter and the role of the entity (either user, for content supplied by the user, or assistant for LLM responses), and finished with the <|im_end|> token.

Evaluation

Testing Data

Our objective was to evaluate translation between Spanish and Valencian. To test the model, we used the Phrases task from IberoBench.

Metrics

For evaluation, we relied on a set of standard translation metrics: COMET, BLEU, TER, and ChrF.

Results on Phrases tasks

Task Metric Aitana-TA-2B-S salamandraTA-2b-instruct
phrases_es-va BLEU 64.93 62.40
phrases_va-es BLEU 81.40 75.49
phrases_va-ca BLEU 81.19 82.07
phrases_ca-va BLEU 80.22 76.53
=============== ======= ===================
phrases_es-va ChrF 84.23 82.18
phrases_va-es ChrF 91.22 88.15
phrases_va-ca ChrF 92.00 91.43
phrases_ca-va ChrF 91.77 89.57
=============== ======= ===================
phrases_es-va COMET 0.93 0.93
phrases_va-es COMET 0.95 0.95
phrases_va-ca COMET 0.96 0.96
phrases_ca-va COMET 0.96 0.95
=============== ======= ===================
phrases_es-va TER 21.07 23.26
phrases_va-es TER 11.03 15.06
phrases_va-ca TER 9.03 10.68
phrases_ca-va TER 10.30 13.67

Results on FLORES tasks

Task Metric Aitana-TA-2B-S salamandraTA-2b-instruct
ca_en_flores BLEU 46.79 46.43
ca_es_flores BLEU 26.97 24.20
en_ca_flores BLEU 43.80 42.10
en_es_flores BLEU 29.05 26.08
es_ca_flores BLEU 26.90 22.14
es_en_flores BLEU 32.10 28.64
=============== ======= ===================
ca_en_flores ChrF 69.76 69.52
ca_es_flores ChrF 54.37 52.94
en_ca_flores ChrF 66.86 65.68
en_es_flores ChrF 56.26 54.33
es_ca_flores ChrF 56.00 52.56
es_en_flores ChrF 60.44 58.36
=============== ======= ===================
ca_en_flores COMET 0.88 0.88
ca_es_flores COMET 0.86 0.85
en_ca_flores COMET 0.87 0.87
en_es_flores COMET 0.86 0.85
es_ca_flores COMET 0.86 0.85
es_en_flores COMET 0.87 0.86
=============== ======= ===================
ca_en_flores TER 39.98 41.06
ca_es_flores TER 59.12 61.24
en_ca_flores TER 43.00 44.24
en_es_flores TER 56.48 58.58
es_ca_flores TER 64.29 68.49
es_en_flores TER 56.14 60.99

Technical Specifications

Hardware and Software

For training, we used custom code developed on top of the PyTorch Lightning framework (Fabric Lightning), enabling model sharding across multiple GPUs through Fully Sharded Data Parallel (FSDP).

Compute Infrastructure

This model was trained on NVIDIA DGX systems equipped with A100 GPUs, which enabled efficient large-scale training. For this model, we used 4-A100 GPUs.

Additional Information

Author

The model has been developed by the Language and Information Systems Group (GPLSI) and the Centro de Inteligencia Digital (CENID), both part of the University of Alicante (UA), as part of their ongoing research in Natural Language Processing (NLP).

Funding

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública, co-financed by the EU – NextGenerationEU, within the framework of the project Desarrollo de Modelos ALIA. This work has also been partially supported by Project HEART-NLP (PID2024-156263OB-C22).

Acknowledgments

We would like to express our gratitude to all individuals and institutions that have contributed to the development of this work.

Special thanks to:

We also acknowledge the financial, technical, and scientific support of the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA, whose contribution has been essential to the completion of this research.

License

Apache License, Version 2.0

Disclaimer

This model has been developed and instruction-tuned specifically for translation between Spanish and Valencian. Its use outside of translation tasks is not recommended, as performance and reliability have not been evaluated for other natural language processing applications. The authors are not responsible for potential errors, misinterpretations, or inappropriate use of the model beyond its intended purpose.

Reference

If you use this model in your research or work, please cite it as follows:

@misc{gplsi-aitana-ta-2b-s,
 author = {Sepúlveda-Torres, Robiert and Galeano, Santiago and Miró Maestre, María and Martínez-Murillo, Iván and Grande, Eduardo and Canal-Esteve, Miquel and Estevanell-Valladares, Ernesto L. and Yáñez-Romero, Fabio and Consuegra-Ayala, Juan Pablo and Bonora, Mar and Gutierrez, Yoan and Abreu Salas, José Ignacio and Lloret, Elena and Montoyo, Andrés and Muñoz-Guillena and Palomar, Manuel},
 title = {Aitana-TA-2b-S: Translation model for Spanish and Valencian},
 year = {2025},
 institution = {Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA)},
 howpublished = {\url{https://huggingface.co/gplsi/Aitana-TA-2B-S}},
 note = {Accessed: 2025-10-03}
}

Copyright © 2026 Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA). Distributed under the Apache License 2.0.

Downloads last month
44
Safetensors
Model size
2B params
Tensor type
BF16
·

Model tree for gplsi/Aitana-TA-2B-S

Finetuned
(1)
this model
Quantizations
2 models

Collection including gplsi/Aitana-TA-2B-S