Best models for persona based chats. โข 6 items โข Updated โข 2
๐ Image
Aether-12b
Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.
Model Details ๐
- Developed by: AIXON Lab
- Model type: Causal Language Model
- Language(s): English (primarily), may support other languages
- License: apache-2.0
- Repository: https://huggingface.co/aixonlab/Aether-12b
Model Architecture ๐๏ธ
- Base model: Arcanum-12b
- Parameter count: ~12 billion
- Architecture specifics: Transformer-based language model
Open LLM Leaderboard Evaluation Results
Coming Soon !
Training & Fine-tuning ๐
Aether-12b was fine-tuned on the following dataset:
- Dataset: theprint/CleverBoi-Data-20k
- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.
The CleverBoi-Data-20k dataset improved the model in the following ways:
- Enhanced reasoning and problem-solving capabilities
- Broader knowledge across various topics
- Improved performance on specific tasks like writing, analysis, and problem-solving
- Better contextual understanding and response generation
Intended Use ๐ฏ
As an assistant or specific role bot.
Ethical Considerations ๐ค
As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.
Acknowledgments ๐
We acknowledge the contributions of:
- theprint for the amazing CleverBoi-Data-20k dataset
- Downloads last month
- 9
Safetensors
Model size
12B params
Tensor type
BF16
ยท
Model tree for aixonlab/Aether-12b
Base model
Xclbr7/Arcanum-12b