👁 Image

Aether-12b

Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.

Model Details 📊

Developed by: AIXON Lab
Model type: Causal Language Model
Language(s): English (primarily), may support other languages
License: apache-2.0
Repository: https://huggingface.co/aixonlab/Aether-12b

Model Architecture 🏗️

Base model: Arcanum-12b
Parameter count: ~12 billion
Architecture specifics: Transformer-based language model

Open LLM Leaderboard Evaluation Results

Coming Soon !

Training & Fine-tuning 🔄

Aether-12b was fine-tuned on the following dataset:

Dataset: theprint/CleverBoi-Data-20k
Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.

The CleverBoi-Data-20k dataset improved the model in the following ways:

Enhanced reasoning and problem-solving capabilities
Broader knowledge across various topics
Improved performance on specific tasks like writing, analysis, and problem-solving
Better contextual understanding and response generation

Intended Use 🎯

As an assistant or specific role bot.

Ethical Considerations 🤔

As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.

Acknowledgments 🙏

We acknowledge the contributions of:

theprint for the amazing CleverBoi-Data-20k dataset

Downloads last month: 9

Safetensors

Model size

12B params

Tensor type

BF16

Model tree for aixonlab/Aether-12b

Base model

Xclbr7/Arcanum-12b

Finetuned

(2)

this model

Finetunes

1 model

Merges

8 models

Quantizations

3 models

Collection including aixonlab/Aether-12b

Best models for persona based chats. • 6 items • Updated Mar 2 • 2

URL: https://huggingface.co/aixonlab/Aether-12b

⇱ aixonlab/Aether-12b · Hugging Face