NB-sami
AI & ML interests
None defined yet.
Recent Activity
🌌 Borealis
Borealis is a new family of Norwegian-centric instruction-tuned language models from the National Library of Norway.
NB-ASR-BETA
Beta testing resources in the NB-ASR project
nb-embed datasets
NB-NoTraM-Llama 3.x
Llama 3.x models in various sizes.
NB-Whisper-verbatim
NB-Whisper models that are mostly suited for linguists and researchers. The output is lowercase and without punctation.
NB-GPT-J
Models based on GPT-J from EleutherAI, and trained on data from various sources, including the digital collection at the National Library of Norway.
Speech datasets
Speech data for our speech to text models
NB-Whisper Beta
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
NB-TTS
🌌 Borealis Preview
Preview release of the Borealis family of instruction tuned models by the National Library of Norway.
NB Embed
NB-Whisper
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
NB-Llama 3.x Quant
Quantized version of the NB-Llama 3.x models. Due to hardware issues, we have still not been able to make quantized version of the 70B models.
NB-Wav2Vec
Models based on Wav2Vec from Meta, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
NB-BERT
Models based on BERT from Google, and trained on data from various sources, including the digital collection at the National Library of Norway.
Newspaper Processing
Tools for processing of newspapers: cropping (TBD) and front page detection.
Autocrop Workshop
NB-sami
NB-TTS
🌌 Borealis
Borealis is a new family of Norwegian-centric instruction-tuned language models from the National Library of Norway.
🌌 Borealis Preview
Preview release of the Borealis family of instruction tuned models by the National Library of Norway.
NB-ASR-BETA
Beta testing resources in the NB-ASR project
NB Embed
nb-embed datasets
NB-Whisper
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
NB-NoTraM-Llama 3.x
Llama 3.x models in various sizes.
NB-Llama 3.x Quant
Quantized version of the NB-Llama 3.x models. Due to hardware issues, we have still not been able to make quantized version of the 70B models.
NB-Whisper-verbatim
NB-Whisper models that are mostly suited for linguists and researchers. The output is lowercase and without punctation.
NB-Wav2Vec
Models based on Wav2Vec from Meta, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
NB-GPT-J
Models based on GPT-J from EleutherAI, and trained on data from various sources, including the digital collection at the National Library of Norway.
NB-BERT
Models based on BERT from Google, and trained on data from various sources, including the digital collection at the National Library of Norway.
Speech datasets
Speech data for our speech to text models
Newspaper Processing
Tools for processing of newspapers: cropping (TBD) and front page detection.
NB-Whisper Beta
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
Autocrop Workshop
