SWE-FastContext A family of code-search models powering the Explore subagent for coding agents. Text Generation β’ 4B β’ Updated 1 day ago β’ 957 β’ β’ 202 Text Generation β’ 4B β’ Updated 1 day ago β’ 795 β’ β’ 36
GridSFM Collection of datasets and models developed to support research in power grid modeling Updated 24 days ago β’ 557 β’ 4 Graph Machine Learning β’ Updated 23 days ago β’ 18
ChatBench ChatBench Datasets and Simulators (same prompt + fine-tuning set-up) from the ChatBench paper. Preview β’ Updated Apr 28, 2025 β’ 104 β’ 13 Text Generation β’ 81.9M β’ Updated Aug 23, 2025 β’ 38 β’ 4 Updated Aug 23, 2025 β’ 11 β’ 6 Updated Aug 23, 2025 β’ 12 β’ 5
MediPhi A collection of SLMs based on Phi3.5-mini-instruct adapted to clinical natural language processing tasks: https://arxiv.org/abs/2505.10717 Paper β’ 2505.10717 β’ Published May 15, 2025 β’ 5 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 1.65k β’ 69 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 847 β’ 23 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 159 β’ 13
NextCoder NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. Text Generation β’ 8B β’ Updated Jun 12, 2025 β’ 141 β’ β’ 34 Text Generation β’ 15B β’ Updated Jun 12, 2025 β’ 284 β’ β’ 19 Text Generation β’ 33B β’ Updated Jun 12, 2025 β’ 44 β’ β’ 69 Viewer β’ Updated Jul 8, 2025 β’ 381k β’ 348 β’ 55
Phi-3 Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 937k β’ 989 Text Generation β’ 42B β’ Updated Dec 10, 2025 β’ 118k β’ 575 Image-Text-to-Text β’ 4B β’ Updated Dec 10, 2025 β’ 1.72M β’ 736 Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 563k β’ β’ 1.44k
Controllable Safety Alignment Artifacts for the paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968) Paper β’ 2410.08968 β’ Published Oct 11, 2024 β’ 14 Viewer β’ Updated Aug 1, 2025 β’ 200 β’ 328 β’ 3 Viewer β’ Updated May 5, 2025 β’ 3.2k β’ 87 β’ 3 Viewer β’ Updated Aug 1, 2025 β’ 125k β’ 127 β’ 4
MAI-DS-R1 MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. Text Generation β’ 671B β’ Updated Dec 15, 2025 β’ 165 β’ 299 Text Generation β’ 671B β’ Updated Dec 15, 2025 β’ 61 β’ 27
SpeechT5 The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. Paper β’ 2110.07205 β’ Published Oct 14, 2021 β’ 6 Text-to-Speech β’ Updated Nov 8, 2023 β’ 89.1k β’ 835 SpeechT5 Speech Synthesis Demo π© 220 Audio-to-Audio β’ Updated Mar 22, 2023 β’ 3.81k β’ 111
Table Transformer The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. Object Detection β’ 28.8M β’ Updated Sep 6, 2023 β’ 1.69M β’ 424 Object Detection β’ 28.8M β’ Updated Sep 6, 2023 β’ 1.22M β’ 219 Object Detection β’ 28.8M β’ Updated Nov 18, 2023 β’ 174k β’ 83 Object Detection β’ 28.8M β’ Updated Nov 27, 2023 β’ 1.11k β’ 2
Biomedical Models for biomedical research applications, such as radiology report generation and biomedical language understanding. Text Generation β’ 7B β’ Updated Aug 14, 2025 β’ 6.06k β’ 79 Image Feature Extraction β’ 86.6M β’ Updated Aug 22, 2024 β’ 23.3k β’ 24 Image Feature Extraction β’ 86.6M β’ Updated May 12 β’ 21.6k β’ 78 Updated Dec 8, 2025 β’ 31
UDOP UDOP is a general multimodal model for document AI Paper β’ 2212.02623 β’ Published Dec 5, 2022 β’ 12 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 90.5k β’ 124 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 36 β’ 6 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 204 β’ 34
Florence Paper β’ 2311.06242 β’ Published Nov 10, 2023 β’ 97 Image-Text-to-Text β’ 0.8B β’ Updated Aug 4, 2025 β’ 532k β’ 1.82k Image-Text-to-Text β’ 0.2B β’ Updated Aug 4, 2025 β’ 2.3M β’ 379 Image-Text-to-Text β’ 0.8B β’ Updated Aug 4, 2025 β’ 21.7k β’ 385
MoCapAct Locomotion policies for hundreds of simulated humanoid locomotion clips and demonstration data for training them. Updated Aug 17, 2024 β’ 10 Updated Aug 17, 2024 β’ 149 β’ 5 Paper β’ 2208.07363 β’ Published Aug 15, 2022 β’ 2
Froggy-Models Text Generation β’ Updated Jan 22 β’ 5.56k β’ β’ 32 Text Generation β’ 425k β’ Updated Jan 15 β’ 243 β’ β’ 62
Skala Accurate and scalable exchange-correlation with deep learning Updated Apr 23 β’ 96.9k β’ 10 Paper β’ 2506.14665 β’ Published Apr 21 β’ 5 Updated May 4 β’ 129 β’ 6 Updated Apr 23 β’ 25.8k β’ 5
VibeVoice Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ Text-to-Speech β’ 3B β’ Updated Jan 22 β’ 219k β’ 2.4k Text-to-Speech β’ 1B β’ Updated Dec 12, 2025 β’ 659k β’ 1.23k Paper β’ 2508.19205 β’ Published Aug 26, 2025 β’ 173 Automatic Speech Recognition β’ 9B β’ Updated Jan 27 β’ 637k β’ 1.18k
Dayhoff Atlas The models and datasets that comprise the Dayhoff Atlas Viewer β’ Updated Apr 2 β’ 1.77B β’ 3.75k β’ 12 Text Generation β’ 0.2B β’ Updated Jan 16 β’ 52 β’ 5 Text Generation β’ 0.2B β’ Updated Jan 26 β’ 442 β’ 1 Text Generation β’ 0.2B β’ Updated Jan 26 β’ 51 β’ 2
Paza Paza is a collection of speech models & benchmarks for low resource languages by the Microsoft Research Africa - Nairobi Lab PazaBench π₯ 20 ASR Leaderboard for low resource languages Automatic Speech Recognition β’ 6B β’ Updated Feb 4 β’ 148 β’ 4 Automatic Speech Recognition β’ 0.8B β’ Updated Feb 4 β’ 411 β’ 8
Phi-4 Phi-4 family of small language, multi-modal and reasoning models. Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 905 β’ 281 Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 58.9k β’ 237 Text Generation β’ 15B β’ Updated Nov 24, 2025 β’ 11.1k β’ 228 Text Generation β’ 15B β’ Updated Nov 24, 2025 β’ 24.3k β’ 343
Phi-1 Phi-1 family of small language models. Text Generation β’ 1B β’ Updated Nov 24, 2025 β’ 10.9k β’ β’ 222 Text Generation β’ 1B β’ Updated Nov 24, 2025 β’ 51.8k β’ β’ 1.36k Paper β’ 2306.11644 β’ Published Jun 20, 2023 β’ 158 Paper β’ 2309.05463 β’ Published Sep 11, 2023 β’ 92
BitNet π₯BitNet family of large language models (1-bit LLMs). Text Generation β’ 0.8B β’ Updated Dec 17, 2025 β’ 8.73k β’ 1.46k Text Generation β’ 2B β’ Updated Dec 17, 2025 β’ 6.93k β’ 42 Text Generation β’ 2B β’ Updated Dec 17, 2025 β’ 21.4k β’ 279 Paper β’ 2504.12285 β’ Published Apr 16, 2025 β’ 87
LLM2CLIP LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. Zero-Shot Image Classification β’ Updated Nov 22, 2024 β’ 56 β’ 61 Zero-Shot Classification β’ 0.6B β’ Updated Nov 24, 2024 β’ 13.3k β’ 45 Updated Feb 8, 2025 β’ 12 β’ 11 Zero-Shot Classification β’ 0.4B β’ Updated Nov 24, 2024 β’ 489 β’ 19
TAPEX TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification. Paper β’ 2107.07653 β’ Published Jul 16, 2021 β’ 3 Table Question Answering β’ 0.4B β’ Updated Jan 12, 2024 β’ 1.47k β’ 79 Table Question Answering β’ Updated Jan 24, 2023 β’ 817k β’ β’ 24 Table Question Answering β’ 0.4B β’ Updated Sep 15, 2023 β’ 325 β’ 19
LayoutLM The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. 0.1B β’ Updated Apr 10, 2024 β’ 1.73M β’ 500 Updated Sep 16, 2022 β’ 559k β’ 68 0.1B β’ Updated Apr 16, 2024 β’ 199k β’ 62 Updated Sep 16, 2022 β’ 145k β’ 75
Orca The Orca family of LMs developed by Microsoft. Text Generation β’ Updated Nov 22, 2023 β’ 829 β’ β’ 224 Text Generation β’ Updated Nov 22, 2023 β’ 1.14k β’ β’ 668
GIT GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering. Paper β’ 2205.14100 β’ Published May 27, 2022 β’ 2 Image-to-Text β’ 0.2B β’ Updated Apr 24, 2023 β’ 10.3k β’ 110 Image-to-Text β’ Updated Feb 8, 2023 β’ 636 β’ 18 Visual Question Answering β’ 0.2B β’ Updated Mar 9, 2024 β’ 1.17k β’ 21
IFMs Industrial Foundation Models Text Generation β’ 7B β’ Updated Aug 12, 2024 β’ 24 β’ 10 Text Generation β’ 13B β’ Updated Aug 12, 2024 β’ 28 β’ 6
SWE-FastContext A family of code-search models powering the Explore subagent for coding agents. Text Generation β’ 4B β’ Updated 1 day ago β’ 957 β’ β’ 202 Text Generation β’ 4B β’ Updated 1 day ago β’ 795 β’ β’ 36
Froggy-Models Text Generation β’ Updated Jan 22 β’ 5.56k β’ β’ 32 Text Generation β’ 425k β’ Updated Jan 15 β’ 243 β’ β’ 62
GridSFM Collection of datasets and models developed to support research in power grid modeling Updated 24 days ago β’ 557 β’ 4 Graph Machine Learning β’ Updated 23 days ago β’ 18
Skala Accurate and scalable exchange-correlation with deep learning Updated Apr 23 β’ 96.9k β’ 10 Paper β’ 2506.14665 β’ Published Apr 21 β’ 5 Updated May 4 β’ 129 β’ 6 Updated Apr 23 β’ 25.8k β’ 5
ChatBench ChatBench Datasets and Simulators (same prompt + fine-tuning set-up) from the ChatBench paper. Preview β’ Updated Apr 28, 2025 β’ 104 β’ 13 Text Generation β’ 81.9M β’ Updated Aug 23, 2025 β’ 38 β’ 4 Updated Aug 23, 2025 β’ 11 β’ 6 Updated Aug 23, 2025 β’ 12 β’ 5
VibeVoice Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ Text-to-Speech β’ 3B β’ Updated Jan 22 β’ 219k β’ 2.4k Text-to-Speech β’ 1B β’ Updated Dec 12, 2025 β’ 659k β’ 1.23k Paper β’ 2508.19205 β’ Published Aug 26, 2025 β’ 173 Automatic Speech Recognition β’ 9B β’ Updated Jan 27 β’ 637k β’ 1.18k
MediPhi A collection of SLMs based on Phi3.5-mini-instruct adapted to clinical natural language processing tasks: https://arxiv.org/abs/2505.10717 Paper β’ 2505.10717 β’ Published May 15, 2025 β’ 5 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 1.65k β’ 69 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 847 β’ 23 Text Generation β’ 4B β’ Updated Dec 15, 2025 β’ 159 β’ 13
Dayhoff Atlas The models and datasets that comprise the Dayhoff Atlas Viewer β’ Updated Apr 2 β’ 1.77B β’ 3.75k β’ 12 Text Generation β’ 0.2B β’ Updated Jan 16 β’ 52 β’ 5 Text Generation β’ 0.2B β’ Updated Jan 26 β’ 442 β’ 1 Text Generation β’ 0.2B β’ Updated Jan 26 β’ 51 β’ 2
Paza Paza is a collection of speech models & benchmarks for low resource languages by the Microsoft Research Africa - Nairobi Lab PazaBench π₯ 20 ASR Leaderboard for low resource languages Automatic Speech Recognition β’ 6B β’ Updated Feb 4 β’ 148 β’ 4 Automatic Speech Recognition β’ 0.8B β’ Updated Feb 4 β’ 411 β’ 8
NextCoder NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. Text Generation β’ 8B β’ Updated Jun 12, 2025 β’ 141 β’ β’ 34 Text Generation β’ 15B β’ Updated Jun 12, 2025 β’ 284 β’ β’ 19 Text Generation β’ 33B β’ Updated Jun 12, 2025 β’ 44 β’ β’ 69 Viewer β’ Updated Jul 8, 2025 β’ 381k β’ 348 β’ 55
Phi-4 Phi-4 family of small language, multi-modal and reasoning models. Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 905 β’ 281 Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 58.9k β’ 237 Text Generation β’ 15B β’ Updated Nov 24, 2025 β’ 11.1k β’ 228 Text Generation β’ 15B β’ Updated Nov 24, 2025 β’ 24.3k β’ 343
Phi-3 Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 937k β’ 989 Text Generation β’ 42B β’ Updated Dec 10, 2025 β’ 118k β’ 575 Image-Text-to-Text β’ 4B β’ Updated Dec 10, 2025 β’ 1.72M β’ 736 Text Generation β’ 4B β’ Updated Dec 10, 2025 β’ 563k β’ β’ 1.44k
Phi-1 Phi-1 family of small language models. Text Generation β’ 1B β’ Updated Nov 24, 2025 β’ 10.9k β’ β’ 222 Text Generation β’ 1B β’ Updated Nov 24, 2025 β’ 51.8k β’ β’ 1.36k Paper β’ 2306.11644 β’ Published Jun 20, 2023 β’ 158 Paper β’ 2309.05463 β’ Published Sep 11, 2023 β’ 92
Controllable Safety Alignment Artifacts for the paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968) Paper β’ 2410.08968 β’ Published Oct 11, 2024 β’ 14 Viewer β’ Updated Aug 1, 2025 β’ 200 β’ 328 β’ 3 Viewer β’ Updated May 5, 2025 β’ 3.2k β’ 87 β’ 3 Viewer β’ Updated Aug 1, 2025 β’ 125k β’ 127 β’ 4
BitNet π₯BitNet family of large language models (1-bit LLMs). Text Generation β’ 0.8B β’ Updated Dec 17, 2025 β’ 8.73k β’ 1.46k Text Generation β’ 2B β’ Updated Dec 17, 2025 β’ 6.93k β’ 42 Text Generation β’ 2B β’ Updated Dec 17, 2025 β’ 21.4k β’ 279 Paper β’ 2504.12285 β’ Published Apr 16, 2025 β’ 87
MAI-DS-R1 MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. Text Generation β’ 671B β’ Updated Dec 15, 2025 β’ 165 β’ 299 Text Generation β’ 671B β’ Updated Dec 15, 2025 β’ 61 β’ 27
LLM2CLIP LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. Zero-Shot Image Classification β’ Updated Nov 22, 2024 β’ 56 β’ 61 Zero-Shot Classification β’ 0.6B β’ Updated Nov 24, 2024 β’ 13.3k β’ 45 Updated Feb 8, 2025 β’ 12 β’ 11 Zero-Shot Classification β’ 0.4B β’ Updated Nov 24, 2024 β’ 489 β’ 19
SpeechT5 The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. Paper β’ 2110.07205 β’ Published Oct 14, 2021 β’ 6 Text-to-Speech β’ Updated Nov 8, 2023 β’ 89.1k β’ 835 SpeechT5 Speech Synthesis Demo π© 220 Audio-to-Audio β’ Updated Mar 22, 2023 β’ 3.81k β’ 111
TAPEX TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification. Paper β’ 2107.07653 β’ Published Jul 16, 2021 β’ 3 Table Question Answering β’ 0.4B β’ Updated Jan 12, 2024 β’ 1.47k β’ 79 Table Question Answering β’ Updated Jan 24, 2023 β’ 817k β’ β’ 24 Table Question Answering β’ 0.4B β’ Updated Sep 15, 2023 β’ 325 β’ 19
Table Transformer The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. Object Detection β’ 28.8M β’ Updated Sep 6, 2023 β’ 1.69M β’ 424 Object Detection β’ 28.8M β’ Updated Sep 6, 2023 β’ 1.22M β’ 219 Object Detection β’ 28.8M β’ Updated Nov 18, 2023 β’ 174k β’ 83 Object Detection β’ 28.8M β’ Updated Nov 27, 2023 β’ 1.11k β’ 2
LayoutLM The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. 0.1B β’ Updated Apr 10, 2024 β’ 1.73M β’ 500 Updated Sep 16, 2022 β’ 559k β’ 68 0.1B β’ Updated Apr 16, 2024 β’ 199k β’ 62 Updated Sep 16, 2022 β’ 145k β’ 75
Biomedical Models for biomedical research applications, such as radiology report generation and biomedical language understanding. Text Generation β’ 7B β’ Updated Aug 14, 2025 β’ 6.06k β’ 79 Image Feature Extraction β’ 86.6M β’ Updated Aug 22, 2024 β’ 23.3k β’ 24 Image Feature Extraction β’ 86.6M β’ Updated May 12 β’ 21.6k β’ 78 Updated Dec 8, 2025 β’ 31
Orca The Orca family of LMs developed by Microsoft. Text Generation β’ Updated Nov 22, 2023 β’ 829 β’ β’ 224 Text Generation β’ Updated Nov 22, 2023 β’ 1.14k β’ β’ 668
UDOP UDOP is a general multimodal model for document AI Paper β’ 2212.02623 β’ Published Dec 5, 2022 β’ 12 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 90.5k β’ 124 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 36 β’ 6 Image-Text-to-Text β’ 0.7B β’ Updated Dec 2, 2025 β’ 204 β’ 34
GIT GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering. Paper β’ 2205.14100 β’ Published May 27, 2022 β’ 2 Image-to-Text β’ 0.2B β’ Updated Apr 24, 2023 β’ 10.3k β’ 110 Image-to-Text β’ Updated Feb 8, 2023 β’ 636 β’ 18 Visual Question Answering β’ 0.2B β’ Updated Mar 9, 2024 β’ 1.17k β’ 21
Florence Paper β’ 2311.06242 β’ Published Nov 10, 2023 β’ 97 Image-Text-to-Text β’ 0.8B β’ Updated Aug 4, 2025 β’ 532k β’ 1.82k Image-Text-to-Text β’ 0.2B β’ Updated Aug 4, 2025 β’ 2.3M β’ 379 Image-Text-to-Text β’ 0.8B β’ Updated Aug 4, 2025 β’ 21.7k β’ 385
IFMs Industrial Foundation Models Text Generation β’ 7B β’ Updated Aug 12, 2024 β’ 24 β’ 10 Text Generation β’ 13B β’ Updated Aug 12, 2024 β’ 28 β’ 6
MoCapAct Locomotion policies for hundreds of simulated humanoid locomotion clips and demonstration data for training them. Updated Aug 17, 2024 β’ 10 Updated Aug 17, 2024 β’ 149 β’ 5 Paper β’ 2208.07363 β’ Published Aug 15, 2022 β’ 2