Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware
- Updated
- C++
![]() |
VOOZH | about |
Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware
Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices. Features proprietary Layer-by-Layer (LBL) streaming, zero-copy mmap loading, and native C++/Kotlin architecture.
Convert and quantize llm models
A simple Gradio app for local translation using the GGUF versions of MADLAD-400
Privacy-first Local RAG Server: Chat with PDF & DOCX using GGUF models via llama.cpp and Qdrant. A lightweight, standalone FastAPI server with a clean HTML UI. High-performance, fully offline document intelligence. No Ollama, no cloud, no API keys.
Splinter is an atomic, lock-free persist-able shared memory KV & vector store that runs LLM inference without socket, mutex or memcpy() overhead; it ingests, stores and optionally persists huge amounts of data without latency. Splinter fits in the size of most modern CPU instruction caches (766 ELOC) and ships with CLI , tools and tests.
Emotica AI is a compassionate and therapeutic virtual assistant designed to provide empathetic and supportive conversations. It integrates a local LLaMA model for text generation, a vision model for image captioning, a RAG system for information retrieval, and emotion detection to tailor its responses.
Containerized LLM for any use-case big or small
Nectar-X-Studio is a powerful, Local AI-Inferencing application that allows the user download, create, run agents and run large language models on their own machine. With no internet connection required, Nectar ensures privacy-first, high-performance inference using cutting-edge open-source models from Hugging Face, Ollama, and beyond.
AI tool to help users research using local LLMs and automated web search.
GGUF file format for dotnet
Add a description, image, and links to the gguf-model-support topic page so that developers can more easily learn about it.
To associate your repository with the gguf-model-support topic, visit your repo's landing page and select "manage topics."