numa

Here are 73 public repositories matching this topic...

guqiong96 / Lvllm

LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.

cpu gpu model inference parallelism decode moe numa hybrid prefill vllm

Updated
Python

eXascaleInfolab / PyExPool

Star

Python Multi-Process Execution Pool: concurrent asynchronous execution pool with custom resource constraints (memory, timeouts, affinity, CPU cores and caching), load balancing and profiling capabilities of the external apps on NUMA architecture

multiprocessing parallel-computing numa monitoring-server cache-control task-queue application-framework parallel-processing execution-pool benchmarking-framework load-balancing in-memory-computations

Updated
Python

Scottcjn / ram-coffers

Star

RAM Coffers: Conditional Memory via NUMA-Distributed Weight Banking - O(1) lookup routing for LLM inference (Dec 16, 2025 - predates DeepSeek Engram by 27 days)

ai ram memory-management numa neuromorphic hacktoberfest ppc64le first-timers-only cognitive-computing power8 good-first-issue cpu-inference llm llama-cpp

Updated
C

domargan / awesome-numa

Star

A community-oriented list of useful NUMA-related libraries, tools, and other resources

multiprocessing multithreading awesome-list numa shared-memory numa-systems non-uniform-memory-access numa-benchmarks numa-aware

Updated

lsds / LightSaber

Star

Multi-core Window-Based Stream Processing Engine

compression cpp llvm stream-processing ssd rdma numa multi-core aggregation sliding-windows incremental-computation libaio

Updated
C++

Scottcjn / llama-cpp-power8

Star

AltiVec/VSX optimized llama.cpp for IBM POWER8

machine-learning ai numa ibm powerpc hacktoberfest altivec vsx ppc64le first-timers-only power8 good-first-issue cpu-inference llm llama-cpp ggml

Updated
C

memtt / numaprof

Star

NUMAPROF is a NUMA memory profliler based on Pintool to track your remote memory accesses.

profiler memory instrumentation numa

Updated
C++

peterosterlund2 / texel

Star

Texel chess engine

android windows linux cmake chess-engine cluster cpp14 mpi smp numa

Updated
C++

guqiong96 / Lsglang

Star

Lsglang is a special extension of sglang that fully utilizes CPU and GPU computing resources with an efficient GPU parallel + NUMA parallel architecture, suitable for MOE model hybrid inference.

cpu gpu model inference parallelism decode moe numa hybird prefill sglang

Updated
Python

HadrienG2 / hwlocality

Star

Rust bindings to Open MPI Portable Hardware Locality "hwloc" library, covering version 2.0 and above.

cache os locality memory-management numa hwloc ffi-bindings hardware-support

Updated
Rust

k13132 / openwrt-dpdk

Star

Data Plane Development Kit (DPDK) integration into OpenWrt

dpdk kernel-module openwrt numa openwrt-package openwrt-feed dpdk-driver

Updated
Makefile

numamma / numamma

Star

NumaMMA is a lightweight memory profiler for parallel applications

profile memory numa pebs

Updated
C

tklauser / numcpus

Star

Go package providing information about the number of CPUs in the system

go linux golang unix cpu online offline bsd numa cputopology

Updated
Go

👁 numanji

bastion-rs / numanji

Star

Local-affinity first NUMA-aware allocator with optional fallback.

rust allocator mmap numa numa-aware globalallocator

Updated
Rust

c3sr / comm_scope

Star

NUMA-aware multi-CPU multi-GPU data transfer benchmarks

performance gpu cuda bandwidth numa hip benchmark-suite nvlink

Updated
C++

numap-library / numap

Star

profile memory numa pebs

Updated
C

latentPrion / zambesii

Star

Non-unix, custom-API hybrid OS kernel written in C++ which can be thought of as an emulated microkernel. The native API is almost fully asynchronous and the kernel is aimed at high-scaling, high-throughput-requiring multiprocessor workloads, with working support for SMP and NUMA already implemented. Join the IRC channel, #zbz-dev on freenode!