Open Source · POC Phase

KEIN
FILESYSTEM.

Kein Filesystem. Kein Problem.

No Filesystem. No Problem.

Erasure-coded parallel object storage written directly to raw block devices — NVMe, SAS SSD, and yes, even HDD. No filesystem. No kernel modules. No POSIX. No distributed lock manager. Native clients talk directly and in parallel to storage nodes. Coordinators handle auth, placement, resolve, and commit. S3 is proxied only because S3 leaves no honest alternative, and that proxy runs on storage nodes, not on a fake "not really in the data path" middle tier. Purpose-built for AI/ML workloads at GPU-cluster scale. Runs on bare metal, KVM virtual machines, and Kubernetes.

Access Documentation Explore Architecture ↓
0
Kernel Modules
0
POSIX Overhead
0
Filesystem Journals
GPU Throughput

STORAGE IS STUCK
IN THE 1970s.

S3, parallel filesystems like Lustre or BeeGFS, Ceph, MinIO, re-invented pNFS, you name it — most if not all major file or object storage systems in production today are, at their core, POSIX storage with a distributed veneer or object storage stuck at the millennium on top of a POSIX filesystem. They inherit the same foundational assumptions from the 1970s: inodes, journals, POSIX lock semantics, directory trees, page cache coherency. These abstractions were designed for interactive multi-user Unix workstations. They are not designed for a world where a single training run reads 1 PiB across 2,500 GPU nodes.

Legacy Lock Managers

Distributed locks add milliseconds to every metadata operation

Designed for POSIX consistency that nobody in the AI/ML world asked for — but everybody pays the performance tax on. Your GPUs sit idle while the storage system argues with itself about file locks.

Kernel Module Hell

Recompile for every OS update, crash the whole node on bugs

Kernel-level filesystem clients are a maintenance nightmare at scale. One bad module update can take down hundreds of compute nodes simultaneously. Your storage vendor's release cycle now dictates your kernel upgrades.

Bolted-On Observability

eBPF probes and sidecar agents glued to the I/O path

"Why is my training run slow?" should not require deploying a separate monitoring stack, attaching kernel probes across every node, and correlating logs from six different systems to find a single bottleneck.

POSIX Tax

Compatibility shims that nobody asked for

Your AI pipeline doesn't need atomic rename, POSIX advisory locks, or extended attributes. But your storage system implements all of it — and charges you in latency, complexity, and ops burden on every single I/O operation.

FROM YOUR MACBOOK
TO 2,500 GPUs.

Storage shouldn't dictate your workflow. With libkeinfs, an AI researcher can prototype a training pipeline on a MacBook against a single-node KeInFS instance in a Docker container, then deploy that exact same code — unchanged — against a 2,500 GPU node cluster backed by a 3 TiB/s KeInFS storage backend. Same API. Same client library. Same data format. Zero refactoring.

Prototype
MacBook + single KeInFS node
libkeinfs / S3 / FUSE read+write
Validate
Dev cluster + multi-node KeInFS
Same code, real scale
Production
2,500 GPUs + 3 TiB/s KeInFS
Same code. Zero refactoring.

BUILT FOR AI.
NOT RETROFITTED.

KeInFS is not a general-purpose filesystem that someone added an S3 gateway to. It is an object storage system designed from the ground up for the access patterns of modern machine learning: massive sequential reads, large checkpoint writes, high-throughput parallel ingest, and low-latency model serving.

01

Raw Block Devices

Erasure-coded data written directly to NVMe via io_uring with O_DIRECT. No filesystem layer, no journal, no inode table. A policy-driven extent allocator handles large objects, while packed containers keep small-object overhead under control.

02

HTTP/2 Native Protocol

The wire protocol, KeInFS/2, runs HTTP/2 over TLS 1.3. Coordinators handle auth, resolve, and commit. Smart clients using libkeinfs read and write directly to storage nodes in parallel, with direct pull or direct push selected by client policy. A FUSE client built on libkeinfs supports both reads and writes on that same direct path.

03

S3 Backward Compatible

The entire AI/ML ecosystem — PyTorch, Hugging Face, DVC, MLflow, Spark — speaks S3. KeInFS provides S3 compatibility through storage-node proxy ingress, so adoption is zero-friction without pretending S3 is the high-performance path. Same data, same metadata, different honesty level.

04

Built-In Observability

Every KeInFS/2 request carries a signed attribution context: tenant, team, project, job, training rank. Metrics per-job, per-operation, in real time. "Why is my training run slow?" — one command, one answer.

05

ISA-L Erasure Coding

Reed-Solomon erasure coding via Intel ISA-L with AVX2/AVX-512 SIMD acceleration. Encode throughput exceeds network bandwidth on modern hardware. Profiles from "kamikaze" to "fortress" — you choose your redundancy vs. capacity tradeoff.

06

Self-Healing Cluster

Drives fail. Nodes fail. KeInFS detects failures in seconds, reconstructs lost chunks from parity across surviving nodes, and restores full protection — automatically, without operator intervention, without impacting running workloads.

DUAL-PATH CLIENT MODEL

Native KeInFS and S3 serve different needs. The native path uses coordinators for control plane only and moves bytes directly between clients and storage nodes. S3 is the only proxy path, and that proxy runs on storage nodes behind ordinary load balancing.

Clients
libkeinfs (Smart Path) Initiate/Resolve at coordinators → EC encode local → direct pull or push with storage nodes → EC decode local.
FUSE Client (on libkeinfs) Full read/write POSIX emulation with direct chunk I/O, aggressive read-ahead, writeback, and a pinned hot core for low latency.
S3 SDK (Proxy Path) Standard S3 requests. A storage-node ingress proxy handles assembly, fan-out, and streaming on behalf of the client.
▼   HTTP/2 + TLS 1.3   ▼
Coordinators
Stateless Control Plane Auth · policy evaluation · initiate/resolve/commit · management API · quota orchestration · signed capability issuance
▼   mTLS   ▼
Data Plane
Metadata Plane Ordered namespace operations · object publish transactions · watches · leases · backend under active evaluation
EC Engine (ISA-L) Reed-Solomon · AVX2/AVX-512 SIMD · CRC32C via SSE 4.2 · Hardware-accelerated integrity
Storage Nodes direct chunk service · optional S3 ingress · policy allocator with extents and packed containers · Raw block devices

PERFORMANCE IS
NOT AN AFTERTHOUGHT.

Every design decision in KeInFS optimizes for the access patterns of AI/ML workloads. When accelerator vendors recommend 1.4 GB/s+ sustained read bandwidth per GPU, your storage system needs to deliver — not negotiate POSIX locks.

0
Kernel Modules
Entirely userspace. Deploy as static binaries or containers with zero runtime dependencies. No recompilation on kernel upgrades.
≈LS
Line-Speed Transfer
Transports data at or near line speed, on par with traditional parallel filesystems — without the kernel modules, lock managers, or POSIX baggage.
1.4
GB/s per GPU
Designed to saturate modern accelerator bandwidth requirements — NVIDIA, AMD, Intel, or whatever comes next. Calculate GPU supportability directly from measured storage throughput.
<2s
Failure Detection
Automatic failure detection and self-healing rebuild from erasure coding parity. Operators replace hardware at their convenience.
Rust io_uring Metadata Contract Intel ISA-L HTTP/2 TLS 1.3 mTLS Reed-Solomon EC AVX-512 SIMD S3 Compatible FUSE Client Busy Poll CRC32C / SSE 4.2

GET UNDER
THE HOOD.

Full architecture documentation, design specifications, deployment guides, and API reference are available to project contributors. Authenticate with GitHub to access the complete technical documentation.

Sign in with GitHub

Requires membership in the storagebit GitHub organization. Access opens progressively as the project progresses.