Hypervisor Decisions
Three decisions that define RumKV's minimal design philosophy.
See also: Rumpk vs. RumKV for a detailed comparison of the kernel and hypervisor.
H1: Type-1 Hypervisor
Status: Accepted
Context
Type-1 hypervisors run directly on hardware (Xen, VMware ESXi). Type-2 hypervisors run inside a host OS (VirtualBox, KVM on Linux). Nexus claims sovereignty — no hidden host OS underneath. If the system runs on Linux, it's a Linux system with Nexus characteristics, not a sovereign system.
Decision
RumKV is a bare-metal Type-1 hypervisor:
- Boots before any guest OS
- Holds highest hardware privilege (EL2 / Ring -1 / M-Mode)
- Assigns CPU cores, memory ranges, and devices to cells via page tables
- < 5 KB active code — minimal by any standard
- Then delegates and mostly sleeps
Alternatives Rejected
| Option | Why Not |
|---|---|
| Type-2 on Linux (KVM) | Requires Linux kernel as host; surrenders sovereignty to Linux security model |
| Bare-metal single OS (no hypervisor) | No isolation between services; one fault cascades to entire system |
| seL4 as hypervisor | Designed for Type-1 but heavier than RumKV's spatial-only model |
| Xen | Feature-rich but 1M+ LOC; too complex for formal reasoning |
Consequences
- Complete hardware control (no hidden host OS surface)
- Cells are truly isolated (no shared kernel state between guests)
- Can run air-gapped systems (no Internet-connected host needed)
- Must port RumKV to each hardware platform (RISC-V, ARM64, x86_64)
- Device sharing requires explicit passthrough configuration
H2: Dual-Pledge Enforcement
Status: Accepted
Context
Single-layer security enforcement creates a single point of failure. If the kernel is compromised, all bets are off. Defense-in-depth is a well-established principle, but most systems implement it as "multiple software checks" — which all fail if the software is compromised. Hardware-backed enforcement survives software compromise.
Decision
Two independent enforcement layers operating simultaneously:
Layer 1: Hard Pledges (RumKV — hardware)
- Mechanism: Stage-2 page tables + SMMU/IOMMU
- Enforcement: Immediate cell termination on violation
- Example: Cell pledges "compute only" → RumKV unmaps all network and disk controllers from Stage-2 tables. Even a compromised kernel inside that cell cannot address the network card.
Layer 2: Soft Pledges (Rumpk — software)
- Mechanism: Capability tokens, energy budgets, pledge declarations
- Enforcement: Fiber termination, resource throttling
- Example: Fiber pledges "stdio rw" → kernel denies network capability requests
The One-Way Ratchet
Both layers enforce a monotonic narrowing rule: a cell/fiber can only tighten its pledges via hypercall/syscall. It can never loosen them. A compromised workload that tries to re-enable network access gets terminated.
Alternatives Rejected
| Option | Why Not |
|---|---|
| Hypervisor-only enforcement | Kernel can circumvent software policies within the cell |
| Kernel-only enforcement | No protection if kernel is compromised |
| Single trusted layer | No defense-in-depth; one exploit = full access |
Consequences
- Kernel compromise doesn't enable cell breakout (hypervisor still enforces hardware boundaries)
- Individual cell compromise can't reach other cells
- Root-equivalent privilege inside a cell is still insufficient for cell escape
- Hypervisor debugging is harder (no kernel symbols, minimal logging at EL2)
- Pledge design must be careful to avoid false positives that terminate legitimate workloads
H3: Spatial-Only Partitioning
Status: Accepted
Context
Most hypervisors time-share CPU cores between guests (KVM, Xen). This maximizes utilization but introduces scheduling jitter — a guest may be preempted mid-operation, causing unpredictable latency. For deterministic systems (aerospace, industrial control, financial), jitter is unacceptable.
Decision
RumKV assigns cores statically at boot. No time-sharing at the hypervisor level.
Boot time assignment (example: 8-core system):
Cell A (Rumpk primary): Cores 0-3
Cell B (NetBSD guest): Cores 4-5
Cell C (Compute cell): Cores 6-7- Assignments are permanent for the system's lifetime
- Each cell's kernel schedules its own fibers on assigned cores
- RumKV never preempts a cell to give time to another
- Hypervisor is idle except for faults, interrupts, and hypercalls
Alternatives Rejected
| Option | Why Not |
|---|---|
| Oversubscription (KVM-style) | Flexible but nondeterministic; context switch latency at hypervisor level |
| Dynamic load balancing | Requires hypervisor involvement every few ms; violates Silence Doctrine |
| Asymmetric (some shared, some static) | Hard to reason about; partial determinism is worse than none |
Consequences
- Deterministic scheduling: No hypervisor preemption jitter. Period.
- Cell A can't steal cores from Cell B (spatial isolation = temporal isolation)
- Hypervisor is mostly idle (minimal power drain, minimal attack surface)
- Unused cores in idle cells are wasted (cannot help overloaded cells)
- Core allocation must be planned at deployment time (no runtime adjustment)
- Total core count limits the number of cells (no overcommit)
When Spatial-Only Is Not Enough
For dynamic cloud workloads where utilization matters more than determinism, Nexus Fleet profiles use a different strategy: multiple Rumpk cells with RumKV spatial partitioning, but with the Kinetic Economy providing soft resource sharing within each cell's core allocation. This gives deterministic inter-cell boundaries with flexible intra-cell scheduling.