2026-06-23

Measuring energy at the silicon

RAPL on x86, NVML on NVIDIA, IOReport on Apple Silicon. How we get the joules, and where we estimate.

Joule pricing only works if the joules are real. Here's how we get them.

The three sources of truth

Modern silicon publishes its own energy. We read it. Three counters cover the silicon we run on:

Silicon	Counter	What it reports	Precision*
Intel x86, AMD x86	Intel RAPL (msr 0x611, 0x619)	Per-package energy, per-DRAM energy, microjoules granularity	~5%
NVIDIA GPU	NVML `nvmlDeviceGetPowerUsage`	Per-device instantaneous power in mW; integrated over time = joules	~10%
Apple Silicon	IOReport `IOReportCreateSubscription`	Per-subsystem (CPU, GPU, ANE) energy in nanojoules	~3%

*Precision = the typical disagreement between the counter and an external power meter at the wall, over a 60-second sustained workload. RAPL has been studied extensively; the literature (Khan et al. 2018, Desrochers et al. 2016) puts it at 1-7%. NVML is looser because it samples; we integrate over short windows to keep error bounded. IOReport is the tightest because Apple's hardware design is the most integrated.

The middleware

Every request hits a thin Rust middleware before reaching your handler. The middleware:

Reads the relevant counter(s) for the cores/devices the request will touch (cgroup boundary).
Records start values.
Invokes your handler.
Reads end values; delta = total energy consumed.
Multiplies by the data centre's published PUE to account for cooling + networking overhead.
Writes the result into the response header and the receipt.

Per-request overhead is < 50 µs and the energy of the middleware itself (mostly memory loads) is bookkept separately as "platform overhead", visible on aggregate stats but not charged per-request — the customer doesn't pay for our measurement.

The two cases where we estimate

Hardware counters aren't universal. Two cases where we fall back:

Multi-tenant hosts where cgroup-level RAPL isn't reliable. Some kernels expose RAPL only at the socket level, not per-core. For these, we measure socket energy and divide by the fraction of CPU-time your request consumed. Honest, but it'd assign part of the socket's idle floor to your workload — which is your share of the silicon's existence even when idle. The receipt marks these as method_confidence: "measured-proportional".
Silicon where no counter exists. Some ARM SBCs, some older platforms. We use the published TDP × util, and mark the receipt method_confidence: "derived". Reads to the silicon will be ~20-30% wrong; we never charge derived-method requests at > 80% of the measured-method rate to keep the customer net-positive on the error.

You can read which method was used per request: response header X-Energy-Method, receipt field energy.method.

What we don't measure

We're explicit about scope:

Network energy is not in the per-request joules. The data centre's PUE captures aggregate networking energy as overhead; the per-link joules of moving your packet through 5 switches is not split out.
Manufacturing (embodied) energy isn't in the per-request number. The lifecycle energy of the silicon — fab, transport, decommissioning — is amortized in the data centre's published PUE-equivalent if the operator reports it, otherwise excluded.
Your client device's energy isn't ours to count.

The receipt clearly labels what's measured vs. what's amortised. An auditor can challenge any line.

Why we make the methodology public

Because energy claims become political instantly. We'd rather publish our methodology and have it argued with than make claims people can't verify. The Rust middleware is source-available at git.openie.sh in invisible-infrastructure/crates/energy-meter/. The jc receipt verify CLI lets anyone, anywhere, validate that a signed receipt was produced by the public middleware against the published methodology.

If you've found a measurement bug, write to [email protected]. If you've found a methodology improvement, the source-available repo accepts PRs.