Writing

Gen 5 NVMe's are fast

EC2 disk speed benchmarks.

I wrote a quick benchmark over the weekend to piece the parts together (implementation details dont matter much for this sake) - grafana and a few .tf files to spin up and config the EC2's.

https://github.com/1rvyn/irvyn-puffer

Its quite self explanatory, my motivation behind it

Architecture

My laptop triggers the run, two "writers" stream live 8-channel "audio" (just ffmpeg generated audio), and the target EC2 instance does the ingest and flush work (100ms).

Benchmark architecture overview

That layout is what let me compare instance types without hiding the write path behind extra layers. The target instance and the writer nodes are doing the real work; everything else is just orchestration and observability.

32-conccurent audio streams run

Yellow = ? guess Green = ?

The p95 means added latency PER chunk of 100ms getting wrote to disk - in a realworld scenario its likely more total delay since you would often upload 100ms chunks to a WS then recieve text response from external STT service (so its quite crucial we keep the flush_p95 low)

At the end of the day this is a synthetic benchmark, not too accurate and just looks to measure the "latency" around writing 100ms chunks when conccurency is high. Single instance monoliths can stream a LOT of audio in reality...

32-channel benchmark dashboard

16 run

16-channel benchmark dashboard

The i7i and t3 are relatively close on lower runs (still a decent delta)

Local NVMe sanity check

Motivation for this (this is from the recently released apple M5 disks - the extremely expensive 8TB option has more chips on its NVMe so is the only variant capable - also a few thousand £ extra...)

MacBook Pro local NVMe result

AWS storage comparison

A post from a Planetscale employee who motivated me to run this benchmark:

R7i gp3 versus i7i NVMe throughput comparison

Instance details

i7i.large instance pricing and shape

i7i family storage limits and bandwidth table