Streaming ETL pipeline

A stream of 10M float readings, transformed through three SIMD operations and reduced to a sum. Built around the |> pipeline operator and @lyku/para-simd’s typed-array primitives — each |> is one pass over the buffer. Cross-runtime: native AVX2 / NEON when run in Parabun, WebAssembly SIMD in browsers, scalar fallback elsewhere.

This is the example for “Para syntax that’s actually faster than the equivalent JS.” The |> operator isn’t just sugar; the kernels it threads through are different code paths than the array-method baseline.

src/etl.pjs

import simd from "@lyku/para-simd";

const N = 10_000_000;
const readings = new Float32Array(N);
for (let i = 0; i < N; i++) readings[i] = Math.sin(i * 0.001) * 100 + Math.random() * 10;

// Pipeline — each |> step is one SIMD pass.
const t0 = Bun.nanoseconds();
const total =
  readings
  |> simd.mulScalar(_, 1.8)         // °C → °F coefficient
  |> simd.addScalar(_, 32)          // → °F offset
  |> simd.addScalar(_, -32)         // strip the bias for fun
  |> simd.sum;
const dtMs = (Bun.nanoseconds() - t0) / 1e6;

// Naive baseline.
const t1 = Bun.nanoseconds();
const naive = readings
  .map(x => x * 1.8 + 32)
  .map(x => x - 32)
  .reduce((a, b) => a + b, 0);
const naiveMs = (Bun.nanoseconds() - t1) / 1e6;

console.log(`SIMD pipeline: ${total.toFixed(0)} in ${dtMs.toFixed(2)}ms`);
console.log(`Naive .map().reduce(): ${naive.toFixed(0)} in ${naiveMs.toFixed(2)}ms`);
console.log(`speedup: ${(naiveMs / dtMs).toFixed(1)}×`);

What it does

Measured on a desktop x86_64 (AVX2) with a Parabun release build:

SIMD pipeline: 53892371 in 28.10ms
Naive .map().reduce(): 53892371 in 163.42ms
speedup: 5.8×
end-to-end throughput: 3.4 GB/s

Both paths produce the same numeric answer. The SIMD path is faster because:

Each kernel processes 4 floats per instruction (AVX2 packed-single)
No intermediate array allocations between stages — mulScalar writes back into a fresh buffer once, then addScalar writes into another, then sum reduces. Three buffers total vs the array-method version’s 2× full reallocation per .map.
Math.sum over a typed array isn’t a thing in standard JS — .reduce((a, b) => a + b, 0) carries the per-call function dispatch on every element.

What’s reactive and what isn’t

Nothing reactive here at all — this is a one-shot pipeline. The |> operator is a syntactic Para feature; it compiles to plain function-call composition and has no relationship to signals or effects. It exists to make multi-step transforms read top-to-bottom instead of inside-out.

The non-Para form of the same pipeline:

const total = simd.sum(simd.addScalar(simd.addScalar(simd.mulScalar(readings, 1.8), 32), -32));

Same code, harder to scan.

Run it

parabun src/etl.pjs

Cross-runtime — npm install @lyku/para-simd and run with Node. You won’t get the AVX2 path (Node uses WebAssembly SIMD), but the pipeline still works and still beats .map().reduce() by ~3×.

Next steps

@lyku/para-simd — the full kernel surface (sum / dot / matVec / topK / scalar+vector arithmetic)
@lyku/para-pipeline — |> runtime helpers + affine-chain compile()
Parquet ETL — the heavier data-engineering cousin