
Member-only story
An Intro to Zero Copy Reads, Serialization
The first time I stumbled upon this Term `Zero Copy Serialization` was when I started looking into Apache Arrow. The website said
The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead
That’s when I started reading more about this Zero Copy Operations. And why would they beneficial in high performance data systems? This article highlights zero copy IO and compute — typically involved with Serializing and deserializing data.
Often, systems need to serialize data — either from one memory location to the other, on a disk, or between machines in a distributed network.
Serialization involves transforming data structures such as classes, structs , and other primitive types into actual bytes — which can later be stored on a disk or moved over a network.
This is where serialization formats come into picture — the way data is serialized and stored on disk involves some CPU and Memory. One may chose JSON
, protobufs
, Flatbuffers
, or plain Serialization without any format.
For instance, when a JSON API request is parsed to a Java POJO / Go Struct , the data bytes are read in memory, and transformed into native
structs. No matter how small the payload is, the process of reading a JSON string bytes…