Why It's Called Serialization

Why It's Called Serialization

Remember serial ports? They could only send data one bit at a time. 1 or 0. Binary data.

Serialization is what you do to data before putting it on the wire: make it binary. Convert it to 1s and 0s. Deserialization does the reverse, taking binary data and reinterpreting it as whatever data types you are interested in.

e.g. What does it mean when a serial port tells you '01000001'? Maybe you are expecting ASCII text, so you deserialize this message as ASCII: 'A'. Or maybe you are looking for a number in the form of an 8-bit integer: 65. Or maybe you are accepting a set of True or False survey question responses, so the data becomes False True False False False .... Deserialization is the point where binary data is converted into what it represents.

This process is ripe for security bugs because:

  1. There are practically infinite ways to interpret a string of binary data.
  2. There are practically infinite strings of binary data.
  3. Apps need to interpret binary data in order to talk over the wire.
  4. Apps interpret and handle data in different ways.
  5. Serialization is boring and complicated and "if it works is good enough" tends to be the only requirement. (I've been there. Too many app features to implement, not enough time to pick a secure serialization library.)

About Joey Rideout

I am an Application Security professional and UW CS grad currently based in Ottawa. Committing the crime of curiosity since 2008. Submit questions or ideas for the blog to: joey.rideout@owasp.org