Alex's blog

Separation of data and control

Athena Lilith Martin

Published on

One of the most fundamental lessons we seem to keep re-learning is that one of the most important factors in protecting systems from being hijacked is the separation of data, which could be from untrustworthy sources, from control information, which has the power to make the system do specific things.

We have learned this lesson a lot of ways: Phone companies learned it from phreakers, and they developed Signaling System 6 and later 7 to move the control signals off the lines customers spoke over; SQL users learned it from injection attacks and spent a lot of effort convincing each other to use bound parameters for everything; and now natural language processing machine learning developers are learning it from people simply politely asking ML systems to ignore their safeguard instructions.

The unfortunate thing is, in the domain of computing, where everything is built as tremendous piles of abstractions upon abstractions upon abstractions, it becomes practically impossible to solve this problem in a general way; each layer considers as mere data what is, to the layer above, control. So, each layer has to implement its own, incompatible means to separate data from control. And, frequently, we want to give someone access to the control of only some of the layers; we want people sending a message to be able to control the recipients, but at the same time, of course we don't want them to control the cryptography that prevents them from reading other people's messages.

And so, perhaps we are doomed to in-band signaling attacks forever, with no real solution in reach other than to continue trying to build walls at each layer and cross our fingers that the next layer up uses them properly.

Maybe we should think about whether we really need all those layers, after all.