Towards a new model of computational expressiveness
Since joining the cult of LangSec, I've spent a great deal of time pondering what makes a programming language or environment "useful" and believe that the current model centered around Turing-completeness is no longer sufficient to describe the nuances between varied environments. The root of this stems from the huge divergence between what computers were capable of in the infancy of computing, and the current state-of-the-art. Take for example, two Turing-complete (TC) programs: a multi-threaded web server written in C, and a simple routine in Brainfuck that increments numbers and prints them out in an infinite loop. Both of these languages and run times are TC, and have the same expressive power under that model of computation, however it should be clear that one is more powerful than the other (hint: the server) and should be more carefully monitored to prevent compromise.
The divergence between the theoretical Turing machines and the machines designed to emulate has grown significantly; most of the "work" modern computers (including mobile devices, etc.) perform is not computation oriented as it was with a Turing machine. Today networking, graphics/audio and interaction with the physical world (SCADA, wearables, etc.) have become the key features that the industry is working on improving, not the emulation gap between modern CPUs and a theoretical Turing machine. As technology has evolved from the original models of computers from the era of the Church-Turing thesis into the dubiously-named "Internet of Things" or "fog", the ability for Turing-completeness to provide meaningful insight into a program decreased. Many examples in the modern research body have shown Turing-completeness to "pop up" in increasingly surprising places: the Intel MMU, ELF meta-data and the BGP routing system (PDF). At this point, programs fall off what is known as the "undecidability cliff", in which the Halting Problem prevents certain formal analysis and verification (as opposed to a restricted language such as regular expressions). Once every program is TC, it becomes a meaningless descriptor; a new dimension of computational expressiveness is needed to keep pace with the state-of-the-art.
To remedy this inability to measure and differentiate between differing TC programs with different "practical power", a new model is needed to augment the current language hierarchies and more closely portray current computing systems. There has been some work to tie computation to physics such as work by Scott Aaronson and work on modeling search problems using phase transitions, but those contributions do not provide a "unit" for computational power or expressiveness. Below I will provide my off-the-cuff idea for how such a model could be formed, knowing full well that I am nowhere near qualified to propose such a model and my suggestion is most certainly highly flawed.
As computers in all their physical implementations are designed to serve a purpose (even if that purpose is esoteric or useless), I am drawn to the concept of "work" from elementary physics. Work (W) is a core measure of impact on a (physical) system by the application of force (F) over a distance (technically displacement, s), or . I think that the notion of work carries over to computing nicely, instructions do work on a system: changing register values, printing information to a display, writing data to disk, etc.
Kludge alert! There could be a proposed model of locality of modern computer system implementations to replace the displacement (distance) in the physics definition. This could be measured as "hops" from the ALU (which I am treating as the Turing machine kernel of modern computer systems); from closest to most distant (this is roughly based off of the orders of magnitude increase in access time):
- registers
- cache (all levels, or more exact)
- main memory
- device memory
- disk drive
- remote network device(s)
With these "rings of locality/distance", it is now possible to model the distance away from which a program can impact the computational system (think back to the Brainfuck program vs. the C server).
Force is a much more difficult notion to translate (I welcome suggestions!), it is clear that a NOP
instruction is less "forceful" than an ADD
instruction, though capturing that in a clean fashion is challenging. Some ideas including tying force to clock cycles or power usage compared to the NOP
instruction. In this way, it is possible to show that an instruction that initiates a series of vector math operations in the SSE logic is more forceful than a INC EAX
. The same definition can be used for more distant computations, either on a device, or a remote system.It also becomes extremely difficult to model a single instruction with this definition, so perhaps modeling a program, or at the basic block level is more tractable.
Now with this (preliminary) model of computational work, the above two TC programs are clearly differing in their ability to perform work on the computational system. The C server can spawn numerous threads, each performing more complex instructions exerting some force on the system and effect computation on remote systems (highest distance) whereas the Brainfuck program can perform only small-force instructions at a low-distance from the ALU.
In conclusion, I think that a new dimension of computational expressiveness has arose in modern computing and currently is not adequately captured by the traditional Turing-completeness model.The above proposed model is merely a jumping-off point to spawn a discussion in this area as I believe this is a needed development, and will help bridge the gap between the theoretical models of Turing machines and their physical implementations. LangSec has shown that security problems arise from gaps between the theory/design and the implementation, so the above model is needed to formally define what is it a program does or can do. Once every program (or hardware executing that program) is TC, TC becomes a relic.
P.S.: To tie these musings back to LangSec and security, with such a model, exploits are merely doing work outside the a priori decided bounds, whereas a bug is a region of code (or program) that is overly empowered to exert force at a distance. With all that said, I welcome feedback, pointers to papers having done this exact thing from year ago, or well-mannered flames.
Cyber-security Philosopher and Boffin