Ramblings & ephemera

A very brief history of programming

From Brian Hayes’ “The Post-OOP Paradigm“:

The architects of the earliest computer systems gave little thought to software. (The very word was still a decade in the future.) Building the machine itself was the serious intellectual challenge; converting mathematical formulas into program statements looked like a routine clerical task. The awful truth came out soon enough. Maurice V. Wilkes, who wrote what may have been the first working computer program, had his personal epiphany in 1949, when “the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs.” Half a century later, we’re still debugging.

The very first programs were written in pure binary notation: Both data and instructions had to be encoded in long, featureless strings of 1s and 0s. Moreover, it was up to the programmer to keep track of where everything was stored in the machine’s memory. Before you could call a subroutine, you had to calculate its address.

The technology that lifted these burdens from the programmer was assembly language, in which raw binary codes were replaced by symbols such as load, store, add, sub. The symbols were translated into binary by a program called an assembler, which also calculated addresses. This was the first of many instances in which the computer was recruited to help with its own programming.

Assembly language was a crucial early advance, but still the programmer had to keep in mind all the minutiae in the instruction set of a specific computer. Evaluating a short mathematical expression such as x 2+y 2 might require dozens of assembly-language instructions. Higher-level languages freed the programmer to think in terms of variables and equations rather than registers and addresses. In Fortran, for example, x 2+y 2 would be written simply as X**2+Y**2. Expressions of this kind are translated into binary form by a program called a compiler.

… By the 1960s large software projects were notorious for being late, overbudget and buggy; soon came the appalling news that the cost of software was overtaking that of hardware. Frederick P. Brooks, Jr., who managed the OS/360 software program at IBM, called large-system programming a “tar pit” and remarked, “Everyone seems to have been surprised by the stickiness of the problem.”

One response to this crisis was structured programming, a reform movement whose manifesto was Edsger W. Dijkstra’s brief letter to the editor titled “Go to statement considered harmful.” Structured programs were to be built out of subunits that have a single entrance point and a single exit (eschewing the goto command, which allows jumps into or out of the middle of a routine). Three such constructs were recommended: sequencing (do A, then B, then C), alternation (either do A or do B) and iteration (repeat A until some condition is satisfied). Corrado Böhm and Giuseppe Jacopini proved that these three idioms are sufficient to express essentially all programs.

Structured programming came packaged with a number of related principles and imperatives. Top-down design and stepwise refinement urged the programmer to set forth the broad outlines of a procedure first and only later fill in the details. Modularity called for self-contained units with simple interfaces between them. Encapsulation, or data hiding, required that the internal workings of a module be kept private, so that later changes to the module would not affect other areas of the program. All of these ideas have proved their worth and remain a part of software practice today. But they did not rescue programmers from the tar pit.

Object-oriented programming addresses these issues by packing both data and procedures—both nouns and verbs—into a single object. An object named triangle would have inside it some data structure representing a three-sided shape, but it would also include the procedures (called methods in this context) for acting on the data. To rotate a triangle, you send a message to the triangle object, telling it to rotate itself. Sending and receiving messages is the only way objects communicate with one another; outsiders are not allowed direct access to the data. Because only the object’s own methods know about the internal data structures, it’s easier to keep them in sync.

You define the class triangle just once; individual triangles are created as instances of the class. A mechanism called inheritance takes this idea a step further. You might define a more-general class polygon, which would have triangle as a subclass, along with other subclasses such as quadrilateral, pentagon and hexagon. Some methods would be common to all polygons; one example is the calculation of perimeter, which can be done by adding the lengths of the sides, no matter how many sides there are. If you define the method calculate-perimeter in the class polygon, all the subclasses inherit this code.

Comments are closed.