Graphical Programming: I Really Hope This Is the Future
Since March 2013, your author has been working on a software project with a fervor not previously experienced by said author. This work has not been a never-ending drive in a certain direction with a clear vision; rather, I’ve been engaging in copious exploratory programming. Over spring break, I got a working pre-alpha, and I thought the basic architecture was fairly solid and wouldn’t change much. Since then, every aspect of the stack has been swapped out for something new at least once. All the data structures are different, the important program logic is solving a completely different problem than it was, and the interface has gone through iteration after iteration of paradigm shift.
One thing, however, has remained.
The purpose of this project is to assist the coder in developing more complex code faster and easier by clearly exposing the structure of the code.
To explain the solution I propose, I will first explain the particular problem I see.
ASCII Is Awesome, Just Not That Awesome
In the Good Old Days(tm), people wrote programs in binary, and the great masters of the era were those who achieved the greatest oneness with the computer. By which I mean, the interface between humans and computers followed a very simple rule: make something the computers can understand, and then teach the humans to speak the language. This system worked rather well because humans have an amazing ability to learn even the most abstract skills remarkably well.
In time, though, with advent of assemblers, and later compilers and interpreters, we moved the frontier between humans and computers. Now, the computer does much more work to try to understand what we mean; likewise, humans do much less work to try to explain what we mean. As abstraction is layered on abstraction, humans can use simpler statements to describe more complex programs than ever before.
However, and this is the meat of my criticism of the state of programming today, our interface for programming is still based on text.
Text is awesome. I much prefer the terminal over GUIs. I write documents in LaTeX using vim and avoid Microsoft Word and LibreOffice Writer like the plague because they’re just such a pain to work with. Nearly without exception, if I’m given the option to write something in plain text rather than “visually” or in “formatted” or “rich” text, then I’ll write in plain text, even if that means throwing in a little code. For a combinatorics problem I’ve been working on recently I needed to draw a series of graphs, so what did I use? Tikz in LaTeX. I find that almost without exception, GUIs (1) do not allow the kind of manipulation I want, (2) hide important information, and/or (3) are simply frustrating to use.
Is this a problem inherent in graphical user interfaces, though? Is text really the best interface possible to interact and communicate with computers?
I submit that it is not.
GUIs have been misused.
A Defense of Graphical User Interfaces
Why do I use text programs rather than graphical programs? As outlined above, there are essentially three reasons.
This is the most important isssue. The terminal is awesome because of its composability. Basically, the Unix pipe and backticks allow any command to be used as the input to any other command, and this allows the user to take relatively simple commands and mold them together into a one-liner that solves exactly his or her problem. When a new command is added, it integrates perfectly with the other commands, so it immediately gains all the power contained in the other commands. This allows the user to learn one set of commands and everything else just plugs right in.
I use vim for everything from configuring programs to note taking to todo lists to writing math papers to programming. I even use vim macros as a sort of scripting language if I’m trying to do something that’s a little too complicated for sed. Regular expressions make manipulating many different kinds of data — anything from file lists to code to csv files — surprisingly easy.
I like LaTeX better than Word not because I can use a plain text interface. I like it better because I can program in it (either directly or by writing a program that spits out LaTeX code). And being able to program in it gives the interface more expressive power. It is easier to tell the computer what I want it to do.
Text interfaces in general usually exhibit this kind of programmability. GUIs, as a rule, do not.
This problem, however, is solvable. Inherently, I don’t think it’s any easier to make a text interface programmable than it is to make a graphical interface programmable. However, since the text tools are already available, it is easier in practice to create a programmable text program. The solution for GUIs, then, is to create the tools necessary to program them. That’s a really vague description, but that’s because this is the hardest problem. I will describe my solution in detail in another post.
Show Me Everything
WYSIWYG editors are basically evil. I don’t want you to hide all the complicated formatting stuff in the background. I don’t trust you that much. One of the most frustrating experiences is, in Microsoft Word, to try to put a picture exactly where you want it. There usually is a way to make it work, but it is so needlessly complicated that I get insanely frustrated. When I drag this picture right here, why do you put it over there? The problem is not so much the putting it over there as it is the not telling me why. Please, if I’m doing something that doesn’t work in your framework, then let me know.
In LaTeX, every little bit of formatting is eminently visible. If there’s a problem, then at least I know exactly what the inputs are that give me the incorrect results, so I can know what to try to change to fix it.
However, this is clearly just a misuse of GUIs. Is there any inherent reason why GUIs must show less information than text interfaces? No. In fact, I would argue that they ought to be able to show more. This is a problem with current GUIs, but this is very solvable.
GUIs generally depend on the mouse for interaction, and that’s generally a bad thing. What is the difference between the mouse and the keyboard? Basically, the mouse gives you pseudo-analog interaction, and the keyboard gives you a lot more buttons. Thus, the mouse should be used in cases where you want analog interaction, and the keyboard should be used for making discrete choices. That’s why manipulating menus with a mouse grates on me so much: it’s absolutely the wrong choice of utensil. It’s like eating soup with a fork.
There are good uses for mice. For example, viewing 3D models should definitely be done with a mouse since the interaction is fundamentally analog (although there should be discrete buttons to, for example, snap to an axis). Flight simulators are a lot more fun with a joystick than just using the arrow keys, and that’s because flying a plane is an analog interaction.
GUIs don’t need to be based on the mouse, though. It’s perfectly possible to create a GUI that is primarily controlled by the keyboard, and in many cases that would be a good thing. Again, this is a solvable problem.
So, yeah, text is great, and modern GUIs are, if not completely awful, at least significantly worse than text interfaces. I’m trying to make a GUI for programming that is an actual improvement over a text interface.
How could programming be helped by seeing the code graphically? Basically, in text, there are a lot of connections behind the scenes that the programmer must keep in his or her head. For example, an identifier is not merely a string of characters; it is connected with every other instance of that identifier, and in particular with its definition. It may also be connected in some ways to other identifiers — it may have been included from some module, it may have various functions that apply to it, or it may be the inverse of some other identifier. All these connections must be kept track of by the programmer in his or her mind (assisted by documentation).
However, the computer knows all these connections as well. Why can’t the computer just show me these connections? In a text interface, this is difficult (although automatic indentation and syntax highlighting is helpful).
Thus, this isn’t really about graphical programming. The point is not that stuff is arranged nonlinearly. The point is that we show more of the structure of the code. We show several different aspects of the code at once, and we can have different graphical views of the same code. Maybe one view emphasizes the data flow; another view emphasizes the control flow, and another view emphasizes the human-meaningful divisions in the code (between, for example, view logic, controller logic, and model logic).
This applies especially well to Lisps. A Lisp program is just a tree, and every expression in the program is just a subtree of the program. Text, however, being line-oriented, is good for displaying linear processes, as in assembly or C or Python. It is difficult to naturally represent a tree in text. The most common solutions are indentation, which is too vertically verbose for Lisp’s needs, and parentheses.
Using parentheses to display a tree is a decent system. It is unambiguous, simple, and easy to type. However, it is clear to me that it is not the best system. When we draw a tree on a blackboard or on paper, we don’t use the linear parenthesis system. No, we draw them graphically. Why? Because it’s easier for humans to understand something drawn out visually. Why don’t we use that when programming Lisp, then? Because it has historically been hard to draw that kind of thing quickly and easily with a computer. Basically, it’s just easier to use parentheses.
However, I believe this can change. Thus, I propose that graphical programming should be the future. I haven’t seen a graphical programming system that I thought was better (or even close to as good) as a text system, but that’s just because people are going at it the wrong way.
In time, I’ll write about what my solution is to the problem of graphical programming. My solution is still in pre-alpha, although I think I’m tantalizingly close to an alpha version. So, until then, I’d love to hear anyone’s opinions on the subject.
After working on my project for a couple of months, I found Bret Victor’s work, and I’ve since incorporated some of his ideas into my view of programming (as you may notice above). I particularly recommend Learnable Programming and The Future of Programming:
For an overview of many of the studied tree visualization techniques, see treevis.net. I’ve implemented several of these visualizations in my project, and I plan to implement many more: