Why Don’t We Have a General-Purpose Tree Editor?

We have excellent tools to create and edit text (vim, emacs, sublime, etc.). We have pretty good tools to create and edit tabular data (excel, other spreadsheet software). We even have pretty good tools to create and edit diagrams, pictures, and video.

Why don’t we have good tools to create and edit trees and graphs?

Trees and graphs (in the sense of connections between data) are the underpinnings of structured data. Virtually all data can be described in terms of vertices with (possibly directed, labeled, and/or weighted) edges between them. That may not be the best way of presenting it, but it works. For a variety of different types of data, it is indeed best to think in terms of the connections between the data.

Most often this is manifested in trees because our data is often hierarchical in nature. Thus it makes sense to focus on trees. Additionally, many graph structures can be viewed best by analyzing them through the “cross-section” that a tree view provides. This is all fairly abstract, so I think before I go any further I should give an example.

An Example: Constructing Proofs

I write math proofs on a fairly regular basis, and I read them even more often. Naturally, my mind has wandered to thoughts of how a computer could be used to write them more effectively, or at least make them easier to read and understand. The natural first step is first defining exactly what is a proof.

For the purposes of this discussion, a proof is the link between a series of hypotheses and a conclusion, such that if the hypotheses are true then the conclusion is true. I realize that not all proofs are simple implications (if-then), but most (all?) can be reduced to a series of implications. Thus, this is a reasonable characterization. A proof (in the abstract), looks something like this:

Theorem: If A, B, and C, then Z.

Proof: By Theorem T1, A and B implies D. By definition, C and D implies E. By Theorem T2, E and B implies F. Clearly, D and F implies Z.

This looks like a graph structure to me. Here is the structure (in various syntaxes):

Z
- D
-- A
-- B
- F
-- B
-- E
--- C
--- D
---- A
---- B

This is a fairly self-explanatory syntax. Note that there is some duplication since we are trying to project a graph onto a tree.

((A B) (B (C (A B))))

This “lisp-style” representation is compact and emphasizes the fact that the result only depends on A, B, and C, but it obscures some of the intermediate information.

                        A && B
                        ------  (T1)
                   C &&   D
                   -----------  (def)
      A && B  B &&      E
(T1)  ------  ----------------  (T2)
        D   &&       F
      ------------------------
                 Z

This gives more explicit information than the others, and gives it in a more visual way. Thus, it exposes the structure of the data.

There are countless other ways of giving the structure, and we haven’t yet even stepped outside the realm of ASCII. See this site for some more innovative visualizations.

Okay, so if we accept for the moment the hypothesis that proofs are structured data, and that in particular they can be represented well as a tree (or a graph), then it seems like we’re halfway there. After all, computer programs have been manipulating structured data for decades. The question I have is, what program would you use to write out a proof in such a way that it exposes the structure of the proof?

The usual way of writing a proof is in a linear, text-based format. This works well, but it doesn’t expose the structure of the proof, so it can be difficult to understand a proof on first read-through and even harder to get a general feel for the reasoning involved. If we want to write a proof in a tree-based or graph-based format, we need software that can help us. A minute ago I used a text editor to draw several views of a tree. There should be a piece of software that is more suited to drawing trees than a text editor.

Unfortunately, no such software exists. Or at least, not that I can find. If it does exist, please tell me. Seriously, I’d love to find out about it.

I want to be able to construct a proof by starting with “A && B && C” at the top of the screen and “Z” at the bottom. As I start finding things I can prove from A, B, and C, I can start adding nodes. Working backwards from Z, I can start creating different sets of sufficient conditions for Z. Thus, I have a tree growing from the top and a tree growing from the bottom. When they meet, the theorem is proved.

This doesn’t require the software to know anything about proofs or logic. I just need a sufficiently general tree/graph editor.

Another Example: Programming

If proofs aren’t your thing, let’s look at programming. Virtually all programming languages can be reduced to an abstract syntax tree (AST). In the Lisp family, the language is essentially a bare AST. The AST is represented using the parenthesis notation for trees, although it could just as easily be represented in other ways. So, if we had a general tree editor, we could write Lisp code directly in it. If it was a general graph editor then we could make connections between parts of the program that are not syntactically related (for example, connecting all uses of a particular variable or function).

Again, the major functionality here is simply a tree editor. For sure, to actually use this to write a program we would want to have a series of plugins that know that we’re writing Lisp and help us, just like a text editor. But the major functionality is the same as proof-writing — we’re just editing trees.

What I’m Looking For

The software I’m looking for is general purpose software. I’m not looking for a proof editor, nor am I looking for a tree-based programming editor. I’m looking for a platform.

In text processing we have text editors combined with plugins and syntax files and whatnot to create an environment suitable for the task at hand, be that writing a to-do list, writing proofs in LaTeX, writing a blog post, or programming. But the main functionality is that of simply editing text.

Analogously, I envision many plugins and whatnot to help with proof-writing or programming or whatever other tree manipulation we might want to do. But we first need the tree editor. We need a solid, simple program that can simply edit trees.

What would this look like? This is mostly pure speculation at this point, but here’s some ideas I have.

The program consists of a series of frames, each of which displays a (part of a) tree. There are many ways to visualize trees, so each frame may use a different visualization, or they may all use the same one. When a change is made to one frame the others immediately are aware of the change. Navigating through the tree(s) should be easy and quick. A good keyboard-focused mouse-optional interface would be important.

Plugins should be able to define additional visualizations and additional commands. These should include domain-specific ways of interacting with the data, just as commands and syntax highlighting do in text editors.

These ideas come to some extent out of a side project of mine. Phlisped is a graphical programming editor experiment that is on some level just a tree editor with a few extra things slapped on to make it work for programming. Some of my writings on the subject include a brief philosophy post, an announcement (includes screenshots) when I released the code, and a short video demo of a few of its features. In some sense, this would be a re-write, but with a more general purpose.

Conclusion

I haven’t written any code for this. This is really just a statement of a perceived lack. I plan to investigate the issue more, do some design, and see if I can write up a prototype, but I can’t give any guarantees. If only I had more free time…

I am interested, however, in any feedback. In particular, any examples of already-existing programs that fit part or all of this description would be useful. Most tree editing software is domain-specific, but if there are good examples of domain-specific tree editors, I’d love to hear about those as well. Any other feedback is also appreciated.

Phlisped: An Experiment in Graphical Programming

I know, I know. I promised to release the code to my graphical programming project months ago, and I didn’t. I’m an awful person. At any rate, without further ado, the code is now available on Github: https://github.com/philipcmonk/phlisped

Here’s a few screenshots of it in action. Some of these are from a little while ago. Yes, that’s xkcd.com/1190.

screenshot

screenshot-normal (1)

screenshot (1)

screenshot-disk

A few notes:

  • I wanted to get some things working before I released it, but life caught up with me, and I just didn’t. So here’s a (semi-)stable version. If you want help getting it to work, let me know. I’d love to help you.
  • It depends mainly on Racket and openGL. I believe it also depends on ftgl, but it shouldn’t be hard to remove that dependency. If you have a problem with that, just let me know.
  • I’m still interested in graphical programming, but I’m not sure my approach is the right way to go forward. I would love to hear any positive or negative feedback on it. This project is not under active development, so I’m mostly throwing it out here to generate discussion.
  • I don’t really have docs written up, but go ahead and ask me if you have any questions. One hint: try vim-style keys (h,j,k,l,i,…) and to switch visualizations. Go ahead and experiment. There are features for some kinds of intelligent autocompletion, autorenaming, and some other bonus features that don’t have standard names.
  • See my other blog posts for some ideas of my philosophy in creating this. Also read the comments.

If anyone wants to contact me, you can go ahead and comment here or create an issue on Github. Alternatively, send me an email at pcmonk (at) asu.edu

Why Lisp is Right for Graphical Programming

In my graphical programming project (which I’m really hoping to open source by the end of the month), the programmer uses Racket. Using a Lisp for graphical programming makes a lot of sense, and I’ll try to explain why here. To do so, I’ll have to go off on a short tangent about what are the opportunities for improvement that graphical programming offers, and then I’ll show why Lisp is ideal as a language to be edited in this manner.

Separation of Entry and View

A critical aspect of this is the decoupling of the method of editing code from the manner of viewing code. It is useful to add visual structure to code, so we’ve evolved standard indentation patterns. However, it’s frustrating and time-consuming to manually indent code, so we built editors to do it automatically. This separates the act of typing in the code from the way it’s presented, and that’s a good thing.

I contend that this principle can and should be extended much further. We should greatly increase the ratio of information added to number of key strokes. Why do I think this is possible? When I look at code, my understanding of the code is much more in-depth than what I see. There is much more structure to the code than is easily visible. Since there is that much structure, the editor should display as much of that structure as possible. Indentation and syntax highlighting is good, but there’s much more we could do.

The question is, then, what can we we do to improve the experience by programming graphically? Obviously, because of the tree-like structure of Lisp code, we can simply visualize the tree in various ways. Most likely, though, instead of simply displaying a tree with no understanding of the symbols, the program should have an understanding of a function like map and display it in a particular way. But how? What would give a better intuitive understanding of the map function than simply the word map? I propose simply showing more or less what a map actually is: simply show the function applied to several of the elements. Thus, instead of this:

(map f l)

We have something more like this:


(f (list-ref l 1))
(f (list-ref l 2))
(f (list-ref l 3))
...
(f (list-ref l (length l)))

To me, this is significantly more intuitive. I’m very comfortable using map because I’ve used it so often for so long, but I still understand the second faster on first glance. This is the same reason why the ellipses notation is often used in math when first explaining a sequence. Only after you understand the sequence do you write it in sequence notation. Humans are very good at pattern matching, so when looking at the second example, it is very obvious what exactly is happening from one line to the next.

The main problem with the first example is not that it’s hard to understand. It’s easy to understand — the problem is that it requires being understood at all. It requires you to mentally imagine the second example without seeing it. We have these powerful machines right in front of us that are already going to create the second example anyway when it runs the code, so why must we duplicate the computer’s work and imagine it in our mind? Why not just have the computer show you what it’s thinking?

There are two main reasons why we don’t do the second example when we’re writing code textually. First, it’s more verbose so it takes more time to type. Second, it’s hard to maintain and modify — one must edit each line to make a change. Having the computer generate the second example from the first example seems like the best solution. That way, we must only tell the program “apply f to every element in l”, and it creates the second example. If we want to change something in it, we should be able to change it on one line and it will be reflected on every other line.

This may remind you of spreadsheet software. I think that’s a good thing — it’s easy to see what’s happening in a spreadsheet because all the data and patterns are easily recognizable. To change a formula in a spreadsheet, you just need to change it in one cell and then tell it to put an analogous formula in the entire column. This is a very intuitive and easy way to model computation.

I should note that spreadsheet software is only good for particular types of programming, and its model falls apart very quickly for general programming. The last thing I want to do is bring Lisp down to the level of a spreadsheet. That’s why a graphical programming editor should understand map and display it in this way, and it should understand fold and display it in a similar way, but it should understand something like a struct definition or a conditional or a macro in a very different way, and it should display it in a way that’s specifically tailored for that operation.

Incidentally, for fold, I’d imagine it displaying this:

(foldl f i l)

As something like this:


(set! res (f (list-ref l 1) i))
(set! res (f (list-ref l 2) res))
(set! res (f (list-ref l 3) res))
(set! res (f (list-ref l 4) res))
...
(set! res (f (list-ref l (length l)) res))

Obviously, when you’ve “zoomed out” on your code, this should collapse into the more concise version.

Lisp’s Homoiconicity

A Lisp file is essentially a bare AST. Thus, the structure of the program is very clear to the programmer. This is critical for graphical programming since the major advantage of graphical programming is that it can be used to very clearly expose the structure of the program. This allows the programmer to more naturally reason about the logic involved.

Creating a program to graphically display and edit an AST is not particularly hard — it’s simply a question of visualizing and editing a tree. There are many different ways to accomplish that, but a minimal functional program can be created fairly simply. This, however, does not likely give a compelling reason to change away from textual programming — the additional clarity gained by the tree visualization is likely negated by the additional difficulty in editing the tree.

Thus, we add more understanding to the program. This is easy to do with Lisp since there is basically no syntax in the code. Thus, all of the “syntax” is controlled by the editor. Note that for a graphical editor, there is a difference between the syntax of typing code and the syntax of displaying the code. Behind the scenes, though, it should simply be an AST.

I realize that Lisp has more features than merely its homoiconicity. However, the rest of the feature are intimately related to its homoiconicity. For example, macros are awesome because they’re easy to think about — you just need to manipulate a tree. The ease of quickly abstracting and the functional nature of Lisp are both also very natural once you have homoiconicity. A graphical programming editor should be able to support these other features very easily. I haven’t talked about these features simply because there will be very little change in them from textual to graphical programming.

In Conclusion

Eventually, I believe that languages will be created specifically for graphical programming editors. Right now, it’s important for languages to be backwards-compatible with textual editors, at least in a pinch. This allows you to use not only vim but also grep, sed, diff, and so forth. Until we have good enough graphical tools to replace these, the various Lisps are the best candidates for graphical programming.

To Open Source or Not To Open Source

I’ve been trying to wrestle the code for my graphical programming project into some kind of releasable form. This brings up the obvious question: in what form do I want to release this? Do I want to release it open source or closed source? Is there any way I can make money off it? Would it compromise the quality of the program to keep it closed source?

There are two basic strategies, and many permutations of each. On the one hand, I could open source it and try to build a community; on the other hand, I could keep it closed and try to do something like a startup.

To come to any kind of reasonable answer to this question, I need to first determine my priorities in writing this software.

Goals

I have five main goals (in order): to learn, to make something worth using, to make something that people use, to advance my career, and to make money.

I’ve started many projects in the past, and I’ve made significant progress on many, but I’ve never tried to turn them into something releasable. This is because my primary goal has always been to learn to be a better programmer. I’m a student, so this is my time to learn as much as possible before I’m thrown into the world of needing to make money. This is my time to dream, and my time to learn. Thus, my first priority is to learn.

My second priority is to make something worth using. That is, I want to make something that I and others would want to use. Thus, in the context of this project, I want to make a program that makes programmers more productive. I believe that if programmers can be made more productive, then not only can problems be solved faster, but problems that were previously prohibitively hard can be solved. Thus, I don’t want to do anything to compromise the power and usefulness of my program.

My third priority is to make something that people use. This is distinct from the above in that this is not measured by how useful this program would be to people if they used it; rather, it is measured by how useful this program is to people since they use it. Thus, I would like for this program to be available to as many people as possible with as little barrier to adoption as possible.

My fourth priority is to advance my career. This is pretty self-explanatory, and it could be accomplished in several different ways. Indeed, the learning I accomplish will be invaluable here.

My fifth priority is to make money. Aside from all the normal benefits of money, if this could turn into something that actually pays me, then I could devote more time and effort to it. If this doesn’t happen, then this will remain a side project while I either get a day job, an internship, or do a startup. At any rate, this project will suffer if I can’t find a way to monetize it.

Ways to Accomplish Said Goals

Learning

Whichever way I release this, I’ll learn a tremendous amount. Part of the learning will be about software projects in general, and some will be about whatever method I choose to release the project. At any rate, there probably won’t be enough distinction here to make a difference in my choice.

Something Worth Using

For making something worth using, the most important distinction is probably going to be between open and closed source. There are two issues at work here.

The architecture of the program is such that nearly every aspect is extensible without too much difficulty. Thus, users should be able to add and remove visualizations, key commands, and even language backends by just dropping the code into particular folders. Additionally, the actual program itself exposes a Racket interpreter that evals in the environment of the program. Thus, this should be eminently hackable. That makes it much more valuable to users. I’m not sure if this sort of open architecture would be possible in a closed source model. It seems like it would compromise this aspect of the system.

The other issue is that this project is too big to solve myself. I’m going to need other people to work on it with me. In an open source model, this means other open source programmers, and in a closed source money-making model, this means employees (and maybe a cofounder or two…). Both of these approaches carry risks. If the open source project does not gain momentum, then there may not be enough developers complete the project. If the closed source company does not earn enough money, then there will not be enough money to hire developers. At any rate, my focus here will be to create a program that is minimally complete and useful (by which I mean something that is complete enough to be useful) and then start iterating and improving. Then, more developers will mean faster improvement, and a lack of developers will mean slower improvement rather than outright failure.

Something People Use

For people to use a piece of software, it needs to (1) be worth using (above), and (2) have low barriers to adoption. I’ve attempted to design the program in such a way that it will be familiar to at least a certain subset of programmers. It is possible to write pure Racket in it (and I’m in the middle of making the process for this smoother), which will be familiar to Racket programmers (and Lisp programmers in general) and allow the use of the usual text tools. If you need to ssh into somewhere and edit code, vim will work just fine. Speaking of vim, many of the keyboard shortcuts are vim-inspired, such as h,j,k,l,i,d,p,q,@,:. If you prefer emacs-style shortcuts, you’ll be able to change them without too much trouble. Incidentally, I think the environment will be more familiar to people who use text tools than graphical tools.

In releasing the program, though, people will likely use the program significantly more if it is free, or if it at least has a free version. Obviously, being completely free is better than having a “community version”, but even that’s better than a purely proprietary program.

Advance My Career

Obviously, if this turns into a multi-billion dollar company, then I think my career will be fine. Assuming it doesn’t, then, this has a couple of ways that it can advance my career.

If this becomes a startup, then, even if it fails, I’ll have gained a lot of valuable experience, which could help my career in three potential directions: (1) this could be something I could put on my job application to a good company, so maybe I get a better job; (2) the experience might be able to land me a good job at a startup; (3) the experience would be invaluable were I to do another startup.

I wonder, though, how much an open source project might help with those same three things. On a job application to either a big company or a startup, I could point potential employers to this project, and they could look through the code, and, hopefully, they could see how I worked with other developers. This would highlight my love for programming even without a direct path to monetization. The experience would also be useful in doing a startup since I would be working on a project with actual users and whatnot. It would not be as useful as actually doing a startup would, but it also wouldn’t be as much of a risk and time investment.

Make Money

This one is pretty heavily weighted to one side. I don’t know of any good way to make money off of an open source project of this type. I could ask for donations, but unless this becomes huge that’s not likely to be a sustainable source of income. Some have suggested Kickstarter, and that might be a good idea, but I’m not sure how I would make a Kickstarter campaign work for this. Maybe I’m just not familiar with the Kickstarter model.

If anyone has any ideas on how to monetize this, particularly in an open source manner, I’d love to hear them.

In Which the Author Does Not Come To a Conclusion

So, that’s where I’m at. I’m leaning toward open sourcing, but I’m still not completely convinced. The crux of the issue seems to be that closed source would compromise the quality of the program, and I don’t think I’m willing to do that.

Clarification Regarding Graphical Programming’s Potential

After posting my previous post, I submitted it to Hacker News, and, while I slept, it hit the front page. As is to be expected after about 10,000 views, there was a lot of feedback, mostly constructive. A couple of points recurred, and I’d like to address them here.

Show Me the Code

This is the easiest: I’m working on it. I’ve got a few thousand lines of Racket to show that I’m serious about this. I plan to release this, but I’ll warn you, it’ll probably be a few months before that happens. I plan to continue blogging and laying out my philosophy, and I may post some screenshots, but I’m going to spend most of my time wrestling the code into something useful.

<Insert Graphical Programming System> Is Awful, Ergo Graphical Programming Is Awful

I acknowledge that graphical programming, in its present state, is not good. In particular, it’s significantly worse than textual programming. That’s exactly my point. I think there’s a lot of improvement that can be made here. Graphical programming has (with a few exceptions) failed. My goal is to change that.

The argument that “textual programming is better than graphical programming in their present forms” is not an argument that textual programming is better than graphical programming. When deciding to work on something like this, you don’t ask whether TP is better than GP right now, you ask whether GP has the potential to be better than TP. My hypothesis is that it does.

The Mouse Is Awful, Ergo Graphical Programming Is Awful

I agree that the mouse is used way too much in user-oriented software. But graphical programming does not have to mean using the mouse. It’s true that those have been used together for much of their combined history, but it’s not because the one inherently depends on the other. In my system, the mouse is purely optional; indeed, most of the important interactions can only be accessed by the keyboard.

On the flip side, some may argue that this makes it not very user-friendly. I’m not trying to make it easier to learn programming, and I’m not trying to get non-programmers to start programming. I’m trying to make the everyday life of the developer easier. If a system is such that many of the interactions will be brief, it may be important to make it intuitive and obvious. If a system is such that many users will spend hours every day working with it, the “activation energy” is not terribly important, since it will be dwarfed by the later productivity. Think asymptotic complexity. When n is small, the constants are important; when n is large, the constants don’t make that much of a difference. That’s why vim is not “user-friendly”, but is still commonly by programmers. For people for whom editing text is a significant part of their life, it’s a lot more important that the text editor be powerful and efficient than that it be easy to learn.

I Want a Neural Interface (Or at Least Natural Language)

I hope we move beyond both textual and graphical programming. But it’s going to take a while. I think graphical programming is within the limits of our technology, so I’m going to try to make that. Waiting for the technology to get good enough for a neural interface is not productive. If I had the scientific and background required to help with the research required to implement a neural interface, I would. But I don’t, so I figure I’ll do the best I can with what we’ve got.

Incidentally, regarding natural language, I disagree with (what appears to be) the general consensus that this is a better way to interact with the computer than what we have presently. There are certainly some cases where it would be useful (typing this blog post, for example). But English is not a technical or precise language. English has been optimized for interpersonal communication and relations, and it does a pretty decent job at that. English is not good for describing stuff on a technical level. If I were able to speak to a computer, I would want to be able to use a language specifically designed for being able to speak technically. It might take some time for me to learn, but I think it’d be better than the machine trying to assign meaning to my English statements. Most of what we say in English is not just hard to assign a technical meaning to — it may not have a technical meaning. When telling a computer what to do, I want to actually be able to communicate exactly what I want.

Conclusion

Thank you guys for the constructive criticism and the links to other projects. I do want to be able to give a well-reasoned response to possible issues with my hypothesis, so any criticism is welcome. If I missed an important point, let me know.

For the near future, I’ll continue blogging about my ideas and working on the code. Eventually, I’ll release, and then it’ll probably make a lot more sense.

Graphical Programming: I Really Hope This Is the Future

Since March 2013, your author has been working on a software project with a fervor not previously experienced by said author. This work has not been a never-ending drive in a certain direction with a clear vision; rather, I’ve been engaging in copious exploratory programming. Over spring break, I got a working pre-alpha, and I thought the basic architecture was fairly solid and wouldn’t change much. Since then, every aspect of the stack has been swapped out for something new at least once. All the data structures are different, the important program logic is solving a completely different problem than it was, and the interface has gone through iteration after iteration of paradigm shift.

One thing, however, has remained.

Purpose

The purpose of this project is to assist the coder in developing more complex code faster and easier by clearly exposing the structure of the code.

To explain the solution I propose, I will first explain the particular problem I see.

ASCII Is Awesome, Just Not That Awesome

In the Good Old Days(tm), people wrote programs in binary, and the great masters of the era were those who achieved the greatest oneness with the computer. By which I mean, the interface between humans and computers followed a very simple rule: make something the computers can understand, and then teach the humans to speak the language. This system worked rather well because humans have an amazing ability to learn even the most abstract skills remarkably well.

In time, though, with advent of assemblers, and later compilers and interpreters, we moved the frontier between humans and computers. Now, the computer does much more work to try to understand what we mean; likewise, humans do much less work to try to explain what we mean. As abstraction is layered on abstraction, humans can use simpler statements to describe more complex programs than ever before.

However, and this is the meat of my criticism of the state of programming today, our interface for programming is still based on text.

Text is awesome. I much prefer the terminal over GUIs. I write documents in LaTeX using vim and avoid Microsoft Word and LibreOffice Writer like the plague because they’re just such a pain to work with. Nearly without exception, if I’m given the option to write something in plain text rather than “visually” or in “formatted” or “rich” text, then I’ll write in plain text, even if that means throwing in a little code. For a combinatorics problem I’ve been working on recently I needed to draw a series of graphs, so what did I use? Tikz in LaTeX. I find that almost without exception, GUIs (1) do not allow the kind of manipulation I want, (2) hide important information, and/or (3) are simply frustrating to use.

Is this a problem inherent in graphical user interfaces, though? Is text really the best interface possible to interact and communicate with computers?

I submit that it is not.

GUIs have been misused.

A Defense of Graphical User Interfaces

Why do I use text programs rather than graphical programs? As outlined above, there are essentially three reasons.

Programmability

This is the most important isssue. The terminal is awesome because of its composability. Basically, the Unix pipe and backticks allow any command to be used as the input to any other command, and this allows the user to take relatively simple commands and mold them together into a one-liner that solves exactly his or her problem. When a new command is added, it integrates perfectly with the other commands, so it immediately gains all the power contained in the other commands. This allows the user to learn one set of commands and everything else just plugs right in.

I use vim for everything from configuring programs to note taking to todo lists to writing math papers to programming. I even use vim macros as a sort of scripting language if I’m trying to do something that’s a little too complicated for sed. Regular expressions make manipulating many different kinds of data — anything from file lists to code to csv files — surprisingly easy.

I like LaTeX better than Word not because I can use a plain text interface. I like it better because I can program in it (either directly or by writing a program that spits out LaTeX code). And being able to program in it gives the interface more expressive power. It is easier to tell the computer what I want it to do.

Text interfaces in general usually exhibit this kind of programmability. GUIs, as a rule, do not.

This problem, however, is solvable. Inherently, I don’t think it’s any easier to make a text interface programmable than it is to make a graphical interface programmable. However, since the text tools are already available, it is easier in practice to create a programmable text program. The solution for GUIs, then, is to create the tools necessary to program them. That’s a really vague description, but that’s because this is the hardest problem. I will describe my solution in detail in another post.

Show Me Everything

WYSIWYG editors are basically evil. I don’t want you to hide all the complicated formatting stuff in the background. I don’t trust you that much. One of the most frustrating experiences is, in Microsoft Word, to try to put a picture exactly where you want it. There usually is a way to make it work, but it is so needlessly complicated that I get insanely frustrated. When I drag this picture right here, why do you put it over there? The problem is not so much the putting it over there as it is the not telling me why. Please, if I’m doing something that doesn’t work in your framework, then let me know.

In LaTeX, every little bit of formatting is eminently visible. If there’s a problem, then at least I know exactly what the inputs are that give me the incorrect results, so I can know what to try to change to fix it.

However, this is clearly just a misuse of GUIs. Is there any inherent reason why GUIs must show less information than text interfaces? No. In fact, I would argue that they ought to be able to show more. This is a problem with current GUIs, but this is very solvable.

Physical Interface

GUIs generally depend on the mouse for interaction, and that’s generally a bad thing. What is the difference between the mouse and the keyboard? Basically, the mouse gives you pseudo-analog interaction, and the keyboard gives you a lot more buttons. Thus, the mouse should be used in cases where you want analog interaction, and the keyboard should be used for making discrete choices. That’s why manipulating menus with a mouse grates on me so much: it’s absolutely the wrong choice of utensil. It’s like eating soup with a fork.

There are good uses for mice. For example, viewing 3D models should definitely be done with a mouse since the interaction is fundamentally analog (although there should be discrete buttons to, for example, snap to an axis). Flight simulators are a lot more fun with a joystick than just using the arrow keys, and that’s because flying a plane is an analog interaction.

GUIs don’t need to be based on the mouse, though. It’s perfectly possible to create a GUI that is primarily controlled by the keyboard, and in many cases that would be a good thing. Again, this is a solvable problem.

Graphical Programming

So, yeah, text is great, and modern GUIs are, if not completely awful, at least significantly worse than text interfaces. I’m trying to make a GUI for programming that is an actual improvement over a text interface.

How could programming be helped by seeing the code graphically? Basically, in text, there are a lot of connections behind the scenes that the programmer must keep in his or her head. For example, an identifier is not merely a string of characters; it is connected with every other instance of that identifier, and in particular with its definition. It may also be connected in some ways to other identifiers — it may have been included from some module, it may have various functions that apply to it, or it may be the inverse of some other identifier. All these connections must be kept track of by the programmer in his or her mind (assisted by documentation).

However, the computer knows all these connections as well. Why can’t the computer just show me these connections? In a text interface, this is difficult (although automatic indentation and syntax highlighting is helpful).

Thus, this isn’t really about graphical programming. The point is not that stuff is arranged nonlinearly. The point is that we show more of the structure of the code. We show several different aspects of the code at once, and we can have different graphical views of the same code. Maybe one view emphasizes the data flow; another view emphasizes the control flow, and another view emphasizes the human-meaningful divisions in the code (between, for example, view logic, controller logic, and model logic).

This applies especially well to Lisps. A Lisp program is just a tree, and every expression in the program is just a subtree of the program. Text, however, being line-oriented, is good for displaying linear processes, as in assembly or C or Python. It is difficult to naturally represent a tree in text. The most common solutions are indentation, which is too vertically verbose for Lisp’s needs, and parentheses.

Using parentheses to display a tree is a decent system. It is unambiguous, simple, and easy to type. However, it is clear to me that it is not the best system. When we draw a tree on a blackboard or on paper, we don’t use the linear parenthesis system. No, we draw them graphically. Why? Because it’s easier for humans to understand something drawn out visually. Why don’t we use that when programming Lisp, then? Because it has historically been hard to draw that kind of thing quickly and easily with a computer. Basically, it’s just easier to use parentheses.

However, I believe this can change. Thus, I propose that graphical programming should be the future. I haven’t seen a graphical programming system that I thought was better (or even close to as good) as a text system, but that’s just because people are going at it the wrong way.

In time, I’ll write about what my solution is to the problem of graphical programming. My solution is still in pre-alpha, although I think I’m tantalizingly close to an alpha version. So, until then, I’d love to hear anyone’s opinions on the subject.

References

After working on my project for a couple of months, I found Bret Victor’s work, and I’ve since incorporated some of his ideas into my view of programming (as you may notice above). I particularly recommend Learnable Programming and The Future of Programming:

http://worrydream.com/#!/LearnableProgramming
http://worrydream.com/#!/TheFutureOfProgramming

For an overview of many of the studied tree visualization techniques, see treevis.net. I’ve implemented several of these visualizations in my project, and I plan to implement many more:

http://vcg.informatik.uni-rostock.de/~hs162/treeposter/poster.html