Archive for the ‘computer’ Category

The search for simplicity

Sunday, September 28th, 2008

There are several ways in which computer programming and physics are very similar. Possibly the most important is that both disciplines are, fundamentally, a search for simplicity.

In physics, we have a big pile of experimental results and we want to find the simplest theory that satisfies them all. Just listing all the experiments and their results gives you a theory of physics, but not a particularly useful one since it's not very simple and doesn't predict the results of future experiments (only the past ones). Rather than just listing the results, we would like to find a general theory, an equation, a straight line through the points of data which allows for interpolation and extrapolation. This is a much more difficult thing to do as it requires insight and imagination.

In computer programming, we generally have a big pile of specifications about what a program should do - maybe a list of possible interactions with the user (what they input and what they should expect to see as output). These might be encapsulated as testcases. To write a program that satisfies all the testcases, we could just go through them all one by one, write code to detect that particular testcase and hard-code the output for that particular input. That wouldn't be very useful though, as the program would fail as soon as the user tried to do something that wasn't exactly one of the scenarios that the designers had anticipated. Instead we want to write programs for the general case - programs that do the right thing no matter what the input is. When the "right thing" isn't precisely specified, we get to choose the output that makes the most sense according to our internal model of how the program should act.

I think a number of software companies in recent years (Microsoft in particular but others as well) have started to fall into the trap of writing software that concentrates too much on what the behavior of the software should be for particular (sometimes quite specific) scenarios, at the expense of doing the right thing in the most general case. Windows is chock full of "special case" code ("epicycles" if you will) to work around particular problems when the right thing to do would have been to fix the general problem, or sometimes even to explain that this is how we should expect it to work. Here is one example of this kind of band-aiding. I discovered another the other day - I was running some older Windows software in Vista and accessed the "Help" functionality, which was implemented an old-style .hlp file. Vista told me that it no longer includes the .hlp viewer by default (I guess it was a piece of the OS that doesn't get a lot of use these days, and they had just dropped it from the default distribution to avoid having to bring it up to the latest coding standards). I was pointed to the download location (where I had to install an ActiveX control to verify that my copy of Windows was genuine before I was allowed to download the viewer).

Part of the problem is that (at Microsoft at least) it's very difficult to make big changes. Rewriting some core piece of functionality, even if the programming itself is easy, would involve months of planning, scheduling, designing, specification writing, testcase writing, test-plan reviewing, management sign off meetings, threat modelling, localization planning, documentation planning, API reviewing, performance testing, static analysis, political correctness checking, code reviewing and integrating. And of course everyone whose code might possibly be affected by the change needs to sign off on it and put in their two cents about the correct design. And it must comply with the coding standards du jour, which change every year or two (so delay too long and you'll probably have to start all over again.) When you come to understand all this, the long gap between XP and Vista becomes less surprising (in fact, it's quite a surprise to me that it only took a little over 5 years, considering how many pieces were completely rewritten). All this process exists for a reason (mostly the politician's fallacy) but is rigorously justified and widely accepted.

Because it's difficult to make big changes, people tend to make little changes instead ("hey, we can work around this by just doing x in case y - it's just one extra line of code") - these don't require much process (usually just a code review - most of the rest of the processes for such small changes is automated). All these small changes add up to a great deal of extra code complexity which makes it very difficult for newcomers to understand the code, and even more difficult to rewrite it in the future because people will have come to depend on these edge cases.

Ray tracing in GR

Saturday, September 27th, 2008

Following on from this post, a natural generalization is that to non-Euclidean spaces. This is important for simulating gravity, for example rendering a scientifically accurate trip through a wormhole (something I have long wanted to do but never got to work). The main difference is that ones rays are curved in general, which makes the equations much more difficult (really they need to be numerically integrated, making it orders of magnitude slower than normal ray-tracing). One complication of this is that generally the rays will also curve between the eye point and the screen. But the rays between your screen and your eye in real life do not curve, so it would look wrong!

I think the way out of this is to make the virtual screen very small and close to the eye. This doesn't affect the rendering in flat space (since only the directions of the rays matter) and effectively eliminates the need to take into account curvature between the screen and the eye (essentially it makes the observer into a locally Euclidean reference frame).

Another complications of simulated relativity is the inability to simulate time dilation. Well, you can simulate it perfectly well if you're the only observer in the simulated universe but this would be a big problem for anyone who wanted to make a relativistically-accurate multiplayer game - as soon as the players are moving fast enough with respect to each other to have different reference frames, they will disagree about their expected relative time dilations.

The hobgoblin of little minds

Thursday, September 25th, 2008

Most software projects with more than one programmer seem to enforce some kind of formatting style for the code - brace positions, indent width, use of tabs - that sort of thing.

At Microsoft, we didn't spend a lot of time talking about the style but we did have one rule - you should try to make your code consistent with the code around it. (If you were in the fortunate position of starting a brand new project from scratch, you got to choose the style yourself.) At least until the tyrannical StyleCop showed up. I left before using it for very long but I hated having to placate it (especially when I disagreed with its rules - for example, it wouldn't let me insert extra blank lines to group related functions, or arrange my functions in a more logical order than the standard one).

The GNU coding standards are similarly strict. I haven't disagreed with them very much (though I dislike the convention of having two spaces after a full stop).

I suppose having style guidelines (provided they are good ones) does make the source code look prettier and more consistent. However, I'm not convinced that it is worth the effort, especially since any programmer will have to be able to read code written in any style anyway (lest they start making assumptions and get fooled by a malevolent patch). In fact, there may be certain benefits to allowing every programmer to adopt their own personal favorite style. For one thing you'd be able to tell at a glance just who wrote any particular piece of code in your program (assuming that there were a small number of contributors and you're familiar with all their work, which I don't think are particularly bad assumptions in many cases).

Programmers' personal style also changes with time, so this can also be a good gauge for how old a particular piece of code is.

However one chooses to format their code, it is important that is readable - not having indentation at all, or having inconsistent indentation in a given class or file, or having indentation that misleads you about which "if" an "else" is paired with should not be acceptable.

One might get the impression from reading the above that my own preferred coding style is not particularly consistent. Nothing could be further from the truth - I've spent many hours (probably far too many) reformatting code to my own personal taste (K&R style with the exception of putting opening braces of global functions and classes on column 1) to make it look prettier. I used to prefer indenting by 2 spaces but now I prefer 4 (a habit I picked up at Microsoft). As I like to avoid very deeply nested constructs (and like to be able to see how deeply nested I am easily), I may even increase that further in the future.

What makes a game addictive?

Wednesday, September 24th, 2008

I have an unfortunate habit of getting addicted to computer games - this is one reason why I don't play them as much as I used to, and why when I do now play games, I usually pick one with a definite ending so that there's a natural place to stop.

But occasionally I do slip into excess playing of Tet4, Freecell or Spider Solitaire. I think what makes these games particularly addictive is that restarting them is a move which seems to bring you closer to winning. In all these games, the state of the game starts out as quite simple and the moves are obvious, but as you play things get more complexified and constrained until either you lose or (if possible) winning becomes inevitable. Restarting decomplexifies so when you lose your brain (which is still thinking in terms of the game) naturally reaches for the "restart" key.

Tet4 implemented in JavaScript

Tuesday, September 23rd, 2008

If you want to just skip to the game, the link is here.

Many years ago, I decided to write a Tetris game, just to see if I could. I succeeded, but the play area was very wide. So I added controls to make it adjustable. I figured the minimum sensible width was 4 (otherwise not all orientations of all pieces can be used). Playing Tetris on a board of width 4 (or Tet4 for short), I discovered, is very different to normal Tetris. Things change much more quickly, every piece has a profound effect on the state of the board and the normal Tetris strategies don't apply because it's impossible to play without leaving some gaps.

After playing for a while, I discovered that Tet4 was even more addictive than normal Tetris, and that it was much easier to get to the state of mental concentration known as "the zone" - where the entire rest of the world seems to melt away and there is nothing left except you and the falling blocks. When you notice this (and can do so without leaving the zone), it becomes almost an "out of body" experience - your conscious mind almost seems to observe from outside as your unconscious mind plays the game as if on some kind of automatic pilot.

Once I experienced this, I wanted to intensify it. I realized that a gradual increase in speed is intrinsic to the zone experience (otherwise it's just repetitive). But at a particular speed, the limiting factor becomes how fast your fingers can move to maneuvre the piece into place - this can take as many as 4 or 5 keystrokes with the normal Tetris controls. I realized that Tet4 only had at most 10 possible combinations of orientation and position for each block. Most people have 10 fingers so let's just assign one combination to each of 10 keys and have a finger corresponding to each key. This is the control mechanism that Tet4 uses, and it worked exactly as I had hoped.

This version is a rewritten version in JavaScript (because I wanted to learn the language, and because by making it run in a web browser more people would be able to play it). I've tested it on IE7, FireFox 3 and Chrome and it seems to work fine but there may be bugs with other browsers. Let me know if you find any. I've made some fairly substantial changes with this rewrite, which makes this a rather unusual and minimalist version of the game:

  • A display which shows the position and orientation for each key (on my original version I learned the combinations by trial and error). This display is really just to reduce the game's learning curve a bit - to truly get into the zone you'll need to commit all the combinations to muscle memory.
  • In my original version, the 10 keys just set the position and orientation of the tetroid - you had to press the spacebar (or wait for gravity) to drop the piece to the bottom and get the next piece. This meant a decrease in zone at the point where it gets too fast to use the space bar and you switch from 11-key mode to 10-key mode. In this version, the 10 keys drop the piece as well, so you always press 1 key per piece.
  • In my original version, the game ends rather suddenly when your tower gets so high that there isn't time to think before your active tetroid locks. In this version, there is a "curtain" which falls behind the playfield, and you always have until it gets all the way to the bottom before your tetroid drops. This means that the amount of time you have isn't dependent on the height of your tower. Also, the piece doesn't enter the playfield until it is dropped, so you can take care even with the endgame (you lose when you drop a piece and it would protrude from the top of the screen).
  • The single key control mechanism is rather unforgiving of misdrops, so I added an undo feature - at any time you can press Q to undo the last drop. This helps in learning the keys, but for getting the best scores it is best avoided as your score is also undone but the time speedup is not. This also doesn't allow you to "look into the future" (further than you can anyway with the next piece indicator) - a new tetroid is chosen at random for the new next piece whenever a piece is placed.
  • I added a persistant, global high score table which includes a facility for replaying the top 10 games.
  • I modified the colours and key to position correspondances to better take advantage of the symmetries. This puts me at a bit of a disadvantage (since I'm used to the original keys) but should be easier to learn.

One other thing I'd like to do in the future is make an ActionScript/Flash version so that it can be embedded into other pages. I started this but it turns out that without the official Adobe development kit, Flash is really hard to learn.

Editor comment shading

Monday, September 22nd, 2008

Syntax highlighting is an indespensible feature of a programmer's editor - I don't know how I managed without it and always miss it if for some reason it doesn't activate.

One aspect of syntax highlighting that editors don't seem to do very well, however, is multiple levels of meaning. For example, if I have some code that's commented out the "top level" of meaning of this text is that it's a comment. The next level underneath that is that it is code. Syntax highlighting just treats it as a comment, however, and makes it a single colour. It would be neat if, within a comment (or other section that can be interpreted on multiple levels) the editor would try to parse the text and tweak the colour based on that. Commented out code would then be syntax highlighted, but would also be tinted green to show that it was within a comment. Similar techniques can be used for "#ifdef 0"ed out code, code within macros and code within literal strings. This would make it much easier to work with this sort of "multi-levelled" code.

You should only ever need one GUID

Sunday, September 21st, 2008

Some people like to use GUIDs for everything - every interface, every class, every type library (I'm looking at you COM) - sometimes even every record in a database. This is ridiculous. GUIDs are overly verbose and difficult to work with, and far more unique than they need to be. We should be using hierarchical identifiers instead. There is exactly one circumstance I can think of in which a GUID would be necessary - you want to establish your own sub-namespace in some global registry and want to avoid colliding with anyone else who might happen to choose the same name as you. In the absence of some universal arbitrator who can dole out names, a GUID is an acceptable alternative. Once you have your GUID you can use it in any such situation - it only needs to be unique within that particular registry - it doesn't need to be universally unique.

No person or organization needs more than one GUID because a person or organization can make up their own namespace under that GUID.

If you follow the Java path and use domain names as the top level of your hierarchy, you don't even need a GUID. That's not a perfect solution, though, because sometimes domain names expire and fall into the wrong hands.

Vista annoyances

Saturday, September 20th, 2008
  1. The intensity of the shadows around the windows darkens in proportion to the number of shadows. When you have a pile of windows of the same size in the same place, it looks very weird.
  2. Sometimes remote desktop sessions from an XP machine freeze for no apparent reason - often it is necessary to close the session and reconnect to regain control. Curiously, rebooting seems to help but then it starts happening again at the next reboot - some parameter seems to be set at random on boot which affects how often this happens. Googling suggests a few solutions, none of which made any difference for me.
  3. Windows Mail works well enough but it's search facility is much slower than Outlook Express's was because mail is stored one message per file instead of one folder per file. I guess this would be less of a problem if I hadn't turned off the Windows search service, but this service is a resource hog.
  4. The Aero glass theme is okay but I think on the whole I still prefer the classic Windows 98 look. It's easier to tell when a window has focus. Glass is really just a gimmick - there's no usability advantage. And taking screenshots of windows leaks information through the translucent title bar. I would switch back to the classic look but certain pieces of Windows functionality look terrible on Vista classic. Also you can't do this without turning off compositing. I hope they fix that in a future version.
  5. You still get an annoying warning message when the number of icons on your start menu is too large. Why can't you just make the menu take up the entire vertical height of the screen and put as many icons as you can on it?
  6. You can't type a full path into the "Save As" dialog anymore! It says that "\" is not a valid character to have in a name. You have to navigate to the directory you want to save in and then just type the name.

Some might say that having worked at Microsoft throughout the entire Vista development cycle I should have been trying it out regularly and filing bugs. But they weren't paying me for that, and for most of Vista's development it was too buggy to be usable for my day to day work - I would have had to have dedicated a machine to it (which I probably would then have used rarely). Also, I tried filing bugs against Windows a few times and they always got resolved as "By design", "Won't fix" or "Postponed". It's not that the developers don't know about these things, they just don't have the resources to fix them.

More Linux usability problems

Friday, September 19th, 2008

Tab completion: If there are multiple possible completions, you have to type the next character and press tab. With Windows you can just press tab repeatedly until you get to the completion you want. Both systems are reasonable but it's annoying to use one when you're used to another. To be fair the Linux system has better worst-case behavior (O(n) keystrokes instead of O(exp(n))) and I think it's generally better for navigating hierarchies that you know well, but the Windows system is better for navigating unfamiliar hierarchies.

Shell glob expansion: while it makes some sense for the shell to expand wildcards and does ensure consistent wildcard expansion for different programs, I think this design causes more problems than it solves. Wildcards mean different things depending on what you're doing, so I think on balance it makes more sense for individual programs to do their own wildcard expansion as in Windows. For example in Windows I can type "findstr /sip foo *.txt" to find the string foo in all files ending in .txt in all subdirectories. With Linux this is significantly more complicated. This also leads to:

Too much escaping: with so much intelligence in the shell, many characters have shell meanings and aren't just passed along to programs on their command lines unless they are escaped. This leads to tons of escaping. Escape sequences aren't hard to write (at least if you know what the escape character is) but they are very hard to read - they look like line noise.

I made my bash prompt a different colour from the rest of the terminal and now it gets confused if press up when the path is so long it takes more than one line of the terminal window. It's just a bug (in readline I think) but this software is so mature there's no excuse for it.

It's annoying that closing a terminal window closes any programs that I've opened using it, even background ones. I don't want any processes to go away without me explicitly killing them.

Double-clicking on the top-left corner of a window should close it like it does on Windows, but it doesn't.

Adding programs to the start menu seems to be much more difficult than it is on Windows.

Emacs gripes

Thursday, September 18th, 2008
  1. Copy and Paste in CUA mode (Ctrl+C, Ctrl+V) doesn't interact with the desktop clipboard. If I want to copy and paste to another application I have to use the mouse. I know this isn't really the fault of Emacs, and might even work if I used it in windowed mode, except for:
  2. Windowed mode takes too long to start up if I'm using Emacs on a remote machine (as I often am). Sometimes even terminal mode takes a long time to start up if the machine is busy. I guess this has improved in recent years but it's still annoying when I have to wait for it. I'm sure Emacs gurus would tell me that I just need to leave it running continuously and use it as my desktop environment, but I would rather bend my software to support my way of working than the other way around.
  3. CUA mode doesn't work in PuTTY because the shift+cursor keys don't send the right sequences. Again, I know this isn't really the fault of Emacs.
  4. PgUp and PgDn don't work right. Specifically, if I'm in the middle of a long document and I press PgUp followed by PgDn, it should put the cursor back where it was. It doesn't. This drives me crazy as I lose track of where the cursor is in the window, where the cursor is in the document and where the document is relative to the window. It always takes me a few seconds to get my bearings again. I found some code which fixes this, but shifted PgUp and PgDn don't work - guess I'll have to learn some elisp.
  5. Long lines are wrapped instead of disappearing on the right by default. This seemed quite sensible at first (since you can see the whole document without horizontal scrolling) but it makes vertical cursor movement is unpredictable. In general I think I prefer horizontal scrolling. Fortunately this is easy to change.
  6. Spaces aren't removed from the ends of lines. This mostly trips me up when I copy and paste using the desktop clipboard from another emacs session in a different terminal window - all the lines that end partway through a /* comment */ get spaces at the end of them for some reason.
  7. It indents with tabs instead of spaces. I know how to turn this off but it means that most GNU source code has tabs. This means that the indentation of these files only looks correct when the tab stops are every 8 characters. Also it means that when I move the cursor left or right I can't predict whether it will move 1 space, 8 or something in between, and often end up overshooting with my cursor movement. Also the fact that pressing Tab doesn't insert a tab character has tripped me up a few times.
  8. I couldn't get visible whitespace mode to work - it seems to be incompatible with my terminal. I think this might be fixed with Unicode support in the next version of Emacs.
  9. There don't seem to be any good colour schemes for terminal mode, only for windowed mode. Again I guess this is the fault of the terminal for having only a fixed palette of 16 colours. I tweaked the default colours a bit to get something reasonable but it's still a bit garish, and writing a colour scheme from scratch is daunting.
  10. Sometimes font-lock mode parses things incorrectly. Most editors have similar problems, but when it happens in Emacs it's more confusing because Emacs tries to be more intelligent.
  11. Not all navigation can be done with the shift keys for selection. I particularly miss shift-home and shift-end - combinations like these are much more intuitive and discoverable than special purpose commands like "delete to end of line".
  12. Undo doesn't work very intuitively - if I undo too far it undoes the undoing. I like the Windows Ctrl+Z to undo, Ctrl+Y to redo system more.
  13. Backspace in incremental search goes to the previous hit. I often want to search for something and then when I find it search for something else from that point. This operation always seems to trip me up. Okay it's easy to work around but still annoying.

Many of these can probably be fixed if I spend enough time tinkering with it.

I suppose to be fair I should list the things I like about Emacs too:

  1. Extensibility. Nothing else even comes close.
  2. Community. I don't think any other editor has had so much written about it online.
  3. C programming features. It's terrific that I can make my code follow the GNU style guidelines just by pressing Tab somewhere on the line.
  4. It highlights extraneous spaces at the end of lines.
  5. Apart from point 13 above, incremental search works great.
  6. I like the way tab completion and some other things work the same way as in bash. Consistency is nice. Nothing outside the Free Software world is this well-factored.