Tet4 is a game of luck

October 3rd, 2009

I used to think it might be possible to come up with some kind of general algorithm for playing Tet4 indefinitely, but I now suspect that no matter how tall the pit is and how many pieces you can look ahead, there are always sequences of pieces that will guarantee a loss - in particular, I don't think there is any way to place the sequence SSZ without adding a row every 6 pieces. This means that ultimately, Tet4 is a game of luck. However, most sequences that occur in practice do seem to be solvable - many years ago I wrote a program to play Tet4 automatically, and it seems to be able to play indefinitely. So in practice speed and accuracy are more important than luck for winning the game.

Side note: On the most recent occasion that I tried to figure this out, I attempted to resurrect my old Tet4-auto-player program (not trivial, since it's a graphical DOS program and hence doesn't work on Vista) and modify it to see if it could solve the SSZ repeated sequence. Once I'd got it running, I was amused to see that it was already set up to solve the SSZ repeated sequence - I'd done the exact same thing before and forgotten all about it!

Talking of Tet4, it's had a couple of updates since I last mentioned it here. The original version has had a bug fix so that playback of games in which the curtain reaches the bottom works properly. Also, a second version has been added, which is exactly the same but with no undo (and which is therefore less forgiving). This may be one of very few occasions on which the second version of a piece of software is identical to the first version apart from the removal of a significant feature.

When I first wrote the Javascript version of Tet4, I took the opportunity to change the keys from the DOS version a bit, to make them more logical and faster (so that if there were 5 or fewer possible positions for a piece, either hand could be used, for example). This meant that I kept making mistakes because (although I hadn't played Tet4 for years) my muscle memory was still trained with the old keys. Hence I added the undo feature. But in making the game more forgiving, I changed its nature significantly.

There's two ways I could have added undo to Tet4. One way is to undo the random number generator one step when undoing. The other is not to. Undoing the random number generator essentially gives you unlimited lookahead capability. To look ahead n pieces, drop those n pieces (anywhere) and then undo n times. I didn't want to grant that power, so I wrote it in such a way that dropping n pieces and undoing them changes the random number generator state. However, this gives an even stronger ability - now you can essentially choose which piece comes next by dropping and undoing until you get the piece you want. I think this makes the game too easy (at least until it gets really fast), hence the new version.

Text editor as shell

October 2nd, 2009

The text editor interface I talked about yesterday is so good I find myself wanting to use for things other than text editing. In particular, I'd love to be able to use it as my command line interface. Most command line shells are annoying because you can't use the keyboard to move the cursor up into a program's output to copy it - you generally have to resort to the mouse. I think it would be far better to treat the entire command-line session as a document. One way to do this would be to have a text editor with a special key combination (say Ctrl+Enter) which takes the contents of the current line, executes it as a command, and pastes the resulting output into the document immediately below, followed by a prompt, followed by the cursor.

One useful aspect of command line interfaces is tab completion - it would be nice if there were a way to make this continue to work in the editor interface. Perhaps tab could be interpreted to mean "tab completion" if the line above the cursor was a prompt (maybe prompt lines are "special" in some way, though that counteracts the principle that there should be no "hidden" information). If prompt lines are special, then maybe pressing Enter on the line below would suffice for running commands, instead of a special key combination for this.

A variation on this idea would be to allow editing of multiple files in a single buffer. Suppose you're using the editor interface and "cat" (Unix) or "type" (Windows) the contents of a file, causing it to be inserted into the buffer. I think it would be tremendously useful if that file was then "live" and could be edited by just moving the cursor up into its contents, changing things, and then hitting some other key (perhaps Alt+S?) to save it. Again this goes against the "no hidden information" principle, though. Perhaps one of the ASCII control codes that aren't usually found in text files could be repurposed to have special meaning to the editor, to signify a line that is a prompt or contains the path to an embedded file.

Commandments of text editing

October 1st, 2009

As a programmer, one of the main user interfaces I work with day-to-day is a text editor. I find this interface to be extremely natural and efficient, when it's done right. A text editor isn't the same thing as a wordprocessor - text is all monospaced and there are no fonts or embellishments like bold, italic and underline. Colour may be present for syntax highlighting purposes, but you can't set the colour of individual characters. An important aspect of a text editor is that what you see should be exactly what you get - there should be no "hidden" information. For this reason I eschew spaces at the ends of lines and tab characters (though unfortunately documents with such things sometimes have to be handled for compatibility reasons).

I think that a find that a text editor is best thought of as a (very large) 2D grid, with one character in each cell of the grid. Moving around this grid should be predictable - the left arrow key should always move the cursor one space to the left (unless it's at the beginning of the line, in which case it should do nothing). Similarly for the other arrow keys. Home should always move the cursor to column 1 of the current line, End should move it to the cell after the last character in the current line. If you move it the cursor from the end of a long line up or down to a shorter line, the cursor shouldn't jump to the end of the line, it should remain in the same column, and spaces should be inserted as necessary if you type in such non-existent locations.

To get around the document more quickly, the PgUp and PgDn keys should scroll the document up and down a screenful at a time (leaving the cursor in the same location on the screen). Pressing the arrow keys with Ctrl held down should move the cursor a predictable amount (say 5 characters/lines at a time). Many editors take Ctrl+Left and Ctrl+Right to mean "move left a word" and "move right a word" but I think consistency in number of characters moved is more useful.

Ctrl+PgUp and Ctrl+PgDn should take one to the beginning or end of the document respectively.

Moving through the document while keeping the cursor in the same place is useful too - let's use the Alt key for this. If Alt is held down the same move is performed relative to the document but the cursor is kept in the same place on the screen.

The shift key should be used to select text (i.e. if you move the cursor with the shift key held down, the text from the point where you pressed shift to the current cursor location should be selected. One should also be able to use the mouse for selecting text (though it's rare that I find myself doing this in a good text editor). Selecting rectangular areas of text is also occasionally useful, though sufficiently rare not to need a modifier key dedicated to it. Typing in a selection should delete it, as should the Del key when text is selected. When text is not selected, Del should delete the character or line break to the right of the cursor.

It's very important that all the CUA keyboard shortcuts work:
Ctrl+C for copy
Ctrl+X for cut
Ctrl+V for paste
Ctrl+Z for undo
Ctrl+Y for redo
Ctrl+S for save
Ctrl+A for select entire document
Ctrl+F for search

The insert key should toggle insert mode (very useful when working with fixed-width data).

A few other things I miss if they aren't there: F3 to load a new file. Alt+W to save the current selection as a new file. Alt+R to read a file and insert it into the document at the cursor position. Esc to switch to a menu for accessing less often used functions such as macros and window splitting. Alt+F6 for switching to the next file in the ring.

The text editor I have found that follows most of these ideals is TSE, and it's always one of the first things I install on a new machine. With some tweaking I could probably get it even closer to my ideal. Unfortunately it doesn't run well on Linux at the moment. For this reason (and others) one of these days I may write my own text editor to replace it.

Half-size Lego

September 30th, 2009

I've been playing with Lego again recently, since Alexander enjoys it and wants me to help. He has three sorts - the normal Lego bricks, the double-size Duplo bricks and the quadruple-size Quattro bricks. The Lego bricks are not directly compatible with the Quattro bricks but the Duplo bricks are compatible with both and can be used as an interface. This allows one to create models that are very large (using the Quattro bricks) but also detailed (with the Duplo and Lego bricks).

This leads one to wonder whether a fourth size of brick would be possible, with a size half that of normal Lego bricks. It would not be directly compatible with Duplo or Quattro, but would be compatible with Lego bricks in just the same way that Lego bricks are compatible with Duplo bricks.

This is not always possible, since the "bumps" in Duplo and Quattro bricks have indentations in them which are necessary for compatibility. Most Lego bricks don't have these. But some (particularly Technic Lego) do. This leads me to wonder if the Lego company at one point planned to make such half-size bricks. I'm guessing the reason they didn't was that the pieces would be too fiddly for most fingers, too easily lost/swallowed and too difficult to manufacture with the required accuracy (the edges might also have to be dangerously sharp in order to join properly).

The Lego company did actually make a smaller version of Lego called Modulex but the bricks are 5/8 size rather than half size and are therefore not compatible.

Welcome Penelope

September 3rd, 2009

Penelope Beatrice Fern Jenner was born at 4:22pm on the 2nd of September 2009, weighing 7 pounds 14.3 ounces. Mother and baby are both doing very well. Lots of pictures below - click to embiggen.

Yes, I'm metablogging again

August 24th, 2009

Recently, I got a piece of blogspam emploring me to turn off "no follow" on my comment links in order to increase my blog's readership. The "rel='nofollow'" attribute on a link (for those who don't know) means "I do not endorse this link - it was added by a user of the site, and may be spammy in nature", and is used by search engines (Google in particular) to mean that it should not contribute to the linked page's pagerank.

My first reaction was revulsion that spammers would use such a down-right dirty tactic. But then I realised that if spammers are doing this, it means that nofollow is working - it's getting harder to get non-nofollow links to your site via spam, and therefore harder to fraudulently increase your pagerank. This is a very good thing for bloggers because if spamming blogs is no longer effective, it will stop. So (in order to counteract any damage that this "disable nofollow" spam might do, I encourage bloggers and operators of other sites where users can add links to ensure that the "rel='nofollow'" attribute is placed on all links that you haven't personally checked and endorsed. It has no effect either way on your readership (only producing quality content that people want to read can do that) but it makes the efforts of spammers useless. Also beware of anyone coming out of the blue and wanting you to link to their site (even if they're offering reciprocal linking) - chances are they're SEO spammers.

Another Argulator feature that never was

August 23rd, 2009

One Argulator feature I was, at one point, absolutely convinced that I needed was a web page cache/archive. The trouble is that one often wants to link to other sources of information when making statements (for example to cite a source or backup some assertion). On the web the natural way to do that is to use a link to a URL elsewhere on the web. But URLs are fickle things - the resource they point to can change with time or disappear altogether (something I've always tried to avoid happening with this site - I believe pretty much every reenigne.org URL that was ever valid is still valid today and points to the same information, even if it has been tweaked a bit over the years).

If you're referring to a URL in an Argulator statement, though, that's a problem because a statement's meaning is never supposed to change once it's been created. If it could change, you're putting words into the mouths of all the people who have expressed an opinion on that statement. So I wanted a way to make sure that web pages mentioned in Argulator statements never changed or disappeared, which meant bringing them under the control of the Argulator server. The idea was that when someone mentioned a URL, the Argulator server would fetch that URL, along with the graphics and CSS required to make the page display properly, just as if it was a web browser. Then it would store these files on the server and show it's cached copy (much like Google's cache or the Internet Archive does) when a link is clicked.

But that turns out to be surprisingly difficult, and opens up a whole cannery of worms. Security is a big problem. To ensure nothing bad is ever served from the Argulator site we need to strip out all executable content (script, Java, ActiveX, flash, XUL, Silverlight etc.) from the pages. Which means parsing all the CSS and tag attributes. After that treatment the page might not render at all (especially if it contains things like protection from ad-blockers). Such a cache should also respect the directive and refuse to cache pages whose authors do not want them cached.

Then what do we do if someone requests we take down a cached page (maybe it contains illegal content, or the content owner complains on copyright grounds)? We don't want to create work for ourselves, but on the other hand we don't want to make it too easy for people to sabotage statements they disagree with by getting that statement's sources removed.

I considered the possibility of just linking to pages on the Internet Archive but that may stifle discussions of current events since the IA doesn't serve any pages less than 6 months old. But the idea of linking to an external site instead of keeping the pages on Argulator got me thinking. Such a cache may be useful for other things besides Argulator, so maybe the cache should be a separate (but associated) site. Realizing that such a site would be useful in its own right got me to wondering why nobody had done that before, which got me to realizing that maybe they had. A few google searches later I found several sites which did exactly that. I then realized that Argulator should stick to its core competency and leave the web caching to the experts. Statements can link to any URL anywhere on the internet, but it is recommended for the sake of stability to link to one of these caching services. I hope Argulatists will realize that URLs pointing elsewhere might not have the same contents as they had when the statement was originally created, and take such statements with the appropriate amount of salt.

Why Argulator requires users to create an account before creating statements

August 22nd, 2009

One thing I wanted to do with Argulator but didn't quite manage was to eliminate the login barrier - to allow people to use any part of the site without creating an account first. The idea was to essentially put all the site's data under multi-headed version control, and then put each temporary user's changes in its own branch, so that they wouldn't affect logged-in users or other temporary users. When a temporary user did decide to create an account, then those changes would be merged into the main branch.

The trouble with that is that arbitrary merges are very difficult. What happens if two temporary users happen to create identical statements (or statements which are similar enough that they would be duplicates if they had both been created by logged-in users)? What happens if a statement is deleted in one branch and created in another? Adding UI for resolving arbitrary merge conflicts would be difficult enough, but forcing users to go through it when they create an account would be downright user-hostile.

I've therefore come to the conclusion that the principle of "you can do anything without creating an account" only works for sites where users' actions don't really affect other users (such as a shopping site).

Error handling in Argulator

August 21st, 2009

One interesting thing I had to think about for Argulator was what to do with error messages. There are three types of errors that can occur in applications such as these when classified by the action that needs to be taken.

The first type of errors are things that real users can run into, and which are not problems with the program itself. For example, visiting a statement's page and finding that it has been deleted. These deserve user-friendly error messages and are usually pretty straightforward.

The second type are validation errors, for example someone trying to attempt SQL injection. There's no point giving a user-friendly error message for these as long as we're sure real users can't run into it, for example if we're doing the same validation on the client side, so these are generally handled just with a "die" call.

The third type are programming errors on my part (assertion failures, essentially). These also deserve user-friendly error messages (without details, just a generic "uh oh, something went wrong, we're on it") but they also send me email with some details so that I can try to figure out what's going wrong.

The trouble is it's not always easy to tell what is the second type and what is the third type. Some things seem like they would have to be programming errors but maybe there's some validation missing which means that they could be triggered by invalid input. Some things seem like they could only be triggered by invalid input but perhaps there's a programming error which could cause them to be exposed to real users.

I hope I haven't made too many mistakes classifying these error messages. When choosing whether a particular error is a type 2 or a type 3, I've tried to err on the side of type 3 because the consequences of getting it wrong are less severe (a type 2 misclassified as a type 3 just results in me getting spam, a type 3 misclassified as a type 2 results in me not getting notified, and the user getting an unfriendly error message). Please let me know if you see any error messages in Argulator that don't seem particularly friendly (for example, just text on an empty page instead of a popup div with a button on top of a normal page).

Other principles and innovations in Argulator

August 20th, 2009

Some other little bits of Argulator that I'm quite proud of:

  • Pieces of UI that would be represented as a modal dialog in a desktop application (such as the search results, or the create user dialog) use an HTML div with absolute positioning, that "floats" above the rest of the page. Behind this is a window-filling div with a semi-transparent background, which causes the rest of the page to darken, focussing the user's attention on the dialog. The popup "window" can be dragged with the mouse and scrolling still works, so the rest of the page can still be read if one needs to refer to it during the dialog interaction.
  • When there is a script error, it is caught and transmitted to the server with an AJAX request. The server then emails it to me so that I can fix the bug. If this fails, we do nothing in order to avoid an infinite loop. I wish more sites would do this - there is so much broken Javascript out there it's pretty much impossible to surf normally with Javascript debugging switched on. This means I either need to switch my browser between "debugging mode" and "surfing mode" when starting/stopping work on Argulator, or use a different browser for debugging than I use for surfing.
  • Argulator uses 96-bit random character strings (encoded as 16 characters of 6 bits each) all over the place, for cookies, pseudo-cookies (for preventing cross-site request forgery attacks), salts and email authentication. These use few enough characters that a URL containing one easily fits on one line of an email message, but have enough entropy to make them unguessable for security purposes. They are generated using /dev/urandom if available, otherwise they are generated with a cryptographic hash, an entropy pool, and the current time.

Speaking of cryptography, Argulator's main cryptographic shortcoming is that the user's password is sent across the wire in plaintext (in a POST request, so it doesn't show up in server logs or caches). I thought seriously about hashing it on the client side (implementing a cryptographic hash function in Javascript). But in the end I decided that this was too much effort for too little gain. The danger of having your password stolen comes mostly from malware and keystroke loggers on the client side, which this does nothing to protect, or having your password stored in plain text in the database (it isn't - we salt it and hash it before storing it). Lots of other sites also send passwords in plaintext. And even if I did implement this, most users would not even be able to tell - really the way to solve this is to use https instead of http for authentication, which I will do if Argulator gets sufficiently popular that I can justify the expense. Then you'll get the little padlock icon in your web browser and you know you're secure. Until then, just don't use the same password for Argulator that you use for important things like your bank account.