The ALFE type system

October 12th, 2012

I want to write a series of posts about the rich type system I want to implement for ALFE. This post will serve as an index to them.

I recently tried doing a bit of programming in Haskell, and although I didn't produce anything worth showing off, I did get a bit of a feel for the language and how it works. One of my favorite things about Haskell is the rich type system, and I want to use some of its ideas in ALFE. Some of the ALFE types below are inspired by Haskell types.

Given types Foo, Bar and Baz I would like ALFE to have the following types:

  • Foo* - a pointer type.
  • Foo(Bar) (aka Foo -> Bar) - a function type.
  • Foo | Bar - a "sum" type.
  • Foo? - a "Maybe" type.
  • (Foo, Bar) - a "product" type.
  • [Foo] - a "sequence" type.
  • Foo[2] - a fixed length array type.
  • Bottom - the "bottom" type.
  • Variant - the "variant" type.
  • Label - the "label" type.
  • Complex<Foo> - a complex number type.
  • Rational<Foo> - a rational type.
  • String - the string type (usually represented in memory as UTF-8 in some form).
  • CodePoint - a type capable of holding any Unicode code point.
  • Fixed<Foo, Bar> - a fixed point fractional type.
  • Boolean - the boolean type.
  • Int1, Int2, Int4, Int8, Byte, Int16, Int32, Int64, Int128, Int256, Int512, UInt1, UInt2, UInt4, UInt8, UByte, UInt16, UInt32, UInt64, UInt128, UInt256, UInt512, NInt1, NInt2, NInt4, NInt8, NByte NInt16, NInt32, NInt64, NInt128, NInt256, NInt512 - fixed-width integer types.
  • Word, Int, HalfWord, QuarterWord, DoubleWord, QuadWord, UWord, UInt, UHalfWord, UQuarterWord, UDoubleWord, UQuadWord, NWord, NInt, NHalfWord, NQuarterWord, NDoubleWord, NQuadWord, FastInt1, FastInt2, FastInt4, FastInt8, FastByte, FastInt16, FastInt32, FastInt64, FastInt128, FastInt256, FastInt512, FastUInt1, FastUInt2, FastUInt4, FastUInt8, FastUByte, FastUInt16, FastUInt32, FastUInt64, FastUInt128, FastUInt256, FastUInt512, FastNInt1, FastNInt2, FastNInt4, FastNInt8, FastNByte FastNInt16, FastNInt32, FastNInt64, FastNInt128, FastNInt256, FastNInt512 - machine-dependent integer types.
  • WordString, Integer, Unsigned - arbitrary-precision integer types.
  • Float16, Float32, Float64, Float128 - fixed-length floating-point types.
  • Float - machine-dependent floating point types.
  • Floating - arbitrary-precision floating point type.
  • Concrete (possibly also specializations like Length, Time, Area, MagneticFluxDensity etc.)

There's still a lot about this that isn't finalized. For example, I might use some templates to avoid having all those fixed-width and machine-dependent types.

Government as singleton

October 11th, 2012

Sometimes I read things on the internet written by libertarians. I used to read reddit a lot and that has always been a bit of a libertarian stronghold. Occasionally I read things on reason.com. I think I'm pretty squarely on the left side of most political systems but libertarianism is one part of conservatism that I do have some sympathy for. I like the idea of legalizing all drugs, for example, not getting into wars without a really good reason, and generally not imposing unnecessary rules on people or companies.

Those who describe themselves as libertarians often seem to take this principle to extremes, though. One sentiment in particular that I've noticed being expressed repeatedly is that all taxation is theft. I disagree with this - theft is illegal while taxes are legal. Also, one has a say in taxes (via voting and other forms of participation in the political process) but no say in getting robbed. Finally, one doesn't get anything back from getting robbed but taxes pay for useful government services and infrastructure. To me the accusation seems like an instance of the worst argument in the world.

Another thing that libertarians say is that government has a monopoly on (legal) aggression and violence. That sounds like a bad thing (because monopolies are bad, and violence and aggression are bad). But in this case two bads make a good - you don't want multiple violent organizations competing, as that would not tend to minimize the amount of violence. Since we can't eliminate legalized violence altogether (since otherwise we would have no way to arrest an uncooperative murder suspect) it's best that a single (accountable) organization has that monopoly.

There are other things that government has a natural monopoly over - things that benefit society as a whole but won't get done by the markets because there isn't any profit in being the one to do them - things like making sure the poor have enough to eat and access to life-saving and preventative medical care. Another case is where having multiple competing organizations would cause practical difficulties: I don't want my house to be connected to six different electrical networks, six different water supplies, six different sewers, six different telephone networks, have driveways connecting to six different road networks, have six different garbage collectors coming by and I don't want my town to be served by six incompatible rail networks. The basic provision of utilities like these is best done by monopoly - even if building, servicing and billing can have competing providers (here in the UK I can choose from many different electricity companies to bill me each month, but they all have the same number to call if there is a power cut). Similarly, the military is probably best done centrally, otherwise different militaries representing different interests of the same country might end up fighting each other!

It makes sense to have one single organization take care of all the things that need to be done that can't or won't be taken care of by markets for one reason or another, and that organization is what we call government.

There's a similar concept in software engineering called a singleton object - a single chunk of memory where you put all the things that there is only one of in the program. It's bad engineering practice to stuff in the singleton that doesn't really need to be there, because it leads to inflexible programs that allow you to have only one at a time of something that you might want to have multiple instances of. Similarly, it's bad practice for the government to do stuff that the market can do better - I wouldn't want government to get into the business of designing and making laptops, for example, since the market does a great job at that.

XT reset switch

October 10th, 2012

I've mentioned before that I have a system for remotely running programs on my XT. Of course, since I'm running native code (mostly written in assembly code) on a machine without any kind of memory protection (and especially since I'm sometimes doing crazy things like disabling DRAM refresh), these programs will inevitably sometimes crash the entire machine. I don't want to have to go and reset it manually every time this happens, so I needed a remote reset switch.

I found this post on the Vintage Computer Forums describing a modification that is almost exactly what I want. Th exactly the modification I wanted to make, except that it's connected to a manual switch. It's trivial to go from a switch that grounds a line to programmatically grounding a line, though, so I just hooked it to the Arduino-based keyboard port manufacturing test device hack I already had, and now sending a particular byte over the serial line causes the Arduino to ground the power good line for 250ms and thereby reset the XT.

Computer-controlled TV

October 9th, 2012

Before I bought my XT (but when I was planning on getting one) I knew I'd need a monitor for it. Fortunately for me, the region I was living in had recently transitioned from analog to digital TV signals, so a lot of people were disposing of TVs that were perfectly good apart from not having a built in ATSC decoder. So I didn't have to walk past very many yard sales before I found an ideal little 13" Orion TV1334A. The only downside was that it didn't have a remote control, which I discovered that I needed to change the brightness and contrast (without changing the brightness and contrast, the old-style CGA has rather poor composite output). I tried to make a fake remote using an Arduino, an IR LED that I pulled out of an old DVD remote, IRemote and LIRC's remotes database, but unfortunately I found that the latter did not contain codes for the TV1334A.

Fortunately I was able to find a suitable remote on eBay for a reasonable price (albeit still slightly more than I paid for the TV, which annoys me slightly). If I had just done that to start with that would probably be the end of it, but now that I've gone to the trouble of writing some IR code for the Arduino I want to be able to use it! This isn't as useless as it might sound, because it would be nice to be able to control the TV from the computer - to turn it on from another room, run a CGA program remotely, view the result over a webcam and then turn it off again.

Comparison of CGA card versions

October 8th, 2012

Over at the Vintage Computer Forums I asked about the differences between CGA card versions.

The main change that occurred during the CGA's lifetime seems to be to do with composite output. In particular, the old CGA (part numbers 1804472 and 1501486) had the following formula for the composite output voltage: COMPOSITE = 0.72*CHROMA + 0.28*I. The new CGA (part numbers 1504910 and 1501981) has the formula COMPOSITE = 0.29*CHROMA + 0.1*R + 0.22*G + 0.07*B + 0.32*I. The consequences of this are more obvious on a monochrome monitor, since there CHROMA only makes 3 different shades of grey (0 for colours 0 and 8, 1 for colours 7 and 15 and 0.5 for all the others). So an old CGA will only yield 6 different shades of grey on a monochrome monitor, while on a new CGA the 16 different colours will yield 16 (theoretically) different shades of grey (though some of them may be very similar).

On a colour monitor, a new CGA will give a lower saturation then an old CGA, but the brightnesses of the different colours will seem more appropriate. On an old CGA, blue seems lighter than it should be while yellow seems darker - new CGA fixes that.

The other thing about new CGA is that its composite output is a better match to standard NTSC than old CGA's, which means that the results will be more consistent between different monitors. The old CGA's color burst has both too high of an amplitude and too high of a DC offset, which causes many NTSC output devices to reduce the gain, making the resulting image too dark (I have to turn the brightness and contrast right up to get a decent image from my 1501486 card on the TV I connect it to).

That's the theory - to check how well it works in practice, I'd like to do some side-by-side comparisons. Rather than trying to buy a new CGA card (which is likely to be expensive, might be unsuccessful and probably wouldn't work too well side-by-side with another CGA card in the same machine anyway), I want to make an add-on card for my old CGA card which adds a second composite output, with new-style colours. The differences between the two cards are localized to a small part of the circuit, so there isn't too much to duplicate.

Improved composite mode support for DOSBox

October 7th, 2012

I recently contributed to a DOSBox patch to make composite output work in all CGA graphics modes. Most CGA composite games use BIOS mode 6 (port 0x3d8 value 0x1a, aka 640x200 1bpp mode) which gives a nice range of colours. However, there are a rare few games which use a 2bpp mode but have 3.57MHz vertical lines which are quite obviously designed to yield a different palette on the composite output. DOSBox currently shows the output of such games as they would appear on a digital (aka TTL or RGBI) monitor, which isn't always what the game author intended - the example that started the thread was Fooblitsky.

I had already written code to simulate composite CGA in 2bpp mode (and indeed all modes) but there is a complication - DOSBox is based around a 256 colour palette and my code assumes 24-bit colour. The first 16 colours are also reserved for the digital CGA colours so that the palette entries don't have to be reloaded when switching between composite and digital, so only 240 colours can be used for composite CGA. The current DOSBox CGA composite implementation uses 80 palette entries - 16 colours (one for each bit pattern) times 5 brightness levels (0, 1, 2, 3 or 4 pixels lit). I realized that actually only 16 of these palette entries are really needed, since the same information is being used to dereference both the "bit pattern" table and the "brightness" table. Also, DOSBox's current implementation isn't quite right since some of the colour fringes are desaturated - look at the top tapper screenshot (the one from DOSBox) here and compare it to the more correct version below - look in particular at the middle of the "SODA" sign in the window, the right edge of the D and the left edge of the O, which is grey in DOSBox (missing its fringing entirely). Since DOSBox uses a (1, 1, 1, 1) kernel for its NTSC filter, there are only actually 16 possible colour combinations (though they are permutated depending on the colour carrier phase).

I realized that a similar technique might be possible for the 2bpp modes. Each output composite pixel depends on the colours of four consecutive pixels of hdot width. These pixels cover at most 3 consecutive ldots, so any given pixel position depends on at most 6 bits of video data. It also depends on 2 bits of x position, so we need 8 bits of palette entries, or 256 colours - just slightly too many. There's an easy way to reduce the amount of palette entries though - half of the output hdots depend on only 2 consecutive ldots (since the sampled hdots exactly cover 2 ldots). So even hdots have 16 possible colours and odd hdots have 64, for a total of 16+64+16+64 = 160 colours - plenty to spare. What's more, the "render" part of the new algorithm is even faster than it used to be (although the mode setting part is probably much slower - however it's run sufficiently rarely that there's no need to optimize it).

ISA bus sniffer

October 5th, 2012

One of the trickiest parts of writing my cycle exact 8088 emulator is going to be figuring out exactly when each part of each instruction is executed - in particular, at what point during each instruction's execution is each of its bytes removed from the prefetch queue? And (for instructions which do IO) at which points during the execution are those IO requests sent from the Execution Unit to the Bus Interface Unit?

I was originally thinking that I would have to devise a clever set of experiments to find out - make a hypothesis about the timings, devise an experiment which behaved differently depending on whether that hypothesis was true or not (existence proof: if such an experiment were not possible I wouldn't care about the result for emulation purposes), rinse and repeat until the observed behavior of the emulator stops deviating from the observed behavior of the actual machine.

However, I have learned that there is a easier way to go about it. It turns out that the CPU outputs a couple of bits of information concerning the state of the prefetch queue on two of its pins (QS0 and QS1), allowing us to distinguish between 4 possible operations which can occur on each cycle: first byte of opcode removed, subsequent byte of opcode removed, queue emptied and no change. Being able to read that information (along with exactly what the bus is doing) would make figuring all this out much easier. I didn't want to use a logic probe to do this because (among other reasons) I wanted to be able to set up a large number of experiments and run them all automatically. So instead I have designed an ISA card which (completely transparently to the PC or XT it's plugged into) uses a microcontroller to sample the state of many lines and transmit the results to another PC over a serial connection.

Compared to a real logic probe we can only sample a few lines at once, only gather a couple of KB of samples at once and can't sample very often (I think 4.77MHz should be possible), but the experiments I care about are all deterministic so we can just repeat the experiment enough times to gather all the data I want. Here's the schematic for the bus sniffer and here's what the board layout looks like:

I've ordered a PCB from BatchPCB (the first time I've actually had a PCB professionally made) so we'll see how it works!

LED strobe and spinning disc

October 4th, 2012

My son's Elenco Snap Circuits Jr. SC-100 toy came with a spinning disc that demonstrates stroboscopic behavior when accelerating or decelerating under a fluorescent light. I built my own LED strobe with adjustable frequency and duty cycle to make the effect a bit more visible. In the following video you can see what happens as the duty cycle is adjusted:

The circuit is too simple to show a schematic for - just a LED with current limiting resistor and three potentiometers connected to an Arduino. The clever bit is the Arduino program. Also, because I needed to be able to tune the frequency quite precisely, I used two potentiometers for the frequency - one for coarse adjustments and one for fine. Only the top 6 bits of the coarse potentiometer's measured value are used (otherwise the fine potentiometer would be useless due to the jitter). The remaining 10 bits come from the fine potentiometer, yielding a theoretical 16 bits of frequency resolution, giving a range of about 3.8Hz to 250KHz.

The video would probably come out better if I connected multiple LEDs to the output, but I only had one white LED to hand.

Zoomable family tree

October 3rd, 2012

There seem to be lots of people on the internet who don't really understand evolution. You can tell who they are because they ask things like "If people evolved from monkeys, why are there still monkeys?" and claim that there are things called microevolution and macroevolution which are qualitatively different. Sometimes this happens because people perceive there to be a conflict between their preconceived beliefs and the theory of evolution. Sometimes though it's just that they're taught by people who have such perceptions, which is a bit of a tragedy because a poor quality of teaching will lead to these students not being able to contribute to science in the future.

Without pretending to have any answers for the more fundamental problem of bad teaching, I do have an idea which could help people get an intuitive understanding of how evolution works.

Really what evolution says is that there ultimately there is a single family tree - we're all related if you go back far enough. Also, the blueprints for our bodies are not engraved in some unalterable stone tablet sealed in a cryogenic vacuum somewhere - they're stored as atomic-scale patterns inside our warm, wet cells and therefore change from time to time (albeit very slowly by the timescales we're used to).

So what I'm imagining is a phylogenetic tree viewer with a twist - you can zoom into a phylogenetic tree, and if you zoom in far enough you can see individuals and the parent-child relationships between them, like a normal family tree. Individuals would actually be shown as what typical individual of that lineage would have looked like (possibly with typical random variations). But these pictures would be stored in some kind of vector representation so that one could be morphed into another gradually. The point is that no matter where you zoom in, there's no place you can find where the children don't resemble their parents. You'd also be able to get an intuitive understanding of the process of speciation by seeing how a single species lineage splits into two, but on the level of individuals there's no one individual that you can point to that marks the point at which the species diverge.

By zooming in right at the top of the tree you'd be able to see (a theory of) abiogenesis (the first point at which information was passed on from parent to child).

A simple version of this program (just including a tiny minority of species) would be relatively easy to create and could still fulfill the basic educational possibilities. But the biologists collaborate and the more data that the program was fed, the more interesting things could be done with the result. Ideally it would be made as accurate as possible according to current best theories. Of course, a huge amount of the data is currently unknown and would have to be synthesized (at high zoom-in levels, almost all of the details would need to be made up). So if the program is used for scientific purposes it would be good to have some way to visualize which aspects of the image are accurate according to current best theories and which aspects were synthesized.

The data that's added to the program need not all be biological in nature - people could add their own family trees (and even those of their pets) as well (though reconciling low-level actual data with synthesized data could be tricky).

British TV is so civilized

October 2nd, 2012

I tried out the iPlayer for the first time last night (at the time of writing) to watch the first episode of the new series of Doctor Who, which I had missed the beginning of due to having to put the children to bed. On going to the BBC website it was just three clicks - one to start "watch live", one to restart the current program and one to maximize the window. And there was negligible buffering time, and it was higher definition than standard definition TV! I was very impressed. It was a great episode too - most of the Dalek episodes suffer from massive unexplained plot holes, but this one was quite well thought out.

Some of the US networks have streaming sites, but they all have adverts, the quality is worse than standard definition TV and new shows usually aren't available for a day to a week after they air. Plus a year's basic cable subscription costs more than a year's UK TV license.