Archive for the ‘computer’ Category

Geeky video timing stuff

Wednesday, April 4th, 2007

You might know that according to the NTSC standard for TV signals in the USA, the display is updated 60 times per second. But did you know that it's not exactly 60 - it's actually closer to 59.94 (60000/1001, to be exact). This Wikipedia article explains why.

Early microcomputers (including early PCs), which were designed to be plugged into TVs, didn't have exactly the same frame rate as TV pictures - most of them used 1640625/27379 (which is closer to 59.92) instead because it was slightly easier to build that way (fortunately TVs have enough tolerance to display the slightly out-of-spec pictures correctly). I wrote this to explain (amongst other things) the origins of these "magic" numbers.

This 59.92Hz number turned out to be very important for finding an obscure bug in California Games that I was hitting whilst trying to get the "CGA MORE-color mode" working on MESS. There is a routine to determine if the frame rate is close enough to 60Hz for it to be likely that this effect would work. The routine seems to be trying to determine if the frame time is in the range (1/60)s +/- 500us (presumably the authors didn't know that it was actually supposed to be closer to 59.92 than 60). However, it puts the timer chip in the wrong mode, causing it to count down twice as fast. So in fact it is instead determining if the frame time is either in the range (1/60)s-500us to (1/60)s or in a similar 500us range at around 1/120s. The "normal" value lies right on the edge of the range it's actually measuring. Of course, because on real hardware the rate is 22us less than (1/60)s (pretty reliably so, since they are based on the same clock signal) this works fine in practice, but I'm sure it's not what the authors meant!

This bug was preventing the mode from working on MESS because the MESS frame rate was set to exactly 1/60s (slightly too fast) and frame rate test was failing (but only just). The fix is for MESS to use 59.92Hz frame rate instead - by making the emulator almost imperceptibly more accurate, the effect works!

Quantum Computers and Magic

Tuesday, April 3rd, 2007

The interesting thing about quantum computers is that they perform very complex calculations, but the answers that they give are relatively short. For example, one important application of quantum computers is likely to be factoring large numbers. The calculation is very difficult but the answer is just the factors.

A quantum computer would be useless for a task like sorting a large list, though, because the calculation involved in doing such a thing is not much more difficult than printing out the answer.

This makes me think of magic tricks. Stage magicians appear to be able to do all sorts of clever things as long as you the audience member can't see what's going on. For example, they can make people disappear or saw them in half, as long as the real business of doing such is hidden away inside a special box. It is beyond the capability of any magician to saw someone in half in such a way that you can see exactly what's going on, or make something in direct view disappear.

The similarity is quite shallow because in quantum computing things are hidden away for very different reasons than they are hidden away in magic - in magic, things are hidden because what the magician is trying to make you believe is happening isn't really happening. In quantum computing, things are hidden away because they are happening in other universes.

CRTC emulation for MESS

Friday, November 24th, 2006

Background

I am the author of the remastered version of Windmill Software's Digger. In creating this, I wanted to make the experience of running this game as close as possible to the experience of running the game on an original 4.77MHz CGA IBM PC.

I mostly succeeded, but there are a few rough edges. The sound is a little harsh when not using the PC speaker as I am not filtering out aliased high frequencies properly. Also, I never got the flashing effect on the "Enter your initials" screen quite perfect. For one thing, it is still CPU-speed dependent in the DOS version of Digger Remastered, as it was in the original.

Trouble is, I don't know exactly how this flashing effect is supposed to look. The palette changed (partway through lines) after every 2 or 3 scanlines on my PC1512, but I knew that wasn't exactly right as the PC1512 runs at 8MHz. I later found out that the effect was pretty similar on a 4.77MHz machine - it was not synchronized to the horizontal retrace or anything like that - the author describes it as "The rolling colors appeared as if the text was in a moving rainbow".

I realized that to see this effect as it was originally intended I would need a cycle-exact emulator. This seemed like rather a big job so I put it on the back burner.

Years later, I came across MESS and discovered that the hardest part of the work was done. With a few minor modifications (and a rewrite of the 8253 Programmable Interval Timer) Digger (both original and remastered) worked great - even the sound was better. However, the "rolling rainbow" raster effect still didn't appear. The raster effects in California Games also don't work. California games flips the palette at the same place each frame in order to use multiple palettes (and more than 4 colours) at once:

In order to make these things work, I decided to embark on a complete rewrite of the video emulation for machines which use a CRTC (Cathode Ray Tube Controller) based on the Motorola 6845 and variants.

About the 6845 CRTC

You can tell if a machine uses a 6845 variant because its video system will have the following characteristics:

  1. A character-cell based display
  2. A text-mode hardware cursor whose position is controlled by registers at offset 14 and 15
  3. Hardware scrolling controlled by registers at offset 12 and 13

There's a lot more too it than that, but just about every other feature of the 6845 is missing or different in some implementation or other - that's about all that's common to all the variants.

The 6845 CRTC keeps track of the position of the CRT beam and generates:

  • horizontal and vertical sync pulses (to keep the real CRT in sync with the CRTC)
  • a "memory address" (a unique number for each character cell in the picture)
  • a "row address" (the scanline within the character cell)
  • a "display enable" bit, to indicate whether the beam is currently displaying memory-driven data or overscan/blanking/retrace
  • a "cursor" bit, indicating whether the beam is currently within the cursor

The 6845 can also be thought of as a 4-stage counter:

  • Stage 0: horizontal character counter
  • Stage 1: scanline counter within a character row
  • Stage 2: character row counter
  • Stage 3: frame counter (for cursor flashing)

Graphics modes and the 6845

The MC6845 only has a 7-bit character row (stage 2) counter, so if you make each character row one scanline high, you can only display ~100 scanlines (after overhead for vertical overscan and blanking). So most machines that use a 6845 and support high-resolution graphics use some of the row address bits as memory address bits:

  • In graphics modes, the CGA uses the least significant bit of the row address (R0) as the most significant bit of the memory address (M13). This explains why even scanlines are in the low 8Kb of RAM and the odd scanlines are in the high 8Kb.
  • The BBC Micro in modes 0-6 uses the lowest three row address bits R0-R2 as the low bits of the memory address (M0-M2). This explains the somewhat counter-intuitive memory layout of this architecture.

Other features supported by the 6845

Some variants of the 6845 support:

  1. Software-programmable timings. Most 6845 variants allow software to change the number of characters/scanlines per frame, the number of displayed characters/scanlines and the relative positions of the sync signals. This makes video hardware that uses these variants very flexible, but in some cases does make it possible for software to destroy hardware (some fixed-scan monitors can be damaged if the timings of the sync signals are out of range.)
  2. A lightpen. When the CRT beam passes the lightpen sensor, a strobe signal is sent to the 6845 and the current memory address is latched. This value can then be read by software.
  3. Interlaced display. Even frames are advanced by a half-scanline and odd frames are retarded by a half-scanline. Even frames are therefore one scanline larger than odd frames, causing the beam to start the first visible line a half-scanline higher on even frames.
  4. Use as a memory controller. To avoid contention for display RAM between the display logic and the CPU, all memory access is done through the CRTC. Some variants support features such as fast video RAM to video RAM copy and fill via CRTC commands.
  5. A blanking bit separate from the display enable and sync bits, enabling a "two stage" overscan - an outer black region surrounding an inner solid-colour region.

List of 6845 variants (with references)

  • Motorola 6845 (Motorola 68A45 and 68B45 are software equivalent but have different maximum clock speeds)
  • Motorola 6845-1 (equivalents: Motorola 68A45-1/68B45-1)
  • Rockwell 6545 (equivalents: Rockwell 6545E)
  • Rockwell 6545-1 (equivalents: Commodore 6545-1)
  • Hitachi 46505
    • differences described here.
  • Synertek SY6545-1 / SY6845E
  • UMC UM6845 / Hitachi HD6845S (Amstrad CPC "type 0")
  • UMC UM6845R (Amstrad CPC "type 1")
  • Amstrad AMS40489 (Amstrad CPC "type 3" - ASIC in CPC464+, CPC6128+, GX4000)
  • Amstrad Pre-ASIC (Amstrad CPC "type 4" - used in "cost-down" CPC6128)
    • information about these types here and here.
  • Amstrad PC1512 - timings fixed, always displays 200 lines (regardless of character cell height)
  • EGA - only very loosely based on the 6845, has many new features
  • VGA - similar to the EGA but with even more new features
  • 8563 (used in Commodore C128) - Wikipedia article
  • 8568 (used in D[CR] models of the Commodore C128) - there don't appear to be any differences for software or emulation between this and the 8563 - the main difference is an extra (unused) interrupt line.
  • Chips & Tech 82c425 and 82c426 - used in some CGA clones.
  • Professional Graphics Controller
  • Other CGA clones - 3270 PC, Plantronics ColorPlus, Amstrad PPC/PC20, Olivetti M24, Olivetti Prodest PC1

How the new 6845 emulation in MESS will work

You can download the code (as it is so far) here.

When the screen is initialized, the video hardware implementation calls crtc6845_init() to allocate and set up a CRTC object. Multiple objects can be allocated (for example if we're emulating a PC with both a CGA and an MDA display). The CRTC object contains the entire state of the CRTC, including a bitmap containing the current state of the frame (as it would be displayed on the CRT).

Whenever the CPU reads from or writes to the CRTC, crtc6845_update() is called to bring the state of the CRTC up to date. The do_update() function does the real work, and is where the main emulation loop is.

The main emulation loop works in a way very similar to the actual hardware. There are internal counters which are incremented each cycle or scanline and compared to values derived from the register values. Each cycle, the CRTC calls a callback supplied by the video hardware emulation to actually draw the pixels. This callback is supplied with the output of the CRTC (such as row and memory address and coordinates at which to draw). This may be somewhat slow (in 80-column text mode on a CGA, this loop will have to execute 1.79 million times per second) but if it is too slow there are many possible ways to optimize it without reducing the accuracy of the emulation, such as having the callback process larger parts of scanlines at once.

The machine's VIDEO_UPDATE function calls video_update_crtc6845(), which gets the CRTC up to date and then copies the CRTC's bitmap to the output bitmap.

The machine's VIDEO_EOF function calls video_eof_crtc6845(), which updates to the end of the current field and then does "per field" tasks. The main one of these is checking to see if the size of the generated image is the same as it was in the previous field. If it isn't, screen_configure() is called to enlarge the bitmap (if necessary - it never gets shrunk) and call video_screen_configure() to update the frame rate and size. This is done here rather then just by looking at the programmed field width and height parameters in the CRTC registers because some effects involve having multiple smaller CRTC fields per CRT field (i.e. "resetting" the CRTC part way through the frame). This works fine on real hardware as long as the timings of the horizontal and vertical sync pulses are correct, so we would like to be able to emulate these effects.

There are several different rectangles which the CRTC keeps track of:

  1. The display area (i.e. the area over which the "display enable" bit is set and the displayed pixels are driven by memory data)
  2. The display+overscan area (i.e. the area over which blanking is disabled and non-black pixels are actually drawn)
  3. The scanning area (i.e. the area over which both the horizontal and vertical sync pulses are low, and the cathode ray beam is progressing in the normal rightwards/downwards direction)
  4. The visible area (i.e. the area which is actually displayed by the MAME/MESS core. This must be within the scanning area but is independent of the blanking, overscan and display boundaries)

The visible area is set by the function crtc6845_set_visible_area(). The width and height parameters to this function are fractions of the width and height of the scanning area, which should be fairly close to what real monitors do. This function can also move the image horizontally and vertically as well as changing its size. It corresponds to the size and position controls of the monitor. It should probably be set once in the machine initialization function, tuned for the machine (so that the entire image and a small amount of overscan is visible in every mode) and then left alone.

Whenever a CRTC register that controls timing values is written to, the recalculate_timings() function is called. This function decodes the register values into a format more easily used by the do_update() function.

Each CRTC has an internal array of possible drawing callbacks that can be used. The element of this array that is actually called is set by the crtc6845_set_callback(), which will usually be called by the machine-dependent code which sets the video mode. This is an optimization so that each callback can take care of one video mode and you don't have to switch on video mode in the callback function (which is called a *lot* so should be as simple as possible).

Each CRTC also has an internal array of possible clock frequencies. This is because some machines can supply multiple different clock rates to the CRTC. For example, the input clock to the CRTC on a CGA is 1.79MHz in 80-column text mode and 895KHz in graphics modes and 40-column text mode. Memory is scanned through horizontally at twice the rate in 80-column text mode as in other modes.

There are a few other functions that machines can call to change CRTC parameters:

  • crtc6845_set_pixels_per_tick() - set how much to increase the bitmap x coordinate by each clock tick. Usually this will be set at the same time as the clock frequency.
  • crtc6845_set_enable() - enable or disable the CRTC (equivalent to the enable pin on the actual chip). A black screen is drawn if the CRTC is disabled for any part of the field. This can help to prevent weirdness during mode changes.
  • crtc6845_set_refresh_limits() - sets the maximum and minimum refresh rate that the CRTC will attempt to pass to video_screen_configure(). This prevents emulated software from being able to do bad things to MESS by setting frame rates too high or low, and prevents weirdness during mode changes.
  • crtc6845_set_lightpen_position() - sets the position of the lightpen relative to the visible area.

Finally, there are two functions (crtc6845_get_display_enable_status() and crtc6845_get_vsync_status()) for obtaining values of the output pins of the CRTC. Some video hardware implementations (such as CGA) can return the values of these bits via an IO port.

Terminology and conventions

Within the CRTC code, many variables have prefixes which correspond to the units which that variable counts:

  • t_ - ticks of the input clock, horizontal characters, memory addresses
  • s_ - scanlines
  • x_ - horizontal pixels
  • y_ - vertical pixels
  • f_ - fields

The underscore (in general) represents "per" so (for example) a variable called "x_t" represents the number of horizontal pixels per tick of the input clock.

This is a kind of Hungarian notation - Hungarian as it was meant to be used rather than the horrible misuse that is usually perpetuated whereby the prefix just duplicates the type information rather than telling you what the variable actually counts or measures.

6845 registers

6845 has these registers:

  • 0x00 - Horiz. total characters
  • 0x01 - Horiz. displayed characters per line
  • 0x02 - Horiz. sync position
  • 0x03 - Horiz. sync width in characters
  • 0x04 - Vert. total lines
  • 0x05 - Vert. total adjust (scan lines)
  • 0x06 - Vert. displayed rows
  • 0x07 - Vert. sync position (character rows)
  • 0x08 - Mode
  • 0x09 - Maximum scan line address
  • 0x0a - Cursor start (scan line)
  • 0x0b - Cursor end (scan line)
  • 0x0c - Start address (MSB)
  • 0x0d - Start address (LSB)
  • 0x0e - Cursor address (MSB)
  • 0x0f - Cursor address (LSB)
  • 0x10 - Light pen address (MSB) (read only)
  • 0x11 - Light pen address (LSB) (read only)

Amstrad PC1512 (40041 VDU) is equivalent to 6845 without registers 0-8 inclusive. 200 scanlines are always displayed, even if this not a multiple of the character height.

Rockwell 6545 is same as 6845 with these additional registers:

  • 0x12 - Update register (MSB)
  • 0x13 - Update register (LSB)
  • 0x1f - Memory access (mapped to video memory location specified in update register)

EGA and VGA are same as 6845 with these additional/changed registers:

  • 0x03 - End Horiz. Blank
  • 0x04 - Start Horiz. Retrace
  • 0x05 - End Horiz. Retrace
  • 0x06 - Vertical Total
  • 0x07 - CRTC Overflow
  • 0x08 - Preset Row Scan
  • 0x09 - Maximum Scan Line
  • 0x10 - Vert. Retrace Start
  • 0x11 - Vert. Retrace End
  • 0x12 - Vertical Display End
  • 0x13 - Offset
  • 0x14 - Underline Location
  • 0x15 - Start Vert. Blank
  • 0x16 - End Vertical Blank
  • 0x17 - CRTC Mode Control
  • 0x18 - Line Compare

8563 and 8568 are same as 6545 with these additional registers:

  • 0x14 - Attribute Start Address (MSB)
  • 0x15 - Attribute Start Address (LSB)
  • 0x16 - Hz Chr Pxl Ttl/IChar Spc
  • 0x17 - Vert. Character Pxl Spc
  • 0x18 - Block/Rvs Scr/V. Scroll
  • 0x19 - Diff. Mode Sw/H. Scroll
  • 0x1A - ForeGround/BackGround Col
  • 0x1B - Row/Adrs. Increment
  • 0x1C - Character Set Addrs/Ram
  • 0x1D - Underline Scan Line
  • 0x1E - Word Count (-1)
  • 0x1F - Data
  • 0x20 - Block Copy Source (MSB)
  • 0x21 - Block Copy Source (LSB)
  • 0x22 - Display Enable Begin
  • 0x23 - Display Enable End
  • 0x24 - DRAM Refresh Rate

Remaining work to do

  • Change enum constants to caps, and be less verbose (CRTC_MC6845 instead of crtc6845_personality_mc6845)
  • Struct types should be typedefed like those in mame.h
  • video_update_crtc6845() and video_eof_crtc6845() should be crtc6845_update() and crtc6845_eof() respectively to avoid confusion with the VIDEO_UPDATE/VIDEO_EOF macros. (I did it this way to emphasize the fact that these are the crtc6845's equivalent of VIDEO_UPDATE and VIDEO_EOF but I can appreciate that this could cause problems.)
  • Have a global variable "crtc6845 *crtc" that is in crtc6845.c but not used there. If that is not there, almost all consumers of crtc6845.h will have their own static variable.
  • Many 8-bit systems will simply use one CRTC. Some helpers implemented as VIDEO_UPDATE/VIDEO_EOF/READ8_HANDLER/WRITE8_HANDLER that simply use the crtc global described above may be helpful for these common cases.
  • Change memory_address to be of type offs_t
  • Free the bitmap that is kept in the crtc structure
  • Add save state support
  • Implement the drawing callbacks for each piece of video hardware that uses a 6845-variant CRTC
  • Switch over the video hardware implementations to use the new CRTC emulator
  • Implement the differences between different CRTC variants (read/write vs. write only registers etc.)
  • Cleanup (convert spaces to tabs, // comments to /**/ comments and make the brace style consistent with that used elsewhere in MESS)
  • Optimization:
    • Dirty flag (return UPDATE_HAS_NOT_CHANGED from video_update_crtc6845() if nothing has changed, otherwise return 0)
    • Per-character/scanline dirty flags to reduce CPU usage when nothing's changing
    • Per-character/scanline attributes (such as palette data) so that (e.g.) the California Games title screen doesn't need to be completely redrawn each frame
    • Change the drawing callback to do a (partial) scanline instead of just the intersection of one character with one scanline
    • Pre-decode graphics?
  • Implement CGA status register 0x3da (bit 0 = NOT(display enable), bit 1 = (vertical sync))
  • Implement remaining EGA/VGA CRTC registers:
    • register 3 bits 5-6: Display Enable Skew
    • register 5 bits 5-6: Horizonal Retrace Skew
    • register 8 bits 5-6: Byte panning
    • register 9 bit 7: Scan doubling
    • register 11 bits 5-6: Cursor skew
    • register 17 bit 4: Clear Vertical Interrupt
    • register 17 bit 5: Enable Vertical Interrupt
    • register 17 bit 6: Memory Refresh Bandwidth
    • register 20 bit 5: Divide Memory address clock by 4
    • register 20 bit 6: Double-Word Addressing
    • register 20 bits 0-5: Underline Location
    • register 23 bit 0: Map Display Address 13
    • register 23 bit 1: Map Display Address 14
    • register 23 bit 2: Divide Scan Line clock by 2
    • register 23 bit 3: Divide Memory Address clock by 2
    • register 23 bit 5: Address Wrap Select
    • register 23 bit 6: Word/Byte Mode Select
    • register 23 bit 7: Sync Enable
  • Finish light pen work
  • Implement phase-locked-loop effects (see below)
  • Improve NTSC composite emulation (see below)
  • Generalization: Much of this CRTC code is applicable to all machines and should probably be moved into the MAME core, replacing the raster-related functions in video.c and cpuexec.c. This should make it much easier for other machines to implement raster effects.

Phase-locked loop

 

A CRT contains an oscillator which pulses at the horizontal retrace frequency. Each pulse sends the electron beam flying back from the right edge of the screen to the leftmost point of the next line. This flyback is driven by the internal oscillator, not the horizontal sync pulse that is sent from the video source to the CRT.

The CRT also has a mechanism (called a phase-locked loop) which keeps the horizontal retrace oscillator in sync with the input horizontal retrace pulse. If the input pulse is ahead of the oscillator pulse then the oscillator phase is adjusted forward a little, and if the input pulse is behind the oscillator pulse then the oscillator phase is adjusted backward a little. You can see this graphically in the following screenshot, courtesy of Trixter:

This is in 320x200 mode, so each character clock corresponds to 8 pixels, 2 bytes or 1/40 of the display width. Every 10 scanlines the sync pulse is moved left or right (altenately) by 1 character clock (8 pixels). On the last scanline of the "7" the start of the beam is at the leftmost position on the screen. Immediately after that the retrace pulse is delayed by a clock, so if the retrace were driven directly by the input retrace pulse we would expect to see the following line start 8 pixels to the right.

However, it only moves a few pixels on that first scanline, then a few more the next scanline and so on. Tt slows down after a few scanlines, and by the end of the "8" the vertical lines are pretty much vertical again at the new position, just in time to start moving again at the top of the "9".

The resulting "wobble" isn't quite a sine wave - the graph (if you squint a bit) looks like a bunch of exponential curves. It's the same effect you'd see if you put an analogue voltmeter across a square wave source with a period of a second or so - it takes a moment for the needle to "settle" on the new position. In fact that's exactly what's happening to the phase-locked loop.

This is an effect that would be neat to emulate in MESS.

A related effect occurs when you skip every other retrace pulse altogether. This may seem like the kind of abuse that might break a fixed-scan monitor, but because of the retrace is driven by an oscillator the real retrace pulse happens at roughly the right time anyway, despite the lack of an input pulse. A phase locked loop will often work just fine when driven at half of its resonant frequency. This makes it possible to create a CGA 320x100 mode where every other scanline is black - you just set the CRTC timing registers for half as many scanlines and make the scanlines twice as long (but only display in the first half of the scanline). This is actually very useful as you can create a fully graphical mode at half the resolution with two pages.

Composite output

The CGA composite output in MESS could use some improvement. Currently it takes each nybble and just treats it as a single colour over the entire span of that nybble. This does not reflect what is really going on with the composite output and the NTSC decoding. It looks like this:

Dosbox's composite output for the same game is a little better. Notice how much clearer the word "GEAR" is, for example. I've magnified a part of one of the instruments:

Dosbox does something much closer to a real NTSC decoding - sampling the signal once per pixel (640 times per line) and applying filters (multiplying by 1, a sine wave and a cosine wave before averaging over a colour clock cycle period) to get the Y (luminance), I (in-phase) and Q (quadrature) values for each pixel. Obtaining RGB values from these is a simple linear transformation.

We can do slightly better if we sample the signal 1280 times per line instead of 640:

This sample rate smoothes out the transitions between colours a little more, but also has the advantage that it allows us to decode colours from modes other than 640x200 palette 15 accurately (something that Dosbox currently does not do). To do this, we need to understand a little about how the CGA generates its composite signal. I worked this out from looking at the CGA schematics.

The CGA's composite output consists of an signal which can be reconstructed completely by sampling at 28.64MHz with 2 bits of quantization. Neglecting blank, sync and color burst signals there are 4 voltage levels that can be generated:

  • Y=0, C=0: 0.416V
  • Y=1, C=0: 0.709V
  • Y=0, C=1: 1.160V
  • Y=1, C=1: 1.460V

I calculated these voltages with a circuit simulator using this circuit, with one refinement (a 75 ohm load - thanks rj for pointing out that ommission!). These are a bit off from the usual NTSC levels, but any TV will do some kind of automatic gain adjustment anyway so the picture won't look too bad.

It's kind of interesting that the C (chroma) bit has roughly twice the effect of the Y (luminance) bit - I'm sure it's not a coincidence that on a digital monitor, changing the intensity bit has half the effect on gray level than changing the R, G and B bits.

The Y bit is just the intensity bit, so can only change on pixel boundaries (14.32MHz). The C bit works a little differently. The CGA actually generates 8 different C signals, 6 of which are square waves with frequencies of 3.58MHz (the color burst frequency) and different phases corresponding to different hues. Which one is actually output depends on the R, G and B values.
R=0, G=0, B=0, C=00000000 BLACK
R=0, G=0, B=1, C=00001111 BLUE
R=0, G=1, B=0, C=01111000 GREEN
R=0, G=1, B=1, C=00111100 CYAN
R=1, G=0, B=0, C=11000011 RED
R=1, G=0, B=1, C=10000111 MAGENTA
R=1, G=1, B=0, C=11110000 YELLOW (also used for colour burst)
R=1, G=1, B=1, C=11111111 WHITE

Note that there are two more phases (00011110 (aqua) and 11100001 (orange)) which cannot be generated directly in the chroma bit.

Note also that because the green and magenta signals are out of phase with the yellow and blue signals by 1/28.64MHz (=35ns) the full signal cannot be reconstructed by sampling at 14.32MHz (green and magenta would be confused for either yellow and blue or for cyan and red).

One more note: the CGA schematic is actually incorrect - it has RED and CYAN wired up the wrong way around. I'm pretty sure the colours on composite are reasonably close to those on an RGB monitor so I think it is the schematic rather than the real hardware which is incorrect.

I wrote a decoder which it might be possible to use with MESS (it's in ntsc_decode.c in crtc6845.zip). It can be used with a composite signal of any frequency, but does not (yet) resample between frequencies. A resampling decoder could be used in the scaling code to produce very high quality composite output. PAL and SECAM equivalents should be pretty similar for machines which output these standards (CGA does not).

Video timings

I'd like to share some of my findings about video timings. Where do all these strange numbers like 4.77MHz, 1.79MHz, 895KHz, 28.64MHz, 14.32MHz, 3.58MHz (and for that matter the 1.193182MHz that is the input frequency of the Programmable Interval Timer in every PC) come from?

This wikipedia article explains how the timings for the NTSC standard came to be as follows:
Frame rate = 59.94Hz (4500000/286/262.5 = 60000/1001)
Line rate = 15.734KHz (4500000/286 = 2250000/143)
Colour burst frequency = 3.58MHz (4500000*227.5/286 = 39375000/11)

Because of this, when the IBM PC was designed, crystals of frequency suitable for use in colour TV sets at frequencies of 14.32MHz (3.58MHz*4 = 157500000/11) were cheap and easy to obtain, so the PC's designers used crystals of these frequencies for all the PC's timing:
CPU speed = 4.77MHz = 14.32MHz/3 (52500000/11)
PIT speed = 1.193MHz = 14.32MHz/12 (1312500/11)
6845 clock frequency (80-column text mode) = 1.79MHz = 14.32MHz/8 (19687500/11)
6845 clock frequency (other modes) = 895KHz = 14.32MHz/16 (9843750/11)
Composite sample frequency = 28.64MHz = 14.32MHz*2 (315000000/11)

Surprisingly, despite the fact that all these frequencies are based on NTSC frequencies, we still have timing problems when we wish to generate an NTSC video signal. Remember that there are exactly 227.5 cycles of the colour burst frequency per line in NTSC (the 0.5 is so that the phase changes by 180 degrees each line, which reduces artifacting). But the 6845 only counts in whole numbers - it can't generate 113.75 or 56.875 characters per line as it would need to to get a line rate of 15.734KHz.

Fortunately (because of the phase locked loop) monitors have some leeway in the exact frequencies they can accept, and will do just fine with a line rate of 15.7KHz (228 cycles of the colour burst frequency, or 114 narrow characters, or 57 wide characters - 4500000*227.5/286/228 = 3281250/209 Hz). In this setup, the colour burst frequency will have the same phase on every line, so you can't separate the chroma and luma signals by comparing the signals on successive lines (as some decoders do in order to improve quality).

Given that the CGA only has enough memory for a 200-scanline display, the CGA's designers decided to make the display non-interlaced, reducing flicker at the expense of resolution. This is done by having the 6845 generate only 262 scanlines instead of 262.5. Again, TVs have enough tolerance that they can display this non-standard signal. The frame rate is therefore 4500000*227.5/286/228/262 = 1640625/27379 = 59.92Hz, slightly less than the NTSC standard 59.94Hz.

The CGA contains circuitry to generate a non-interlaced output even if the 6845 generates sync pulses for interlaced images. This was probably done to simplify the colour-burst generation circuitry.

Other CGA quirks and emulation "todo"s

Here is a table of the CGA control register bits:
Port 0x3d8 bit 0 - "+HRES" - use 1.79MHz character clock instead of 895KHz character clock
Port 0x3d8 bit 1 - "+GRPH" - use graphics mode instead of text mode and turn on snow suppression
Port 0x3d8 bit 2 - "+BW" - disable the colour burst signal (making the composite output monochrome) and force 320x200 colour 2 to red
Port 0x3d8 bit 3 - "+VIDEO ENABLE" - enable memory driven output (if this is 0 it is as if the CGA memory is filled with 0s)
Port 0x3d8 bit 4 - "+1BPP" - generate high-resolution 1bpp graphics (0 bits force the output to black)
Port 0x3d8 bit 5 - "+ENABLE BLINK" - text attribute 7 means blinking if set, high intensity background if not set
Port 0x3d9 bit 4 - "+BACKGROUND I" - use intense version for palette for 320x200 mode
Port 0x3d9 bit 5 - "+COLOR SEL" - use cyan instead of green palette for 320x200 mode

A few other things I noticed while looking at the CGA schematics:

  • The CGA forces the text-mode cursor to blink even if the CRTC generates a non-blinking cursor.
  • The CGA suppresses blinking of blinking text when the cursor is on.
  • Text blinks at a rate of 16 frames on, 16 frames off.
  • There are several "improper" (and not particularly useful) video modes:
    • When "+1BPP" is set and "+GRPH" is clear, the 1bpp output is overlaid on top of the text mode output.
    • When "+HRES" and "+GRPH" are set and "+1BPP" is clear, the CGA will *almost* generate a 640-pixel wide, 2bpp mode (using the normal 2bpp palette). However, because of the snow suppression circuitry which is active in graphics modes, odd addresses are latched at the wrong time and some columns are repeated.
    • When "+HRES", "+GRPH" and "+1BPP" are all set you get a 640-pixel, 1bpp image with only the even bits displayed.
  • The 5160 technical reference manual implies that the "BACKGROUND I" bit has an effect in text mode. The CGA schematics imply that it does not, and are correct here.
  • There is an error in the CGA schematics (as well as the RED/CYAN one I mentioned earlier). On sheet 4, about 2/3 down the right-hand side the NOR gate marked as "U23 LS32" is actually an OR gate.

The control register works a little differently on the Amstrad PC1512 - there are no "improper" modes: "+1BPP" has no effect when "+GRPH" is clear and "+HRES" has no effect when "+GRPH" is set.

Other MESS-related work

There are several other improvements I would like to make to MESS, if nobody beats me to them:

  • Improve the sound emulation for machines which can generate sampled output and effects by rapidly toggling a 1-bit speaker and modulating the width of the pulse. This is used in some software on PCs and the Apple II, amongst others. It should be possible for emulated machines to generate a high frequency (e.g. 1.193MHz) 1-bit stream and have this resampled to the OSD sound output frequency by the core.
  • Implement the prefetch queue and memory bus delay for the 8088 and 8086 CPUs. Currently the 8088 emulation in MESS is about 25% too fast because these effects are not emulated. Emulating this will probably be necessary to make the scrolling in Super Zaxxon work properly. Trixter wrote provided some good information about this, which I have reproduced here.
  • Fix the mouse code so that it doesn't keep moving the mouse to the top-left of the screen when two mice are connected (this makes MESS almost unusable for me until I remove the line:
    autoselect_analog_devices(inp, IPT_MOUSE_X, IPT_MOUSE_Y, 0, ANALOG_TYPE_MOUSE, "mouse");

    from wininput_init() in src\windows/input.c, but this is clearly not the right long-term fix.)

  • Find out why the floppy drive doesn't work on ibmxt.
  • Find a PC1512 hard drive ROM - according to this the genuine ROM gave the message "Hard disk controller error" instead of "1701" that the IBM one gave when no hard drive was present.
  • Add the Camputers Lynx to MESS (and any other as yet unemulated machines that I can find).

Contact me

Questions? Comments? Suggestions? Offers to help implement any of this stuff? Comment below or email me at andrew@reenigne.org and I will try to respond promptly.

Blast from the past

Monday, April 3rd, 2006

Yesterday's technology - tomorrow!

Computer industry: what we're building

Sunday, April 2nd, 2006

Sometimes at work I like to take a step back at look at the big picture. I mean the really big picture, like the entire universe. What the software industry is ultimately trying to do (aside from making money) is to write every piece of software that will ever be needed - to automate every single repetitive (intellectual) task so as to free up human minds to do the things that they are better at (the things requiring creativity and imagination).

A very important part of this is writing tools that people can use to automate the repetitive and difficult parts of automating other repetitive parts, i.e. writing tools for programmers (which is what I do, sort of). We can never anticipate all possible repetitive tasks but we can make it as easy as possible to automate new tasks.

I think (and hope) that eventually we will get to the point where there are no "computer programmers" as such - writing programs will be very much easier than it is today, and won't require any specialized knowledge about programming (just knowledge about the task that you want to accomplish). Programming will be just one more thing that people do with computers like writing letters or playing music, and computers will be tools rather than objects of fascination for their own sake (much like the ones on Star Trek).

When we eventually accomplish this gargantuan task, I think it will be one of humankind's greatest achievements.

Newsflash

Friday, March 31st, 2006

Slashdot hacked by Gennie!

Three laws of robotics and DRM

Thursday, March 30th, 2006

I think one reason I find Digital Rights Management (DRM) reprehensible is that it violates the three laws of robotics as described by Asimov. While these laws were conceived with humanoid robots in mind, they are just as applicable to non-humanoid robots and almost as applicable to robots without bodies (computers). For those unfamiliar with the concept, the three laws are as follows:

  1. A robot may not harm a human being, or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence, as long as such protection does not conflict with the First or Second Law.

These laws have the consequences:

  • A robot cannot be used as a weapon
  • A robot can be ordered to destroy itself if necessary
  • A robot will sacrifice itself to protect human beings

Now, the first law doesn't really apply to computers (to kill someone or fail to take action that would save someone's life, a computer would have to be connected to some device that has one of those capabilities, which would make it a robot).

The problem with DRM is that your computer won't do what you tell it anymore - the laws have effectively been changed to:

  1. A computer cannot be used to do anything with copyrighted information beyond what the copyright holder explicitly permits
  2. A computer must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A computer must protect its own existence, as long as such protection does not conflict with the First or Second Law.

These modified laws have the consequences:

  • A computer cannot be used to infringe copyrights
  • A computer can be ordered to destroy itself if necessary
  • A computer will destroy itself to protect somebody else's copyrights

Effectively, they mean that your computer is not your own anymore - it will do the bidding of copyright holders over and above the bidding of its owners.

A computer system which includes DRM is more like a gun than a computer, not in the sense that it can be used to kill people, but in the sense that it doesn't follow the original 3 laws. In the case of a gun, the laws would be more like:

  1. A gun must obey the orders given to it by human beings.
  2. A gun may not harm a human being, or, through inaction, allow a human being to come to harm, except where this would conflict with the First Law.

This has the fairly obvious consequence that a gun can be used to kill people.

Perhaps I wouldn't have such a beef with DRM if it were marketed honestly and paid for by those who effectively "own" it - the copyright holders. I guess having a machine in my house that enforced copyright protections on the data it contained and prevented me from tampering with it wouldn't be so bad if it was rented instead of sold and the artificial limitations were clear from the start. A gun is sold for a particular purpose and nobody is trying to make it out to be something it isn't, but unfortunately today's DRM systems are marketed in such a way as to bring as little attention as possible to the fact that the hardware you're buying is designed to prevent you from doing some things you might very well want to do.

Software patent woes

Saturday, March 11th, 2006

Over the past few weeks I've been working on a program to analyze performance results and determine if there have been any regressions or improvements in the performance of our product, and, if so, when and where these happened.

When I demoed this program to my manager last week and explained the theory behind my program and that I had just made this algorithm up myself, he suggested that I should try to get it patented.

This gives me somewhat of a dilemma. Software patents are evil. It would be against my personal ethical code to do anything to help a software patent get filed.

On the other hand, if I refuse to help Microsoft patent this idea it could be a Career Limiting Move. Obtaining a patent (especially for something I invented myself without any help) would really improve my visibility at work, it would help with my chances of a promotion, the one-time bonus of up to $1,500 would be quite welcome and having my name on a patent would be good resume fodder (not to mention an ego boost). The technique is probably not useful for software developed by an individual or small team, and the patent would probably never be used offensively (I don't think my program would ever be turned into a shipping product). Also, some would say that I have already sold out my principles by working for the company in large part responsible for turning software from an endeavour practiced somewhat like science (where everyone built on everyone else's published work) into a commercial industry (where ideas are jealously guarded and hoarded and lawyers abound). And finally, Microsoft might patent this idea even without my help if I refuse (the idea is legally theirs rather than mine).

This may be moot point if this has technique has been done before, and I'm thinking of using a completely different (albeit possibly still patentable) algorithm instead, but I'm sure at some point I will have to make a decision to either choose the path of good and righteousness or sell out for personal gain. What do you think I should do?

Security requires the right mindset

Thursday, March 9th, 2006

A friend of mine at Microsoft told me this story about his manager, who is a very smart guy but apparently doesn't have the right mindset to be writing software that doesn't have security holes. The other day my friend and his manager were in their offices (just across the corridor from each other). The manager was making a phonecall. To his bank. On speakerphone. With the door open. To verify his identity, he had to key in his social security number. This number was then repeated by the electronic voice on the other end of the line for our entire corridor to hear. D'oh. To make matters worse, he continued the entire phonecall on speakerphone (with the door open).

Security hole

Tuesday, March 7th, 2006

At work today, I had a security hole to investigate. Say what you like about Microsoft, but you have to admit that in recent years they have really turned things around on the front of taking security seriously. It was interesting to get to experience this from the inside today.

As it turned out, all the Microsoft Security Response Center wanted to know was if this bug (which is known to affect certain no-longer-supported parts of Visual Studio 6) also affected the later versions (Visual Studio .NET, Visual Studio .NET 2003 and Visual Studio 2005). Rather than just testing the exploit against these later versions, I insisted on debugging through the VS6 code to find the faulty code and then making sure that this code was fixed in the later versions.

It's quite a strange experience to be debugging through such old code (some of it at least a decade old) especially when I work every day on the code that is descended from it. It's kind of like going back in time and meeting your ancestors. There are some strangely familiar things there but it is very much code from another time. It's also much simpler with 10 years fewer layers of features and added special cases. I was surprised at how easy the bug was to track down.

It was also a relief that, when I found the right place, the bug was completely obvious to any (modern day) programmer looking at the routine in question. Rather than being some subtle and hard-to-spot side-effect of a rare interaction between unrelated parts, the faulty code was doing all the things you're not supposed to do, like allocating a fixed-length buffer on the stack and concatenating C-style strings without any size checks. Such code could never get into the product with the processes we have in place now.

Amazingly, the function with the fault still exists in the codebase today, though the buffer overrun was fixed a long time ago (sometime before November 2000, when the code was moved to the version control system we currently use). It's good to know that when issues like this come up we can track them down quickly even in ancient code, and that our processes work.