Thursday, December 25, 2008

RIP, VHS

Distribution Video Audio, Inc. is the last big supplier of VHS tapes, and they are done with the product. The last major Hollywood movie to be released on VHS was "A History of Violence" in 2006. Introduced in 1976, this gives it an impressive run of about 30 years.

Because we grew up in the Philippines, the dominant video format for many years was actually Sony's Betamax, even after the VHS conquered the rest of the world. The two formats were similar in terms of features and faults: you had to rewind the tape after viewing; the player may eat the tape; magnetic tape was fragile; the cassettes were bulky; and the video quality was poor (and degraded with age).

Good times.

Sunday, December 14, 2008

Secret Ballot, Public Count

A cooperative election official in California allowed the re-scanning of ballots, and provided the images to the public for analysis. Add some donated programmer time, and what we find is that the official election machines dropped 197 of 64,161 ballots (0.003%) because of a software bug. (If that doesn't sound like a lot, remember that it would extrapolate to 304,356 votes in the 2000 presidential elections when Al Gore's popular vote advantage was merely 539,947.)

I don't have intimate knowledge of these machines. If you scanned the ballot into a giant image and then tried to sort out which boxes were filled, then it would be a challenging programming task. However, if you have specialized equipment that could detect if a box at a given position is filled or not, as I suspect the official machines do, then counting votes should be a pretty trivial exercise. Computers have been adding one to a running total correctly for many decades now.

"Secret ballot, public count" really should be the way to go.

...

An unrelated note from the article gave me a chuckle:
Trachtenberg said before the launch they had trouble getting the scanner to work with their Linux scanning program, but contacted M. Allen Noah, administrator of the SANE Project (the open scanning protocol known as Scanner Access Now Easy that works with Linux), who advised them on how to make it work.
Ah, good ol' Linux. You make it so easy.

Saturday, December 13, 2008

On Confrontation

One thing that's always impressed me about the American System is its confrontational nature. There are two big political parties, neither able to destroy the other any time soon. There are three branches of government, and not even a President Bush could really say no to the Supreme Court. There is a prosecutor like any other country, but also a defense lawyer, whose job - get this - is to side completely with the accused whether guilty or innocent. These confrontations would not be possible if they were taken personally, instead of understood to be your opponent merely "doing his or her job." In fact, I think underlying these systems is the belief that it's better to be correct than nice. (Not being a nice person, I really like that.)

One facet of this system that has failed miserably, however, is in the financial regulation area. It's pretty clear why: the bankers are far more powerful than the regulators. The former flies in personal and corporate jets and dine with politicians, the latter wear beige overcoats and shiver in the wind. Okay, maybe my stereotypes are not so good, but it's time the balance of power is restored to this very important confrontation.

Backing Up

I've sometimes thought about printing out digital data in some sort of bar code onto paper as a long-term solution for backing up personal data:

  • Good quality paper lasts a long time in proper storage.
  • Modern laser printers can produce crisp images cheaply. An ink-jet printout would be less crisp, and also more susceptible to moisture damage.
  • Error-correcting codes can be embedded, to help recovery in case of damage to the papers.
  • Redundancy is tedious, but possible. You can photocopy the entire stack of paper.
  • Retrieval is slow, but the technology required is not likely to disappear. The best case is a scanner with a sheet feeder, the worst case is a digital camera.
  • No reliance on computer interfaces that quickly become obsolete.
Add to this a cleanly-written, open source implementation of an encoder and particularly a decoder, I think the data can last for decades. Open source is critical because any future attempt at retrieval should not be dependent on old computers or operating systems. Hell, print the source code along with the data!

This is of course not an original idea. Old computer magazines used to come with pages and pages of game source code for eager young readers to type in, but later also came with these bar code strips to ease data entry.

The big problem is probably data density. If we print the data bits at 50 dpi, then a letter-size sheet of paper can store 187,500 of raw bytes per sheet if we leave half an inch of margin on all sides. This means that a terabyte of data would require 5,727 sheets of paper, or less than 12 reams (which should cost about US$60). Printing them out on a Lexmark T642DTN today (45 pages per minute) would take just over two hours. My printer can only go 20 ppm, so it'll take nearly 5 hours to print. The toner supposedly prints 2,500 pages, but that's probably text pages, so let's say 1,000 pages instead, which adds about US$180 of toner expenses. You might be able to get error correction and compression to cancel each other out, so I won't bother computing that.

But if permanence is really important, this may not be an entirely bad idea. Our photo library, last time I checked, weighs in at about 6 GB. However, if we excluded some of the redundant pictures and scaled the resolution down a bit, this solution could be in the ballpark.

Why not just print out the pictures on archival photo paper, you say?