The Inquirer-Home

Distributed Proofing site goes through the roof

One Page Per Day
Mon Nov 11 2002, 08:23
A WEBSERVER ENDURES Slashdotting and the publishing industry trembles. Two years after launching his Distributed Proofreaders site, Charles Franks, a programmer from Las Vegas, hit the traffic jackpot, shattering records for pages done, increasing his sites' users five-fold, and editing nearly 100 books, all in a wild weekend.

So what is DP? It's a website whose sole purpose is to generate cleanly edited public domain ASCII texts for Project Gutenberg. Franks and a few other experienced volunteers scan in the books via a $3,500 sheet-fed gadget, enter raw OCR'd text into a database along with images of each page. Visitors to Charles' site then edit the text to correspond exactly with the page image, saving the corrections before moving on to additional files. In a second round of proofing, the original edits are double-checked by another person for misses.

At DP, you don't commit to proofing an entire book, just editing one page at a time, though there's a ranking system for who's done the most. When all pages are complete, the experienced volunteer combines them into one large file, sending the completed work to Gutenberg's main internet server; shortly thereafter, the book files are replicated to literally hundreds of mirror sites across the globe. All PG titles are free under terms of the Gutenberg licence.

The Gutenberg Project has long attracted gifted programmers to its cause. Back in '93, World Wide Web founder Tim Berners-Lee pondered how best to get PG's ASCII wares out to the public. Others have written scripts to better handle proofing issues, formatting, FTPing, etc. Charles Franks' effort is the first to apply technology in a manner that completely redefines the editing process.

Gutenberg was launched in '71 by Michael Hart, the first regular citizen to get on what became the Internet. Beginning with an e-mailed Declaration of Independence, the sending of which nearly crashed the network of that time, PG grew slowly over the next 30 years into what is now the most significant group literary undertaking since James Murray teamed up with family, friends and casual passersby to create the Oxford English Dictionary. But while Murray died before OED was complete, Hart has a real good shot of seeing his original objectives realized. It's likely PG's 10,000th book will be added sometime in 2003 (they've got over 6,000 currently).

Hart has a new goal: putting 1,000,000 books on PG. At first it seems impossible, but with people like Charles Franks on his side, and a new means by which anybody can help create ebooks, one page per day, it could well happen in the PG founder's lifetime. Winning a Supreme Court case on copyrights wouldn't hurt either.

Oh, and for those of you who don't believe in ebooks, well: since its inception, where PG was one of the top destinations on the proto-Internet, to the late '90s, where traffic numbers on PG equaled a whole heck of a lot, to now, where with all the mirrors PG's stats can comfortably be estimated as God Knows, but likely still one of the most-trafficked sites around, ebooks have been popular. Other sites offering free titles in quantity have millions of people coming by each month.

Ebook hardware has been a debacle. The former CEO of Gemstar, Inc. gets an early nod for worst marketing decision of the century when he bought the relatively successful Rocket eBook company, alienated his heavily male user base by shutting down a free library for specious reasons, pitched a new ebook device that didn't allow one to add Gutenberg content on the Oprah Winfrey Show, and then pretended sales were good while store after store dropped his devices.

However, if you look over to China, where low cost ebook readers are rapidly being adopted, you do get a real good idea of what AMD means when they say that particular nation "is where the opportunity is." Westerners may have to wait a little longer for the perfect cheap tablet, but they're a-coming, and fast-growing educational products company Leap-Frog could surprise us all and start shipping an ebook device next week.

While you're waiting for that reader, why don't you take a minute to visit the Distributed Proofreaders and edit a page today? Won't take a much of your time and you'll help advance the cause of literacy.

Big publishers aren't sitting still during all this proofreading. They have a plan for dealing with all the free content. They're gonna team up with authors and print fewer books. µ

David Moynihan is webmaster at Blackmask Online, and publishes an almost-daily newsletter with ebook news and free ebooks.


Share this:

blog comments powered by Disqus
Subscribe to INQ newsletters

Sign up for INQbot – a weekly roundup of the best from the INQ

INQ Poll

Heartbleed bug discovered in OpenSSL

Have you reacted to Heartbleed?