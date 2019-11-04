THERE ARE SOME THINGS in life that just go together. Fish and chips, shirt and tie, the INQUIRER and snark are just a few. But there's one that seems so logical, you have to wonder why it didn't happen sooner.

Wikipedia and The Internet Archive are two not-for-profits whose work compliment each other perfectly.

Little wonder then that the two are joining forces to enhance Wikipedia entries with extracts from books stored in the Internet Archive vaults.

The value of this is significant. It means from a starting point of lazily citing Wikipedia in an essay (or article) it becomes a lot easier to back it up with some actual citable sources.

So far, the Archive has 3.8 million books scanned and is adding more at the rate of 10,000 per day. The BHAG for the project is four million books scanned and online in the near future.

The Wikipedia citations will take a bit longer though. So far there are around 130,000 citations from around 50,000 books.

It's not a simple process, and Wikipedia editors will have their work cut out ensuring that each entry is formatted correctly with the code for the API, to allow the book to flank the entry.

It's also a largely manual process. With some books predating ISBNs and the ongoing risk that something has been misattributed, and therefore will be again, a lot of care needs to be taken to preserve the accuracy of Wikipedia.

Fortunately, there's already a process in place. As we reported last year, the Internet Archive can already screen-scrape Wikipedia entries and test the links in a record. If any of them are dead, they are automagically (!) replaced by the newest version held by The Internet Archive's mighty Wayback Machine.

Since that side of the collaboration began, it's estimated that InternetArchiveBot has 'fixed' links in almost six million entries and counting. μ