Sign In

Communications of the ACM

ACM TechNews

AI Cracking Open the Vatican's Secret Archives


View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
Uuntangling handwritten texts in one of the worlds largest historical collections.

A new project uses a combination of artificial intelligence and optical character recognition software to scour neglected texts in the Vatican Secret Archives and make transcripts of them available for the very first time.

Credit: Alessandra Benedetti/Corbis/Getty

The In Codice Ratio project uses artificial intelligence and optical character recognition (OCR) software to mine the Vatican Secret Archives and make its documents available for the first time.

Traditional OCR deconstructs words into letter-images by seeking the spaces between letters, and then compares each letter-image to the bank of letters in its memory. After deciding which letter best matches an image, the software renders the letter into a computer code to make the text searchable.

Handwritten text does not translate well with this technology, but In Codice Ratio uses jigsaw segmentation to circumvent this problem by breaking words down into something closer to individual pen strokes. The OCR splits each word into a series of vertical and horizontal bands and looks for local minimums, then carves the letters at these joints and chunks them together to produce possible letters.

Applying common-sense training to the OCR helped further refine the software's deciphering ability.

From The Atlantic
View Full Article

 

Abstracts Copyright © 2018 Information Inc., Bethesda, Maryland, USA


 

No entries found