Scan and Convert to Text: OCR at Falk Library

Scanning is a form of digitization that replicates a traditional format, like a book, paper, or image, into a digital image file. After digitizing your file, there are ways to change that static image into searchable and editable text. Optical Character Recognition (OCR) is a technology that can interpret the letters on an image, and turn them into computer text that can be searched or edited.

OCR works best on files that were originally typed. Forms, papers, handouts, and screenshots are all common types of documents that can be converted to editable and searchable text. Handwritten documents cannot be transformed to text using standard OCR technology.

Falk Library offers equipment and software to scan and convert documents to text. The Scannx Book ScanCenter is an all-in-one station that allows you to quickly scan and save documents. To use OCR capabilities, select Searchable PDF or Word as your file format. Scan multiple pages into a single document to have all of the editable text combined into one file.

Adobe Acrobat Pro offers several tools that give more text capabilities to PDF documents. Some PDFs do not have searchable text; they can be converted with OCR using the Recognize Text tool. You can then use the Edit Text tool to make edits directly to your PDF. A PDF with recognized text can also be saved as a Word or Excel file for further editing. The Create Form tool is another option in Adobe Acrobat Pro that turns boxes and blanks into form entry fields.

The Scannx Book ScanCenter and Adobe Acrobat Pro software are available for use on the upper floor of Falk Library in the Technology Services area.

For more information about OCR, contact the Technology Services Help Desk at 412-648-9109.

~Julia Dahm

Posted in the May 2017 Issue