Advertisement

How to OCR Documents for Free in Google Drive

by
This post is part of a series called Going Paperless.
How to Edit PDF Documents in Microsoft Word
How to OCR Documents Online With ABBYY FineReader

Google Drive makes it painless to go paperless. Its collaborative documents, spreadsheets, and presentations already help curtail paper usage, but its OCR feature helps curb the paper mess even more so.  

OCR, or Optical Character Recognition, is the most important tech to help you go paperless. Scanned documents on their own are only glorified pictures of your documents, but let your computer recognize the text and they instantly become a ton more useful. We've already looked at how to OCR documents in Adobe Acrobat and even turn them into documents you can edit in Word, but if you don't have a copy of Acrobat or Word, there's an even better option: Google Drive. It includes a little-known free OCR tool, and in this tutorial I'll show you how to use it to turn images and PDF documents into editable text files online with Google Drive.

Scanning the Documents

The first step and most important step in OCR is finding the images or PDFs that you want to convert to text files. Google Drive currently supports OCR for .jpeg, .gif, .png, and PDF files up to 2MB in size.

Tip: If you want to convert multiple pages to text, PDF format is the most efficient as all pages can be uploaded in one batch. However, text will only be extracted from the first 10 pages of the PDF.

Preparing Files for Upload

There are a few rules of thumb to produce optimal results from OCR in Google Drive. Make sure your file is high resolution with clear contrasts and even lighting—these are by far the most important factors in ensuring a successful conversion.

Use standard fonts to ensure the best results
To ensure best results, use a standard font like Helvetica and stay away from overly stylized fonts.

Additionally, make sure that the text being scanned is horizontal and read from left to right. Standard typefaces, such as Helvetica and Times New Roman, will produce better results than more obscure typefaces. If you have a document that's rather unreadable, you can still attempt to OCR it, but the results likely won't be that nice.

While Google Drive supports OCR for many different languages, OCR on languages that use non-Latin character sets is still buggy and may not produce desirable results.

Gathering the Files

If the files are on your computer, make sure that their file format is one of those supported by Google Drive. If the document or photo you want to use is physical, scan them onto the computer. If you own a scanner, you can use programs like Doxie or other scanner’s software to digitize them for upload into Google Drive.

If you don’t have a scanner, your phone can be used in place of one. There are a number of apps in the App Store that “scan” physical documents into clean PDFs using a phone’s camera. I use the app Scanner Pro on a daily basis for this purpose.

Scanner Pro uses the iPhones camera to create PDF versions of physical documents
Scanner Pro uses the iPhone's camera to turn create PDF versions of physical documents.

Using Scanner Pro is as simple as taking a picture. The app will identify the edges of the paper in the photo (given the proper lighting and contrast between the paper and the background), then convert each photo into an easy-to-read PDF page. Once you have finished scanning your whole document, email it to yourself or upload it to Dropbox so you can access the file on the computer.

Tip: Scanner Pro has in-app support for Google Drive. However, Google Drive does not support OCR for documents uploaded directly from the app to the Drive, so the extra step of emailing or uploading the PDF to Dropbox is necessary.

Uploading to Google Drive

To add your documents to Google Drive to get them OCRed, go in to your Google Drive account. On the My Drive page, click the Settings button on the right side of the page. Under Upload Settings, check off the Convert text from uploaded PDF and image files and the Confirm settings before each upload options.

Make sure the proper settings are enabled before proceeding to upload the document
Make sure the proper settings are enabled before proceeding to upload the document.

Then, click the Upload button next to Create button on the left side of the page. Select Files, and find the file that you want to convert to text in Finder. Click Open.

Upload the files to Google Drive
Upload the files to Google Drive.

An Upload Settings panel will appear. Select the proper language from the Document language drop-down menu, and leave the Confirm settings before each upload box checked. 

Verify your document language and keep the confirm settings option enabled
Select the proper Document language from the drop-down and keep Confirm settings checked off.

Click Start upload once you have confirmed your settings. Upload progress can be monitored in the bottom right corner of the page. Uploads typically last around 30 seconds for image files and up to a minute for multipage PDFs.

The uploaded document is converted to a Google Doc carrying the same file name
The uploaded document is converted to a Google Doc carrying the same file name.

Once the upload is complete, a new Google Doc appears in the drive, titled the same as the name of the original file. Open the new document, and rename it if need be. The original file is displayed on the top of the document, and the text extracted from the document is displayed directly below.

The original document is displayed above the extracted text
The original document is displayed above the extracted text.

Google's OCR technology does its best to retain the original formatting of the scanned document; however, some of it may be lost in the translation process. Luckily, changes can easily be made due to the fact that the text is editable. 

Managing OCR Documents

One huge advantage to using OCR in Google Drive is that you can easily share the new document with whomever. To do so, go to File > Share, and you can add collaborators by sharing a link or sending an invitation via email.

The OCR document can be exported as a different format
The OCR document can be exported as a different format.

Additionally, the OCR document may be exported as an editable text document, such as a Word Document or a Plain Text document, by going to File > Download As and selecting the format you want. 

Now Get Working!

Google Drive provides a quick and easy way to turn image and PDF files into editable text for free. These files can be easily shared or exported to different file formats for use at any time, and they're far more useful than your original scanned documents since you can now search through and copy text from everything you've scanned.

If you have any trouble OCRing your documents in Google Drive, have any other OCR tools you love, or anything else you'd like to share about going paperless, let us know in the comments!

Advertisement