Understanding and Creating Full-Text Searchable Documents

Overview

Members of SmartVault Professional plans (i.e., Professional, Accountants, or ProAdvisor) can use the SmartVault full-text search capabilities to search for words within uploaded Microsoft documents, such as Microsoft Word, Excel, or PowerPoint.

PDF documents can also be searched for using words contained in the contents of the document if the PDF document has been created as a searchable PDF.

Feature

If you want to search for a PDF document by using words contained within the body text, the PDF document must be created as a searchable PDF document.

Most PDF documents created from applications, such as from Microsoft Word or Adobe Acrobat, are automatically created as PDF documents.

However, PDF documents created by a scanner may or may not be a searchable PDF document. If the PDF document created by a scanner was generated using the scanner’s Optical Character Recognition (OCR) software, then the PDF document is created as a searchable PDF document.

If the PDF document was simply scanned as an image without using the scanner’s OCR software, then the PDF document is not searchable, and you cannot search for the document using keywords contained in the body of the document.

If you are creating PDF documents by scanning documents with a scanner and want to be able to search for them using the full-text search capabilities of SmartVault, be sure to use the OCR capabilities of the scanner software when scanning documents.

If you use a TWAIN-compliant scanner to scan documents and create PDF files, for more information about how to create a searchable PDF file using your scanner’s OCR capabilities, consult your scanner documentation.

If you use a Fujitsu ScanSnap scanner to scan documents and create PDF files, ensure that you have configured the SmartVault ScanSnap profiles for your Fujitsu ScanSnap scanner to include OCR capabilities. For more information about how to configure SmartVault ScanSnap profiles to include OCR capabilities, see Creating SmartVault ScanSnap Profiles for Fujitsu ScanSnap Scanners.

Benefits

Any Microsoft Office document, such as a Microsoft Word, Excel, or PowerPoint file, is automatically indexed after it is uploaded into an account with a Professional, Accountants, or ProAdvisor plan. After the indexing process completes, you can search for the documents by using keywords contained within the body of the document as your search term.

Considerations

First, test the option to create OCR or searchable documents in your workflow to determine if you want to enable full-text searching, given the time it takes to complete an OCR scan.

...