Extracting Plain Text for Indexing

Searching by keyword requires an index (if you don't want to do it dynamically).

An index requires plain text. And there are a lot of formats out there that are not plain text, especially PDF.

Here are some ways to extract plain (possibly formatted) text from a pdf document:

There are also plenty of shareware and strictly commercial products out there.