Lars Nielsen's Discoveries

January 20, 2010

Search doesn’t find PDF documents with Adobe PDF Ifilter

Filed under: SharePoint,Troubleshooting — Lars Nielsen @ 6:15 pm
Tags: ,

I recently installed the Adobe PDF Ifilter for 64 bit (knowing that the Foxit one is better, but Adobe is free and it’s hard to convince budget-holders to pay for Foxit).  After installing the files on the server, the configuration instructions are rather misleading.  They start you off diving into editing the registry on the server, whereas I simply went to Shared Services Provider – Search Administration – File Types and added a new file type for PDF.  It should show up in the list of file types as AcroExch.Document.  This updates all the registry keys as specified in the configuration instructions, the only difference being the GUID in the Default key in this location:

HKEY_LOCAL_MACHINE\Software\Microsoft\Office">\\KHKEY_LOCAL_MACHINE\Software\Microsoft\Office Server\12\Search\Setup\ContentIndexCommon\Filters\Extension\pdf

The GUID was different but this didn’t seem to be a problem. 

It’s also important to update the docicon.xml file and add a small icon for PDF’s to the images folder.  There are lots of blog entries that cover this.  It would be cleaner to do this “no-touch”, via a feature activation, rather than editing the files on the server.  But in fact you only need to make this change in one place, on the Index Server (of which there’s only one in a SharePoint farm anyway) so it’s not such a big problem to repeat this setting if you need to recover the farm.

I did have a problem when I tried crawling a test PDF document.  The document showed up OK in the crawl log, and no errors appeared in the crawl log.  When I searched for the document by name, I could find it OK.  But if I searched for keywords within the document the document, it didn’t show up in the search results.  Eventually I found there’s one more step you need to do on the server which is to add the Adobe PDF binary to the PATH variable. To do this:

  1. Click Start
  2. Right click My Computer (in Windows Server 2008, it’s Computer)
  3. In the Advanced tab (Advanced Systems Settings in Server 2008) click Environment Variables.
  4. In the lower text box select the Path variable and click Edit and append the file path to the IFilter DLL to the end of the path (put a semicolon in front as a delimiter).  In my case the file path is C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\
  5. Click OK twice to clear the dialog boxes
  6. Restart the Office Server Search service
  7. In a command box type IISReset to reset IIS
  8. Run a full crawl again to re-crawl the PDF documents.

Here’s what it looks like at step 4:

Adding a new path entry

Following this change, I could find text inside PDF documents.

Advertisements

1 Comment »

  1. […] ArcoExch.Document as a description of PDF file type on central administration : https://discoverlars.wordpress.com/2010/01/20/adobe-pdf-ifilter/ <- in nut shell this name is […]

    Pingback by Adobe PDF iFilter 9 for 64-bit and MOSS 2007 « Gorgo.Live.ToString() — April 20, 2011 @ 8:00 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: