Me: I live in Silicon Valley with my wife, child and cat. I have worked at Microsoft since I graduated from College, both in the Macintosh Business Unit on products such as Outlook Express, Entourage, IE, and Virtual PC and in Windows Live on Hotmail, Calendar and People. I am currently a Principal Lead Program Manager on the Windows Live Social Networking team. I basically manage a team of Program Managers responsible for delivering features to support our web and client applications. I've been blogging since 2001 and like to play around with .NET in my spare time working on projects such as dasBlog (the blog that powers this site) and Send to SmugMug (an application for uploading photos to SmugMug). I blog about a number of technology and productivity related topics.
Powered by: newtelligence dasBlog 2.3.9074.18820
Disclaimer The posts on this weblog are provided "AS IS" with no warranties, and confer no rights. The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
© Copyright 2010, Omar Shahine
E-mail
Why am I writing this post? I wish I didn't have to. I spent a few days trying to figure out why PDFs were not being indexed on Vista, and why none of my TIFF files that were created with Microsoft Office Document Imaging (and have embedded OCR text) were showing up in search results. This was happening on Vista and on XP with Office 2007 and Windows Desktop Search.
After a bit of digging around, and a couple of emails I got the answer.
Vista and Windows Desktop Search 3 (which share the same technology) do not support IFilters that only implement IPersistFile. In order for the contents of files to be indexed the IFilter must support IPersistStream.
If you want an IFilter for PDF files then you should download Adobe Acrobat Reader 8 for Vista (only the Vista version has the IFilter). The previous IFilter does not work. While you are at it, read this post for instructions on how to make your own very well behaved Adobe Reader Installer.
If you want an IFilter for your TIFF or MDI files then you are out of luck for now.
I would like to add that Microsoft Office Document Imaging is one of the best values in the Office Suite. It's completely ignored, and not installed by default on Office 2007 any longer. If you do install it you can use it with any TWAIN compatible scanner to scan all your legacy paper. The text in that paper is OCR'ed (text is recognized) and you can save as a TIFF file which can be viewed on almost any computer. That text can also be indexed and searched on your computer. This is a handy way to find that receipt for your TV from 2 years ago etc. You can also create these files from any application that can print. I've been using this for years to save any web receipts that I need.