Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
Author: Subject: The FineReader 9 user experience
Polverone
Now celebrating 21 years of madness
*********




Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline

Mood: Waiting for spring

sad.gif posted on 19-12-2007 at 10:43
The FineReader 9 user experience


I have long been a satisfied user of FineReader OCR software, but the recently released version 9 seems to be a step in the wrong direction in many aspects.

First, the benefits of the new program:
-Multi-processor or multi-core systems will get a speed boost from use of multiple cores
-Recognition accuracy is claimed to be improved (though it was excellent before)
-Layout recognition is more capable
-Custom quality settings are again available in PDF export (this was missing in 8)
-Recognition can proceed simultaneously with opening, so you don't have to completely finish loading large files before recognition starts

The downsides:
-Much slower, both in recognition and in reviewing recognized pages
-Uses more RAM
-Thumbnail page view can no longer be sorted by anything other than page number
-Page-by-page view of recognition process with text highlighting no longer appears when recognizing all pages
-Buggier (had to manually kill a recognition process that locked some critical resource, re-recognizing one page in a 2000 page batch seemed to freeze the program, and there is no batch recovery after unexpected shutdown like in 8)

I wonder if I am being unfair to the program, since I am using a cracked version. It's possible that the bugginess, slowness, and malformed features are subtle anti-piracy measures that the crack authors did not catch and patch. It's also possible that I'm seeing the real face of FR 9 but that later point releases will fix some of these warts. Right now it feels like a downgrade overall. I haven't read any software reviews that mention these problems, but many software reviews are practically regurgitated product brochures and don't really stress a program.

Have others had experience with FR 9? I have several hundred thousand pages that need OCR. This will be a multi-month project no matter what software I use. I'm going to try running several thousand pages through the latest Omnipage and see how it compares with FR 8 and FR 9.

Edit: Omnipage 16 crashed with an unhandled exception when I tried to load and recognize the same test data I'm using with Finereader. It looks like FR 8 may be my only option due to bugs in other programs, never mind feature sets.

[Edited on 12-19-2007 by Polverone]




PGP Key and corresponding e-mail address
View user's profile Visit user's homepage View All Posts By User
chemrox
International Hazard
*****




Posts: 2961
Registered: 18-1-2007
Location: UTM
Member Is Offline

Mood: LaGrangian

[*] posted on 19-12-2007 at 15:03


It seems like a lot of sw manufacturers put out new versions for the sake of sales without making real imrovements. Witness how MS builds on top of older sys without cleaning out the garbage and makes memory hogs out of everything as a result. memory gets cheaper, hds get bigger and the sw developers demand more and more of the space. its unending. I've been trying to scan a book for solo and others using microtek sw. The OCR is aabbyy. It has been difficult and a start and stop and learn and start again kind of thing. Would you suggest my trying FineReader 7 or 8 instead?



"When you let the dumbasses vote you end up with populism followed by autocracy and getting back is a bitch." Plato (sort of)
View user's profile View All Posts By User
Polverone
Now celebrating 21 years of madness
*********




Posts: 3186
Registered: 19-5-2002
Location: The Sunny Pacific Northwest
Member Is Offline

Mood: Waiting for spring

[*] posted on 19-12-2007 at 15:22


I've always scanned through bundled scanner software or a photoshop TWAIN plugin, so I can't comment on the experience of actually scanning through any of the OCR packages.

I don't mind newer software requiring more hardware resources if it does things better. The worst part about FR 9 is the bugginess, followed by the missing thumbnail sort options. The slowness is minor in comparison and would be worth it, IMO, if it didn't have the regressions compared to FR 8. FR 9 does genuinely offer some new and improved features, but I can't really enjoy them because of other shortcomings.

Part of my problem may be that my use is atypical. I use scripts and programs to take whole nested folders of PDF files, extract all the images, and turn them into one enormous multipage TIFF file. That way I can run OCR on all the page images and review them at the end rather than reviewing one file at a time. After I've reviewed the data I save it as PDF and split the PDF to recreate the individual articles. In the review stage, viewing thumbnails sorted by error/warning messages is tremendously useful because it lets me quickly spot wrongly oriented pages.

My initial test for this project used a bundle of 2800 page images. I didn't have any problems using FR 9 on documents of just a few hundred pages, though it was still slow.

It turns out that Omnipage isn't totally buggy like I thought. For some reason it needed more space on my C drive even though it was loading data from a different, much larger drive. In the past I've experienced more rejected page images with it than with FR, so we'll see.

FR 8 has been rock solid for me. I don't think you can go wrong using it for recognition of the book you're scanning. Like I said earlier, I've never actually used it for image acquisition from a scanner, only for doing the OCR.




PGP Key and corresponding e-mail address
View user's profile Visit user's homepage View All Posts By User

  Go To Top