Editing Files after/while verification

  • 77 Views
  • Last Post 2 weeks ago
Lennart Hagemann posted this 5 weeks ago

Hello,

is it possible to edit the recognized files while or after verification? E.g. read a PDF -> put a specific stamp (date, company, signature) on the PDF -> export 

Thanks a lot

Order By: Standard | Newest | Votes
AlexeyEfremov posted this 2 weeks ago

Hello Lennart,

The answer is yes, you can do it. You can access the property IPictureObject of the page, get the Hbitmap handle of the image, change the image and then replace it for page:

 

OLE_HANDLE handle = Document.Pages[0].Picture.Handle;

System.Drawing.Bitmap bitmap = System.Drawing.Image.FromHbitmap( handle );

//<do something with bitmap>

 

IPictureObject FinalPicture = FCTools.PictureFromHBitmap( bitmap.GetHbitmap().ToInt32(), 300 );

Document.Pages[0].ReplaceImage(FinalPicture);

 

You can do this in the export script or create a custom document processing stage.

 

Hope this helps.

Alexey

 

Fritz posted this 2 weeks ago

1. After registration I want to come back to the discussion page I had opened before. Please fix that.

2. Now the question. I want to edit the OCR result behind a PDF text, but without changing the visible document. I scan old Fraktur pages, and would like to edit the OCR results only, for easier machine search, like from an original long ſ into a serpent s e.g. buſt to bust. The original must remain untouched. How to do that? Thank you.

AlexeyEfremov posted this 2 weeks ago

Hello Fritz,

Unfortunately, FlexiCapture dose not have such capabilities by design.

Alexey.

 

Fritz posted this 2 weeks ago

But I vaguely remember that Abbyy has some software that can do that? It’s a frequent problem. Fritz

AlexeyEfremov posted this 2 weeks ago

The ABBYY policy is that the software functions as a "black box". You insert image and get the text as output. 

The closest you can get to changing the text is using ABBYY FineReader Engine's methods Remove(fromPos, toPos) and Insert(position, insertString, charParams) of the Paragraph object. But this methods are mot meant for a wide use because of the processing time they need (irrelevant for one document, but noticeable on a big scale)

We recommend to use third party tools for the tasks you propose. 

Alexey

Fritz posted this 2 weeks ago

See https://stackoverflow.com/questions/32914609/how-can-i-edit-the-search-text-of-a-searchable-pdf
   “I'm using ABBYY FineReader 12 Professional. (not open source). Just open a scanned image or scanned pdf and press Verify Text (or Ctrl + F7), then you go over all the spelling errors or low-confidence charachters and fix them.
   The program is very good, it shows you the exact place in image/pdf to correct and the OCR guessing side by side for convenience. It iterates all of them.
   [By the way, I'm using the shortcuts to speed up things: Alt+Enter to add the unrecognized word to dictionary. Ctrl+Delete to skip word or confirm in case you fixed it.]
   Then save the document as a pdf file, Menu: File>Save Document As> PDF File, and you can search it on every pdf reader. The saved file looks the same as the scanned one, but 'behind' it there [is the corrected] text.
   It's weird you tried ABBYY with no luck... it's working great for me. Maybe you didn’t try the Professional version.”

– Might that work, Alexey? I have no finereader, I just enquire with high interest as tech journalist, see www.Joern.De/Presseausweis (can you give me a free version to try? Fritz@Joern.De).

AlexeyEfremov posted this 2 weeks ago

I thought you were asking about the business/industrial solutions.

Yes, desktop solutions have this capability. You can request the trial here:

https://www.abbyy.com/en-eu/download/finereader/

 

Close