Ok!

Thanks Guillaume!

I investigate Tesseract!

Best Regards!

Eduardo

2012/5/23 Guillaume Lazzara <lazzara@lrde.epita.fr>
Dear Eduardo.

On 05/14/2012 07:53 PM, Eduardo Basterrechea wrote:
> I read the web, but I can't find info about if you use linguistics data
> in OCR, and if you can OCR spanish texts.
>
> Thanks for the project it seems to be a great product !

Thanks for your interest in our project!
Regarding the Scribo module, its main task is to detect and extract
structure and data in documents.

We perform image processing treatments on images and try to OCR detected
text regions. For OCR, we use the open source project Tesseract which
supports many languages, including Spanish.

In Scribo, functions calling the OCR  let the user choose which language
to use for recognition. For Spanish you shall have to use "spa" as argument.

For the moment, we do not use any other linguistics data in OCR such as
dictionnary or semantic post-processing to improve results.

Let us know if you need more information.

Best regards,

--
Guillaume



--

Eduardo Basterrechea Molina

@ebaste

 

Fundador y CEO

 

Nanclares de Oca 1F Bajo F

28022 Madrid

Tel: +34 91 3292318

 

ebaste@molinodeideas.es

 

Descripción: logo-MolinoLetras


Molino de ideas 

Onoma 

Molinolabs 

Gominolabs 

Molinarium

Refranario

Dictio

Fonemolabs

Face-Molino 

@Molinodeideas