Dear Eduardo.
On 05/14/2012 07:53 PM, Eduardo Basterrechea wrote:
> I read the web, but I can't find info about if you use linguistics data
> in OCR, and if you can OCR spanish texts.
>
> Thanks for the project it seems to be a great product !
Thanks for your interest in our project!
Regarding the Scribo module, its main task is to detect and extract
structure and data in documents.
We perform image processing treatments on images and try to OCR detected
text regions. For OCR, we use the open source project Tesseract which
supports many languages, including Spanish.
In Scribo, functions calling the OCR let the user choose which language
to use for recognition. For Spanish you shall have to use "spa" as argument.
For the moment, we do not use any other linguistics data in OCR such as
dictionnary or semantic post-processing to improve results.
Let us know if you need more information.
Best regards,
--
Guillaume
Dear Cristián,
On 11/30/2011 01:57 AM, Cristián Canivell Gutiérrez (UPM) wrote:
> The definition of geom::min/max_row/col its that it gives you the image
> limits, so I should get lena's image in a chess pattern.
> Strangely as it may look, it happens to not fill the right and bottom
> edges as you can see in the picture:
>
> image.png
>
> I think there is something wrong in the tutorial as the loop should have
> as upper limit row/col *<=* geom::max_row/col instead of *=*
Sorry for the late answer but thanks for your report!
Indeed, this is an obvious mistake... '=' should be replaced by '<=' in
the loop as you noticed.
It will be fixed in the next issue of the documentation.
geom::max_{row,col}() functions returns the maximum {row,col} index
which should be included in the browsing.
Thanks again.
Best regards,
-- Guillaume
Dear Cristián,
On 11/30/2011 01:57 AM, Cristián Canivell Gutiérrez (UPM) wrote:
> The definition of geom::min/max_row/col its that it gives you the image
> limits, so I should get lena's image in a chess pattern.
> Strangely as it may look, it happens to not fill the right and bottom
> edges as you can see in the picture:
>
> image.png
>
> I think there is something wrong in the tutorial as the loop should have
> as upper limit row/col *<=* geom::max_row/col instead of *=*
Sorry for the late answer but thanks for your report!
Indeed, this is an obvious mistake... '=' should be replaced by '<=' in
the loop as you noticed.
It will be fixed in the next issue of the documentation.
geom::max_{row,col}() functions returns the maximum {row,col} index
which should be included in the browsing.
Thanks again.
Best regards,
--
Guillaume
Hello,
Unfortunately your package "olena" was rejected because of the following
reason:
You are not uploading to one of those Debian distributions: oldstable stable unstable experimental stable-backports oldstable-backports oldstable-backports-sloppy oldstable-security stable-security testing-security stable-proposed-updates testing-proposed-updates sid wheezy squeeze lenny squeeze-backports lenny-backports lenny-security lenny-backports-sloppy lenny-volatile squeeze-security squeeze-updates wheezy-security unreleased
Please try to fix it and re-upload. Thanks,
--
mentors.debian.net
Hello,
Unfortunately your package "olena" was rejected because of the following
reason:
Couldn't find user olena(a)lrde.epita.fr. Exiting.
Please try to fix it and re-upload. Thanks,
--
mentors.debian.net
On 08/03/2012 00:22, Mathias Gaunard wrote:
> Hi,
Hi,
> Following my invitation at LRDE today from Alexandre Borghi, I was
> curious about SCOOP v2.
>
> I've only been able to find this presentation
> <http://www.lrde.epita.fr/dload/techrep/techrep0601.pdf>
On the Publication section of our web site:
http://www.lrde.epita.fr/cgi-bin/twiki/view/Publications/InConference
just search for "scoop".
> I was surprised to see that the mechanism being described at the end of
> the presentation was similar to Boost.Dispatch, a library that I have
> developed with Joel Falcou as part of the NT² library and that I plan to
> submit to the Boost C++ Libraries.
>
I do not know Boost.Dispatch. FYI we use static dispatches in
our library since about 2001.
> Unfortunately I wasn't able to understand what it exactly does and how
> it works due to the pseudo-code being used.
First read the 2003 paper about SCOOP, I'm sure it will help ;-)
> Could someone tell me where to find a more technical example of what it
> exactly does and whether evolutions have been made?
Unfortunately we do not have any materials on how we have
simplified SCOOP 2. We'll let you know when such info will
be available.
> Thank you.
You're welcome.
Cheers,
théo
Hi,
Following my invitation at LRDE today from Alexandre Borghi, I was
curious about SCOOP v2.
I've only been able to find this presentation
<http://www.lrde.epita.fr/dload/techrep/techrep0601.pdf>
I was surprised to see that the mechanism being described at the end of
the presentation was similar to Boost.Dispatch, a library that I have
developed with Joel Falcou as part of the NT² library and that I plan to
submit to the Boost C++ Libraries.
Unfortunately I wasn't able to understand what it exactly does and how
it works due to the pseudo-code being used.
Could someone tell me where to find a more technical example of what it
exactly does and whether evolutions have been made?
Thank you.
On 30/11/2011 10:28, Mikado Mikado wrote:
> Hi thanks for reply,
>
> I have installed this:
>
> /user@nuxeo:~$ tesseract -v
> *tesseract 3.01*/
>
> And i have installed* /tesseract-ocr-dev/*/ /packages too.
> I have/* Ubuntu 11.10*/ with /*g++ 4.6.1.*/
>
> I get this error:
>
> */In file included from
> ../../../../../scribo/scribo/draw/bounding_box_links.hh:37:0,/*
> */ from
> ../../../../../scribo/scribo/debug/linked_bboxes_image.hh:47,/*
> */ from
> ../../../../../scribo/scribo/toolchain/internal/content_in_hdoc_functor.hh:85,/*
> */ from
> ../../../../../scribo/scribo/toolchain/content_in_hdoc.hh:33,/*
> */ from
> ../../../../../scribo/src/contest/DAE-2011/content_in_hdoc_dae.cc:38:/*
> */../../../../../milena/mln/canvas/browsing/depth_first_search.hh:92:34:
> aviso: const ‘mln::canvas::browsing::depth_first_search’ sin
> inicializar [-fpermissive]/*
> */../../../../../milena/mln/canvas/browsing/depth_first_search.hh:80:14:
> nota: ‘const struct mln::canvas::browsing::depth_first_search_t’ no
> tiene constructor por defecto proporcionado por el usuario/*
Since you are using -fpermissive, this is not an error anymore. Gcc
keeps warning you about that issue but ignores it.
>
> And this:
>
> /*In file included from
> ../../../../scribo/src/text/pbm_lines_recognition.cc:36:0:*/
> /*../../../../scribo/scribo/text/recognition.hh: En la función ‘void
> scribo::text::recognition(scribo::line_set<L>&, const char*)’:*/
> /*../../../../scribo/scribo/text/recognition.hh:112:7: error:
> ‘TessBaseAPI’ no se ha declarado*/
> /*../../../../scribo/scribo/text/recognition.hh:174:12: error:
> ‘TessBaseAPI’ no se ha declarado*/
> /*../../../../scribo/scribo/text/recognition.hh: En la función ‘void
> scribo::text::recognition(const mln::Image<I>&, const char*, const
> string&)’:*/
> /*../../../../scribo/scribo/text/recognition.hh:220:7: error:
> ‘TessBaseAPI’ no se ha declarado*/
> /*../../../../scribo/scribo/text/recognition.hh:248:17: error:
> ‘TessBaseAPI’ no se ha declarado*/
> /*make[1]: *** [pbm_lines_recognition-pbm_lines_recognition.o] Error 1*/
> /*make[1]: se sale del directorio
> «/home/master/Descargas/olena-2.0/_build/scribo/src/text»*/
> /*make: *** [all-recursive] Error 1*/
>
According to your config.log and what you are telling me, it seems that
you have two versions of tesseract installed 2.04 and 3.01. Right ?
I have tried to compile Olena on a fresh install of Ubuntu 11.10 and
everything went fine (with -fpermissive). Tesseract 2.04 was used
though, since, AFAIK, this is the only version officially provided by
Ubuntu for now.
* Did you installed both versions of tesseract in the standard
directories (/usr/ but not /usr/local) ? It might be an issue of our
configure script which gets confused by the two versions installed.
* What does this command print:
$ which tesseract
* You will definitely get best results with Tesseract 3.01. I advice you
to install it manually in a separate location by doing the following:
#Download files
sudo apt-get install libleptonica-dev
cd /tmp
wget http://tesseract-ocr.googlecode.com/files/tesseract-3.01.tar.gzhttp://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.01.eng.tar.gz
#Install Tesseract core
tar zxvf tesseract-3.01.tar.gz
cd tesseract-3.01
./autogen.sh
./configure --prefix=/usr/local
sudo make install
#Install English support
cd ..
tar zxvf
http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.01.eng.tar.gz
sudo mv tesseract-ocr/tessdata/* /usr/local/share/tessdata/
Now you have Tesseract 3.01 installed in /usr/local
Go back to your olena directory and set the location of Tesseract during
configure:
cd <olena_dir>
./configure --with-tesseract=/usr/local
make
Now Olena should use Tesseract 3.01 and compile without any issue.
--
Guillaume Lazzara
EPITA Research and Development Laboratory (LRDE)
14-16 rue Voltaire F-94276 Le Kremlin-Bicetre France
phone +33 1 53 14 59 39 - fax +33 1 53 14 59 22