Le 14 janv. 08 à 21:19, Matthieu Garrigues a écrit :
URL:
https://svn.lrde.epita.fr/svn/oln/trunk/milena
ChangeLog:
2008-01-14 Matthieu Garrigues <garrigues(a)lrde.epita.fr>
Begin a process to extract arrays.
* sandbox/garrigues/factures/array_global.cc: New.
* sandbox/garrigues/factures/facture.pgm: An receipt image.
(Il manque un « New » ici, mais c'est pas grave.)
C'est du détail et c'est juste pour la culture (pour les galas
mondains !), mais dans ce cas on dit plutôt « invoice » que « receipt ».
[...]
---
array_global.cc | 101 ++++++++++++++++++++++++++++++++++++++++++++++
++++++++++
1 file changed, 101 insertions(+)
Index: trunk/milena/sandbox/garrigues/factures/array_global.cc
===================================================================
--- trunk/milena/sandbox/garrigues/factures/array_global.cc
(revision 0)
+++ trunk/milena/sandbox/garrigues/factures/array_global.cc
(revision 1661)
@@ -0,0 +1,101 @@
+// Copyright (C) 2008 EPITA Research and Development Laboratory
+//
+// This file is part of the Olena Library. This library is free
+// software; you can redistribute it and/or modify it under the terms
+// of the GNU General Public License version 2 as published by the
+// Free Software Foundation.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+// General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with this library; see the file COPYING. If not, write to
+// the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+// Boston, MA 02111-1307, USA.
+//
+// As a special exception, you may use this file as part of a free
+// software library without restriction. Specifically, if other
files
+// instantiate templates or use macros or inline functions from this
+// file, or you compile this file and link it with other files to
+// produce an executable, this file does not by itself cause the
+// resulting executable to be covered by the GNU General Public
+// License. This exception does not however invalidate any other
+// reasons why the executable file might be covered by the GNU
General
+// Public License.
+
+#include <mln/core/image2d.hh>
+#include <mln/io/pgm/all.hh>
+#include <mln/io/pbm/all.hh>
+#include <mln/geom/resize.hh>
+#include <mln/display/all.hh>
+#include <mln/value/int_u8.hh>
+#include <mln/morpho/closing.hh>
+#include <mln/win/rectangle2d.hh>
+
+#include <mln/pw/all.hh>
+#include <mln/core/inplace.hh>
+#include <mln/level/stretch.hh>
+#include <mln/labeling/level.hh>
+#include <mln/core/neighb2d.hh>
+
+#include <mln/accu/all.hh>
+
+int main()
+{
+ using namespace mln;
+ using namespace mln;
+
+ typedef image2d<bool> ima2d_bool;
+ typedef image2d<unsigned> ima2d_unsigned;
+
+ image2d<value::int_u8> in;
+
+ io::pgm::load(in , "facture.pgm");
Une espace en trop après la virgule (c'est le genre de détail de
coding style à corriger dans la foulée d'un autre patch).
+
+ // Resize 50%.
Resize by 50%.
+ image2d<value::int_u8> small = in; //geom::resize(in, 0.5);
+
+ // Binarisation.
Binarization.
+ ima2d_bool bin(small.domain());
+ level::paste(inplace(pw::value(small) > pw::cst(50) |
small.domain()), bin);
+
+ // Labeling.
+ unsigned nlabels;
+ image2d<unsigned> labels = labeling::level(bin, true, c4(),
nlabels);
+
+ // Get the caracteristics of the connected components.
« characteristics » (avec un « h ») ou encore « features » (correction
à propager plus bas).
+ std::vector< accu::pair_< accu::bbox<point2d>,
accu::count_<point2d> > > caracteristics(nlabels);
+ mln_fwd_piter_(ima2d_unsigned) p(labels.domain());
+ for_all(p)
+ caracteristics[labels(p)].take(p);
+
+ // Filter the connected components.
+ std::vector<bool> is_array(nlabels);
+ for (int i = 0; i < nlabels; i++)
+ {
+ box_<point2d> b = caracteristics[i].to_result().first;
+ unsigned n = caracteristics[i].to_result().second;
+ float ratio;
+ if (b.pmax() == b.pmin() || n < 10)
+ {
+ is_array[i] = false;
+ continue;
+ }
+
+ if (0 != (b.pmax()[1] - b.pmin()[1]))
+ ratio = float(b.pmax()[0] - b.pmin()[0]) / (b.pmax()[1] -
b.pmin()[1]);
+ else
+ ratio = float(b.pmax()[1] - b.pmin()[1]) / (b.pmax()[0] -
b.pmin()[0]);
+
+ int area = (b.pmax()[0] - b.pmin()[0]) * (b.pmax()[1] - b.pmin()
[1]);
+ is_array[i] = ratio < 0.1 || ratio > 10 || (float(n) / area) <
0.2;
+ }
+
+ // Clean the image.
+ for_all(p)
+ bin(p) = is_array[labels(p)];
+
+ io::pbm::save(bin, "array.pbm");
+}
Ça a l'air cool tout ça !
Index:
trunk/milena/sandbox/garrigues/factures/facture.pgm
===================================================================
Cannot display: file marked as a binary type.
svn:mime-type = application/octet-stream
Property changes on: trunk/milena/sandbox/garrigues/factures/
facture.pgm
___________________________________________________________________
Name: svn:mime-type
+ application/octet-stream
Attention, ce document ne nous appartient pas ! Nous n'avons donc pas
le droit de le diffuser (et donc de le stocker dans le dépôt --
public ! -- Subversion). Je sais que ce n'est pas pratique, mais il
faut le supprimer.
On peut éventuellement se faire une base de données de factures
(licitement enregistrables dans le dépôt) à partir de documents du
LRDE numérisés avec notre scanner. Me le rappeler pour que j'y repense.