OCR-D-compliant page segmentation

Last update: Sep 10, 2022

Related tags

Computer Vision ocr-d

Overview

ocrd_segment

This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation.

Installation

In your virtual environment, run:

pip install .

Usage

exporting page images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with region polygon coordinates and metadata, also MS-COCO:
- ocrd-segment-extract-pages
exporting region images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with region polygon coordinates and metadata:
- ocrd-segment-extract-regions
exporting line images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with line polygon coordinates and metadata:
- ocrd-segment-extract-lines
importing layout segmentations from other formats (mask images, MS-COCO JSON annotation):
- ocrd-segment-from-masks
- ocrd-segment-from-coco
repairing layout segmentations (input file groups N >= 1, based on heuristics implemented using Shapely):
- ocrd-segment-repair 🚧 (much to be done)
comparing different layout segmentations (input file groups N = 2, compute the distance between two segmentations, e.g. automatic vs. manual):
- ocrd-segment-evaluate 🚧 (very early stage)
pattern-based segmentation (input file groups N=1, based on a PAGE template, e.g. from Aletheia, and some XSLT or Python to apply it to the input file group)
- ocrd-segment-via-template 🚧 (unpublished)
data-driven segmentation (input file groups N=1, based on a statistical model, e.g. Neural Network)
- ocrd-segment-via-model 🚧 (unpublished)

For detailed description on input/output and parameters, see ocrd-tool.json

Testing

None yet.

Comments

Processor segment-repair end with Exception

The processor 'segment-repir' ends wirh Exception "Exception: ocrd-segment-repair exited with non-zero return value 1" if it comes after processor 'cis-ocropy-segment' in the workflow. In a changed workflow.

In a modified workflow, where processor 'cis-ocropy-segment' is replaced by processor 'tesserocr-segment-line', the processing runs.

opened by j-panzer 7

Conversion Error

When using the ocrd-segment-repair, I've encountered the following Error

10:30:48.758 INFO processor.RepairSegmentation - Sanitizing region "region0071"
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/bin/ocrd-segment-repair", line 8, in <module>
    sys.exit(ocrd_segment_repair())
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/cli.py", line 13, in ocrd_segment_repair
    return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/decorators.py", line 60, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/processor/base.py", line 57, in run_processor
    processor.process()
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 88, in process
    self.sanitize_page(page, page_id)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 202, in sanitize_page
    scale = int(np.median(np.array(heights)))
ValueError: cannot convert float NaN to integer

The Region in question belongs to a Newspaper Digitalization.

It is possible to workaround at line 202 in repair.py (please see above) with with a check like

            _median = np.median(np.array(heights))
            if not np.isnan(_median):
                scale = int(_median)
            else:
                scale = 1

which finally yields at the same place

10:48:47.496 INFO processor.RepairSegmentation - Sanitizing region "region0071"
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
10:48:47.699 WARNING processor.RepairSegmentation - Zero contour area in region "region0071"

but this way the processing move further.

opened by M3ssman 5

change default output filegroup for `ocrd-segment-replace-original`

The default output filegroup for ocrd-segment-replace-original is set to OCR-D-IMG-CROP which already exists in the majority of METS-files. Would be great to change the default value, so a user is not forced to specify it on his own (as for the other processors this is purely optional). @kba suggested, that changing the default value might actually not be necessary, if we could drop the rule for two output filegroups also for this processor.

opened by EEngl52 4
sanitize: stay on page image/array

Fixes #21 – it was not correct to use the region image/array here, because that depends on the bounding box of the region, which can be too small.

Something not covered by this is when TextLine coordinates even extrude the page Border.

opened by bertsky 4
Expand regions via repair/sanitize
Before samitization:

Regions are often too small and do not span the lines they (should) contain.

After sanitization:

Situation is not much better, although

$ ocrd-segment-repair -J ... "sanitize": { "type": "boolean", "default": false, "description": "Shrink and/or expand a region in such a way that it coordinates include those of all its lines" } ...

Expansion does not work, is not complete.
bug
opened by wrznr 4
Add the basic project layout and minimal functionality

This is supposed to be an OCR-D processor which someday will give plausibility feedback on a page's segmentation. It uses https://pypi.org/project/Shapely/ as proposed by @bertsky.

opened by wrznr 4
">
ocrd-segment-extract-lines ignores lines with "\n" in
I have used ocrd-segment-extract-lines with a PAGE file, which has had some <TextLine> with a "\n" in the <Unicode>area. Unfortunately, for these lines the extraction is not done.

This example works ok:```

<pc:TextEquiv> <pc:Unicode>1889</pc:Unicode> </pc:TextEquiv>

This example does not work:

<pc:TextEquiv> <pc:Unicode>1889 </pc:Unicode> </pc:TextEquiv>

==> please clarify ...
opened by stefanCCS 3

Processor ocrd-segment-repair exits with exception

Log output:

12:06:01.982 INFO ocrd.task_sequence.run_tasks - Start processing task 'segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -p '{"plausibilize": true, "sanitize": false, "plausibilize_merge_min_overlap": 0.9}''
Traceback (most recent call last):
  File "/venv-20200919/bin/ocrd", line 8, in <module>
    sys.exit(cli())
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/venv-20200919/lib/python3.7/site-packages/ocrd/cli/process.py", line 28, in process_cli
    run_tasks(mets, log_level, page_id, tasks, overwrite)
  File "/venv-20200919/lib/python3.7/site-packages/ocrd/task_sequence.py", line 149, in run_tasks
    raise Exception("%s exited with non-zero return value %s. STDOUT:\n%s\nSTDERR:\n%s" % (task.executable, returncode, out, err))
Exception: ocrd-segment-repair exited with non-zero return value 1. STDOUT:

STDERR:
12:06:02.420 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
12:06:02.423 INFO ocrd.page_validator - Validating input file 'FILE_0001_OCR-D-SEG-REG'
12:06:02.439 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
12:06:02.440 INFO ocrd.page_validator - Validating input file 'FILE_0002_OCR-D-SEG-REG'
Traceback (most recent call last):
  File "/venv-20200919/local/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
    sys.exit(ocrd_segment_repair())
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 16, in ocrd_segment_repair
    return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators.py", line 102, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
    processor.process()
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 94, in process
    parents = list(set([region.parent_object_ for region in page.get_AllRegions(classes=['Text'])]))
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_models/ocrd_page_generateds.py", line 2905, in __hash__
    return hash(self.id)
AttributeError: 'PageType' object has no attribute 'id'

opened by stweil 3

Repair fix coords
This attempts to fix problems caused by invalid polygons from ocrd-segment-repair (both in sanitize and plausibilize mode).

This taught me another lesson about what can go wrong with Shapely / numpy / PAGE interaction. To sum up:

Ensuring valid polygons on the input side (e.g. from OpenCV) is always necessary. The only generic way I can think of is to feed them through simplify with ever increasing tolerance until valid. EDIT2 The problem is that the result of the algorithm implemented in Shapely/GEOS depends on the starting point it picked. In pathological cases, no simplification whatsoever can be achieved. (The only thing that then helps is re-ordering...)

Operations like union or intersection can create collections of shapes. EDIT There are actually 2 cases here:

homogeneous (MultiPolygon) – a discontiguous collection of Polygon – in which case one needs the convex hull.

heterogeneous (GeometryCollection) – a collection of Polygon with Point or LineString – in which case one needs to filter out those shapes which have no intrinsic area (and then check again for the other cases)

Operations like union or intersection can create non-integer points, which when rounded for PAGE serialization can become invalid paths. Unfortunately, Shapely always calculates in floating point internally. So all we can do is rounding and then ensuring validity (as in 1).

Related:

https://github.com/cisocrgroup/ocrd_cis/issues/67

https://github.com/cisocrgroup/ocrd_cis/issues/62

https://github.com/OCR-D/ocrd_tesserocr/issues/149

https://github.com/OCR-D/ocrd_tesserocr/issues/151
opened by bertsky 3
Update README, only announce features that are acutally provided

README announces ocrd-segment-via-template and ocrd-segment-via-model – none of which are actually provided by this package. It does provide some ocrd-segment-extract-* features; these do not do any segmentation though (or I could not find out how).

opened by dariok 3
cannot upload to pypi anymore
The change https://github.com/OCR-D/ocrd_segment/commit/c8756272caf900febe7166f8bed5d20713f002cf depends on https://github.com/ppwwyyxx/cocoapi/pull/7, an addition of mine to the current pycocotools version 2.0.3 on PyPI. Such git URL references are allowed in requirements.txt / setuptools, but the PyPI server refuses taking such builds:

Invalid value for requires_dist. Error: Can't have direct dependency: 'pycocotools @ git+https://github.com/bertsky/pycocotools#subdirectory=PythonAPI'

@kba, do you know what to do under such circumstances?
opened by bertsky 2

Build fails for MacOS (ocrd-fork-pycocotools)

Running make all for ocrd_all or pip install . for ocrd_segment fails on MacOS with Homebrew:

      Compiling pycocotools/_mask.pyx because it changed.
      [1/1] Cythonizing pycocotools/_mask.pyx
      /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/.eggs/Cython-3.0.0a11-py3.9.egg/Cython/Compiler/Main.py:345: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/pycocotools/_mask.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      building 'pycocotools._mask' extension
      creating build/common
      creating build/temp.macosx-12-arm64-cpython-39
      creating build/temp.macosx-12-arm64-cpython-39/common
      creating build/temp.macosx-12-arm64-cpython-39/pycocotools
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -I/OCR-D/venv-20221112/lib/python3.9/site-packages/numpy/core/include -I./common -I/OCR-D/venv-20221112/include -I/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c ../common/maskApi.c -o build/temp.macosx-12-arm64-cpython-39/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99
      clang: error: no such file or directory: '../common/maskApi.c'
      clang: error: no input files
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for ocrd-fork-pycocotools

opened by stweil 7

Error in shapely/ocrd_segment

The segment-repair processor in the following workflow:

ocrd process \
"olena-binarize -I OCR-D-IMG -O OCR-D-BIN -P impl sauvola" \
"anybaseocr-crop -I OCR-D-BIN -O OCR-D-CROP" \
"olena-binarize -I OCR-D-CROP -O OCR-D-BIN2 -P impl kim" \
"cis-ocropy-denoise -I OCR-D-BIN2 -O OCR-D-BIN-DENOISE -P level-of-operation page" \
"cis-ocropy-deskew -I OCR-D-BIN-DENOISE -O OCR-D-BIN-DENOISE-DESKEW -P level-of-operation page" \
"tesserocr-segment-region -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG-REG" \
"segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -P plausibilize true" \
"cis-ocropy-deskew -I OCR-D-SEG-REPAIR -O OCR-D-SEG-REG-DESKEW -P level-of-operation region" \
"cis-ocropy-clip -I OCR-D-SEG-REG-DESKEW -O OCR-D-SEG-REG-DESKEW-CLIP -P level-of-operation region" \
"tesserocr-segment-line -I OCR-D-SEG-REG-DESKEW-CLIP -O OCR-D-SEG-LINE" \
"segment-repair -I OCR-D-SEG-LINE -O OCR-D-SEG-REPAIR-LINE -P sanitize true" \
"cis-ocropy-dewarp -I OCR-D-SEG-REPAIR-LINE -O OCR-D-SEG-LINE-RESEG-DEWARP" \
"calamari-recognize -I OCR-D-SEG-LINE-RESEG-DEWARP -O OCR-D-OCR -P checkpoint_dir qurator-gt4histocr-1.0"

executed on the DEFAULT file group inside this workspace: https://content.staatsbibliothek-berlin.de/dc/PPN631277528.mets.xml

produces the following error:

  12:45:52.522 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
  12:45:52.524 INFO ocrd.page_validator.validate - Validating input file 'FILE_0001_OCR-D-SEG-LINE'
  12:45:52.652 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
  12:45:52.654 INFO ocrd.page_validator.validate - Validating input file 'FILE_0002_OCR-D-SEG-LINE'
  12:45:52.776 INFO processor.RepairSegmentation - INPUT FILE 2 / PHYS_0003
  12:45:52.777 INFO ocrd.page_validator.validate - Validating input file 'FILE_0003_OCR-D-SEG-LINE'
  12:45:52.912 INFO processor.RepairSegmentation - INPUT FILE 3 / PHYS_0004
  12:45:52.914 INFO ocrd.page_validator.validate - Validating input file 'FILE_0004_OCR-D-SEG-LINE'
  12:45:53.017 INFO processor.RepairSegmentation - INPUT FILE 4 / PHYS_0005
  12:45:53.019 INFO ocrd.page_validator.validate - Validating input file 'FILE_0005_OCR-D-SEG-LINE'
  12:45:53.026 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0011'
  12:45:53.027 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0012'
  12:45:53.119 WARNING processor.RepairSegmentation - Zero contour area in region "region0000"
  12:45:53.730 WARNING processor.RepairSegmentation - Zero contour area in region "region0011"
  12:45:53.734 WARNING processor.RepairSegmentation - Zero contour area in region "region0012"
  12:45:54.609 INFO processor.RepairSegmentation - INPUT FILE 5 / PHYS_0006
  12:45:54.610 INFO ocrd.page_validator.validate - Validating input file 'FILE_0006_OCR-D-SEG-LINE'
  12:45:54.708 INFO processor.RepairSegmentation - INPUT FILE 6 / PHYS_0007
  12:45:54.710 INFO ocrd.page_validator.validate - Validating input file 'FILE_0007_OCR-D-SEG-LINE'
  12:45:54.812 WARNING processor.RepairSegmentation - Zero contour area in region "region0003"
  12:45:55.186 ERROR shapely.geos - TopologyException: side location conflict at 262 1071. This can occur if the input geometry is invalid.
  Traceback (most recent call last):
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
      sys.exit(ocrd_segment_repair())
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
      return self.main(*args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1055, in main
      rv = self.invoke(ctx)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
      return ctx.invoke(self.callback, **ctx.params)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 760, in invoke
      return __callback(*args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 21, in ocrd_segment_repair
      return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators/__init__.py", line 108, in ocrd_cli_wrap_processor
      run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
      processor.process()
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 188, in process
      padding=self.parameter['sanitize_padding'])
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 559, in shrink_regions
      if len(contour) >= 3], scale=scale)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/project.py", line 179, in join_polygons
      jointp = unary_union(polygons)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/ops.py", line 161, in unary_union
      return geom_factory(lgeos.methods['unary_union'](collection))
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/geometry/base.py", line 73, in geom_factory
      raise ValueError("No Shapely geometry can be created from null value")
  ValueError: No Shapely geometry can be created from null value

This is the input image: FILE_0007_DEFAULT

opened by MehmedGIT 0

ocrd-segment-extract-lines - Lines are not extracted, in case they are in an area of other lines

Hi, I think I have found a bug in ocrd-segment-extract-lines: I cannot prove to 100%, but I think I see my environment, that the lines are not extracted (no images are created), in case a line is somehow graphically (concerning the coordinates) within another line of the same region. I extract only images in this case using this command:

ocrd-segment-extract-lines -I $infolder -O $extractLineImagesFolder  -P  output-types '[]' -P min-line-length 0 -P min-line-width 5 -P min-line-height 5

Page-Extract: Here the line TR-15_line0002 was not extracted:

    <pc:TextRegion id="TR-15" orientation="0.">
      <pc:AlternativeImage filename="OCR-D-REG-VL-BL/OCR-D-REG-VL-BL_4749_007817786_00183_TR-15.IMG-DESKEW.png" comments=",binarized,deskewed,verticallinesremoved" />
      <pc:Coords points="237,383 237,438 443,438 443,383" />
      <pc:TextLine id="TR-15_line0001">
        <pc:Coords points="237,438 237,383 239,383 253,391 311,391 320,383 349,383 357,390 365,383 384,383 402,391 419,383 427,383 430,418 428,438 302,438 298,435 289,435 284,438" />
        <pc:Baseline points="227,415 430,418" />
      </pc:TextLine>
      <pc:TextLine id="TR-15_line0003">
        <pc:Coords points="261,438 269,433 274,433 295,438" />
        <pc:Baseline points="254,475 295,475" />
      </pc:TextLine>
      <pc:TextLine id="TR-15_line0002">
        <pc:Coords points="385,438 388,435 388,434 409,434 409,438" />
        <pc:Baseline points="343,478 412,475" />
      </pc:TextLine>
    </pc:TextRegion>

Logfile content for this case:

2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.189 WARNING processor.ExtractLines - Line 'TR-14_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.201 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.242 WARNING processor.ExtractLines - Line 'TR-15_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log:2022-08-11 14:21:31.255 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.256 WARNING processor.ExtractLines - Line 'TR-15_line0003' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.267 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.268 WARNING processor.ExtractLines - Line 'TR-15_line0002' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.311 WARNING processor.ExtractLines - Line 'TR-16_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.348 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin.png

opened by stefanCCS 5

ocrd-segment-repair: handle case where points is empty

Version 0.1.20, ocrd/core 2.33.0

I have a PAGE file, which does not have any real content - like this:

    <pc:Page imageFilename="OCR-D-IMG/0038_IMAGE000918_00001.tif" imageWidth="1420" imageHeight="2313" orientation="0.">
        <pc:AlternativeImage filename="OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png" comments=",binarized"/>
        <pc:TextRegion id="TR-1" orientation="0.">
            <pc:Coords points=""/>
        </pc:TextRegion>
    </pc:Page>

If I call ocrd-segment-extract-lines, I get an expection like this:

09:19:19.733 DEBUG ocrd.workspace.image_from_page - page 'P_0038_IMAGE000918_00001' has  orientation=0 skew=0.00
09:19:19.733 DEBUG ocrd.workspace.image_from_page - Using AlternativeImage 1 {'', 'binarized'} for page 'P_0038_IMAGE000918_00001'
09:19:19.734 DEBUG ocrd.workspace.download_file - download_file <OcrdFile fileGrp=OCR-D-BIN ID=OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN, mimetype=image/png, url=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png, local_filename=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png]/>  [_recursion_count=0]
09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IHDR' 16 13
09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IDAT' 41 65536
Traceback (most recent call last):
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/bin/ocrd-segment-extract-lines", line 8, in <module>
    sys.exit(ocrd_segment_extract_lines())
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/cli.py", line 65, in ocrd_segment_extract_lines
    return ocrd_cli_wrap_processor(ExtractLines, *args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
    processor.process()
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/extract_lines.py", line 171, in process
    transparency=self.parameter['transparency'])
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 829, in image_from_segment
    fill=fill, transparency=transparency)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 1012, in _crop
    segment_polygon = coordinates_of_segment(segment, parent_image, parent_coords)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 136, in coordinates_of_segment
    polygon = np.array(polygon_from_points(segment.get_Coords().points))
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 148, in polygon_from_points
    polygon.append([float(x_y[0]), float(x_y[1])])
ValueError: could not convert string to float:

My expection would be, that this PAGE file simply would be ignored. --> please, clarify ...

opened by stefanCCS 6

evaluate: explain/document metrics

If I understand correctly the idea behind these metrics are taken from "rethinking semantic segmentation evaluation" paper, but could you explain to me how could I obtain AP,TPs,FPs,FNs for instance segmentation task?

Originally posted by @andreaceruti in https://github.com/cocodataset/cocoapi/issues/564#issuecomment-1064223428

opened by bertsky 1
evaluate: false redundant matches if overlaps occur on any side already

The multi-match overlap algorithm (necessary to calculate over- and undersegmentation) still has a glitch: it will create fake/redundant pairings if either side has a segmentation that already overlaps locally. For example, take a page with a GraphicRegion overlapping multiple TextRegions, and evaluate that against itself: the matching will not only produce the 1:1 pairs, but also other matches. That's probably not what we want.

opened by bertsky 0

Releases(v0.1.21)

v0.1.21(May 27, 2022)
Changed:

extract-regions: add parameter classes and output COCO, too

repair/project: join polygons directly instead of alphashape

Source code(tar.gz)
Source code(zip)
v0.1.20(May 27, 2022)
Fixed:

extract-pages: fix extraction of region JSON

repair/project: make alpha shape more robust

Source code(tar.gz)
Source code(zip)
v0.1.19(May 27, 2022)
Changed:

repair (sanitize): run on all region types

repair (sanitize): add parameter sanitize_padding

repair (sanitize): use binary foreground instead of text line coordinates

repair (plausibilize): use true alpha shape instead of convex hull

project: add level-of-operation=table

repair: add option simplify

ensure compatibility with Shapely 1.8

Source code(tar.gz)
Source code(zip)
v0.1.18(Mar 30, 2022)
extract-lines/words: move extra parameters where they belong

extract-lines: fix regressions in v0.1.15

Source code(tar.gz)
Source code(zip)
v0.1.17(Mar 30, 2022)
Changed:

project: use true alpha shape instead of convex hull

Source code(tar.gz)
Source code(zip)
v0.1.16(Feb 21, 2022)
Fixed:

repair: fix plausibilize scope of apply-list

Changed:

project: new processor for convex hull resegmentation

Source code(tar.gz)
Source code(zip)
v0.1.15(Feb 17, 2022)
Changed:

repair: plausibilize: both analyse & apply iff enabled

extract-lines: add parameters for output types and conditions for line extraction

extract-lines: add xlsx output option for GT editing

Source code(tar.gz)
Source code(zip)
v0.1.14(Feb 17, 2022)
Changed:

repair: for non-trivial region overlaps, recurse to line level

repair: for non-trivial line overlaps, merge (if centric) or subtract

Source code(tar.gz)
Source code(zip)
v0.1.13(Dec 10, 2021)
Fixed:

evaluate: multi-matching (without pycocotools)

Changed:

evaluate: improved report format (hierarchy and names)

Added:

evaluate: over-/undersegmentation metrics, pixel-wise metrics

Source code(tar.gz)
Source code(zip)
v0.1.12(Dec 2, 2021)
Changed:

evaluate: basic IoU matching, Pr/Rc and mAP/mAR stats via pycocotools

Source code(tar.gz)
Source code(zip)
v0.1.11(Mar 23, 2021)
Fixed:

extract-pages: Border has no id

Source code(tar.gz)
Source code(zip)
v0.1.10(Feb 26, 2021)
Fixed:

extract-regions: apply feature_filter param

Changed:

extract-pages: add feature_filter param

extract-pages: add order choice for plot_segmasks

Source code(tar.gz)
Source code(zip)
v0.1.9(Feb 26, 2021)
Changed:

extract-regions/lines/words/glyphs: add feature_filter param

Source code(tar.gz)
Source code(zip)
v0.1.8(Feb 8, 2021)
Fixed:

replace-page: getLogger context

Changed:

extract-words: new

extract-glyphs: new

extract-pages: expose colordict parameter (w/ same default)

extract-pages: multi-level mask output via plot_segmasks

Source code(tar.gz)
Source code(zip)
v0.1.7(Jan 7, 2021)
Fixed:

repair: also ensure polygons have at least 3 points

replace-page: allow non-PAGE input files, too

Source code(tar.gz)
Source code(zip)
v0.1.6(Nov 25, 2020)
Fixed:

repair: also fix negative coords, also on page level

replace-original: also remove page border/@orientation

replace-original: add new original as derived image, too

Source code(tar.gz)
Source code(zip)
v0.1.5(Nov 4, 2020)
Fixed:

evaluate: adapt to zip_input_files in core

Changed:

replace-original: delegate to repair.ensure_consistent

replace-page: new CLI (inverse or replace-original)

Source code(tar.gz)
Source code(zip)
v0.1.4(Nov 4, 2020)
Changed:

repair: fix coordinate consistency/validity errors

Source code(tar.gz)
Source code(zip)
v0.1.3(Sep 24, 2020)
Changed:

logging according to OCR-D/core#599

Source code(tar.gz)
Source code(zip)
v0.1.2(Sep 24, 2020)
Fixed:

repair: traverse all text regions recursively (typo)

Source code(tar.gz)
Source code(zip)
v0.1.1(Sep 24, 2020)
Changed:

repair: traverse all text regions recursively

Fixed:

repair: be robust against invalid input polygons

repair: be careful to make valid output polygons

Source code(tar.gz)
Source code(zip)
v0.1.0(Aug 21, 2020)
Changed:

adapt to 1-output-file-group convention, use make_file_id and assert_file_grp_cardinality, #41

Fixed:

typo in extract_lines, #40

Source code(tar.gz)
Source code(zip)
v0.0.2(Dec 2, 2019)

Source code(tar.gz)
Source code(zip)

Owner

OCR-D

DFG-Koordinierungsprojekt zur Weiterentwicklung von Verfahren der Optical Character Recognition

GitHub Repository

Python library to extract tabular data from images and scanned PDFs

Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d

165 Dec 31, 2022

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

3.2k Dec 31, 2022

The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

47k Jan 07, 2023

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Granblue Automation using Template Matching (It is like Full Auto, but with Full Customization!) Discord here: https://discord.gg/5Yv4kqjAbm Android v

71 Dec 30, 2022

POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

1.7k Jan 04, 2023

"Very simple but works well" Computer Vision based ID verification solution provided by LibraX.

ID Verification by LibraX.ai This is the first free Identity verification in the market. LibraX.ai is an identity verification platform for developers

46 Dec 06, 2022

第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）第一名；仅采用densenet识别图中文字

OCR 第一届西安交通大学人工智能实践大赛（2018AI实践大赛--图片文字识别）冠军模型结果该比赛计算每一个条目的f1score，取所有条目的平均，具体计算方式在这里。这里的计算方式不对一句话里的相同文字重复计算，故f1score比提交的最终结果低： - train val f1score 0

441 Dec 22, 2022

Erosion and dialation using structure element in OpenCV python

2 Nov 11, 2021

Rubik's Cube in pygame with OpenGL

Rubik Rubik's Cube in pygame with OpenGL The script show on the screen a Rubik Cube buit with OpenGL. Then I have also implemented all the possible mo

2 Apr 15, 2022

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

6 Jun 28, 2022

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

6 Jul 11, 2022

Fast style transfer

faststyle Faststyle aims to provide an easy and modular interface to Image to Image problems based on feature loss. Install Making sure you have a wor

21 Mar 11, 2022

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf # it's a scriptable c

7.9k Jan 03, 2023

Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both f

221 Dec 26, 2022

Table recognition inside douments using neural networks

TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco

93 Jul 24, 2022

OCR powered screen-capture tool to capture information instead of images

NormCap OCR powered screen-capture tool to capture information instead of images. Links: Repo | PyPi | Releases | Changelog | FAQs Content: Quickstart

575 Dec 31, 2022

Face Recognizer using Opencv Python

Face Recognizer using Opencv Python The first step create your own dataset with file open-cv-create_dataset second step You can put the photo accordin

2 Nov 16, 2021

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

101 Dec 12, 2022

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

91 Nov 22, 2022

Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

23 Dec 30, 2022

OCR-D-compliant page segmentation

Related tags

Overview

ocrd_segment

Installation

Usage

Testing

Comments

Releases(v0.1.21)

v0.1.21(May 27, 2022)

v0.1.20(May 27, 2022)

v0.1.19(May 27, 2022)

v0.1.18(Mar 30, 2022)

v0.1.17(Mar 30, 2022)

v0.1.16(Feb 21, 2022)

v0.1.15(Feb 17, 2022)

v0.1.14(Feb 17, 2022)

v0.1.13(Dec 10, 2021)

v0.1.12(Dec 2, 2021)

v0.1.11(Mar 23, 2021)

v0.1.10(Feb 26, 2021)

v0.1.9(Feb 26, 2021)

v0.1.8(Feb 8, 2021)

v0.1.7(Jan 7, 2021)

v0.1.6(Nov 25, 2020)

v0.1.5(Nov 4, 2020)

v0.1.4(Nov 4, 2020)

v0.1.3(Sep 24, 2020)

v0.1.2(Sep 24, 2020)

v0.1.1(Sep 24, 2020)

v0.1.0(Aug 21, 2020)

v0.0.2(Dec 2, 2019)