diff --git a/.gitignore b/.gitignore
index 394b9648..8fc7b143 100644
--- a/.gitignore
+++ b/.gitignore
@@ -167,4 +167,8 @@ cython_debug/
.idea/
*.sh
-_testbed/cleaner/
+
+.DS_Store
+_testbed/media/**
+_testbed/cleaner/**
+_testbed/model/**
diff --git a/_testbed/README.md b/_testbed/README.md
new file mode 100644
index 00000000..702df030
--- /dev/null
+++ b/_testbed/README.md
@@ -0,0 +1,55 @@
+# PanelCleaner Testbed
+
+## Overview
+The **PanelCleaner** testbed serves as a dedicated area for experimenting and testing new ideas with *PanelCleaner* using Jupyter Notebooks. Currently, it focuses on **OCR** technologies, primarily using **Tesseract** and **IDefics** models. The testbed also begins the development of an evaluation framework to support future experiments. This project utilizes the `nbdev` literate programming environment.
+
+## Installation
+To get started with the notebooks, you'll need Jupyter Lab/Notebook or any Python IDE that supports Jupyter notebooks like *VSCode* or *Google Colab*.
+The setup mostly shares the same requirements as PanelCleaner and its CLI, with a few additional dependencies.
+Here’s how to set up your environment:
+1. Activate a virtual environment.
+2. Navigate to the `_testbed` directory:
+ ```bash
+ cd _testbed
+ ```
+3. Install the required dependencies:
+ ```bash
+ pip install -r requirements.txt
+ ```
+Note: Each notebook may require the installation of additional dependencies.
+
+## Google Colab Support
+The notebooks are ready to use on Google Colab, allowing you to run them directly on the platform without any extra setup or local GPU rigs.
+Instructions to use Google Colab are included in the notebooks (TBD).
+
+## Install Test Images
+The test images are not included in the repository but can be downloaded from the following link:
+- [Test images](https://drive.google.com/drive/folders/101_1_20240229)
+
+After downloading, place the test images in the [media](media) directory. If you want to use your own, each image should have a corresponding text file with the same name, but with the extension `.txt`, which contains the ground truth data, one line per box (as calculated by PanelCleaner). Optionally, you can also include a `.json` file with the same name, specifying the language of the page:
+```json
+{
+ "lang": "Spanish"
+}
+```
+If no language file is found, English will be used by default. In the near future, language detection will be automated.
+
+## Introduction to nbdev
+[nbdev](https://nbdev.fast.ai/) is a **literate programming** environment that allows you to develop a Python library in Jupyter Notebooks, integrating exploratory programming, code, tests, and documentation into a single cohesive workflow. Inspired by **Donald Knuth**'s concept of literate programming, this approach not only makes the development process more intuitive but also eases the maintenance and understanding of the codebase.
+
+## Notebooks (WIP)
+
+#### [helpers.ipynb](helpers.ipynb)
+This notebook includes utility functions and helpers that support the experiments in other notebooks, streamlining repetitive tasks and data manipulation.
+
+#### [ocr_metric.ipynb](ocr_metric.ipynb)
+This notebook focuses on defining and implementing metrics to evaluate the performance and accuracy of OCR engines, crucial for assessing the effectiveness of OCR technologies in various scenarios. It currently develops a basic metric for evaluating OCR models. In the near future, additional metrics will be added, such as precision and recall using Levenshtein distance (edit distance). More importantly, it will introduce a metric tailored to the unique characteristics of Comics/Manga OCR, a topic currently unexplored in technical literature.
+
+#### [experiments.ipynb](experiments.ipynb)
+This notebook details the development of the evaluation framework used in other notebooks, with Tesseract as a case study to illustrate the evaluation process. It's a work in progress, and will be updated continuously. If you're only interested in visualizing the results of the experiments, go directly to `Test_tesseract.ipynb` or `Test_idefics.ipynb`, which are much shorter and more to the point.
+
+#### [test_tesseract.ipynb](test_tesseract.ipynb)
+This notebook is dedicated to testing the Tesseract OCR engine, offering insights into its capabilities and limitations through hands-on experiments.
+
+#### [test_idefics.ipynb](test_idefics.ipynb)
+Similar to `test_tesseract.ipynb`, this notebook focuses on the IDefics LVM model, evaluating its performance and accuracy under different conditions. Here you can compare the results of the Tesseract OCR engine with the IDefics LVM model to see how the two compare in terms of accuracy and performance.
diff --git a/_testbed/experiments.ipynb b/_testbed/experiments.ipynb
new file mode 100644
index 00000000..6c8d3c81
--- /dev/null
+++ b/_testbed/experiments.ipynb
@@ -0,0 +1,6810 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| default_exp experiments"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| hide\n",
+ "# %reload_ext autoreload\n",
+ "# %autoreload 0\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# install (Colab)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# try: \n",
+ "# import fastcore as FC\n",
+ "# except ImportError: \n",
+ "# !pip install -q fastcore\n",
+ "# try:\n",
+ "# import rich\n",
+ "# except ImportError:\n",
+ "# !pip install -q rich\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note: we're using the `testbed` branch of PanelCleaner.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# !pip install -q git+https://github.com/civvic/PanelCleaner.git@testbed"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Testing `Tesseract` OCR for Comics\n",
+ "> Accuracy Enhancements for OCR in `PanelCleaner`\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Prologue"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "from __future__ import annotations\n",
+ "\n",
+ "import dataclasses\n",
+ "import difflib\n",
+ "import functools\n",
+ "import json\n",
+ "import shutil\n",
+ "from collections import defaultdict\n",
+ "from enum import Enum\n",
+ "from pathlib import Path\n",
+ "from typing import Any\n",
+ "from typing import Callable\n",
+ "from typing import cast\n",
+ "from typing import Mapping\n",
+ "from typing import Self\n",
+ "from typing import TypeAlias\n",
+ "\n",
+ "import fastcore.all as FC\n",
+ "import ipywidgets as W\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "import pcleaner.config as cfg\n",
+ "import pcleaner.ctd_interface as ctm\n",
+ "import pcleaner.image_ops as ops\n",
+ "import pcleaner.ocr.ocr as ocr\n",
+ "import pcleaner.structures as st\n",
+ "import torch\n",
+ "from IPython.display import clear_output\n",
+ "from IPython.display import display\n",
+ "from IPython.display import HTML\n",
+ "from ipywidgets.widgets.interaction import show_inline_matplotlib_plots\n",
+ "from loguru import logger\n",
+ "from pcleaner.ocr.ocr_tesseract import TesseractOcr\n",
+ "from PIL import Image\n",
+ "from PIL import ImageFilter\n",
+ "from rich.console import Console\n",
+ "from tqdm.notebook import tqdm\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "from helpers import *\n",
+ "from ocr_metric import *\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import copy\n",
+ "\n",
+ "import fastcore.xtras # patch Path with some utils\n",
+ "import pcleaner.cli_utils as cli\n",
+ "import pcleaner.preprocessor as pp\n",
+ "import rich\n",
+ "from fastcore.test import * # type: ignore\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Helpers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# pretty print by default\n",
+ "# %load_ext rich"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "console = Console(width=104, tab_size=4, force_jupyter=True)\n",
+ "cprint = console.print\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tesseract installation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['tesseract 5.3.4',\n",
+ " ' leptonica-1.84.1',\n",
+ " ' libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.11 : libwebp 1.4.0 : libopenjp2 2.5.2',\n",
+ " ' Found NEON',\n",
+ " ' Found libarchive 3.7.2 zlib/1.2.11 liblzma/5.4.6 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.6',\n",
+ " ' Found libcurl/8.4.0 SecureTransport (LibreSSL/3.3.6) zlib/1.2.11 nghttp2/1.51.0']"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "out = !tesseract --version\n",
+ "out\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Install jpn_vert tesserac lang\n",
+ "> It has much better results than the default `jpn` language model.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "\n",
+ "Download from [tessdata_best](https://github.com/tesseract-ocr/tessdata_best), or from [here](https://groups.google.com/g/tesseract-ocr/c/FwjSZzoVgeg/m/u-zyFYQiBgAJ) trained for vertical Japanese text as found in manga.\n",
+ "\n",
+ "Note: I've not play much with this one, `managa-ocr` is surely a much better fit, but it can be educational to compare.\n",
+ "\n",
+ "I have copied models in my GDrive, and installed (in my Ubuntu, similar in Mac):\n",
+ "```bash\n",
+ "cd model\n",
+ "ln -s jpn_vert_tessdata_best.traineddata /usr/share/tesseract-ocr/5/tessdata/jpn_vert.traineddata\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(Path('/opt/homebrew/share/tessdata'),\n",
+ " ['afr, amh, ara, asm, aze, aze_cyrl, bel, ben, bod, bos, bre, bul, cat, ceb, ces',\n",
+ " 'chi_sim, chi_sim_vert, chi_tra, chi_tra_vert, chr, cos, cym, dan, deu, div, dzo, ell, eng, enm, epo',\n",
+ " 'equ, est, eus, fao, fas, fil, fin, fra, frk, frm, fry, gla, gle, glg, grc',\n",
+ " 'guj, hat, heb, hin, hrv, hun, hye, iku, ind, isl, ita, ita_old, jav, jpn, jpn_vert',\n",
+ " 'kan, kat, kat_old, kaz, khm, kir, kmr, kor, kor_vert, lao, lat, lav, lit, ltz, mal',\n",
+ " 'mar, mkd, mlt, mon, mri, msa, mya, nep, nld, nor, oci, ori, osd, pan, pol',\n",
+ " 'por, pus, que, ron, rus, san, script/Arabic, script/Armenian, script/Bengali, script/Canadian_Aboriginal, script/Cherokee, script/Cyrillic, script/Devanagari, script/Ethiopic, script/Fraktur',\n",
+ " 'script/Georgian, script/Greek, script/Gujarati, script/Gurmukhi, script/HanS, script/HanS_vert, script/HanT, script/HanT_vert, script/Hangul, script/Hangul_vert, script/Hebrew, script/Japanese, script/Japanese_vert, script/Kannada, script/Khmer',\n",
+ " 'script/Lao, script/Latin, script/Malayalam, script/Myanmar, script/Oriya, script/Sinhala, script/Syriac, script/Tamil, script/Telugu, script/Thaana, script/Thai, script/Tibetan, script/Vietnamese, sin, slk',\n",
+ " 'slv, snd, snum, spa, spa_old, sqi, srp, srp_latn, sun, swa, swe, syr, tam, tat, tel',\n",
+ " 'tgk, tha, tir, ton, tur, uig, ukr, urd, uzb, uzb_cyrl, vie, yid, yor'])"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "out = !tesseract --list-langs\n",
+ "tessdata = Path(out[0].split('\"')[1])\n",
+ "tessdata, [', '.join(sub) for sub in [out[i:i + 15] for i in range(1, len(out), 15)]]\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
Width x Height: 804 x 1241 pixels\n",
+ " PIL Info DPI: None\n",
+ " Print Size 300 DPI: 2.680 x 4.137 in /6.807 x 10.507 cm\n",
+ "Required DPI Modern Age format: 121.216 dpi (6.625 x 10.250 in)\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " Width x Height: \u001b[1;36m804\u001b[0m x \u001b[1;36m1241\u001b[0m pixels\n",
+ " PIL Info DPI: \u001b[3;35mNone\u001b[0m\n",
+ " Print Size \u001b[1;36m300\u001b[0m DPI: \u001b[1;36m2.680\u001b[0m x \u001b[1;36m4.137\u001b[0m in \u001b[35m/\u001b[0m \u001b[1;36m6.807\u001b[0m x \u001b[1;36m10.507\u001b[0m cm\n",
+ "Required DPI Modern Age format: \u001b[1;36m121.216\u001b[0m dpi \u001b[1m(\u001b[0m\u001b[1;36m6.625\u001b[0m x \u001b[1;36m10.250\u001b[0m in\u001b[1m)\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "display(IMAGE_CONTEXT.image_info)\n",
+ "img_visor.image_info()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Balloons and Captions Ground Truth\n",
+ "> The ground truth for the balloons and captions is read from a `.txt` file.\n",
+ "\n",
+ "The file is named `.gt.txt` and contains one entry per line, corresponding to each balloon or caption in the order found in PanelClenaer page data.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 63,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.',\n",
+ " 'The house and the old man are alike in many ways; tall, proud, patient, contented always to wait until their master comes home--',\n",
+ " 'And one in need of some help, it would appear.',\n",
+ " 'Bambu-- we have a guest.',\n",
+ " '--and tonight, he comes most urgently, slamming open the oaken front doors!',\n",
+ " 'Tell me, master-- how may Bambu serve?',\n",
+ " 'Some blankets to keep her warm, Bambu-- and perhaps some dry clothes',\n",
+ " \"The echo of the old man's footsteps fades down the hall as...\",\n",
+ " 'How curious the whims of fate. Had I not chanced to stroll along the river tonight--',\n",
+ " 'As quickly as I can, master',\n",
+ " '--the girl would most surely be dead by now.',\n",
+ " 'Ghede has been generous. the Death God has given the girl a second chance at--',\n",
+ " \"Easy, girl-- there's nothing to scream about anymore.\",\n",
+ " \"You're among friends now, you're safe!\",\n",
+ " 'Continued after next page']"
+ ]
+ },
+ "execution_count": 63,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "IMAGE_CONTEXT.gts\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Experiment\n",
+ "> Use `ExperimentOCR` to perform OCR on the page balloons given a `CropMethod` and a model (i.e., `'Tesseract'` or `'Idefics'`)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 64,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "def trimmed_mean(data, trim_percent):\n",
+ " sorted_data = np.sort(data)\n",
+ " n = len(data)\n",
+ " trim_count = int(trim_percent * n)\n",
+ " trimmed_data = sorted_data[trim_count:-trim_count]\n",
+ " return np.mean(trimmed_data)\n",
+ "\n",
+ "def mad_based_outlier(points, threshold=3.5):\n",
+ " median = np.median(points)\n",
+ " diff = np.abs(points - median)\n",
+ " mad = np.median(diff)\n",
+ " modified_z_score = 0.6745 * diff / mad\n",
+ " return points[modified_z_score < threshold]\n",
+ "\n",
+ "def iqr_outlier_removal(data):\n",
+ " q1 = np.percentile(data, 25)\n",
+ " q3 = np.percentile(data, 75)\n",
+ " iqr = q3 - q1\n",
+ " lower_bound = q1 - 1.5 * iqr\n",
+ " upper_bound = q3 + 1.5 * iqr\n",
+ " return data[(data >= lower_bound) & (data <= upper_bound)]\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 65,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "@dataclasses.dataclass\n",
+ "class Experiment:\n",
+ " ctx: ExperimentContext\n",
+ "\n",
+ "\n",
+ "@dataclasses.dataclass\n",
+ "class ExperimentOCR(Experiment):\n",
+ " ctx: ImageContext\n",
+ " ocr_model: str\n",
+ "\n",
+ " @property\n",
+ " def img_ctx(self): return self.ctx\n",
+ " @property\n",
+ " def ctxs(self):\n",
+ " img_ctx = self.img_ctx\n",
+ " return cast(OCRExperimentContext, img_ctx.exp), img_ctx\n",
+ "\n",
+ " @classmethod\n",
+ " def file_path_of(cls, page_data: st.PageData, ocr_model: str):\n",
+ " return f\"{Path(page_data.original_path).stem}_{ocr_model}.json\"\n",
+ " \n",
+ " def file_path(self):\n",
+ " img_ctx = self.img_ctx\n",
+ " return type(self).file_path_of(img_ctx.page_data, self.ocr_model)\n",
+ " \n",
+ " def to_dict(self):\n",
+ " \"JSON serializable dict of the experiment\"\n",
+ " img_ctx = self.img_ctx\n",
+ " img_idx = img_ctx.image_idx\n",
+ " results = results_to_dict(self.results())\n",
+ " return {\n",
+ " 'image_name': img_ctx.image_name,\n",
+ " 'ocr_model': self.ocr_model, \n",
+ " 'results': results,\n",
+ " }\n",
+ "\n",
+ " def to_json(self, out_dir: Path | None = None):\n",
+ " img_ctx = self.img_ctx\n",
+ " fp = (out_dir or img_ctx.cache_dir_image) / self.file_path()\n",
+ " data = self.to_dict()\n",
+ " with open(fp, 'w') as f:\n",
+ " json.dump(data, f, indent=2)\n",
+ " return fp, data\n",
+ "\n",
+ " @classmethod\n",
+ " def from_json(cls, experiment: OCRExperimentContext, json_path: Path) -> Self:\n",
+ " try:\n",
+ " with open(json_path, 'r') as f:\n",
+ " data = json.load(f)\n",
+ " except Exception as e:\n",
+ " logger.error(f\"Error loading {json_path}: {e}\")\n",
+ " raise e\n",
+ " ocr_model = data['ocr_model']\n",
+ " img_ctx = ImageContext(experiment, data['image_name'])\n",
+ " results: ResultSetDefault = dict_to_results(\n",
+ " img_ctx.image_idx, \n",
+ " data['results'], \n",
+ " result_factory=experiment._result_from)\n",
+ " experiment.update_results(ocr_model, img_ctx.image_idx, results)\n",
+ " return cls(img_ctx, ocr_model)\n",
+ "\n",
+ " @classmethod\n",
+ " def from_image(cls, \n",
+ " ctx: OCRExperimentContext, \n",
+ " ocr_model: str, \n",
+ " image_idx: ImgSpecT):\n",
+ " idx = cast(ImgIdT, ctx.normalize_idx(image_idx))\n",
+ " img_ctx = ImageContext(ctx, idx)\n",
+ " if img_ctx is None:\n",
+ " raise ValueError(f\"Image {image_idx} not found in experiment context\")\n",
+ " fp = img_ctx.cache_dir / cls.file_path_of(img_ctx.page_data, ocr_model)\n",
+ " if fp.exists(): \n",
+ " return cast(Self, cls.from_json(cast(OCRExperimentContext, img_ctx.exp), fp))\n",
+ " return cls(img_ctx, ocr_model)\n",
+ "\n",
+ " @classmethod\n",
+ " def from_method(cls, \n",
+ " ctx: OCRExperimentContext, \n",
+ " ocr_model: str, \n",
+ " image_idx: ImgIdT | str | Path, \n",
+ " method: CropMethod):\n",
+ " experiment = cls.from_image(ctx, ocr_model, image_idx)\n",
+ " if experiment is None:\n",
+ " return None\n",
+ " return experiment.method_experiment(method)\n",
+ "\n",
+ " @classmethod\n",
+ " def saved_experiment(cls, \n",
+ " ctx: OCRExperimentContext, ocr_model: str, image_idx: ImgIdT | str | Path):\n",
+ " idx = ctx.normalize_idx(image_idx)\n",
+ " if idx is None: \n",
+ " logger.warning(f\"Image {image_idx} not found in experiment context\")\n",
+ " return None\n",
+ " return cls.from_image(ctx, ocr_model, idx)\n",
+ "\n",
+ " @classmethod\n",
+ " def saved_experiments(cls, ctx: OCRExperimentContext, ocr_model: str) -> list[Self]:\n",
+ " return [exp for i in range(len(ctx.image_paths))\n",
+ " if (exp := cls.from_image(ctx, ocr_model, i)) is not None]\n",
+ " \n",
+ "\n",
+ " def result(self, box_idx: BoxIdT, method: CropMethod, ocr: bool=True, rebuild: bool=False):\n",
+ " ctx, img_ctx = self.ctxs\n",
+ " return ctx.result(self.ocr_model, img_ctx.image_idx, box_idx, method, ocr, rebuild)\n",
+ "\n",
+ " def results(self):\n",
+ " ctx, img_ctx = self.ctxs\n",
+ " return cast(ResultSet, ctx.results(self.ocr_model, img_ctx.image_idx))\n",
+ "\n",
+ " def has_run(self):\n",
+ " \"at least one method has run\"\n",
+ " img_ctx = self.img_ctx\n",
+ " return len(self.results()) == len(img_ctx.page_data.boxes)\n",
+ " \n",
+ " def best_results(self):\n",
+ " img_ctx = self.img_ctx\n",
+ " results = self.results()\n",
+ " if len(results) < len(img_ctx.page_data.boxes): # at least one method has run\n",
+ " return None\n",
+ " best = []\n",
+ " for box_idx in results:\n",
+ " methods = results[box_idx]\n",
+ " best_method = max(methods, key=lambda m: methods[m].acc) # type: ignore\n",
+ " best.append((best_method, methods[best_method]))\n",
+ " return best\n",
+ "\n",
+ " def save_results_as_ground_truth(self, overwrite=False):\n",
+ " img_ctx = self.img_ctx\n",
+ " gts_path = ground_truth_path(img_ctx.page_data)\n",
+ " if overwrite or not gts_path.exists():\n",
+ " best_results = self.best_results()\n",
+ " if best_results:\n",
+ " tt = [r.ocr for m,r in best_results]\n",
+ " gts_path.write_text('\\n'.join(tt), encoding=\"utf-8\")\n",
+ " img_ctx.setup_ground_truth()\n",
+ " logger.info(f\"Ground truth data saved successfully to {gts_path}\")\n",
+ " return True\n",
+ " else:\n",
+ " logger.info(\"No best results available to save.\")\n",
+ " return False\n",
+ " else:\n",
+ " return False\n",
+ "\n",
+ " @property\n",
+ " def experiments(self):\n",
+ " if not hasattr(self, '_experiments'):\n",
+ " self._experiments = {}\n",
+ " return self._experiments\n",
+ " def method_experiment(self, method: CropMethod) -> ExperimentOCRMethod:\n",
+ " if method not in self.experiments:\n",
+ " self.experiments[method] = ExperimentOCRMethod(self, method)\n",
+ " return self.experiments[method]\n",
+ " \n",
+ "\n",
+ " def to_dataframe(self):\n",
+ " \"Dataframe with crop methods as columns and box ids as rows\"\n",
+ " methods = list(CropMethod.__members__.values())\n",
+ " experiments = [self.method_experiment(m) for m in methods]\n",
+ " accuracies = [[result.acc for result in exp.results()] for exp in experiments]\n",
+ " # transpose accuracies\n",
+ " accuracies = list(zip(*accuracies))\n",
+ " return pd.DataFrame(accuracies, columns=CropMethod.__display_names__())\n",
+ "\n",
+ " def plot_accuracies(self, \n",
+ " methods: list[CropMethod] | None = None, \n",
+ " ):\n",
+ " \"Plots a horizontal bar chart of the accuracies for a list of method experiments.\"\n",
+ " methods = methods or list(CropMethod.__members__.values())\n",
+ " experiments = [self.method_experiment(m) for m in methods]\n",
+ " if not experiments: return\n",
+ "\n",
+ " ctx, img_ctx = self.ctxs\n",
+ " page_data = img_ctx.page_data\n",
+ " model = self.ocr_model\n",
+ " accuracies = [[result.acc for result in exp.results()] for exp in experiments]\n",
+ " accuracies = [np.mean(a) for a in accuracies]\n",
+ " # accuracies = [np.mean([result.acc for result in exp.results()]) for exp in experiments]\n",
+ "\n",
+ " _, ax = plt.subplots(figsize=(10, 5))\n",
+ " \n",
+ " # Normalize the accuracies for color mapping\n",
+ " norm = plt.Normalize(min(accuracies), max(accuracies))\n",
+ " # Color map from red to green\n",
+ " cmap = plt.get_cmap('RdYlGn')\n",
+ " colors = cmap(norm(accuracies))\n",
+ "\n",
+ " ax.barh([m.value for m in methods], accuracies, color=colors)\n",
+ "\n",
+ " ax.set_xscale('log') # Set the x-axis to a logarithmic scale\n",
+ " ax.set_xlabel('Average Accuracy (log scale)', fontsize=12, fontweight='bold')\n",
+ "\n",
+ " ax.set_ylabel('Method', fontsize=12, fontweight='bold')\n",
+ " ax.set_yticks(range(len(methods)))\n",
+ " ax.set_yticklabels([f'{method.value} ({acc:.2f})' \n",
+ " for method, acc in zip(methods, accuracies)], fontsize=12)\n",
+ " max_acc_index = np.argmax(accuracies)\n",
+ " ax.get_yticklabels()[max_acc_index].set(color='blue', fontweight='bold')\n",
+ "\n",
+ " title_text = (f\"{page_data.original_path} - OCR model: {model}\")\n",
+ " ax.set_title(title_text, fontsize=12, fontweight='bold')\n",
+ "\n",
+ " plt.tight_layout()\n",
+ " plt.show()\n",
+ "\n",
+ "\n",
+ " def summary_box(self, box_idx: int):\n",
+ " results: list[tuple[CropMethod, ResultOCR]] = []\n",
+ " pb = tqdm(CropMethod.__members__.values(), leave=False, desc=f\"Box #{box_idx+1}\")\n",
+ " for m in pb:\n",
+ " r = cast(ResultOCR, self.result(box_idx, m))\n",
+ " results.append((m, r))\n",
+ " methods, images, ocrs, accs = zip(\n",
+ " *map(\n",
+ " lambda t: (t[0].value, t[1].cache_image(), t[1].diff_tagged(), acc_as_html(t[1].acc)), \n",
+ " results))\n",
+ " display_columns([methods, images, accs, ocrs], \n",
+ " headers=[\"Method\", f\"Box #{box_idx+1}\", \"Accuracy\", \"OCR\"])\n",
+ "\n",
+ "\n",
+ " def summary_method(self, method: CropMethod):\n",
+ " results = self.method_experiment(method).results()\n",
+ " methods, images, ocrs, accs = zip(\n",
+ " *map(\n",
+ " lambda r: (r.block_idx+1, r.cache_image(), r.diff_tagged(), acc_as_html(r.acc)), \n",
+ " results))\n",
+ " display_columns([methods, images, accs, ocrs], \n",
+ " headers=[\"Box #\", \"Box\", \"Accuracy\", f\"{method.value} OCR\"])\n",
+ "\n",
+ "\n",
+ " def display(self):\n",
+ " out = []\n",
+ " for method in CropMethod:\n",
+ " out.append(f\"---------- {method.value} ----------\")\n",
+ " results = self.method_experiment(method).results()\n",
+ " out.extend(results)\n",
+ " out.append('\\n')\n",
+ " cprint(*out, soft_wrap=True)\n",
+ "\n",
+ "\n",
+ " def reset(self, box_idx: int | None = None, method: CropMethod | None = None):\n",
+ " ctx, img_ctx = self.ctxs\n",
+ " ctx.reset_results(None, img_ctx.image_idx, box_idx, method)\n",
+ "\n",
+ " def perform_methods(self, \n",
+ " methods: CropMethod | list[CropMethod] | None = None, \n",
+ " box_idxs: BoxIdT | list[BoxIdT] | None = None,\n",
+ " rebuild: bool = False,\n",
+ " plot_acc: bool = False\n",
+ " ):\n",
+ " if methods is None:\n",
+ " methods = [*CropMethod.__members__.values()]\n",
+ " elif isinstance(methods, CropMethod):\n",
+ " methods = [methods]\n",
+ " if rebuild:\n",
+ " _methods = tqdm(methods, desc=\"Methods\")\n",
+ " else:\n",
+ " _methods = methods\n",
+ " for method in _methods:\n",
+ " method_exp = self.method_experiment(method)\n",
+ " if method_exp: \n",
+ " if rebuild:\n",
+ " method_exp(box_idxs, rebuild=rebuild)\n",
+ " if plot_acc:\n",
+ " self.plot_accuracies()\n",
+ "\n",
+ " def __call__(self, \n",
+ " box_idxs: BoxIdT | list[BoxIdT] | None = None,\n",
+ " methods: CropMethod | list[CropMethod] | None = None, \n",
+ " save: bool = True,\n",
+ " display=False, \n",
+ " rebuild: bool=False, \n",
+ " save_as_ground_truth=False):\n",
+ " self.perform_methods(methods, box_idxs, rebuild=rebuild)\n",
+ " if save_as_ground_truth:\n",
+ " self.save_results_as_ground_truth(overwrite=True)\n",
+ " if save:\n",
+ " self.to_json()\n",
+ " if display:\n",
+ " self.display()\n",
+ " \n",
+ "\n",
+ "@dataclasses.dataclass\n",
+ "class ExperimentOCRMethod:\n",
+ " ctx: ExperimentOCR\n",
+ " method: CropMethod\n",
+ "\n",
+ " @property\n",
+ " def exp_ctx(self): return self.ctx\n",
+ " @property\n",
+ " def img_ctx(self): return self.ctx.ctx\n",
+ " @property\n",
+ " def ctxs(self):\n",
+ " img_ctx = self.img_ctx\n",
+ " return cast(OCRExperimentContext, img_ctx.exp), img_ctx, self.ctx\n",
+ " \n",
+ " def result(self, box_idx: BoxIdT, ocr: bool=True, rebuild: bool=False) -> ResultOCR | None:\n",
+ " ctx, img_ctx, exp_ctx = self.ctxs\n",
+ " return ctx.result(exp_ctx.ocr_model, img_ctx.image_idx, box_idx, self.method, ocr, rebuild)\n",
+ "\n",
+ " def results(self, \n",
+ " box_idxs: BoxIdT | list[BoxIdT] | None = None, \n",
+ " ocr: bool=True, rebuild: bool=False) -> list[ResultOCR]:\n",
+ " ctx, img_ctx, exp_ctx = self.ctxs\n",
+ " if box_idxs is None:\n",
+ " box_idxs = list(range(len(img_ctx.boxes)))\n",
+ " elif isinstance(box_idxs, int):\n",
+ " box_idxs = [box_idxs]\n",
+ " model = exp_ctx.ocr_model\n",
+ " results = ctx.method_results(model, img_ctx.image_idx, self.method)\n",
+ " results = {i:results[i] if i in results else None for i in box_idxs}\n",
+ " pb = rebuild or not results or any(r is None for r in results.values())\n",
+ " if pb and len(results) > 2:\n",
+ " progress_bar = tqdm(list(results.keys()), desc=f\"{self.method.value} - {model}\")\n",
+ " else:\n",
+ " progress_bar = list(results.keys())\n",
+ " results = []\n",
+ " for i in progress_bar:\n",
+ " results.append(self.result(i, ocr, rebuild=rebuild))\n",
+ " return results\n",
+ "\n",
+ "\n",
+ " def get_results_html(self, \n",
+ " box_idxs: BoxIdT | list[BoxIdT] | None = None,\n",
+ " max_image_width: int | None = None): \n",
+ " _, img_ctx, exp_ctx = self.ctxs\n",
+ " results: list[ResultOCR] = self.results(box_idxs)\n",
+ " accs = np.array([r.acc for r in results])\n",
+ " mean_accuracy = np.mean(accs)\n",
+ " mean_trimmed = trimmed_mean(accs, 0.1)\n",
+ " # filtered_data = mad_based_outlier(accs)\n",
+ " # mean_mad = np.mean(filtered_data)\n",
+ " # filtered_data = iqr_outlier_removal(accs)\n",
+ " # mean_iqr = np.mean(filtered_data)\n",
+ " \n",
+ " descriptions, images, ocrs, accs = zip(*map(\n",
+ " lambda r: (\n",
+ " r.block_idx+1, \n",
+ " r.cache_image(), \n",
+ " r.diff_tagged(), \n",
+ " acc_as_html(r.acc)\n",
+ " ), results))\n",
+ " non_breakin_space = u'\\u00A0'\n",
+ " tmpl = \"{}\"\n",
+ " padded_s = lambda s,n: tmpl.format(s.rjust(n))\n",
+ " acc_fmt = f\"{mean_accuracy:.2f}/{mean_trimmed:.2f}\"\n",
+ " w, h = img_ctx.base_image.size\n",
+ " dim, _dpi = size(w, h), dpi(w, h)\n",
+ " dim_fmt = f\"{w}x{h} px: {dim[0]:.2f} x {dim[1]:.2f} in @ {_dpi:.2f} dpi\"\n",
+ " return '\\n \\n'.join([\n",
+ " (\"
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3 0.90
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.⎕⎕
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
Eneowered by great gnarled cypress jrfes, the ancient manor ! alone on the eit of mew rce: eans, kept tipy by a white-haired ao han known only as 0.85
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orl⎕⎕⎕eans, kept tidy by a white-haired ⎕old man known only as Bambu.
Eneowered by great gnarled cypress jrfes, the ancient manor !⎕⎕⎕⎕⎕ alone on the eit⎕⎕⎕⎕⎕⎕ of mew ⎕rce: eans, kept tipy by a white-haired ao⎕⎕han known only as⎕⎕⎕⎕⎕⎕⎕
Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a white-haired old man known only as 0.95
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a white-haired old man known only as⎕⎕⎕⎕⎕⎕⎕
Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit of mew rce: eans, kept tipy by a white-haired ao lo man known only as 0.88
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orl⎕⎕⎕eans, kept tidy by a white-haired ⎕o⎕ld man known only as Bambu.
Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit⎕⎕⎕⎕⎕⎕ of mew ⎕rce: eans, kept tipy by a white-haired ao⎕lo man known only as⎕⎕⎕⎕⎕⎕⎕
Enbonered by great gnarled cypress trees, the ancient manor stands alone on the pag hile of new orleans, kept tipy by a white-haired ao lo man known omy as 0.88
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired ⎕o⎕ld man known only as Bambu.
Enbonered by great gnarled cypress trees, the ancient manor stands alone on the pag hile⎕ of new orleans, kept tipy by a white-haired ao⎕lo man known om⎕y as⎕⎕⎕⎕⎕⎕⎕
Embowered by great gnarled cypress trees, the ancient ⎕manor stands alone on the outskirts of New Orleans, kept tidy by a whi⎕te-haired old man known only as Bambu⎕⎕.
Fhbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans. kept tipy by a whi⎕te-haire⎕ old man known only as bambi] .
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a whi⎕te-⎕haired old man known only as B⎕ambu⎕.
Enbonsred by great sha⎕le⎕ cypress trees, the anci⎕⎕⎕ manor stands alone on the [⎕tskirts of new orleans, kept tidy by a whi⎕te-⎕haired old man known only as b8ambl .
Embowered by great gnarled cypress trees, the ancient ⎕manor stands alone on the outskirts of New Orleans, kept tidy by a whi⎕te-haired old man known only as Bambu⎕⎕.
Enbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as 8ambli .
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
O⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕utskirts of new orleans, kept tipy by a white-haired old man known only as sams .
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a whi⎕te-haired old man known only as B⎕ambu⎕.
Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as b8ambl .
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a whi⎕te-haired old man known only as B⎕ambu⎕.
Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as b8ambl .
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
Default
0.85
Eneowered by great gnarled cypress jrfes, the ancient manor !⎕⎕⎕⎕⎕ alone on the eit⎕⎕⎕⎕⎕⎕ of mew ⎕rce: eans, kept tipy by a white-haired ao⎕⎕han known only as⎕⎕⎕⎕⎕⎕⎕
Default, grey pad
0.95
Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a white-haired old man known only as⎕⎕⎕⎕⎕⎕⎕
Padded 4px
0.88
Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit⎕⎕⎕⎕⎕⎕ of mew ⎕rce: eans, kept tipy by a white-haired ao⎕lo man known only as⎕⎕⎕⎕⎕⎕⎕
Padded 8px
0.88
Enbonered by great gnarled cypress trees, the ancient manor stands alone on the pag hile⎕ of new orleans, kept tipy by a white-haired ao⎕lo man known om⎕y as⎕⎕⎕⎕⎕⎕⎕
Extracted, init box
0.92
Fhbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans. kept tipy by a whi⎕te-haire⎕ old man known only as bambi] .
Padded 4, extracted
0.91
Enbonsred by great sha⎕le⎕ cypress trees, the anci⎕⎕⎕ manor stands alone on the [⎕tskirts of new orleans, kept tidy by a whi⎕te-⎕haired old man known only as b8ambl .
Padded 8, extracted
0.94
Enbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as 8ambli .
Padded 8, dilation 1
0.61
O⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕utskirts of new orleans, kept tipy by a white-haired old man known only as sams .
Pad 8, fract. 0.5
0.94
Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as b8ambl .
Pad 8, fract. 0.2
0.94
Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as b8ambl .
Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit⎕⎕⎕⎕⎕⎕ of mew ⎕rce: eans, kept tipy by a white-haired ao⎕lo man known only as⎕⎕⎕⎕⎕⎕⎕
2
0.93
The house and the old⎕man are alike in many ways; tall, proud, patient, contented aways 0⎕ wait until their.⎕aster comes home ~~ | }
3
0.69
F and one in ⎕ee⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ would appear.
4
0.77
" bambli-— we have a gliest.
5
0.55
P⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ comes⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ slamming open the caken⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
6
0.57
Tel⎕oe⎕⎕⎕⎕⎕er-- 5 ow ⎕a⎕=⎕⎕⎕⎕7⎕⎕⎕⎕⎕
7
0.38
W⎕⎕e⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ and perhaps c o⎕e /⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
8
0.75
The⎕⎕⎕⎕⎕⎕⎕⎕ the old man⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕s fades down the hall sra⎕⎕⎕⎕
9
0.92
How curious the a⎕whims of fate⎕. - had i not chanced to stroll along the river yl⎕tonight~-⎕
10
0.79
A⎕⎕⎕ulckly as t can, ‘masrer.
11
0.92
<⎕the girl wolld⎕- most surely be dead by now.
12
0.88
Ghede has been generous. the oeath gop has given -⎕the girl. a second chance ye,⎕alem
13
0.67
Soe⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕er⎕eke⎕othing to scream ay⎕⎕⎕ anymore.
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3 0.90
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.⎕⎕
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
2
0.93
The house and the old man are alike in many ways; tall, prolid, patient, contented always 0⎕ wait until. their. master cones mome ~~
3
0.70
“and one in ⎕ee⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ would appear.
4
0.62
Re bambli-~ we have a⎕⎕⎕⎕⎕⎕⎕
5
0.70
T⎕⎕⎕⎕⎕⎕onight, he comes noost⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ slamming open the caken⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
6
0.82
Tell me⎕naster.⎕ how may bambli serve 7
7
0.56
£7⎕⎕»⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ and perhaps some dry clothes...⎕7⎕/
8
0.81
The⎕⎕⎕⎕⎕⎕⎕⎕ the old man'⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕s fades down the hall as...⎕7
9
0.85
How curious the 4⎕fate.⎕whims of h⎕⎕⎕⎕⎕⎕ad t not chanced to stroll along the river yl⎕tonight ==
10
0.80
Fas oulckly as t ca⎕, master.
11
0.91
<⎕the girl would -⎕most slirely be dead by now.
12
0.47
A⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕th⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ the girl. a second chance ge⎕a ee yg adil
13
0.84
Ah⎕⎕⎕ girl--⎕there's ⎕othing to scream nt⎕⎕t anymore⎕.
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
2
0.93
The house and the old man are alike in many ways; tall, prolid, patient, contented always 0⎕ wait until. their. master cones mome ~~
7
0.56
£7⎕⎕»⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ and perhaps some dry clothes...⎕7⎕/
Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski) 2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3
2
0.93
The house and the old man are alike in many ways; tall, prolid, patient, contented always 0⎕ wait until. their. master cones mome ~~
3
0.70
“and one in ⎕ee⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ would appear.
4
0.62
Re bambli-~ we have a⎕⎕⎕⎕⎕⎕⎕
5
0.70
T⎕⎕⎕⎕⎕⎕onight, he comes noost⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ slamming open the caken⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
6
0.82
Tell me⎕naster.⎕ how may bambli serve 7
7
0.56
£7⎕⎕»⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ and perhaps some dry clothes...⎕7⎕/
8
0.81
The⎕⎕⎕⎕⎕⎕⎕⎕ the old man'⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕s fades down the hall as...⎕7
9
0.85
How curious the 4⎕fate.⎕whims of h⎕⎕⎕⎕⎕⎕ad t not chanced to stroll along the river yl⎕tonight ==
10
0.80
Fas oulckly as t ca⎕, master.
11
0.91
<⎕the girl would -⎕most slirely be dead by now.
12
0.47
A⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕th⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ the girl. a second chance ge⎕a ee yg adil
13
0.84
Ah⎕⎕⎕ girl--⎕there's ⎕othing to scream nt⎕⎕t anymore⎕.
14
0.93
You're among friends now. you're sale⎕
15
1.00
Continued after next page
"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "method = CropMethod.INITIAL_BOX\n",
+ "# method = CropMethod.DEFAULT\n",
+ "# method = CropMethod.PADDED_4\n",
+ "# method = CropMethod.PADDED_8\n",
+ "# method = CropMethod.EXTRACTED_INIT_BOX\n",
+ "# method = CropMethod.PAD_8_FRACT_0_5\n",
+ "# method = CropMethod.PAD_8_FRACT_0_2\n",
+ "\n",
+ "# image_experiment.method_experiment(CropMethod.INITIAL_BOX).results()\n",
+ "initial_box_exp = image_experiment.method_experiment(CropMethod.INITIAL_BOX)\n",
+ "initial_box_exp(display=True)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Other method"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 102,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[ResultOCR#block 00: 0.85||Eneowered by great gnarled cypress jrfes, the ancient manor ! alone on the eit of mew rce: eans, kept tipy by a white-haired ao han known only as,\n",
+ " ResultOCR#block 01: 0.96||The house and the old man are alike in many ways; tall, proud, patient, conten tel always to wait until their. master cones home ~~,\n",
+ " ResultOCR#block 02: 0.74||And one in ee would appear.,\n",
+ " ResultOCR#block 03: 0.41||Rir guest.,\n",
+ " ResultOCR#block 04: 0.59||=~and tonight, he comes host sane oo,\n",
+ " ResultOCR#block 05: 0.78||Tell me masts - how may bambli . serve 7 _,\n",
+ " ResultOCR#block 06: 0.48||R warm, bambli-~ and perhaps,\n",
+ " ResultOCR#block 07: 0.76||The the old mans fades down the hall s.00,\n",
+ " ResultOCR#block 08: 0.92||How curious the a whims of fate . had t not chanced to stroll along the river tonight~~ >,\n",
+ " ResultOCR#block 09: 0.50||Aulckly “master as t can,,\n",
+ " ResultOCR#block 10: 0.94||"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "image_experiment.perform_methods(plot_acc=True)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 101,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
---------- Initial box ----------\n",
+ "ResultOCR#block 00: 0.90||Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski)2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3\n",
+ "ResultOCR#block 01: 0.93||The house and the old man are alike in many ways; tall, prolid, patient, contented always 0 wait until. their. master cones mome ~~\n",
+ "ResultOCR#block 02: 0.70||“and one in ee would appear.\n",
+ "ResultOCR#block 03: 0.62||Re bambli-~ we have a\n",
+ "ResultOCR#block 04: 0.70||Tonight, he comes noost slamming open the caken\n",
+ "ResultOCR#block 05: 0.82||Tell me naster. how may bambli serve 7\n",
+ "ResultOCR#block 06: 0.56||£7 » and perhaps some dry clothes...7/\n",
+ "ResultOCR#block 07: 0.81||The the old man's fades down the hall as...7\n",
+ "ResultOCR#block 08: 0.85||How curious the 4 fate. whims of had t not chanced to stroll along the river yl tonight ==\n",
+ "ResultOCR#block 09: 0.80||Fas oulckly as t ca, master.\n",
+ "ResultOCR#block 10: 0.91||<the girl would - most slirely be dead by now.\n",
+ "ResultOCR#block 11: 0.47||Ath the girl. a second chance ge a ee yg adil\n",
+ "ResultOCR#block 12: 0.84||Ah girl--there's othing to scream ntt anymore .\n",
+ "ResultOCR#block 13: 0.93||You're among friends now. you're sale\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Default ----------\n",
+ "ResultOCR#block 00: 0.85||Eneowered by great gnarled cypress jrfes, the ancient manor ! alone on the eit of mew rce: eans, kept tipy by a white-haired ao han known only as\n",
+ "ResultOCR#block 01: 0.96||The house and the old man are alike in many ways; tall, proud, patient, conten tel always to wait until their. master cones home ~~\n",
+ "ResultOCR#block 02: 0.74||And one in ee would appear.\n",
+ "ResultOCR#block 03: 0.41||Rir guest.\n",
+ "ResultOCR#block 04: 0.59||=~and tonight, he comes host sane oo\n",
+ "ResultOCR#block 05: 0.78||Tell me masts - how may bambli . serve 7 _\n",
+ "ResultOCR#block 06: 0.48||R warm, bambli-~ and perhaps\n",
+ "ResultOCR#block 07: 0.76||The the old mans fades down the hall s.00\n",
+ "ResultOCR#block 08: 0.92||How curious the a whims of fate . had t not chanced to stroll along the river tonight~~ >\n",
+ "ResultOCR#block 09: 0.50||Aulckly “master as t can,\n",
+ "ResultOCR#block 10: 0.94||<the girl would - most surely be dead by now.\n",
+ "ResultOCR#block 11: 0.50||Ath - the girl. a second chance ee oo tr tt\n",
+ "ResultOCR#block 12: 0.84||Oe girl--there's othing to scream nt anymore. 4\n",
+ "ResultOCR#block 13: 0.96||You're among friends now. youre safe!\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Default, grey pad ----------\n",
+ "ResultOCR#block 00: 0.95||Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a white-haired old man known only as\n",
+ "ResultOCR#block 01: 0.96||The house and the old man are alike in many ways; tall, prolid, patient, contented always to wait until their. * master cones home ~-\n",
+ "ResultOCR#block 02: 0.94||“and one in > need of some help, it would appear .\n",
+ "ResultOCR#block 03: 0.88||\" bambl-- we have a guest.\n",
+ "ResultOCR#block 04: 0.72||~~and tonight, he comes urgently, slanming open\n",
+ "ResultOCR#block 05: 0.86||Tell me, master: how may bambli serve 7\n",
+ "ResultOCR#block 06: 0.90||Some blankets to keep her. warm, bambli-- and perhaps some dry \\ clothes--7/.\n",
+ "ResultOCR#block 07: 0.06||As.\n",
+ "ResultOCR#block 08: 0.91||How curious the d 7e . whims of fa; had i not chanced to stroll along the river tonight--\n",
+ "ResultOCR#block 09: 0.55||Ickl as t can\n",
+ "ResultOCR#block 10: 0.00||\n",
+ "ResultOCR#block 11: 0.85||Ghede has been generous. the death god has gen + the girl. a second chance te oe ato\" pd ate\n",
+ "ResultOCR#block 12: 0.95||Easy, girl--there's | nothing to scream about anyaore.\n",
+ "ResultOCR#block 13: 0.97||You're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 0.54||“continued a\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block 00: 0.88||Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit of mew rce: eans, kept tipy by a white-haired ao lo man known only as\n",
+ "ResultOCR#block 01: 0.93||The house and the oldman are alike in many ways; tall, proud, patient, contented a ways 0 wait until their. aster comes home ~~ | }\n",
+ "ResultOCR#block 02: 0.69||F and one in ee would appear.\n",
+ "ResultOCR#block 03: 0.77||\" bambli-— we have a gliest.\n",
+ "ResultOCR#block 04: 0.55||P comes slamming open the caken\n",
+ "ResultOCR#block 05: 0.57||Tel oe er-- 5 ow a = 7\n",
+ "ResultOCR#block 06: 0.38||We and perhaps c oe /\n",
+ "ResultOCR#block 07: 0.75||The the old mans fades down the hall sra\n",
+ "ResultOCR#block 08: 0.92||How curious the a whims of fate . - had i not chanced to stroll along the river yl tonight~-\n",
+ "ResultOCR#block 09: 0.79||Aulckly as t can, ‘masrer.\n",
+ "ResultOCR#block 10: 0.92||<the girl wolld - most surely be dead by now.\n",
+ "ResultOCR#block 11: 0.88||Ghede has been generous. the oeath gop has given - the girl. a second chance ye, alem\n",
+ "ResultOCR#block 12: 0.67||Soe er eke othing to scream ay anymore.\n",
+ "ResultOCR#block 13: 0.94||\"you're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Padded 8px ----------\n",
+ "ResultOCR#block 00: 0.88||Enbonered by great gnarled cypress trees, the ancient manor stands alone on the pag hile of new orleans, kept tipy by a white-haired ao lo man known omy as\n",
+ "ResultOCR#block 01: 0.99||The house and the old man are alike in many ways; tall, proud, patient, contented always to wait until their. master comes home\n",
+ "ResultOCR#block 02: 0.67||7 and one in ee would appear,\n",
+ "ResultOCR#block 03: 0.76||Zf mbl == we have a guest.\n",
+ "ResultOCR#block 04: 0.61||Tonight, he comes host slamming open\n",
+ "ResultOCR#block 05: 0.75||Yy i tell me master - how may bambli serve 7 _,\n",
+ "ResultOCR#block 06: 0.89||Some. blankets to keep he arm , bambli-= and perhaps some dry clothes. 2s\n",
+ "ResultOCR#block 07: 0.72||The the old mans fades down the hal. srl see\n",
+ "ResultOCR#block 08: 0.88||* how curious the p whims of fate . - had i not chanced . to stroll along _ the river 3 tonight-~\n",
+ "ResultOCR#block 09: 0.65||Tiie as t can, \\ master ,\n",
+ "ResultOCR#block 10: 0.86||The girl wolld - most slirely be - dead by now.\n",
+ "ResultOCR#block 11: 0.62||Ghede has been generous. : the crn son ue;\n",
+ "ResultOCR#block 12: 0.62||Soe er eke othing to scream hbolt anhore hr\n",
+ "ResultOCR#block 13: 0.92||” you're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 0.94||“continued after next page\n",
+ "\n",
+ " ---------- Extracted, init box ----------\n",
+ "ResultOCRExtracted#block 00: 0.92||Fhbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans. kept tipy by a whi te-haire old man known only as bambi] .\n",
+ "ResultOCRExtracted#block 01: 0.93||Ee house and the old man por alike in many ways; tall, proud, patient, contented always t° wait until their. master cones home ~~\n",
+ "ResultOCRExtracted#block 02: 0.73||And one in fee would appear.\n",
+ "ResultOCRExtracted#block 03: 0.67||— we have a i=s7t.\n",
+ "ResultOCRExtracted#block 04: 0.74||~and tonight, he comes urgently, slamming open\n",
+ "ResultOCRExtracted#block 05: 0.85||Tell me master how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.93||Some blankets to keep her warm, banbli-- and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.77||The the old man's fades down the hall s.,00\n",
+ "ResultOCRExtracted#block 08: 0.77||Hin ef fare” had i not chanced to stroll along the river. tonigmt=~\n",
+ "ResultOCRExtracted#block 09: 0.85||Aulckly as t can, master,\n",
+ "ResultOCRExtracted#block 10: 1.00||--the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.51||Ath the girl a second chance ro\n",
+ "ResultOCRExtracted#block 12: 0.56||Cas ee, othing to scream pls aa .\n",
+ "ResultOCRExtracted#block 13: 0.95||You're among friends now. you're sale!\n",
+ "ResultOCRExtracted#block 14: 0.91||Continued af ext page\n",
+ "\n",
+ " ---------- Padded 4, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 0.91||Enbonsred by great shale cypress trees, the anci manor stands alone on the [tskirts of new orleans, kept tidy by a whi te- haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.98||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master cones home --\n",
+ "ResultOCRExtracted#block 02: 0.73||And one in fee would appear.\n",
+ "ResultOCRExtracted#block 03: 0.83||Bambli we have a gliest.\n",
+ "ResultOCRExtracted#block 04: 0.74||=~and tonight, he comes urgently, slamming open\n",
+ "ResultOCRExtracted#block 05: 0.84||Tell me master. how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.53||Warm, bambli-- and perhaps. som\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sra\n",
+ "ResultOCRExtracted#block 08: 0.76||We sea had i not chanced to stroll along the river. tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.76||Alley as t can, master,\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.92||Chepe has been generous. the peath god has given the girl a second chance amr\n",
+ "ResultOCRExtracted#block 12: 0.73||Cas gr theres othing to scream pissy tore .\n",
+ "ResultOCRExtracted#block 13: 0.95||You're among eriends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.91||Continued af ext page\n",
+ "\n",
+ " ---------- Padded 8, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as 8ambli .\n",
+ "ResultOCRExtracted#block 01: 0.98||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master comes home --\n",
+ "ResultOCRExtracted#block 02: 0.70||And one in fee wolld appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve't\n",
+ "ResultOCRExtracted#block 06: 0.91||Some blankets to keep her warm, banbli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sire\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.77||Allckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.94||Ghede has been generous. the peath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.74||Cas gr theres othing to scream pps hore .\n",
+ "ResultOCRExtracted#block 13: 0.42||You're safe § r\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Padded 8, dilation 1 ----------\n",
+ "ResultOCRExtracted#block 00: 0.61||Outskirts of new orleans, kept tipy by a white-haired old man known only as sams .\n",
+ "ResultOCRExtracted#block 01: 0.88||The house and the old man are alike in many ways, tall, proud, nt, contented live walt gtie their, master comes home -=\n",
+ "ResultOCRExtracted#block 02: 0.97||And one in need of some help, it wolld appear .\n",
+ "ResultOCRExtracted#block 03: 0.78||Bambli ~~ we have a gliest.\n",
+ "ResultOCRExtracted#block 04: 0.79||=and tonight, he comes most slamming open the front\n",
+ "ResultOCRExtracted#block 05: 0.86||Tell me, master: how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.85||Gone blankets to keep her. warm, bambli-~ and perhaps some dry\n",
+ "ResultOCRExtracted#block 07: 0.73||The old man's footsteps the hall as.re\n",
+ "ResultOCRExtracted#block 08: 0.94||How curious the whims of fate . had i not chanced to stroll along the r/| ver tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.68||Aulckly as t can,\n",
+ "ResultOCRExtracted#block 10: 0.95||~<the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.75||Ee lh boe ene. the death gop has the girl. a second chance\n",
+ "ResultOCRExtracted#block 12: 0.92||Easy, girl--there's nothing to scream abolit anymo!\n",
+ "ResultOCRExtracted#block 13: 0.97||You're among friends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.5 ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.97||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until. their. master comes home ~~\n",
+ "ResultOCRExtracted#block 02: 0.78||And one in eee pe would appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve'7\n",
+ "ResultOCRExtracted#block 06: 0.94||Some blankets to keep her. warm, bambli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.73||The the old mans fades donn the hall sire\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.81||Aulckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.92||Ghede hag been generous. the peath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.76||Cas srl theres othing to scream seoit hore .\n",
+ "ResultOCRExtracted#block 13: 0.42||You're safe 4 ’\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.2 ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.97||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master comes home ~~\n",
+ "ResultOCRExtracted#block 02: 0.77||And one in eet sve would appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve'7\n",
+ "ResultOCRExtracted#block 06: 0.94||Some blankets to keep her, warm, bambli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sere\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.81||Aulckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.95||~<the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.94||Ghede has been generous. the ceath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.67||Yi renee othing to scream seat anhore .\n",
+ "ResultOCRExtracted#block 13: 0.93||Youre among eriends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------- Initial box ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.90\u001b[0m||Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski\u001b[1m)\u001b[0m \u001b[1;36m2\u001b[0m of mew ce eans, kept tidy by a white-haired old man known only as bambs, \u001b[1;36m3\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.93\u001b[0m||The house and the old man are alike in many ways; tall, prolid, patient, contented always \u001b[1;36m0\u001b[0m wait until. their. master cones mome ~~\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.70\u001b[0m||“and one in ee would appear.\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.62\u001b[0m||Re bambli-~ we have a\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.70\u001b[0m||Tonight, he comes noost slamming open the caken\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.82\u001b[0m||Tell me naster. how may bambli serve \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.56\u001b[0m||£\u001b[1;36m7\u001b[0m » and perhaps some dry clothes\u001b[33m...\u001b[0m \u001b[1;36m7\u001b[0m \u001b[35m/\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.81\u001b[0m||The the old man's fades down the hall as\u001b[33m...\u001b[0m \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.85\u001b[0m||How curious the \u001b[1;36m4\u001b[0m fate. whims of had t not chanced to stroll along the river yl tonight ==\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.80\u001b[0m||Fas oulckly as t ca, master.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.91\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.50\u001b[0m||Aulckly “master as t can,\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.94\u001b[0m|| need of some help, it would appear .\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.88\u001b[0m||\" bambl-- we have a guest.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.72\u001b[0m||~~and tonight, he comes urgently, slanming open\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.86\u001b[0m||Tell me, master: how may bambli serve \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.90\u001b[0m||Some blankets to keep her. warm, bambli-- and perhaps some dry \\ clothes-\u001b[1;36m-7\u001b[0m \u001b[35m/\u001b[0m\u001b[95m.\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.06\u001b[0m||As.\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.91\u001b[0m||How curious the d 7e . whims of fa; had i not chanced to stroll along the river tonight--\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.55\u001b[0m||Ickl as t can\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.85\u001b[0m||Ghede has been generous. the death god has gen + the girl. a second chance te oe ato\" pd ate\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.95\u001b[0m||Easy, girl--there's | nothing to scream about anyaore.\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.97\u001b[0m||You're among friends now. you're safe!\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.54\u001b[0m||“continued a\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.88\u001b[0m||Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit of mew rce: eans, kept tipy by a white-haired ao lo man known only as\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.93\u001b[0m||The house and the oldman are alike in many ways; tall, proud, patient, contented a ways \u001b[1;36m0\u001b[0m wait until their. aster comes home ~~ | \u001b[1m}\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.69\u001b[0m||F and one in ee would appear.\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.77\u001b[0m||\" bambli-— we have a gliest.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.55\u001b[0m||P comes slamming open the caken\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.57\u001b[0m||Tel oe er-- \u001b[1;36m5\u001b[0m ow a = \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.38\u001b[0m||We and perhaps c oe \u001b[35m/\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.75\u001b[0m||The the old mans fades down the hall sra\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.92\u001b[0m||How curious the a whims of fate . - had i not chanced to stroll along the river yl tonight~-\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.79\u001b[0m||Aulckly as t can, ‘masrer.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.92\u001b[0m||"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "image_experiment.plot_accuracies()\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 103,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "
[\n",
+ " (\n",
+ " 'Default, grey pad',\n",
+ " '0.953',\n",
+ " 'Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of \n",
+ "new orleans, kept tipy by a white-haired old man known only as'\n",
+ " ),\n",
+ " (\n",
+ " 'Padded 8px',\n",
+ " '0.988',\n",
+ " 'The house and the old man are alike in many ways; tall, proud, patient, contented always to \n",
+ "wait until their. master comes home'\n",
+ " ),\n",
+ " ('Padded 8, dilation 1', '0.968', 'And one in need of some help, it wolld appear .'),\n",
+ " ('Default, grey pad', '0.880', '\" bambl-- we have a guest.'),\n",
+ " ('Padded 8, dilation 1', '0.794', '=and tonight, he comes most slamming open the front'),\n",
+ " ('Default, grey pad', '0.857', 'Tell me, master: how may bambli serve 7'),\n",
+ " (\n",
+ " 'Pad 8, fract. 0.5',\n",
+ " '0.935',\n",
+ " 'Some blankets to keep her. warm, bambli-~ and perhaps. some dry clothes'\n",
+ " ),\n",
+ " ('Initial box', '0.811', \"The the old man's fades down the hall as... 7\"),\n",
+ " (\n",
+ " 'Padded 8, extracted',\n",
+ " '0.959',\n",
+ " 'How curious the whims of fate . had i not chanced to stroll along the river tonight-~'\n",
+ " ),\n",
+ " ('Extracted, init box', '0.846', 'Aulckly as t can, master,'),\n",
+ " ('Extracted, init box', '1.000', '--the girl would most surely be dead by now.'),\n",
+ " (\n",
+ " 'Padded 8, extracted',\n",
+ " '0.935',\n",
+ " 'Ghede has been generous. the peath god has given the girl a second chance po'\n",
+ " ),\n",
+ " ('Default, grey pad', '0.953', \"Easy, girl--there's | nothing to scream about anyaore.\"),\n",
+ " ('Default, grey pad', '0.974', \"You're among friends now. you're safe!\"),\n",
+ " ('Initial box', '1.000', 'Continued after next page')\n",
+ "]\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "\u001b[1m[\u001b[0m\n",
+ " \u001b[1m(\u001b[0m\n",
+ " \u001b[32m'Default, grey pad'\u001b[0m,\n",
+ " \u001b[32m'0.953'\u001b[0m,\n",
+ " \u001b[32m'Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of \u001b[0m\n",
+ "\u001b[32mnew orleans, kept tipy by a white-haired old man known only as'\u001b[0m\n",
+ " \u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\n",
+ " \u001b[32m'Padded 8px'\u001b[0m,\n",
+ " \u001b[32m'0.988'\u001b[0m,\n",
+ " \u001b[32m'The house and the old man are alike in many ways; tall, proud, patient, contented always to \u001b[0m\n",
+ "\u001b[32mwait until their. master comes home'\u001b[0m\n",
+ " \u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Padded 8, dilation 1'\u001b[0m, \u001b[32m'0.968'\u001b[0m, \u001b[32m'And one in need of some help, it wolld appear .'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Default, grey pad'\u001b[0m, \u001b[32m'0.880'\u001b[0m, \u001b[32m'\" bambl-- we have a guest.'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Padded 8, dilation 1'\u001b[0m, \u001b[32m'0.794'\u001b[0m, \u001b[32m'=and tonight, he comes most slamming open the front'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Default, grey pad'\u001b[0m, \u001b[32m'0.857'\u001b[0m, \u001b[32m'Tell me, master: how may bambli serve 7'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\n",
+ " \u001b[32m'Pad 8, fract. 0.5'\u001b[0m,\n",
+ " \u001b[32m'0.935'\u001b[0m,\n",
+ " \u001b[32m'Some blankets to keep her. warm, bambli-~ and perhaps. some dry clothes'\u001b[0m\n",
+ " \u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Initial box'\u001b[0m, \u001b[32m'0.811'\u001b[0m, \u001b[32m\"The the old man's fades down the hall as... 7\"\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\n",
+ " \u001b[32m'Padded 8, extracted'\u001b[0m,\n",
+ " \u001b[32m'0.959'\u001b[0m,\n",
+ " \u001b[32m'How curious the whims of fate . had i not chanced to stroll along the river tonight-~'\u001b[0m\n",
+ " \u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Extracted, init box'\u001b[0m, \u001b[32m'0.846'\u001b[0m, \u001b[32m'Aulckly as t can, master,'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Extracted, init box'\u001b[0m, \u001b[32m'1.000'\u001b[0m, \u001b[32m'--the girl would most surely be dead by now.'\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\n",
+ " \u001b[32m'Padded 8, extracted'\u001b[0m,\n",
+ " \u001b[32m'0.935'\u001b[0m,\n",
+ " \u001b[32m'Ghede has been generous. the peath god has given the girl a second chance po'\u001b[0m\n",
+ " \u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Default, grey pad'\u001b[0m, \u001b[32m'0.953'\u001b[0m, \u001b[32m\"Easy, girl--there's | nothing to scream about anyaore.\"\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Default, grey pad'\u001b[0m, \u001b[32m'0.974'\u001b[0m, \u001b[32m\"You're among friends now. you're safe!\"\u001b[0m\u001b[1m)\u001b[0m,\n",
+ " \u001b[1m(\u001b[0m\u001b[32m'Initial box'\u001b[0m, \u001b[32m'1.000'\u001b[0m, \u001b[32m'Continued after next page'\u001b[0m\u001b[1m)\u001b[0m\n",
+ "\u001b[1m]\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "ll = image_experiment.best_results()\n",
+ "if ll:\n",
+ " cprint([(m.value, f\"{r.acc:.3f}\", r.ocr) for m,r in ll])\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Perfom experiments given a list of `CropMethod`s\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 105,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# methods = [*CropMethod.__members__.values()]\n",
+ "methods = [CropMethod.INITIAL_BOX, CropMethod.DEFAULT]\n",
+ "image_experiment.perform_methods(methods)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Plot the perfomance of the experiments"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 106,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "
\n",
+ " \n",
+ "
Box #
Image
Accuracy
OCR
1
0.94
Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi⎕te-haired old man known only as b8ambl .
2
0.97
The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master comes home ~~
3
0.77
And one in ⎕eet⎕⎕⎕ sv⎕e⎕⎕⎕⎕⎕⎕⎕⎕⎕ would appear.
4
0.86
Bambl ~~ we have a guest.
5
0.64
=~and tonight, he comes ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕slamming open urg⎕⎕⎕⎕en⎕⎕⎕⎕⎕tly,⎕⎕⎕⎕
6
0.82
Tell me⎕ master... how may bambli serve'7
7
0.94
Some blankets to keep her, warm, bambli-~ and perhaps. some dry clothes
8
0.75
The⎕⎕⎕⎕⎕⎕⎕⎕ the old man⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕s fades down the hall ⎕sere
9
0.96
How curious the whims of fate⎕. had i not chanced to stroll along the river tonight-~
10
0.81
A⎕⎕⎕ulckry as t can, master.
11
0.95
~<the girl would most surely be dead by now.
12
0.94
Ghede has been generous. the ceath god has given the girl a second chance po⎕⎕
13
0.67
Yi⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕renee⎕othing to scream sea⎕⎕⎕t anh⎕ore⎕.
---------- Initial box ----------\n",
+ "ResultOCR#block 00: 0.90||Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski)2 of mew ce eans, kept tidy by a white-haired old man known only as bambs, 3\n",
+ "ResultOCR#block 01: 0.93||The house and the old man are alike in many ways; tall, prolid, patient, contented always 0 wait until. their. master cones mome ~~\n",
+ "ResultOCR#block 02: 0.70||“and one in ee would appear.\n",
+ "ResultOCR#block 03: 0.62||Re bambli-~ we have a\n",
+ "ResultOCR#block 04: 0.70||Tonight, he comes noost slamming open the caken\n",
+ "ResultOCR#block 05: 0.82||Tell me naster. how may bambli serve 7\n",
+ "ResultOCR#block 06: 0.56||£7 » and perhaps some dry clothes...7/\n",
+ "ResultOCR#block 07: 0.81||The the old man's fades down the hall as...7\n",
+ "ResultOCR#block 08: 0.85||How curious the 4 fate. whims of had t not chanced to stroll along the river yl tonight ==\n",
+ "ResultOCR#block 09: 0.80||Fas oulckly as t ca, master.\n",
+ "ResultOCR#block 10: 0.91||<the girl would - most slirely be dead by now.\n",
+ "ResultOCR#block 11: 0.47||Ath the girl. a second chance ge a ee yg adil\n",
+ "ResultOCR#block 12: 0.84||Ah girl--there's othing to scream ntt anymore .\n",
+ "ResultOCR#block 13: 0.93||You're among friends now. you're sale\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Default ----------\n",
+ "ResultOCR#block 00: 0.85||Eneowered by great gnarled cypress jrfes, the ancient manor ! alone on the eit of mew rce: eans, kept tipy by a white-haired ao han known only as\n",
+ "ResultOCR#block 01: 0.96||The house and the old man are alike in many ways; tall, proud, patient, conten tel always to wait until their. master cones home ~~\n",
+ "ResultOCR#block 02: 0.74||And one in ee would appear.\n",
+ "ResultOCR#block 03: 0.41||Rir guest.\n",
+ "ResultOCR#block 04: 0.59||=~and tonight, he comes host sane oo\n",
+ "ResultOCR#block 05: 0.78||Tell me masts - how may bambli . serve 7 _\n",
+ "ResultOCR#block 06: 0.48||R warm, bambli-~ and perhaps\n",
+ "ResultOCR#block 07: 0.76||The the old mans fades down the hall s.00\n",
+ "ResultOCR#block 08: 0.92||How curious the a whims of fate . had t not chanced to stroll along the river tonight~~ >\n",
+ "ResultOCR#block 09: 0.50||Aulckly “master as t can,\n",
+ "ResultOCR#block 10: 0.94||<the girl would - most surely be dead by now.\n",
+ "ResultOCR#block 11: 0.50||Ath - the girl. a second chance ee oo tr tt\n",
+ "ResultOCR#block 12: 0.84||Oe girl--there's othing to scream nt anymore. 4\n",
+ "ResultOCR#block 13: 0.96||You're among friends now. youre safe!\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Default, grey pad ----------\n",
+ "ResultOCR#block 00: 0.95||Enbowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a white-haired old man known only as\n",
+ "ResultOCR#block 01: 0.96||The house and the old man are alike in many ways; tall, prolid, patient, contented always to wait until their. * master cones home ~-\n",
+ "ResultOCR#block 02: 0.94||“and one in > need of some help, it would appear .\n",
+ "ResultOCR#block 03: 0.88||\" bambl-- we have a guest.\n",
+ "ResultOCR#block 04: 0.72||~~and tonight, he comes urgently, slanming open\n",
+ "ResultOCR#block 05: 0.86||Tell me, master: how may bambli serve 7\n",
+ "ResultOCR#block 06: 0.90||Some blankets to keep her. warm, bambli-- and perhaps some dry \\ clothes--7/.\n",
+ "ResultOCR#block 07: 0.06||As.\n",
+ "ResultOCR#block 08: 0.91||How curious the d 7e . whims of fa; had i not chanced to stroll along the river tonight--\n",
+ "ResultOCR#block 09: 0.55||Ickl as t can\n",
+ "ResultOCR#block 10: 0.00||\n",
+ "ResultOCR#block 11: 0.85||Ghede has been generous. the death god has gen + the girl. a second chance te oe ato\" pd ate\n",
+ "ResultOCR#block 12: 0.95||Easy, girl--there's | nothing to scream about anyaore.\n",
+ "ResultOCR#block 13: 0.97||You're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 0.54||“continued a\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block 00: 0.88||Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit of mew rce: eans, kept tipy by a white-haired ao lo man known only as\n",
+ "ResultOCR#block 01: 0.93||The house and the oldman are alike in many ways; tall, proud, patient, contented a ways 0 wait until their. aster comes home ~~ | }\n",
+ "ResultOCR#block 02: 0.69||F and one in ee would appear.\n",
+ "ResultOCR#block 03: 0.77||\" bambli-— we have a gliest.\n",
+ "ResultOCR#block 04: 0.55||P comes slamming open the caken\n",
+ "ResultOCR#block 05: 0.57||Tel oe er-- 5 ow a = 7\n",
+ "ResultOCR#block 06: 0.38||We and perhaps c oe /\n",
+ "ResultOCR#block 07: 0.75||The the old mans fades down the hall sra\n",
+ "ResultOCR#block 08: 0.92||How curious the a whims of fate . - had i not chanced to stroll along the river yl tonight~-\n",
+ "ResultOCR#block 09: 0.79||Aulckly as t can, ‘masrer.\n",
+ "ResultOCR#block 10: 0.92||<the girl wolld - most surely be dead by now.\n",
+ "ResultOCR#block 11: 0.88||Ghede has been generous. the oeath gop has given - the girl. a second chance ye, alem\n",
+ "ResultOCR#block 12: 0.67||Soe er eke othing to scream ay anymore.\n",
+ "ResultOCR#block 13: 0.94||\"you're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 1.00||Continued after next page\n",
+ "\n",
+ " ---------- Padded 8px ----------\n",
+ "ResultOCR#block 00: 0.88||Enbonered by great gnarled cypress trees, the ancient manor stands alone on the pag hile of new orleans, kept tipy by a white-haired ao lo man known omy as\n",
+ "ResultOCR#block 01: 0.99||The house and the old man are alike in many ways; tall, proud, patient, contented always to wait until their. master comes home\n",
+ "ResultOCR#block 02: 0.67||7 and one in ee would appear,\n",
+ "ResultOCR#block 03: 0.76||Zf mbl == we have a guest.\n",
+ "ResultOCR#block 04: 0.61||Tonight, he comes host slamming open\n",
+ "ResultOCR#block 05: 0.75||Yy i tell me master - how may bambli serve 7 _,\n",
+ "ResultOCR#block 06: 0.89||Some. blankets to keep he arm , bambli-= and perhaps some dry clothes. 2s\n",
+ "ResultOCR#block 07: 0.72||The the old mans fades down the hal. srl see\n",
+ "ResultOCR#block 08: 0.88||* how curious the p whims of fate . - had i not chanced . to stroll along _ the river 3 tonight-~\n",
+ "ResultOCR#block 09: 0.65||Tiie as t can, \\ master ,\n",
+ "ResultOCR#block 10: 0.86||The girl wolld - most slirely be - dead by now.\n",
+ "ResultOCR#block 11: 0.62||Ghede has been generous. : the crn son ue;\n",
+ "ResultOCR#block 12: 0.62||Soe er eke othing to scream hbolt anhore hr\n",
+ "ResultOCR#block 13: 0.92||” you're among friends now. you're safe!\n",
+ "ResultOCR#block 14: 0.94||“continued after next page\n",
+ "\n",
+ " ---------- Extracted, init box ----------\n",
+ "ResultOCRExtracted#block 00: 0.92||Fhbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans. kept tipy by a whi te-haire old man known only as bambi] .\n",
+ "ResultOCRExtracted#block 01: 0.93||Ee house and the old man por alike in many ways; tall, proud, patient, contented always t° wait until their. master cones home ~~\n",
+ "ResultOCRExtracted#block 02: 0.73||And one in fee would appear.\n",
+ "ResultOCRExtracted#block 03: 0.67||— we have a i=s7t.\n",
+ "ResultOCRExtracted#block 04: 0.74||~and tonight, he comes urgently, slamming open\n",
+ "ResultOCRExtracted#block 05: 0.85||Tell me master how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.93||Some blankets to keep her warm, banbli-- and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.77||The the old man's fades down the hall s.,00\n",
+ "ResultOCRExtracted#block 08: 0.77||Hin ef fare” had i not chanced to stroll along the river. tonigmt=~\n",
+ "ResultOCRExtracted#block 09: 0.85||Aulckly as t can, master,\n",
+ "ResultOCRExtracted#block 10: 1.00||--the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.51||Ath the girl a second chance ro\n",
+ "ResultOCRExtracted#block 12: 0.56||Cas ee, othing to scream pls aa .\n",
+ "ResultOCRExtracted#block 13: 0.95||You're among friends now. you're sale!\n",
+ "ResultOCRExtracted#block 14: 0.91||Continued af ext page\n",
+ "\n",
+ " ---------- Padded 4, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 0.91||Enbonsred by great shale cypress trees, the anci manor stands alone on the [tskirts of new orleans, kept tidy by a whi te- haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.98||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master cones home --\n",
+ "ResultOCRExtracted#block 02: 0.73||And one in fee would appear.\n",
+ "ResultOCRExtracted#block 03: 0.83||Bambli we have a gliest.\n",
+ "ResultOCRExtracted#block 04: 0.74||=~and tonight, he comes urgently, slamming open\n",
+ "ResultOCRExtracted#block 05: 0.84||Tell me master. how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.53||Warm, bambli-- and perhaps. som\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sra\n",
+ "ResultOCRExtracted#block 08: 0.76||We sea had i not chanced to stroll along the river. tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.76||Alley as t can, master,\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.92||Chepe has been generous. the peath god has given the girl a second chance amr\n",
+ "ResultOCRExtracted#block 12: 0.73||Cas gr theres othing to scream pissy tore .\n",
+ "ResultOCRExtracted#block 13: 0.95||You're among eriends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.91||Continued af ext page\n",
+ "\n",
+ " ---------- Padded 8, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient nmanor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as 8ambli .\n",
+ "ResultOCRExtracted#block 01: 0.98||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master comes home --\n",
+ "ResultOCRExtracted#block 02: 0.70||And one in fee wolld appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve't\n",
+ "ResultOCRExtracted#block 06: 0.91||Some blankets to keep her warm, banbli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sire\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.77||Allckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.94||Ghede has been generous. the peath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.74||Cas gr theres othing to scream pps hore .\n",
+ "ResultOCRExtracted#block 13: 0.42||You're safe § r\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Padded 8, dilation 1 ----------\n",
+ "ResultOCRExtracted#block 00: 0.61||Outskirts of new orleans, kept tipy by a white-haired old man known only as sams .\n",
+ "ResultOCRExtracted#block 01: 0.88||The house and the old man are alike in many ways, tall, proud, nt, contented live walt gtie their, master comes home -=\n",
+ "ResultOCRExtracted#block 02: 0.97||And one in need of some help, it wolld appear .\n",
+ "ResultOCRExtracted#block 03: 0.78||Bambli ~~ we have a gliest.\n",
+ "ResultOCRExtracted#block 04: 0.79||=and tonight, he comes most slamming open the front\n",
+ "ResultOCRExtracted#block 05: 0.86||Tell me, master: how may bambli serve 7\n",
+ "ResultOCRExtracted#block 06: 0.85||Gone blankets to keep her. warm, bambli-~ and perhaps some dry\n",
+ "ResultOCRExtracted#block 07: 0.73||The old man's footsteps the hall as.re\n",
+ "ResultOCRExtracted#block 08: 0.94||How curious the whims of fate . had i not chanced to stroll along the r/| ver tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.68||Aulckly as t can,\n",
+ "ResultOCRExtracted#block 10: 0.95||~<the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.75||Ee lh boe ene. the death gop has the girl. a second chance\n",
+ "ResultOCRExtracted#block 12: 0.92||Easy, girl--there's nothing to scream abolit anymo!\n",
+ "ResultOCRExtracted#block 13: 0.97||You're among friends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.5 ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.97||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until. their. master comes home ~~\n",
+ "ResultOCRExtracted#block 02: 0.78||And one in eee pe would appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve'7\n",
+ "ResultOCRExtracted#block 06: 0.94||Some blankets to keep her. warm, bambli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.73||The the old mans fades donn the hall sire\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.81||Aulckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.98||~-the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.92||Ghede hag been generous. the peath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.76||Cas srl theres othing to scream seoit hore .\n",
+ "ResultOCRExtracted#block 13: 0.42||You're safe 4 ’\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.2 ----------\n",
+ "ResultOCRExtracted#block 00: 0.94||Enbonered by great snarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tipy by a whi te-haired old man known only as b8ambl .\n",
+ "ResultOCRExtracted#block 01: 0.97||The house and the old man are alike in many ways; tall, proud, patient, contented always t° wait until their. master comes home ~~\n",
+ "ResultOCRExtracted#block 02: 0.77||And one in eet sve would appear.\n",
+ "ResultOCRExtracted#block 03: 0.86||Bambl ~~ we have a guest.\n",
+ "ResultOCRExtracted#block 04: 0.64||=~and tonight, he comes slamming open urgently,\n",
+ "ResultOCRExtracted#block 05: 0.82||Tell me master... how may bambli serve'7\n",
+ "ResultOCRExtracted#block 06: 0.94||Some blankets to keep her, warm, bambli-~ and perhaps. some dry clothes\n",
+ "ResultOCRExtracted#block 07: 0.75||The the old mans fades down the hall sere\n",
+ "ResultOCRExtracted#block 08: 0.96||How curious the whims of fate . had i not chanced to stroll along the river tonight-~\n",
+ "ResultOCRExtracted#block 09: 0.81||Aulckry as t can, master.\n",
+ "ResultOCRExtracted#block 10: 0.95||~<the girl would most surely be dead by now.\n",
+ "ResultOCRExtracted#block 11: 0.94||Ghede has been generous. the ceath god has given the girl a second chance po\n",
+ "ResultOCRExtracted#block 12: 0.67||Yi renee othing to scream seat anhore .\n",
+ "ResultOCRExtracted#block 13: 0.93||Youre among eriends now. you're safe!\n",
+ "ResultOCRExtracted#block 14: 0.83||| continued af ext page\n",
+ "\n",
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "---------- Initial box ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.90\u001b[0m||Eneonered by great gnarled cypress jrfes, the ancient manor stands alone on the outski\u001b[1m)\u001b[0m \u001b[1;36m2\u001b[0m of mew ce eans, kept tidy by a white-haired old man known only as bambs, \u001b[1;36m3\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.93\u001b[0m||The house and the old man are alike in many ways; tall, prolid, patient, contented always \u001b[1;36m0\u001b[0m wait until. their. master cones mome ~~\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.70\u001b[0m||“and one in ee would appear.\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.62\u001b[0m||Re bambli-~ we have a\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.70\u001b[0m||Tonight, he comes noost slamming open the caken\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.82\u001b[0m||Tell me naster. how may bambli serve \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.56\u001b[0m||£\u001b[1;36m7\u001b[0m » and perhaps some dry clothes\u001b[33m...\u001b[0m \u001b[1;36m7\u001b[0m \u001b[35m/\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.81\u001b[0m||The the old man's fades down the hall as\u001b[33m...\u001b[0m \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.85\u001b[0m||How curious the \u001b[1;36m4\u001b[0m fate. whims of had t not chanced to stroll along the river yl tonight ==\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.80\u001b[0m||Fas oulckly as t ca, master.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.91\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.50\u001b[0m||Aulckly “master as t can,\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.94\u001b[0m|| need of some help, it would appear .\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.88\u001b[0m||\" bambl-- we have a guest.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.72\u001b[0m||~~and tonight, he comes urgently, slanming open\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.86\u001b[0m||Tell me, master: how may bambli serve \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.90\u001b[0m||Some blankets to keep her. warm, bambli-- and perhaps some dry \\ clothes-\u001b[1;36m-7\u001b[0m \u001b[35m/\u001b[0m\u001b[95m.\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.06\u001b[0m||As.\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.91\u001b[0m||How curious the d 7e . whims of fa; had i not chanced to stroll along the river tonight--\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.55\u001b[0m||Ickl as t can\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.85\u001b[0m||Ghede has been generous. the death god has gen + the girl. a second chance te oe ato\" pd ate\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.95\u001b[0m||Easy, girl--there's | nothing to scream about anyaore.\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.97\u001b[0m||You're among friends now. you're safe!\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.54\u001b[0m||“continued a\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.88\u001b[0m||Enbonered by great gnarled cypress jrfes, the ancient manor stands alone on the eit of mew rce: eans, kept tipy by a white-haired ao lo man known only as\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.93\u001b[0m||The house and the oldman are alike in many ways; tall, proud, patient, contented a ways \u001b[1;36m0\u001b[0m wait until their. aster comes home ~~ | \u001b[1m}\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.69\u001b[0m||F and one in ee would appear.\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.77\u001b[0m||\" bambli-— we have a gliest.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.55\u001b[0m||P comes slamming open the caken\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.57\u001b[0m||Tel oe er-- \u001b[1;36m5\u001b[0m ow a = \u001b[1;36m7\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.38\u001b[0m||We and perhaps c oe \u001b[35m/\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.75\u001b[0m||The the old mans fades down the hall sra\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.92\u001b[0m||How curious the a whims of fate . - had i not chanced to stroll along the river yl tonight~-\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.79\u001b[0m||Aulckly as t can, ‘masrer.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.92\u001b[0m||---------- Initial box ----------\n",
+ "ResultOCR#block 00: 0.90||Suddenly.\n",
+ "ResultOCR#block 01: 0.81||>gasp!z everything's w- whirling around me!t can't stand sr rea\n",
+ "ResultOCR#block 02: 0.81||Clark!i'm falling' “help! help! —\n",
+ "ResultOCR#block 03: 0.67||I-i'm bs \"passing ohhh\n",
+ "ResultOCR#block 04: 0.92||Action comics\n",
+ "ResultOCR#block 05: 1.00||Then, seconds later...\n",
+ "ResultOCR#block 06: 0.89||Great caesars ghost! { this /s black magic! we've been transportel to the weirdest world, tit ever saw!\n",
+ "ResultOCR#block 07: 0.91||...|t certainly isn't our earth, perry. look at the size of those bees.\n",
+ "ResultOCR#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCR#block 09: 0.73||Owwww.\n",
+ "ResultOCR#block 10: 0.85||Yet the bee's stinger went ws. right through my uniform and penetrated my skin! that means. the fabric of pe ay costume has become dj} satie fued 1,\n",
+ "ResultOCR#block 11: 0.87||Hurry. let's beat it before we get stung, foo = aii\n",
+ "ResultOCR#block 12: 0.86||Ggreat guns!...2g, .pa/n i feel n pain! as superman, i should be > invulnerable! 1 have unbreakabl ‘skin! under my clark kent clothes, im wearing an woestructisle superman uniform ! =\n",
+ "ResultOCR#block 13: 1.00||Abruptly...\n",
+ "ResultOCR#block 14: 0.77||Great caesar's ghost. he's spinning a web of g/ant, sr strands --\n",
+ "ResultOCR#block 15: 0.89||I-i feel the heat of the sun...the pain of the bee-sting ... the heavy weight of my pack! every human discomfort... good grief! i've lost all my super-powers! i've become an ordinary mortal in this world. /\n",
+ "ResultOCR#block 16: 0.84||Enormous spider- like creature is going berserk, as if the sight of us excited him (into mad spinning get back! that 2\n",
+ "\n",
+ " ---------- Default ----------\n",
+ "ResultOCR#block 00: 1.00||Suddenly...\n",
+ "ResultOCR#block 01: 0.80||Gasp! everything's w- whi rns atound me!i can't stand we a\n",
+ "ResultOCR#block 02: 0.87||Clark!i'm falling! help! help\n",
+ "ResultOCR#block 03: 0.45||I-i'm se ou\n",
+ "ResultOCR#block 04: 0.92||Action comics\n",
+ "ResultOCR#block 05: 0.95||Then, seconds later.\n",
+ "ResultOCR#block 06: 0.92||Great caesar's ghost! this /s black magic! we've been transported, to the weirdest world, i ever saw!\n",
+ "ResultOCR#block 07: 0.93||...|t certainly isn't our earth, perry look at the size of those bees!\n",
+ "ResultOCR#block 08: 0.88||Watch out, clark)\n",
+ "ResultOCR#block 09: 0.67||Owwww.,\n",
+ "ResultOCR#block 10: 0.97||Yet the bee's stinger went = right through my uniform ando penetrated my skin! that mean the fabric of my superman costume has become ordinary, cloth! £\n",
+ "ResultOCR#block 11: 0.89||Hurry. let's beat it before we get stung, tod yi\n",
+ "ResultOCR#block 12: 0.88||Ggreat guns!...2gasp/z...pain i feel n fan! as superman, i should be invulnerable! 1 have unbreakable skin! under my clark kent clothes, i'm wearing an /noestruct/ele superman uniform !\n",
+ "ResultOCR#block 13: 1.00||Abruptly...\n",
+ "ResultOCR#block 14: 0.76||Great caesars ° ghost! he's spinning & web of g/a, silk strands\n",
+ "ResultOCR#block 15: 0.89||I-i feel the heat of the sum...the pain of the bee-sting... the heavy weight of my { pack! every human discomfort... good grief! i've lost all my super -powersx ve become an ordinary mortal in this world! j\n",
+ "ResultOCR#block 16: 0.70||Like creature is going berserk, as if the sight of us excited him into mad spinning, get back! that \\ enormous spider-\n",
+ "\n",
+ " ---------- Default, grey pad ----------\n",
+ "ResultOCR#block 00: 0.78||Sudden.\n",
+ "ResultOCR#block 01: 0.82||5gasp!z everything's | w- whirling around] me!t can't stand ue a\n",
+ "ResultOCR#block 02: 0.57||Ark!i'm falling\n",
+ "ResultOCR#block 03: 0.85||I-t'm eq \"passing out... ohhh.\n",
+ "ResultOCR#block 04: 0.92||Action comics\n",
+ "ResultOCR#block 05: 0.00||\n",
+ "ResultOCR#block 06: 0.88||‘great caesar's ghost! || this /§ black magic! we've been transportel to the weirdest world, i ever saw.\n",
+ "ResultOCR#block 07: 0.87||\"...it certainly isn't our | earth, perry look at the| \\s|ze of those bees.\n",
+ "ResultOCR#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCR#block 09: 0.73||Owwww.\n",
+ "ResultOCR#block 10: 0.90||Yet the bee's stinger went ~ right through my uniform and penetrated my skin! that means. the fabric of sera ry costume vis become din a s clots! a\n",
+ "ResultOCR#block 11: 0.80||Hurry. let's ) beat it before] we get stung, k 00! __j\n",
+ "ResultOCR#block 12: 0.86||Ggreat guns!...3gasp/=... rain t feel fan! as superman, i should be invulnerable® 1 have unbreakabli skin! under my clark kent clothes, ia wearing an /noestryct/iele supe iperman uniform !\n",
+ "ResultOCR#block 13: 0.90||Abruptly.\n",
+ "ResultOCR#block 14: 0.92||Great caesar's | ghost! he's spinning k web of giant, silk strands -- as tough as steel! ne]\n",
+ "ResultOCR#block 15: 0.89||\\i-t feel the heat of the sun...the pain of the bee-sting... the heavy weight of my & pack! every human discomfort... good grief! i've lost all my super-powers i've become an ordinary mortal in this world. /.\n",
+ "ResultOCR#block 16: 0.94||(get back! that enormous spider- like creature 1 § going berserk, as if the sight of us excited him into mad spinning]\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block 00: 0.90||Suddenly.\n",
+ "ResultOCR#block 01: 0.83||Fgasp/= everything's whirling around me!i can't stand ea\n",
+ "ResultOCR#block 02: 0.68||Ie lottie eto clark!i'm falling help! help!\n",
+ "ResultOCR#block 03: 0.55||I-i'm yr a ohh hh.\n",
+ "ResultOCR#block 04: 0.92||Action comics\n",
+ "ResultOCR#block 05: 0.74||Then, seconds\n",
+ "ResultOCR#block 06: 0.62||Great caesar's ghost! \\ this /§ black magic! i ever saw!\n",
+ "ResultOCR#block 07: 0.87||T certainly isn't our earth, perry! look at the \\s|ze of those bees.\n",
+ "ResultOCR#block 08: 0.76||#3 watch out, clark!\n",
+ "ResultOCR#block 09: 0.73||Owwww.\n",
+ "ResultOCR#block 10: 0.95||\"yet the bee's stinger went ~~ right through my uniform and enetrated my skin! that means. the fabric of my superman costume has become ordinary, clot! &\n",
+ "ResultOCR#block 11: 0.86||Hurry! let's beat it b=fore we get stung, ¢ tool: puma\n",
+ "ResultOCR#block 12: 0.86||‘ggreat guns!...3gasp/5... pain i feel pain? as superman, i should be invilnerable | t ne unbreakasle gkin! under my clark kent clothes, tih wearing an indestructible superman uniform\n",
+ "ResultOCR#block 13: 1.00||Abruptly...\n",
+ "ResultOCR#block 14: 0.91||Great caesars ghost. he's spinning a web of g/ant, | silk strands as tough as steel!\n",
+ "ResultOCR#block 15: 0.89||I-i feel the heat of the sun...the pain of the bee-sting... the heavy weight of my { pack! every human discomfort... good grief! i've lost all \\my super-powersx ve become an ordinary mortal in this world.\n",
+ "ResultOCR#block 16: 0.81||(get back! that enormous spider- as if the sight of us excited him [into mad spinning\n",
+ "\n",
+ " ---------- Padded 8px ----------\n",
+ "ResultOCR#block 00: 0.00||\n",
+ "ResultOCR#block 01: 0.76||Ega5p/% everything s\\ w- whirling around me! t can't stand ue... slee\n",
+ "ResultOCR#block 02: 0.41||\\ help! help’\n",
+ "ResultOCR#block 03: 0.86||-i'm passing qut... ohh hh.\n",
+ "ResultOCR#block 04: 0.92||Action comics\n",
+ "ResultOCR#block 05: 0.00||\n",
+ "ResultOCR#block 06: 0.18||I ever saw/\n",
+ "ResultOCR#block 07: 0.86||It certainly isn't our earth, perry! poor a the size of thos!\n",
+ "ResultOCR#block 08: 0.62||“watch out” ial ae clark!\n",
+ "ResultOCR#block 09: 0.73||Owwww.\n",
+ "ResultOCR#block 10: 0.98||Yet the bee's stinger went right through my uniform and \\penetrated my skin! that means. the fabric of my superman costume has become ordinar! cloth!\n",
+ "ResultOCR#block 11: 0.93||Hurry. let's beat it before we get stung, too!\n",
+ "ResultOCR#block 12: 0.88||Ggreat guns!...2gasp/z...pain i feel fan! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an /noestruct/ele $ superman uniform ! &\n",
+ "ResultOCR#block 13: 0.00||\n",
+ "ResultOCR#block 14: 0.91||Great caesai ghost. he's spinning a web of g/ant, silk strands -- as tough as steel!\n",
+ "ResultOCR#block 15: 0.88||I-i feel the heat of the sun...the pain of the bee-sting... the heavy weight of my ! every human discomfort... good grief! i've lost all \\my super-powers! ‘ve become an ordinary mortal in this world. x\n",
+ "ResultOCR#block 16: 0.60||As if the sight of us excited him {into mad spinning\n",
+ "\n",
+ " ---------- Extracted, init box ----------\n",
+ "ResultOCRExtracted#block 00: 0.90||Suddenly.\n",
+ "ResultOCRExtracted#block 01: 0.83||Fgasp!z everything § w- whirling around me!i can't stand ue.\n",
+ "ResultOCRExtracted#block 02: 0.89||Clark!i'm falling! help! help!\n",
+ "ResultOCRExtracted#block 03: 0.71||I-i'm passing ohhh?\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 0.98||Then, seconds later..\n",
+ "ResultOCRExtracted#block 06: 0.91||Great caesar's ghost! this /s black magic! we've been transportel to the weirdest world tit ever saw.\n",
+ "ResultOCRExtracted#block 07: 0.90||...it certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.62||Oowwwww.\n",
+ "ResultOCRExtracted#block 10: 0.99||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block 11: 0.93||Hurry. let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.88||Ggreat guns!...2g/ pain t feel pain? as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an /noestruct/ble superman uniform *\n",
+ "ResultOCRExtracted#block 13: 0.96||Abruptly. ..\n",
+ "ResultOCRExtracted#block 14: 0.80||Great caesar's ghost. he's spinning a web of g/ant, silk strands --\n",
+ "ResultOCRExtracted#block 15: 0.89||I-i feel the heat of the sun. the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! t've lost all my super-powers. ve become an ordinary mortal im this world.\n",
+ "ResultOCRExtracted#block 16: 0.97||Get back. that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning\n",
+ "\n",
+ " ---------- Padded 4, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 1.00||Suddenly...\n",
+ "ResultOCRExtracted#block 01: 0.83||Sgasp/z everything s w- whirling around me!i can't stand ue.\n",
+ "ResultOCRExtracted#block 02: 0.87||Clark! i'm falling’ help! help!\n",
+ "ResultOCRExtracted#block 03: 0.78||I-i'm passing ohhhh.\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 0.98||Then, seconds later..\n",
+ "ResultOCRExtracted#block 06: 0.93||Great caesar's ghost! this as black magic! we've been transported to the weirdest world i ever saw!\n",
+ "ResultOCRExtracted#block 07: 0.90||...it certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.67||Owwww.,\n",
+ "ResultOCRExtracted#block 10: 0.99||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block 11: 0.96||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.89||Ggreat guns!...2gasp/:... pain t feel fan! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an /noestruct/ble superman uniform !\n",
+ "ResultOCRExtracted#block 13: 1.00||Abruptly...\n",
+ "ResultOCRExtracted#block 14: 0.94||Great caesar's ghost! he's spinning a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block 15: 0.89||I-i feel the heat of the sun., the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! t've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block 16: 0.97||Get back! that enormous spider- like creature is going berserk, 25 if the sight of us excited him into mad spinning\n",
+ "\n",
+ " ---------- Padded 8, extracted ----------\n",
+ "ResultOCRExtracted#block 00: 0.91||Suddemly...\n",
+ "ResultOCRExtracted#block 01: 0.85||Sgasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block 02: 0.84||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block 03: 0.73||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 1.00||Then, seconds later...\n",
+ "ResultOCRExtracted#block 06: 0.90||Great caesar's ghost! f this /§ black magic! : we've been transported to the weirdest world i ever saw.\n",
+ "ResultOCRExtracted#block 07: 0.90||...it certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.67||Owwww.,\n",
+ "ResultOCRExtracted#block 10: 0.99||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block 11: 0.96||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.89||Ggreat guns!...2gasp/:... pain t feel fan! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an /noestruct/ble superman uniform !\n",
+ "ResultOCRExtracted#block 13: 0.96||Abruptly. ..\n",
+ "ResultOCRExtracted#block 14: 0.93||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block 15: 0.89||I-t feel the heat of the sun., the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! t've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block 16: 0.95||Get back! that enormous spider- like creature is going berserk, 25 if the sight of us excited him into mad spinning g [\n",
+ "\n",
+ " ---------- Padded 8, dilation 1 ----------\n",
+ "ResultOCRExtracted#block 00: 1.00||Suddenly...\n",
+ "ResultOCRExtracted#block 01: 0.83||Gasp! everything s w-whirling around wel] i can't stand\n",
+ "ResultOCRExtracted#block 02: 0.86||Clark!i'm falling! . help! help!\n",
+ "ResultOCRExtracted#block 03: 0.73||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 0.95||Then, seconds later.\n",
+ "ResultOCRExtracted#block 06: 0.91||Great caesar's ghost! e this /5 black magic! we've been transported to the weirdest world i ever saw/\n",
+ "ResultOCRExtracted#block 07: 0.91||...|it certainly isnt our earth, perry” look at the size of those bees!\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.73||Owwww.\n",
+ "ResultOCRExtracted#block 10: 0.94||Yet the bee's stinger we right through my tform and penetrated my skin! that means. the fabric of my superman othe has become ordinary clot?\n",
+ "ResultOCRExtracted#block 11: 0.96||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.82||Fain! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an indestructible superman uniform !\n",
+ "ResultOCRExtracted#block 13: 0.00||\n",
+ "ResultOCRExtracted#block 14: 0.67||Great caesars ghost! he's spinning _ a web of g/ant,\n",
+ "ResultOCRExtracted#block 15: 0.86||I-i feel the heat of the sun. the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! i've lost all my super-powers. ve ie an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block 16: 0.96||Get back! that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning’ g [\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.5 ----------\n",
+ "ResultOCRExtracted#block 00: 0.91||Suddemly...\n",
+ "ResultOCRExtracted#block 01: 0.87||Gasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block 02: 0.84||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block 03: 0.78||I-i'm passing ohhhh.\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 1.00||Then, seconds later...\n",
+ "ResultOCRExtracted#block 06: 0.90||Great caesar's ghost! f this /§ black magic! : we've been transported to the weirdest world i ever saw.\n",
+ "ResultOCRExtracted#block 07: 0.92||...it certainly isnt our earth, perry! look at the size of those bees.\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.67||Owwww.,\n",
+ "ResultOCRExtracted#block 10: 0.99||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block 11: 0.96||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.90||Ggreat guns!...2gasp/:...pain i feel fan! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an /nodestruct/ible superman uniform !\n",
+ "ResultOCRExtracted#block 13: 0.96||Abruptly. ..\n",
+ "ResultOCRExtracted#block 14: 0.93||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block 15: 0.89||I-i feel the heat of the sun. the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! i've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block 16: 0.95||Get back! that enormous spider- like creature is going berserk, 25 if the sight of us excited him into mad spinning g [\n",
+ "\n",
+ " ---------- Pad 8, fract. 0.2 ----------\n",
+ "ResultOCRExtracted#block 00: 1.00||Suddenly...\n",
+ "ResultOCRExtracted#block 01: 0.87||Gasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block 02: 0.84||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block 03: 0.73||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block 04: 0.92||Action comics\n",
+ "ResultOCRExtracted#block 05: 1.00||Then, seconds later...\n",
+ "ResultOCRExtracted#block 06: 0.90||Great caesar's ghost! f this /§ black magic! : we've been transported to the weirdest world i ever saw!\n",
+ "ResultOCRExtracted#block 07: 0.92||...it certainly isnt our earth, perry! look at the size of those bees.\n",
+ "ResultOCRExtracted#block 08: 0.88||Watch out, clark!\n",
+ "ResultOCRExtracted#block 09: 0.67||Owwww.,\n",
+ "ResultOCRExtracted#block 10: 0.99||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block 11: 0.96||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block 12: 0.91||Ggreat guns!...2gasp/:... pain i feel fan! as superman, i should be invulnerable! 1 have unbreakabl skin! under my clark kent clothes, i'm wearing an indestructible superman uniform !\n",
+ "ResultOCRExtracted#block 13: 0.96||Abruptly....\n",
+ "ResultOCRExtracted#block 14: 0.93||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block 15: 0.89||I-i feel the heat of the sun., the pain of the bee-sting... the heavy weight of my pack! every human discomfort... good grief! i've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block 16: 0.96||Get back! that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning gs [\n",
+ "\n",
+ "\n",
+ "\n"
+ ],
+ "text/plain": [
+ "---------- Initial box ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.90\u001b[0m||Suddenly.\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.81\u001b[0m||>gasp!z everything's w- whirling around me!t can't stand sr rea\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.81\u001b[0m||Clark!i'm falling' “help! help! —\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.67\u001b[0m||I-i'm bs \"passing ohhh\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m1.00\u001b[0m||Then, seconds later\u001b[33m...\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.89\u001b[0m||Great caesars ghost! \u001b[1m{\u001b[0m this \u001b[35m/\u001b[0m\u001b[95ms\u001b[0m black magic! we've been transportel to the weirdest world, tit ever saw!\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.91\u001b[0m||\u001b[33m...\u001b[0m|t certainly isn't our earth, perry. look at the size of those bees.\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.73\u001b[0m||Owwww.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.85\u001b[0m||Yet the bee's stinger went ws. right through my uniform and penetrated my skin! that means. the fabric of pe ay costume has become dj\u001b[1m}\u001b[0m satie fued \u001b[1;36m1\u001b[0m,\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.87\u001b[0m||Hurry. let's beat it before we get stung, foo = aii\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.86\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2g, .pa/n i feel n pain! as superman, i should be > invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl ‘skin! under my clark kent clothes, im wearing an woestructisle superman uniform ! =\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m1.00\u001b[0m||Abruptly\u001b[33m...\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.77\u001b[0m||Great caesar's ghost. he's spinning a web of g/ant, sr strands --\n",
+ "ResultOCR#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun\u001b[33m...\u001b[0mthe pain of the bee-sting \u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super-powers! i've become an ordinary mortal in this world. \u001b[35m/\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.84\u001b[0m||Enormous spider- like creature is going berserk, as if the sight of us excited him \u001b[1m(\u001b[0minto mad spinning get back! that \u001b[1;36m2\u001b[0m\n",
+ "\n",
+ " ---------- Default ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m1.00\u001b[0m||Suddenly\u001b[33m...\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.80\u001b[0m||Gasp! everything's w- whi rns atound me!i can't stand we a\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.87\u001b[0m||Clark!i'm falling! help! help\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.45\u001b[0m||I-i'm se ou\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.95\u001b[0m||Then, seconds later.\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.92\u001b[0m||Great caesar's ghost! this \u001b[35m/\u001b[0m\u001b[95ms\u001b[0m black magic! we've been transported, to the weirdest world, i ever saw!\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.93\u001b[0m||\u001b[33m...\u001b[0m|t certainly isn't our earth, perry look at the size of those bees!\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark\u001b[1m)\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.67\u001b[0m||Owwww.,\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.97\u001b[0m||Yet the bee's stinger went = right through my uniform ando penetrated my skin! that mean the fabric of my superman costume has become ordinary, cloth! £\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.89\u001b[0m||Hurry. let's beat it before we get stung, tod yi\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.88\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/z\u001b[33m...\u001b[0mpain i feel n fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakable skin! under my clark kent clothes, i'm wearing an \u001b[35m/noestruct/\u001b[0m\u001b[95mele\u001b[0m superman uniform !\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m1.00\u001b[0m||Abruptly\u001b[33m...\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.76\u001b[0m||Great caesars ° ghost! he's spinning & web of g/a, silk strands\n",
+ "ResultOCR#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sum\u001b[33m...\u001b[0mthe pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my \u001b[1m{\u001b[0m pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super -powersx ve become an ordinary mortal in this world! j\n",
+ "ResultOCR#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.70\u001b[0m||Like creature is going berserk, as if the sight of us excited him into mad spinning, get back! that \\ enormous spider-\n",
+ "\n",
+ " ---------- Default, grey pad ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.78\u001b[0m||Sudden.\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.82\u001b[0m||5gasp!z everything's | w- whirling around\u001b[1m]\u001b[0m me!t can't stand ue a\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.57\u001b[0m||Ark!i'm falling\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.85\u001b[0m||I-t'm eq \"passing out\u001b[33m...\u001b[0m ohhh.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.88\u001b[0m||‘great caesar's ghost! || this \u001b[35m/\u001b[0m§ black magic! we've been transportel to the weirdest world, i ever saw.\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.87\u001b[0m||\"\u001b[33m...\u001b[0mit certainly isn't our | earth, perry look at the| \\s|ze of those bees.\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.73\u001b[0m||Owwww.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.90\u001b[0m||Yet the bee's stinger went ~ right through my uniform and penetrated my skin! that means. the fabric of sera ry costume vis become din a s clots! a\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.80\u001b[0m||Hurry. let's \u001b[1m)\u001b[0m beat it before\u001b[1m]\u001b[0m we get stung, k \u001b[1;36m00\u001b[0m! __j\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.86\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m3gasp/=\u001b[33m...\u001b[0m rain t feel fan! as superman, i should be invulnerable® \u001b[1;36m1\u001b[0m have unbreakabli skin! under my clark kent clothes, ia wearing an \u001b[35m/noestryct/\u001b[0m\u001b[95miele\u001b[0m supe iperman uniform !\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.90\u001b[0m||Abruptly.\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.92\u001b[0m||Great caesar's | ghost! he's spinning k web of giant, silk strands -- as tough as steel! ne\u001b[1m]\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||\\i-t feel the heat of the sun\u001b[33m...\u001b[0mthe pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my & pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super-powers i've become an ordinary mortal in this world. \u001b[35m/\u001b[0m\u001b[95m.\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.94\u001b[0m||\u001b[1m(\u001b[0mget back! that enormous spider- like creature \u001b[1;36m1\u001b[0m § going berserk, as if the sight of us excited him into mad spinning\u001b[1m]\u001b[0m\n",
+ "\n",
+ " ---------- Padded 4px ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.90\u001b[0m||Suddenly.\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.83\u001b[0m||Fgasp/= everything's whirling around me!i can't stand ea\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.68\u001b[0m||Ie lottie eto clark!i'm falling help! help!\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.55\u001b[0m||I-i'm yr a ohh hh.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.74\u001b[0m||Then, seconds\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.62\u001b[0m||Great caesar's ghost! \\ this \u001b[35m/\u001b[0m§ black magic! i ever saw!\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.87\u001b[0m||T certainly isn't our earth, perry! look at the \\s|ze of those bees.\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.76\u001b[0m||#\u001b[1;36m3\u001b[0m watch out, clark!\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.73\u001b[0m||Owwww.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.95\u001b[0m||\"yet the bee's stinger went ~~ right through my uniform and enetrated my skin! that means. the fabric of my superman costume has become ordinary, clot! &\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.86\u001b[0m||Hurry! let's beat it \u001b[33mb\u001b[0m=\u001b[35mfore\u001b[0m we get stung, ¢ tool: puma\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.86\u001b[0m||‘ggreat guns!\u001b[33m...\u001b[0m3gasp/\u001b[1;36m5\u001b[0m\u001b[33m...\u001b[0m pain i feel pain? as superman, i should be invilnerable | t ne unbreakasle gkin! under my clark kent clothes, tih wearing an indestructible superman uniform\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m1.00\u001b[0m||Abruptly\u001b[33m...\u001b[0m\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.91\u001b[0m||Great caesars ghost. he's spinning a web of g/ant, | silk strands as tough as steel!\n",
+ "ResultOCR#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun\u001b[33m...\u001b[0mthe pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my \u001b[1m{\u001b[0m pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all \\my super-powersx ve become an ordinary mortal in this world.\n",
+ "ResultOCR#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.81\u001b[0m||\u001b[1m(\u001b[0mget back! that enormous spider- as if the sight of us excited him \u001b[1m[\u001b[0minto mad spinning\n",
+ "\n",
+ " ---------- Padded 8px ----------\n",
+ "ResultOCR#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.76\u001b[0m||Ega5p/% everything s\\ w- whirling around me! t can't stand ue\u001b[33m...\u001b[0m slee\n",
+ "ResultOCR#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.41\u001b[0m||\\ help! help’\n",
+ "ResultOCR#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.86\u001b[0m||-i'm passing qut\u001b[33m...\u001b[0m ohh hh.\n",
+ "ResultOCR#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCR#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.18\u001b[0m||I ever saw/\n",
+ "ResultOCR#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.86\u001b[0m||It certainly isn't our earth, perry! poor a the size of thos!\n",
+ "ResultOCR#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.62\u001b[0m||“watch out” ial ae clark!\n",
+ "ResultOCR#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.73\u001b[0m||Owwww.\n",
+ "ResultOCR#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.98\u001b[0m||Yet the bee's stinger went right through my uniform and \\penetrated my skin! that means. the fabric of my superman costume has become ordinar! cloth!\n",
+ "ResultOCR#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.93\u001b[0m||Hurry. let's beat it before we get stung, too!\n",
+ "ResultOCR#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.88\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/z\u001b[33m...\u001b[0mpain i feel fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an \u001b[35m/noestruct/\u001b[0m\u001b[95mele\u001b[0m $ superman uniform ! &\n",
+ "ResultOCR#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCR#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.91\u001b[0m||Great caesai ghost. he's spinning a web of g/ant, silk strands -- as tough as steel!\n",
+ "ResultOCR#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.88\u001b[0m||I-i feel the heat of the sun\u001b[33m...\u001b[0mthe pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my ! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all \\my super-powers! ‘ve become an ordinary mortal in this world. x\n",
+ "ResultOCR#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.60\u001b[0m||As if the sight of us excited him \u001b[1m{\u001b[0minto mad spinning\n",
+ "\n",
+ " ---------- Extracted, init box ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.90\u001b[0m||Suddenly.\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.83\u001b[0m||Fgasp!z everything § w- whirling around me!i can't stand ue.\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.89\u001b[0m||Clark!i'm falling! help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.71\u001b[0m||I-i'm passing ohhh?\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.98\u001b[0m||Then, seconds later..\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.91\u001b[0m||Great caesar's ghost! this \u001b[35m/\u001b[0m\u001b[95ms\u001b[0m black magic! we've been transportel to the weirdest world tit ever saw.\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.90\u001b[0m||\u001b[33m...\u001b[0mit certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.62\u001b[0m||Oowwwww.\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.99\u001b[0m||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.93\u001b[0m||Hurry. let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.88\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2g/ pain t feel pain? as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an \u001b[35m/noestruct/\u001b[0m\u001b[95mble\u001b[0m superman uniform *\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.96\u001b[0m||Abruptly. ..\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.80\u001b[0m||Great caesar's ghost. he's spinning a web of g/ant, silk strands --\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun. the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! t've lost all my super-powers. ve become an ordinary mortal im this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.97\u001b[0m||Get back. that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning\n",
+ "\n",
+ " ---------- Padded \u001b[1;36m4\u001b[0m, extracted ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m1.00\u001b[0m||Suddenly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.83\u001b[0m||Sgasp/z everything s w- whirling around me!i can't stand ue.\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.87\u001b[0m||Clark! i'm falling’ help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.78\u001b[0m||I-i'm passing ohhhh.\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.98\u001b[0m||Then, seconds later..\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.93\u001b[0m||Great caesar's ghost! this as black magic! we've been transported to the weirdest world i ever saw!\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.90\u001b[0m||\u001b[33m...\u001b[0mit certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.67\u001b[0m||Owwww.,\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.99\u001b[0m||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.96\u001b[0m||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.89\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/:\u001b[33m...\u001b[0m pain t feel fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an \u001b[35m/noestruct/\u001b[0m\u001b[95mble\u001b[0m superman uniform !\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m1.00\u001b[0m||Abruptly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.94\u001b[0m||Great caesar's ghost! he's spinning a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun., the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! t've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.97\u001b[0m||Get back! that enormous spider- like creature is going berserk, \u001b[1;36m25\u001b[0m if the sight of us excited him into mad spinning\n",
+ "\n",
+ " ---------- Padded \u001b[1;36m8\u001b[0m, extracted ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.91\u001b[0m||Suddemly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.85\u001b[0m||Sgasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.84\u001b[0m||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.73\u001b[0m||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m1.00\u001b[0m||Then, seconds later\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.90\u001b[0m||Great caesar's ghost! f this \u001b[35m/\u001b[0m§ black magic! : we've been transported to the weirdest world i ever saw.\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.90\u001b[0m||\u001b[33m...\u001b[0mit certainly isnt our earth, perry. look at the size of those bees.\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.67\u001b[0m||Owwww.,\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.99\u001b[0m||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.96\u001b[0m||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.89\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/:\u001b[33m...\u001b[0m pain t feel fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an \u001b[35m/noestruct/\u001b[0m\u001b[95mble\u001b[0m superman uniform !\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.96\u001b[0m||Abruptly. ..\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.93\u001b[0m||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-t feel the heat of the sun., the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! t've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.95\u001b[0m||Get back! that enormous spider- like creature is going berserk, \u001b[1;36m25\u001b[0m if the sight of us excited him into mad spinning g \u001b[1m[\u001b[0m\n",
+ "\n",
+ " ---------- Padded \u001b[1;36m8\u001b[0m, dilation \u001b[1;36m1\u001b[0m ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m1.00\u001b[0m||Suddenly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.83\u001b[0m||Gasp! everything s w-whirling around wel\u001b[1m]\u001b[0m i can't stand\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.86\u001b[0m||Clark!i'm falling! . help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.73\u001b[0m||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m0.95\u001b[0m||Then, seconds later.\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.91\u001b[0m||Great caesar's ghost! e this \u001b[35m/\u001b[0m\u001b[95m5\u001b[0m black magic! we've been transported to the weirdest world i ever saw/\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.91\u001b[0m||\u001b[33m...\u001b[0m|it certainly isnt our earth, perry” look at the size of those bees!\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.73\u001b[0m||Owwww.\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.94\u001b[0m||Yet the bee's stinger we right through my tform and penetrated my skin! that means. the fabric of my superman othe has become ordinary clot?\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.96\u001b[0m||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.82\u001b[0m||Fain! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an indestructible superman uniform !\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.00\u001b[0m||\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.67\u001b[0m||Great caesars ghost! he's spinning _ a web of g/ant,\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.86\u001b[0m||I-i feel the heat of the sun. the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super-powers. ve ie an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.96\u001b[0m||Get back! that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning’ g \u001b[1m[\u001b[0m\n",
+ "\n",
+ " ---------- Pad \u001b[1;36m8\u001b[0m, fract. \u001b[1;36m0.5\u001b[0m ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m0.91\u001b[0m||Suddemly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.87\u001b[0m||Gasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.84\u001b[0m||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.78\u001b[0m||I-i'm passing ohhhh.\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m1.00\u001b[0m||Then, seconds later\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.90\u001b[0m||Great caesar's ghost! f this \u001b[35m/\u001b[0m§ black magic! : we've been transported to the weirdest world i ever saw.\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.92\u001b[0m||\u001b[33m...\u001b[0mit certainly isnt our earth, perry! look at the size of those bees.\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.67\u001b[0m||Owwww.,\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.99\u001b[0m||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.96\u001b[0m||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.90\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/:\u001b[33m...\u001b[0mpain i feel fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an \u001b[35m/nodestruct/\u001b[0m\u001b[95mible\u001b[0m superman uniform !\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.96\u001b[0m||Abruptly. ..\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.93\u001b[0m||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun. the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.95\u001b[0m||Get back! that enormous spider- like creature is going berserk, \u001b[1;36m25\u001b[0m if the sight of us excited him into mad spinning g \u001b[1m[\u001b[0m\n",
+ "\n",
+ " ---------- Pad \u001b[1;36m8\u001b[0m, fract. \u001b[1;36m0.2\u001b[0m ----------\n",
+ "ResultOCRExtracted#block \u001b[1;36m00\u001b[0m: \u001b[1;36m1.00\u001b[0m||Suddenly\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m01\u001b[0m: \u001b[1;36m0.87\u001b[0m||Gasp/z everything s w- whirling around me!i can't stand up.\n",
+ "ResultOCRExtracted#block \u001b[1;36m02\u001b[0m: \u001b[1;36m0.84\u001b[0m||Clark! i'm falling’ _ help! help!\n",
+ "ResultOCRExtracted#block \u001b[1;36m03\u001b[0m: \u001b[1;36m0.73\u001b[0m||I-i'm passing ohhha.\n",
+ "ResultOCRExtracted#block \u001b[1;36m04\u001b[0m: \u001b[1;36m0.92\u001b[0m||Action comics\n",
+ "ResultOCRExtracted#block \u001b[1;36m05\u001b[0m: \u001b[1;36m1.00\u001b[0m||Then, seconds later\u001b[33m...\u001b[0m\n",
+ "ResultOCRExtracted#block \u001b[1;36m06\u001b[0m: \u001b[1;36m0.90\u001b[0m||Great caesar's ghost! f this \u001b[35m/\u001b[0m§ black magic! : we've been transported to the weirdest world i ever saw!\n",
+ "ResultOCRExtracted#block \u001b[1;36m07\u001b[0m: \u001b[1;36m0.92\u001b[0m||\u001b[33m...\u001b[0mit certainly isnt our earth, perry! look at the size of those bees.\n",
+ "ResultOCRExtracted#block \u001b[1;36m08\u001b[0m: \u001b[1;36m0.88\u001b[0m||Watch out, clark!\n",
+ "ResultOCRExtracted#block \u001b[1;36m09\u001b[0m: \u001b[1;36m0.67\u001b[0m||Owwww.,\n",
+ "ResultOCRExtracted#block \u001b[1;36m10\u001b[0m: \u001b[1;36m0.99\u001b[0m||Yet the bee's stinger went right through my uniform and penetrated my skin! that means. the fabric of my superman costume has become ordinary cloth!\n",
+ "ResultOCRExtracted#block \u001b[1;36m11\u001b[0m: \u001b[1;36m0.96\u001b[0m||Hurry! let's beat it before we get stung, too!\n",
+ "ResultOCRExtracted#block \u001b[1;36m12\u001b[0m: \u001b[1;36m0.91\u001b[0m||Ggreat guns!\u001b[33m...\u001b[0m2gasp/:\u001b[33m...\u001b[0m pain i feel fan! as superman, i should be invulnerable! \u001b[1;36m1\u001b[0m have unbreakabl skin! under my clark kent clothes, i'm wearing an indestructible superman uniform !\n",
+ "ResultOCRExtracted#block \u001b[1;36m13\u001b[0m: \u001b[1;36m0.96\u001b[0m||Abruptly\u001b[33m...\u001b[0m.\n",
+ "ResultOCRExtracted#block \u001b[1;36m14\u001b[0m: \u001b[1;36m0.93\u001b[0m||Great caesar's ghost! he's spinning _ a web of g/ant, \"silk strands -- as tough as steel!\n",
+ "ResultOCRExtracted#block \u001b[1;36m15\u001b[0m: \u001b[1;36m0.89\u001b[0m||I-i feel the heat of the sun., the pain of the bee-sting\u001b[33m...\u001b[0m the heavy weight of my pack! every human discomfort\u001b[33m...\u001b[0m good grief! i've lost all my super-powers. ve become an ordinary mortal in this world.\n",
+ "ResultOCRExtracted#block \u001b[1;36m16\u001b[0m: \u001b[1;36m0.96\u001b[0m||Get back! that enormous spider- like creature is going berserk, as if the sight of us excited him into mad spinning gs \u001b[1m[\u001b[0m\n",
+ "\n",
+ "\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "image_action_experiment: ExperimentOCR = cast(ExperimentOCR, ExperimentOCR.from_image(\n",
+ " CONTEXT, 'Tesseract', 'Action_Comics_1960-01-00_(262).JPG'))\n",
+ "image_action_experiment.display()\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 114,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ "
\"\n",
+ " for width in max_widths:\n",
+ " html_str += f\"
\"\n",
+ " html_str += \"
\"\n",
+ "\n",
+ " if headers:\n",
+ " if len(headers) != len(columns):\n",
+ " raise ValueError(\"Headers list must match the number of columns.\")\n",
+ " html_str += (\n",
+ " \"
\"\n",
+ " + \"\".join(\n",
+ " f\"
{header}
\"\n",
+ " for header in headers\n",
+ " )\n",
+ " + \"
\"\n",
+ " )\n",
+ "\n",
+ " for row_items in zip(*columns):\n",
+ " html_str += \"
\"\n",
+ " for i, item in enumerate(row_items):\n",
+ " if isinstance(item, (Image.Image, Path)):\n",
+ " img_html = get_image_html(item, max_width=max_image_width)\n",
+ " html_str += f\"
{img_html}
\"\n",
+ " else: # Assume the item is a string\n",
+ " style = \"font-weight: bold;\" if i == 0 else \"\"\n",
+ " html_str += f\"
{item}
\"\n",
+ " html_str += \"
\"\n",
+ "\n",
+ " html_str += \"
\"\n",
+ " return html_str\n",
+ "\n",
+ "\n",
+ "def display_columns(\n",
+ " columns: list[list], max_image_width: int | None = None, headers: list[str] | None = None\n",
+ "):\n",
+ " \"\"\"\n",
+ " Displays a table with any combination of columns, which can be lists of strings or lists \n",
+ " of PIL Image objects, within a Jupyter notebook cell.\n",
+ "\n",
+ " :param columns: A list of lists, where each sublist represents a column in the table. \n",
+ " Each sublist can contain either strings or PIL Image objects.\n",
+ " :param max_image_width: The maximum size of the images in pixels. This controls the max-height \n",
+ " of the images.\n",
+ " :param headers: A list of header labels for the table. If None, no headers are displayed.\n",
+ " \"\"\"\n",
+ " return display(HTML(get_columns_html(columns, max_image_width, headers)))\n",
+ "\n",
+ "\n",
+ "def get_image_grid_html(\n",
+ " images: list[Image.Image | Path | str],\n",
+ " rows: int,\n",
+ " columns: int,\n",
+ " titles: list[str] | None = None,\n",
+ " max_image_width: int | None = None,\n",
+ " caption: str | None = None\n",
+ "):\n",
+ " if titles and len(titles) != len(images):\n",
+ " raise ValueError(\"Titles list must match the number of images if provided.\")\n",
+ "\n",
+ " html_str = \"
\" # Empty cell if no more images\n",
+ " image_index += 1\n",
+ " html_str += \"
\"\n",
+ "\n",
+ " html_str += \"
\"\n",
+ " return html_str\n",
+ "\n",
+ "\n",
+ "def display_image_grid(\n",
+ " images: list[Image.Image | Path | str],\n",
+ " rows: int,\n",
+ " columns: int,\n",
+ " titles: list[str] | None = None,\n",
+ " max_image_width: int | None = None,\n",
+ " caption: str | None = None,\n",
+ "):\n",
+ " \"\"\"\n",
+ " Displays a grid of images in a HTML table within a Jupyter notebook cell.\n",
+ "\n",
+ " :param images: A list of PIL Image objects to be displayed.\n",
+ " :param rows: The number of rows in the grid.\n",
+ " :param columns: The number of columns in the grid.\n",
+ " :param titles: An optional list of titles for each image. If provided, it must match the length \n",
+ " of the images list.\n",
+ " :param max_image_width: The maximum width of the images in pixels.\n",
+ " \"\"\"\n",
+ " display(HTML(get_image_grid_html(images, rows, columns, titles, max_image_width, caption)))\n",
+ "\n",
+ "\n",
+ "def acc_as_html(acc):\n",
+ " return f\"
+ """
+ display(HTML(html_content))
+
+ def _ipython_display_(self):
+ self.display()
+
+# %% helpers.ipynb 29
+def page_boxes(self: st.PageData, out_dir: Path | None = None) -> tuple[Image.Image, Path]:
+ """
+ Visualize the boxes on an image.
+ Typically, this would be used to check where on the original image the
+ boxes are located.
+
+ :param image_path: The path to the image to visualize the boxes on.
+ """
+ image_path = Path(self.image_path)
+ image = Image.open(image_path)
+ draw = ImageDraw.Draw(image)
+ data_path = resources.files(pcleaner.data)
+ font_path = str(data_path / "LiberationSans-Regular.ttf")
+ # Figure out the optimal font size based on the image size. E.g. 30 for a 1600px image.
+ font_size = int(image.size[0] / 50) + 5
+
+ for index, box in enumerate(self.boxes):
+ draw.rectangle(box.as_tuple, outline="green")
+ # Draw the box number, with a white background, respecting font size.
+ draw.text(
+ (box.x1 + 4, box.y1),
+ str(index + 1),
+ fill="green",
+ font=ImageFont.truetype(font_path, font_size),
+ stroke_fill="white",
+ stroke_width=3,
+ )
+
+ for box in self.extended_boxes:
+ draw.rectangle(box.as_tuple, outline="red")
+ for box in self.merged_extended_boxes:
+ draw.rectangle(box.as_tuple, outline="purple")
+ for box in self.reference_boxes:
+ draw.rectangle(box.as_tuple, outline="blue")
+
+ # Save the image.
+ extension = "_boxes"
+ out_path = image_path.with_stem(image_path.stem + extension)
+ if out_dir is not None:
+ out_dir.mkdir(parents=True, exist_ok=True)
+ out_path = out_dir / image_path.name
+ image.save(out_path)
+
+ return image, out_path
+
+# %% helpers.ipynb 31
+def crop_box(box: st.Box, image: Image.Image) -> Image.Image:
+ return image.crop(box.as_tuple)
+
+# %% helpers.ipynb 33
+PRINT_FORMATS = {
+ 'Golden Age': (7.75, 10.5), # (1930s-40s)
+ 'Siver Age': (7, 10.375), # (1950s-60s)
+ 'Modern Age': (6.625,10.25), # North American comic books
+ 'Magazine': (8.5, 11),
+ 'Digest': (5.5, 8.5),
+ 'Manga': (5.0, 7.5),
+}
+
+
+def size(w: int, h: int, unit: str = 'in', dpi: float = 300.) -> tuple:
+ """
+ Calculate the print size of an image in inches or centimeters.
+
+ Args:
+ w (int): Width of the image in pixels.
+ h (int): Height of the image in pixels.
+ unit (str): Unit of measurement ('in' for inches, 'cm' for centimeters).
+ dpi (float): Dots per inch (resolution).
+
+ Returns:
+ tuple: Width and height of the image in the specified unit.
+ """
+ if unit == 'cm':
+ return (w / dpi * 2.54, h / dpi * 2.54)
+ else: # default to inches
+ return (w / dpi, h / dpi)
+
+
+def dpi(w: int, h: int, print_format: str = 'Modern Age') -> float:
+ """
+ Calculate the dpi (dots per inch) needed to print an image at a specified format size.
+
+ Args:
+ w (int): Width of the image in pixels.
+ h (int): Height of the image in pixels.
+ print_format (str): Print format as defined in the formats dictionary.
+
+ Returns:
+ float: Required dpi to achieve the desired print format size.
+ """
+ # Default to 'Modern Age' if format not found
+ format_size = PRINT_FORMATS.get(print_format, PRINT_FORMATS['Modern Age'])
+ width_inch, height_inch = format_size
+ dpi_w = w / width_inch
+ dpi_h = h / height_inch
+ return (dpi_w + dpi_h) / 2 # Average dpi for width and height
+
+
+# %% helpers.ipynb 35
+def get_image_html(image: Image.Image | Path | str, max_width: int | None = None):
+ """
+ Converts a PIL image to an HTML image tag containing the image as a base64 blob.
+
+ :param image: A PIL Image object.
+ :param max_size: A PIL Image object.
+ :return: A string containing an HTML tag with the image.
+ """
+ style = f' style="max-width: {max_width}px;"' if max_width is not None else ''
+ if isinstance(image, (Path, str)):
+ return f''
+ else:
+ buffered = BytesIO()
+ image.save(buffered, format='PNG')
+ img_str = base64.b64encode(buffered.getvalue()).decode()
+ return f''
+
+
+def get_columns_html(
+ columns: list[list], max_image_width: int | None = None, headers: list[str] | None = None
+):
+ if not all(len(col) == len(columns[0]) for col in columns):
+ raise ValueError("All columns must have the same length.")
+
+ # Calculate the maximum width of images in each column
+ max_widths = []
+ for col_index in range(len(columns)):
+ max_col_width = 0
+ for item in columns[col_index]:
+ if isinstance(item, (Image.Image, Path)):
+ if isinstance(item, (Path, str)):
+ item = Image.open(item)
+ width, _ = item.size
+ max_col_width = max(max_col_width, width)
+ if max_col_width > 0:
+ max_widths.append(
+ f"{min(max_col_width, max_image_width)}px"
+ if max_image_width is not None else
+ f"{max_col_width}px"
+ )
+ else:
+ max_widths.append('auto')
+
+ html_str = "
"
+
+ # Apply calculated column widths using
and
elements
+ html_str += "
"
+ for width in max_widths:
+ html_str += f"
"
+ html_str += "
"
+
+ if headers:
+ if len(headers) != len(columns):
+ raise ValueError("Headers list must match the number of columns.")
+ html_str += (
+ "
"
+ + "".join(
+ f"
{header}
"
+ for header in headers
+ )
+ + "
"
+ )
+
+ for row_items in zip(*columns):
+ html_str += "
"
+ for i, item in enumerate(row_items):
+ if isinstance(item, (Image.Image, Path)):
+ img_html = get_image_html(item, max_width=max_image_width)
+ html_str += f"
{img_html}
"
+ else: # Assume the item is a string
+ style = "font-weight: bold;" if i == 0 else ""
+ html_str += f"
{item}
"
+ html_str += "
"
+
+ html_str += "
"
+ return html_str
+
+
+def display_columns(
+ columns: list[list], max_image_width: int | None = None, headers: list[str] | None = None
+):
+ """
+ Displays a table with any combination of columns, which can be lists of strings or lists
+ of PIL Image objects, within a Jupyter notebook cell.
+
+ :param columns: A list of lists, where each sublist represents a column in the table.
+ Each sublist can contain either strings or PIL Image objects.
+ :param max_image_width: The maximum size of the images in pixels. This controls the max-height
+ of the images.
+ :param headers: A list of header labels for the table. If None, no headers are displayed.
+ """
+ return display(HTML(get_columns_html(columns, max_image_width, headers)))
+
+
+def get_image_grid_html(
+ images: list[Image.Image | Path | str],
+ rows: int,
+ columns: int,
+ titles: list[str] | None = None,
+ max_image_width: int | None = None,
+ caption: str | None = None
+):
+ if titles and len(titles) != len(images):
+ raise ValueError("Titles list must match the number of images if provided.")
+
+ html_str = "
"
+
+ if caption:
+ html_str += (f"
{caption}
")
+
+ image_index = 0
+ for row in range(rows):
+ html_str += "
"
+ for col in range(columns):
+ if image_index < len(images):
+ img_html = get_image_html(images[image_index], max_width=max_image_width)
+ title_html = (
+ f"
{titles[image_index]}
"
+ if titles
+ else ""
+ )
+ html_str += f"
{title_html}{img_html}
"
+ else:
+ html_str += "
" # Empty cell if no more images
+ image_index += 1
+ html_str += "
"
+
+ html_str += "
"
+ return html_str
+
+
+def display_image_grid(
+ images: list[Image.Image | Path | str],
+ rows: int,
+ columns: int,
+ titles: list[str] | None = None,
+ max_image_width: int | None = None,
+ caption: str | None = None,
+):
+ """
+ Displays a grid of images in a HTML table within a Jupyter notebook cell.
+
+ :param images: A list of PIL Image objects to be displayed.
+ :param rows: The number of rows in the grid.
+ :param columns: The number of columns in the grid.
+ :param titles: An optional list of titles for each image. If provided, it must match the length
+ of the images list.
+ :param max_image_width: The maximum width of the images in pixels.
+ """
+ display(HTML(get_image_grid_html(images, rows, columns, titles, max_image_width, caption)))
+
+
+def acc_as_html(acc):
+ return f"
{acc:.2f}"
+
+
+# %% helpers.ipynb 37
+def strip_uuid(p: Path | str):
+ _p: Path = p if isinstance(p, Path) else Path(p)
+ new_stem = re.sub(r'(?i)[a-f0-9]{8}-([a-f0-9]{4}-){3}[a-f0-9]{12}', '', _p.stem).strip('_')
+ return _p.with_stem(new_stem)
+
diff --git a/_testbed/ocr_idefics.py b/_testbed/ocr_idefics.py
new file mode 100644
index 00000000..67a50c42
--- /dev/null
+++ b/_testbed/ocr_idefics.py
@@ -0,0 +1,184 @@
+# AUTOGENERATED! DO NOT EDIT! File to edit: test_idefics.ipynb.
+
+# %% test_idefics.ipynb 12
+from __future__ import annotations
+
+import functools
+from pathlib import Path
+
+import pcleaner.ocr.ocr as ocr
+import torch
+import transformers
+from pcleaner.ocr.ocr_tesseract import TesseractOcr
+from PIL import Image
+from rich.console import Console
+from transformers import AutoProcessor
+from transformers import Idefics2ForConditionalGeneration
+from transformers import PreTrainedModel
+
+
+# %% auto 0
+__all__ = ['IdeficsOCR', 'IdeficsExperimentContext']
+
+# %% test_idefics.ipynb 17
+console = Console(width=104, tab_size=4, force_jupyter=True)
+cprint = console.print
+
+
+# %% test_idefics.ipynb 20
+from experiments import *
+from helpers import *
+from ocr_metric import *
+
+
+# %% test_idefics.ipynb 21
+def load_image(img_or_path) -> Image.Image:
+ if isinstance(img_or_path, (str, Path)):
+ return Image.open(img_or_path)
+ elif isinstance(img_or_path, Image.Image):
+ return img_or_path
+ else:
+ raise ValueError(f"img_or_path must be a path or PIL.Image, got: {type(img_or_path)}")
+
+
+# %% test_idefics.ipynb 36
+processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b")
+
+# %% test_idefics.ipynb 37
+device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
+
+model = Idefics2ForConditionalGeneration.from_pretrained(
+ "HuggingFaceM4/idefics2-8b",
+ torch_dtype=torch.bfloat16,
+ _attn_implementation="flash_attention_2",
+ ).to(device) # type: ignore
+
+
+# %% test_idefics.ipynb 39
+prompt_text_tmpl = (
+ "Please perform optical character recognition (OCR) on this image, which displays "
+ "speech balloons from a comic book. The text is in {}. Extract the text and "
+ "format it as follows: transcribe in standard sentence case, avoid using all capital "
+ "letters. Provide the transcribed text clearly and double check the sentence is not all capital letters.")
+
+# prompt_text_tmpl = ("Please perform optical character recognition (OCR) on this image, which displays "
+# f"speech balloons from a manga comic. The text is in {}. Extract the text and "
+# "format it without newlines. Provide the transcribed text clearly.")
+
+# prompt_text_tmpl = ("Please perform optical character recognition (OCR) on this image, which displays "
+# "speech balloons from a comic book. The text is in {}. Extract the text and "
+# "format it as follows: transcribe in standard sentence case (avoid using all capital "
+# "letters) and use asterisks to denote any words that appear in bold within the image. "
+# "Provide the transcribed text clearly.")
+
+# prompt_text_tmpl = ("Please perform optical character recognition (OCR) on this image, which displays "
+# "speech balloons from a comic book. The text is in {}. Extract the text and "
+# "format it as follows: transcribe in standard sentence case, capitalized. Avoid using "
+# "all capital letters. In comics, it is common to use two hyphens '--' to interrupt a sentence. "
+# "Retain any hyphens as they appear in the original text. Provide the transcribed text "
+# "clearly, ensuring it is capitalized where appropriate, including proper nouns.")
+
+prompt_text_tmpl = (
+ "Please perform optical character recognition (OCR) on this image, which displays "
+ "speech balloons from a comic book. The text is in {}. Extract the text and "
+ "format it as follows: transcribe in standard sentence case, capitalized. Avoid using "
+ "all capital letters, but ensure it is capitalized where appropriate, including proper nouns. "
+ "Provide the transcribed text clearly. Double check the text is not all capital letters.")
+
+default_prompt_text_tmpl = prompt_text_tmpl
+
+# %% test_idefics.ipynb 41
+class IdeficsOCR:
+ prompt_text_tmpl: str = default_prompt_text_tmpl
+
+ def __init__(self,
+ lang: str | None = None,
+ prompt_text_tmpl: str|None = None,
+ device: str | None = None
+ ):
+ self.lang = lang
+ self.prompt_text_tmpl = prompt_text_tmpl or self.prompt_text_tmpl
+ self.device = (device or
+ "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu")
+
+ @staticmethod
+ def is_idefics_available() -> bool:
+ return True
+
+ def _generation_args(self, image: Image.Image, resulting_messages: list[dict]):
+ prompt = processor.apply_chat_template(resulting_messages, add_generation_prompt=True)
+ inputs = processor(text=prompt, images=[image], return_tensors="pt")
+ inputs = {k: v.to(self.device) for k, v in inputs.items()}
+
+ max_new_tokens = 512
+ repetition_penalty = 1.2
+ decoding_strategy = "Greedy"
+ temperature = 0.4
+ top_p = 0.8
+
+ generation_args = {
+ "max_new_tokens": max_new_tokens,
+ "repetition_penalty": repetition_penalty,
+ }
+
+ assert decoding_strategy in [
+ "Greedy",
+ "Top P Sampling",
+ ]
+
+ if decoding_strategy == "Greedy":
+ generation_args["do_sample"] = False
+ elif decoding_strategy == "Top P Sampling":
+ generation_args["temperature"] = temperature
+ generation_args["do_sample"] = True
+ generation_args["top_p"] = top_p
+
+ generation_args.update(inputs)
+ return prompt, generation_args
+
+ def __call__(
+ self,
+ img_or_path: Image.Image | Path | str,
+ prompt_text: str | None = None,
+ lang: str | None = None,
+ config: str | None = None,
+ show_prompt: bool = False,
+ **kwargs,
+ ) -> str:
+ if not self.is_idefics_available():
+ raise RuntimeError("Idefics is not installed or not found.")
+ resulting_messages = [
+ {
+ "role": "user",
+ "content": [{"type": "image"}] + [
+ {"type": "text", "text": prompt_text or self.prompt_text_tmpl.format(lang or self.lang)}
+ ]
+ }
+ ]
+ image = load_image(img_or_path)
+ prompt, generation_args = self._generation_args(image, resulting_messages)
+ generated_ids = model.generate(**generation_args)
+ generated_texts = processor.batch_decode(
+ generated_ids[:, generation_args["input_ids"].size(1):], skip_special_tokens=True)
+ if show_prompt:
+ cprint("INPUT:", prompt, "|OUTPUT:", generated_texts)
+ return generated_texts[0]#.strip('"')
+
+ def postprocess_ocr(self, text):
+ return ' '.join(remove_multiple_whitespaces(text).splitlines())
+
+
+# %% test_idefics.ipynb 43
+class IdeficsExperimentContext(OCRExperimentContext):
+ @functools.lru_cache()
+ def mocr(self, ocr_model: str, lang: str):
+ if ocr_model == 'Idefics':
+ proc = IdeficsOCR(lang)
+ else:
+ engine = self.engines[ocr_model]
+ ocr_processor = ocr.get_ocr_processor(True, engine)
+ proc = ocr_processor[lang2pcleaner(lang)]
+ if isinstance(proc, TesseractOcr):
+ proc.lang = lang2tesseract(lang)
+ return proc
+
diff --git a/_testbed/ocr_metric.ipynb b/_testbed/ocr_metric.ipynb
new file mode 100644
index 00000000..41f2fad4
--- /dev/null
+++ b/_testbed/ocr_metric.ipynb
@@ -0,0 +1,276 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| default_exp ocr_metric"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| hide\n",
+ "# %reload_ext autoreload\n",
+ "# %autoreload 0\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# install (Colab)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# try: \n",
+ "# import fastcore as FC\n",
+ "# except ImportError: \n",
+ "# !pip install -q fastcore\n",
+ "# try:\n",
+ "# import rich\n",
+ "# except ImportError:\n",
+ "# !pip install -q rich\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Developing a metric for OCR of Comics/Manga texts\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Prologue"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "from __future__ import annotations\n",
+ "\n",
+ "import difflib\n",
+ "import html\n",
+ "\n",
+ "from IPython.display import display\n",
+ "from IPython.display import HTML\n",
+ "from rich.console import Console\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import fastcore.all as FC\n",
+ "import fastcore.xtras # patch Path with some utils\n",
+ "import rich\n",
+ "from fastcore.test import * # type: ignore\n",
+ "from loguru import logger\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Helpers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# pretty print by default\n",
+ "# %load_ext rich"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "console = Console(width=104, tab_size=4, force_jupyter=True)\n",
+ "cprint = console.print\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## OCR metric\n",
+ "> Some basic ways to compare OCR results"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "def get_text_diffs_html(str1, str2, ignore_align: bool = False):\n",
+ " matcher = difflib.SequenceMatcher(None, str1, str2)\n",
+ " html_str1, html_str2 = \"\", \"\"\n",
+ " _ch ='⎕' # ▿\n",
+ " ch = f'{ord(_ch):x};'\n",
+ " span1_g = lambda l: f\"{ch*l}\" if l > 0 else \"\"\n",
+ " span1_r = lambda l: f\"{ch*l}\" if l > 0 else \"\"\n",
+ " span2 = lambda s: f\"{html.escape(s)}\" if s else \"\"\n",
+ "\n",
+ " for opcode in matcher.get_opcodes():\n",
+ " tag, i1, i2, j1, j2 = opcode\n",
+ " if tag == \"equal\":\n",
+ " html_str1 += html.escape(str1[i1:i2])\n",
+ " html_str2 += html.escape(str2[j1:j2])\n",
+ " elif tag == \"replace\":\n",
+ " max_span = max(i2 - i1, j2 - j1)\n",
+ " # str1_segment = str1[i1:i2].ljust(max_span)\n",
+ " html_str1 += html.escape(str1[i1:i2]) + span1_g(max_span - (i2 - i1))\n",
+ " html_str2 += span2(str2[j1:j2]) + (span1_r(max_span - (j2 - j1)) if not ignore_align else '')\n",
+ " elif tag == \"delete\":\n",
+ " deleted_segment = str1[i1:i2]\n",
+ " html_str1 += html.escape(deleted_segment)\n",
+ " if not ignore_align: html_str2 += span1_r(len(deleted_segment))\n",
+ " elif tag == \"insert\":\n",
+ " inserted_segment = str2[j1:j2].replace(\" \", _ch)\n",
+ " html_str1 += span1_g(len(inserted_segment))\n",
+ " html_str2 += span2(inserted_segment)\n",
+ " html_str1 = f\"
{html_str1}
\"\n",
+ " html_str2 = f\"
{html_str2}
\"\n",
+ " return html_str1, html_str2\n",
+ "\n",
+ "def display_text_diffs(str1, str2):\n",
+ " \"\"\"\n",
+ " Displays two strings one above the other, with differing characters highlighted in red in the \n",
+ " second string only, using difflib.SequenceMatcher to align the strings and ensure matching \n",
+ " sequences are vertically aligned.\n",
+ "\n",
+ " :param str1: The first string to compare.\n",
+ " :param str2: The second string to compare.\n",
+ " \"\"\"\n",
+ " html_str1, html_str2 = get_text_diffs_html(str1, str2)\n",
+ " display(HTML(f\"
"
+ return html_str1, html_str2
+
+def display_text_diffs(str1, str2):
+ """
+ Displays two strings one above the other, with differing characters highlighted in red in the
+ second string only, using difflib.SequenceMatcher to align the strings and ensure matching
+ sequences are vertically aligned.
+
+ :param str1: The first string to compare.
+ :param str2: The second string to compare.
+ """
+ html_str1, html_str2 = get_text_diffs_html(str1, str2)
+ display(HTML(f"
{html_str1} {html_str2}
"))
diff --git a/_testbed/requirements.txt b/_testbed/requirements.txt
new file mode 100644
index 00000000..4e3d1c68
--- /dev/null
+++ b/_testbed/requirements.txt
@@ -0,0 +1,5 @@
+matplotlib
+rich
+fastcore
+nbdev
+ipywidgets
diff --git a/_testbed/test_idefics.ipynb b/_testbed/test_idefics.ipynb
new file mode 100644
index 00000000..a8978d12
--- /dev/null
+++ b/_testbed/test_idefics.ipynb
@@ -0,0 +1,1497 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| default_exp ocr_idefics"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| hide\n",
+ "# %reload_ext autoreload\n",
+ "# %autoreload 0\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# install (Colab)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# try: \n",
+ "# import fastcore as FC\n",
+ "# except ImportError: \n",
+ "# !pip install -q fastcore\n",
+ "# try:\n",
+ "# import rich\n",
+ "# except ImportError:\n",
+ "# !pip install -q rich\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# !pip install -q git+https://github.com/civvic/PanelCleaner.git@basic-tesseract"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "need version >4.40 of transformers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# %pip install git+https://github.com/huggingface/transformers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Fash attention doesn't support Metal [#412](https://github.com/Dao-AILab/flash-attention/issues/412) (but see [metal-flash-attention](https://github.com/philipturner/metal-flash-attention))\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# %env FLASH_ATTENTION_SKIP_CUDA_BUILD=TRUE\n",
+ "# %pip install flash-attn --no-build-isolation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "notebookRunGroups": {
+ "groupValue": ""
+ }
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Fri May 10 18:32:10 2024 \n",
+ "+---------------------------------------------------------------------------------------+\n",
+ "| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 |\n",
+ "|-----------------------------------------+----------------------+----------------------+\n",
+ "| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
+ "| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |\n",
+ "| | | MIG M. |\n",
+ "|=========================================+======================+======================|\n",
+ "| 0 NVIDIA GeForce RTX 3090 Ti On | 00000000:65:00.0 Off | Off |\n",
+ "| 0% 50C P8 33W / 480W | 1MiB / 24564MiB | 0% Default |\n",
+ "| | | N/A |\n",
+ "+-----------------------------------------+----------------------+----------------------+\n",
+ " \n",
+ "+---------------------------------------------------------------------------------------+\n",
+ "| Processes: |\n",
+ "| GPU GI CI PID Type Process name GPU Memory |\n",
+ "| ID ID Usage |\n",
+ "|=======================================================================================|\n",
+ "| No running processes found |\n",
+ "+---------------------------------------------------------------------------------------+\n"
+ ]
+ }
+ ],
+ "source": [
+ "!nvidia-smi"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Testing `Idefics` OCR for Comics\n",
+ "> Accuracy Enhancements for OCR in `PanelCleaner`\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Prologue"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "from __future__ import annotations\n",
+ "\n",
+ "import functools\n",
+ "from pathlib import Path\n",
+ "\n",
+ "import pcleaner.ocr.ocr as ocr\n",
+ "import torch\n",
+ "import transformers\n",
+ "from pcleaner.ocr.ocr_tesseract import TesseractOcr\n",
+ "from PIL import Image\n",
+ "from rich.console import Console\n",
+ "from transformers import AutoProcessor\n",
+ "from transformers import Idefics2ForConditionalGeneration\n",
+ "from transformers import PreTrainedModel\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "from typing import cast\n",
+ "\n",
+ "import fastcore.xtras # patch Path with some utils\n",
+ "import pcleaner.config as cfg\n",
+ "from fastcore.test import * # type: ignore\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "notebookRunGroups": {
+ "groupValue": ""
+ }
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'4.41.0.dev0'"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "transformers.__version__"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Helpers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "notebookRunGroups": {
+ "groupValue": ""
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# pretty print by default\n",
+ "# %load_ext rich"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "notebookRunGroups": {
+ "groupValue": ""
+ }
+ },
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "console = Console(width=104, tab_size=4, force_jupyter=True)\n",
+ "cprint = console.print\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Force reload of `experiments` module"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "if 'experiments' in sys.modules:\n",
+ " import importlib; importlib.reload(experiments) # type: ignore\n",
+ "else:\n",
+ " import experiments\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "from experiments import *\n",
+ "from helpers import *\n",
+ "from ocr_metric import *\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "def load_image(img_or_path) -> Image.Image:\n",
+ " if isinstance(img_or_path, (str, Path)):\n",
+ " return Image.open(img_or_path)\n",
+ " elif isinstance(img_or_path, Image.Image):\n",
+ " return img_or_path\n",
+ " else:\n",
+ " raise ValueError(f\"img_or_path must be a path or PIL.Image, got: {type(img_or_path)}\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "----\n",
+ "# Idefics basic usage\n",
+ "\n",
+ "not working, cuda memory error"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# # Note that passing the image urls (instead of the actual pil images) to the processor is also possible\n",
+ "# # image1 = load_image(\"https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg\")\n",
+ "# # image2 = load_image(\"https://cdn.britannica.com/59/94459-050-DBA42467/Skyline-Chicago.jpg\")\n",
+ "# # image3 = load_image(\"https://cdn.britannica.com/68/170868-050-8DDE8263/Golden-Gate-Bridge-San-Francisco.jpg\")\n",
+ "\n",
+ "# image1 = Image.open(\"media/Statue-of-Liberty-Island-New-York-Bay.webp\")\n",
+ "# image2 = Image.open(\"media/Skyline-Chicago.webp\")\n",
+ "# image3 = Image.open(\"media/Golden-Gate-Bridge-San-Francisco.webp\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# processor = AutoProcessor.from_pretrained(\"HuggingFaceM4/idefics2-8b\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# model = Idefics2ForConditionalGeneration.from_pretrained(\n",
+ "# \"HuggingFaceM4/idefics2-8b\",\n",
+ "# torch_dtype=torch.bfloat16,\n",
+ "# #_attn_implementation=\"flash_attention_2\",\n",
+ "# )\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# assert isinstance(model, PreTrainedModel)\n",
+ "# model.to(DEVICE)\n",
+ "# type(model), model.device\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Create inputs"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# messages = [\n",
+ "# {\n",
+ "# \"role\": \"user\",\n",
+ "# \"content\": [\n",
+ "# {\"type\": \"image\"},\n",
+ "# {\"type\": \"text\", \"text\": \"What do we see in this image?\"},\n",
+ "# ]\n",
+ "# },\n",
+ "# {\n",
+ "# \"role\": \"assistant\",\n",
+ "# \"content\": [\n",
+ "# {\"type\": \"text\", \"text\": \"In this image, we can see the city of New York, and more specifically the Statue of Liberty.\"},\n",
+ "# ]\n",
+ "# },\n",
+ "# {\n",
+ "# \"role\": \"user\",\n",
+ "# \"content\": [\n",
+ "# {\"type\": \"image\"},\n",
+ "# {\"type\": \"text\", \"text\": \"And how about this image?\"},\n",
+ "# ]\n",
+ "# }, \n",
+ "# ]\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# prompt = processor.apply_chat_template(messages, add_generation_prompt=True)\n",
+ "# inputs = processor(text=prompt, images=[image1, image2], return_tensors=\"pt\")\n",
+ "# inputs = {k: v.to(DEVICE) for k, v in inputs.items()}\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Generate"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# generated_ids = model.generate(**inputs, max_new_tokens=500)\n",
+ "# generated_texts = processor.batch_decode(generated_ids, skip_special_tokens=True)\n",
+ "\n",
+ "# print(generated_texts)\n",
+ "# # ['User: What do we see in this image? \\nAssistant: In this image, we can see the city of New York, and more specifically the Statue of Liberty. \\nUser: And how about this image? \\nAssistant: In this image we can see buildings, trees, lights, water and sky.']\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# [\n",
+ "# 'User: What do we see in this image? '\n",
+ "# 'Assistant: In this image, we can see the city of New York, and more specifically the Statue of Liberty. '\n",
+ "# 'User: And how about this image? '\n",
+ "# 'Assistant: In this image we can see buildings, trees, lights, water and sky.'\n",
+ "# ]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "----\n",
+ "# Idefics experiments\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Idefics"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Idefics initialization"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n"
+ ]
+ }
+ ],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "processor = AutoProcessor.from_pretrained(\"HuggingFaceM4/idefics2-8b\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.\n"
+ ]
+ },
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "5a21b61268674a159f210694841c2149",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ "Loading checkpoint shards: 0%| | 0/7 [00:00, ?it/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "device = \"mps\" if torch.backends.mps.is_available() else \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+ "\n",
+ "model = Idefics2ForConditionalGeneration.from_pretrained(\n",
+ " \"HuggingFaceM4/idefics2-8b\",\n",
+ " torch_dtype=torch.bfloat16,\n",
+ " _attn_implementation=\"flash_attention_2\",\n",
+ " ).to(device) # type: ignore\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "(transformers.models.idefics2.modeling_idefics2.Idefics2ForConditionalGeneration,\n",
+ " device(type='cuda', index=0))"
+ ]
+ },
+ "execution_count": 26,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "type(model), model.device\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| exporti\n",
+ "\n",
+ "prompt_text_tmpl = (\n",
+ " \"Please perform optical character recognition (OCR) on this image, which displays \"\n",
+ " \"speech balloons from a comic book. The text is in {}. Extract the text and \"\n",
+ " \"format it as follows: transcribe in standard sentence case, avoid using all capital \"\n",
+ " \"letters. Provide the transcribed text clearly and double check the sentence is not all capital letters.\")\n",
+ "\n",
+ "# prompt_text_tmpl = (\"Please perform optical character recognition (OCR) on this image, which displays \"\n",
+ "# f\"speech balloons from a manga comic. The text is in {}. Extract the text and \"\n",
+ "# \"format it without newlines. Provide the transcribed text clearly.\")\n",
+ "\n",
+ "# prompt_text_tmpl = (\"Please perform optical character recognition (OCR) on this image, which displays \"\n",
+ "# \"speech balloons from a comic book. The text is in {}. Extract the text and \"\n",
+ "# \"format it as follows: transcribe in standard sentence case (avoid using all capital \"\n",
+ "# \"letters) and use asterisks to denote any words that appear in bold within the image. \"\n",
+ "# \"Provide the transcribed text clearly.\")\n",
+ "\n",
+ "# prompt_text_tmpl = (\"Please perform optical character recognition (OCR) on this image, which displays \"\n",
+ "# \"speech balloons from a comic book. The text is in {}. Extract the text and \"\n",
+ "# \"format it as follows: transcribe in standard sentence case, capitalized. Avoid using \"\n",
+ "# \"all capital letters. In comics, it is common to use two hyphens '--' to interrupt a sentence. \"\n",
+ "# \"Retain any hyphens as they appear in the original text. Provide the transcribed text \"\n",
+ "# \"clearly, ensuring it is capitalized where appropriate, including proper nouns.\")\n",
+ "\n",
+ "prompt_text_tmpl = (\n",
+ " \"Please perform optical character recognition (OCR) on this image, which displays \"\n",
+ " \"speech balloons from a comic book. The text is in {}. Extract the text and \"\n",
+ " \"format it as follows: transcribe in standard sentence case, capitalized. Avoid using \"\n",
+ " \"all capital letters, but ensure it is capitalized where appropriate, including proper nouns. \"\n",
+ " \"Provide the transcribed text clearly. Double check the text is not all capital letters.\")\n",
+ "\n",
+ "default_prompt_text_tmpl = prompt_text_tmpl"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## IdeficsOCR"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "class IdeficsOCR:\n",
+ " prompt_text_tmpl: str = default_prompt_text_tmpl\n",
+ "\n",
+ " def __init__(self, \n",
+ " lang: str | None = None, \n",
+ " prompt_text_tmpl: str|None = None, \n",
+ " device: str | None = None\n",
+ " ):\n",
+ " self.lang = lang\n",
+ " self.prompt_text_tmpl = prompt_text_tmpl or self.prompt_text_tmpl\n",
+ " self.device = (device or \n",
+ " \"mps\" if torch.backends.mps.is_available() else \"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
+ "\n",
+ " @staticmethod\n",
+ " def is_idefics_available() -> bool:\n",
+ " return True\n",
+ "\n",
+ " def _generation_args(self, image: Image.Image, resulting_messages: list[dict]):\n",
+ " prompt = processor.apply_chat_template(resulting_messages, add_generation_prompt=True)\n",
+ " inputs = processor(text=prompt, images=[image], return_tensors=\"pt\")\n",
+ " inputs = {k: v.to(self.device) for k, v in inputs.items()}\n",
+ " \n",
+ " max_new_tokens = 512\n",
+ " repetition_penalty = 1.2\n",
+ " decoding_strategy = \"Greedy\"\n",
+ " temperature = 0.4\n",
+ " top_p = 0.8\n",
+ "\n",
+ " generation_args = {\n",
+ " \"max_new_tokens\": max_new_tokens,\n",
+ " \"repetition_penalty\": repetition_penalty,\n",
+ " }\n",
+ "\n",
+ " assert decoding_strategy in [\n",
+ " \"Greedy\",\n",
+ " \"Top P Sampling\",\n",
+ " ]\n",
+ "\n",
+ " if decoding_strategy == \"Greedy\":\n",
+ " generation_args[\"do_sample\"] = False\n",
+ " elif decoding_strategy == \"Top P Sampling\":\n",
+ " generation_args[\"temperature\"] = temperature\n",
+ " generation_args[\"do_sample\"] = True\n",
+ " generation_args[\"top_p\"] = top_p\n",
+ "\n",
+ " generation_args.update(inputs)\n",
+ " return prompt, generation_args\n",
+ "\n",
+ " def __call__(\n",
+ " self,\n",
+ " img_or_path: Image.Image | Path | str,\n",
+ " prompt_text: str | None = None,\n",
+ " lang: str | None = None,\n",
+ " config: str | None = None,\n",
+ " show_prompt: bool = False,\n",
+ " **kwargs,\n",
+ " ) -> str:\n",
+ " if not self.is_idefics_available():\n",
+ " raise RuntimeError(\"Idefics is not installed or not found.\")\n",
+ " resulting_messages = [\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": [{\"type\": \"image\"}] + [\n",
+ " {\"type\": \"text\", \"text\": prompt_text or self.prompt_text_tmpl.format(lang or self.lang)}\n",
+ " ]\n",
+ " }\n",
+ " ]\n",
+ " image = load_image(img_or_path)\n",
+ " prompt, generation_args = self._generation_args(image, resulting_messages)\n",
+ " generated_ids = model.generate(**generation_args)\n",
+ " generated_texts = processor.batch_decode(\n",
+ " generated_ids[:, generation_args[\"input_ids\"].size(1):], skip_special_tokens=True)\n",
+ " if show_prompt:\n",
+ " cprint(\"INPUT:\", prompt, \"|OUTPUT:\", generated_texts)\n",
+ " return generated_texts[0]#.strip('\"')\n",
+ "\n",
+ " def postprocess_ocr(self, text):\n",
+ " return ' '.join(remove_multiple_whitespaces(text).splitlines())\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## IdeficsExperimentContext"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 29,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#| export\n",
+ "\n",
+ "class IdeficsExperimentContext(OCRExperimentContext):\n",
+ " @functools.lru_cache()\n",
+ " def mocr(self, ocr_model: str, lang: str):\n",
+ " if ocr_model == 'Idefics':\n",
+ " proc = IdeficsOCR(lang)\n",
+ " else:\n",
+ " engine = self.engines[ocr_model]\n",
+ " ocr_processor = ocr.get_ocr_processor(True, engine)\n",
+ " proc = ocr_processor[lang2pcleaner(lang)]\n",
+ " if isinstance(proc, TesseractOcr):\n",
+ " proc.lang = lang2tesseract(lang)\n",
+ " return proc\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# PanelCleaner Configuration\n",
+ "> Adapt `PanelCleaner` `Config` current config to this notebook.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 30,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "config = cfg.load_config()\n",
+ "config.cache_dir = Path(\".\")\n",
+ "\n",
+ "cache_dir = config.get_cleaner_cache_dir()\n",
+ "\n",
+ "profile = config.current_profile\n",
+ "preprocessor_conf = profile.preprocessor\n",
+ "# Modify the profile to OCR all boxes.\n",
+ "# Make sure OCR is enabled.\n",
+ "preprocessor_conf.ocr_enabled = True\n",
+ "# Make sure the max size is infinite, so no boxes are skipped in the OCR process.\n",
+ "preprocessor_conf.ocr_max_size = 10**10\n",
+ "# Make sure the sus box min size is infinite, so all boxes with \"unknown\" language are skipped.\n",
+ "preprocessor_conf.suspicious_box_min_size = 10**10\n",
+ "# Set the OCR blacklist pattern to match everything, so all text gets reported in the analytics.\n",
+ "preprocessor_conf.ocr_blacklist_pattern = \".*\"\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Test images\n",
+ "> `IMAGE_PATHS` is a list of image file paths that are used as input for testing the OCR methods."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 31,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['00: Action_Comics_1960-01-00_(262).JPG',\n",
+ " '01: Adolf_Cap_01_008.jpg',\n",
+ " '02: Barnaby_v1-028.png',\n",
+ " '03: Barnaby_v1-029.png',\n",
+ " '04: Buck_Danny_-_12_-_Avions_Sans_Pilotes_-_013.jpg',\n",
+ " '05: Cannon-292.jpg',\n",
+ " '06: Contrato_con_Dios_028.jpg',\n",
+ " '07: Erase_una_vez_en_Francia_02_88.jpg',\n",
+ " '08: FOX_CHILLINTALES_T17_012.jpg',\n",
+ " '09: Furari_-_Jiro_Taniguchi_selma_056.jpg',\n",
+ " '10: Galactus_12.jpg',\n",
+ " '11: INOUE_KYOUMEN_002.png',\n",
+ " '12: MCCALL_ROBINHOOD_T31_010.jpg',\n",
+ " '13: MCCAY_LITTLENEMO_090.jpg',\n",
+ " '14: Mary_Perkins_On_Stage_v2006_1_-_P00068.jpg',\n",
+ " '15: PIKE_BOYLOVEGIRLS_T41_012.jpg',\n",
+ " '16: Sal_Buscema_Spaceknights_&_Superheroes_Ocular_Edition_1_1.png',\n",
+ " '17: Sal_Buscema_Spaceknights_&_Superheroes_Ocular_Edition_1_1_K.png',\n",
+ " '18: Sal_Buscema_Spaceknights_&_Superheroes_Ocular_Edition_1_2.png',\n",
+ " '19: Spirou_Et_Fantasio_Integrale_06_1958_1959_0025_0024.jpg',\n",
+ " '20: Strange_Tales_172005.jpg',\n",
+ " '21: Strange_Tales_172021.jpg',\n",
+ " '22: Tarzan_014-21.JPG',\n",
+ " '23: Tintin_21_Les_Bijoux_de_la_Castafiore_page_39.jpg',\n",
+ " '24: Transformers_-_Unicron_000-004.jpg',\n",
+ " '25: Transformers_-_Unicron_000-016.jpg',\n",
+ " '26: WARE_ACME_024.jpg',\n",
+ " '27: Yoko_Tsuno_T01_1972-10.jpg',\n",
+ " '28: Your_Name_Another_Side_Earthbound_T02_084.jpg',\n",
+ " '29: manga_0033.jpg',\n",
+ " '30: ronson-031.jpg',\n",
+ " '31: 哀心迷図のバベル 第01巻 - 22002_00_059.jpg']"
+ ]
+ },
+ "execution_count": 31,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "media_path = Path(\"media/\")\n",
+ "\n",
+ "IMAGE_PATHS = sorted(\n",
+ " [_ for _ in media_path.glob(\"*\") if _.is_file() and _.suffix.lower() in [\".jpg\", \".png\", \".jpeg\"]])\n",
+ "\n",
+ "[f\"{i:02}: {_.name}\" for i,_ in enumerate(IMAGE_PATHS)]\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# CONTEXT\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Current Configuration:\n",
+ "\n",
+ "Locale: System default\n",
+ "Default Profile: Built-in\n",
+ "Saved Profiles:\n",
+ "- victess: /home/vic/dev/repo/DL-mac/cleaned/victess.conf\n",
+ "- victmang: /home/vic/dev/repo/DL-mac/cleaned/vicmang.conf\n",
+ "\n",
+ "Profile Editor: System default\n",
+ "Cache Directory: .\n",
+ "Default Torch Model Path: /home/vic/.cache/pcleaner/model/comictextdetector.pt\n",
+ "Default CV2 Model Path: /home/vic/.cache/pcleaner/model/comictextdetector.pt.onnx\n",
+ "GUI Theme: System default\n",
+ "\n",
+ "--------------------\n",
+ "\n",
+ "Config file located at: /home/vic/.config/pcleaner/pcleanerrc\n",
+ "System default cache directory: /home/vic/.cache/pcleaner\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "
INPUT: User:<image>Please perform optical character recognition (OCR) on this image, which displays \n",
+ "speech balloons from a comic book. The text is in English. Extract the text and format it as follows: \n",
+ "transcribe in standard sentence case, capitalized. Avoid using all capital letters, but ensure it is \n",
+ "capitalized where appropriate, including proper nouns. Provide the transcribed text clearly. Double \n",
+ "check the text is not all capital letters.<end_of_utterance>\n",
+ "Assistant: |OUTPUT:\n",
+ "[\n",
+ " 'Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New \n",
+ "Orleans, kept tidy by a white-haired old man known only as Bambu.'\n",
+ "]\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "INPUT: User:\u001b[1m<\u001b[0m\u001b[1;95mimage\u001b[0m\u001b[39m>Please perform optical character recognition \u001b[0m\u001b[1;39m(\u001b[0m\u001b[39mOCR\u001b[0m\u001b[1;39m)\u001b[0m\u001b[39m on this image, which displays \u001b[0m\n",
+ "\u001b[39mspeech balloons from a comic book. The text is in English. Extract the text and format it as follows: \u001b[0m\n",
+ "\u001b[39mtranscribe in standard sentence case, capitalized. Avoid using all capital letters, but ensure it is \u001b[0m\n",
+ "\u001b[39mcapitalized where appropriate, including proper nouns. Provide the transcribed text clearly. Double \u001b[0m\n",
+ "\u001b[39mcheck the text is not all capital letters.\u001b[0m\n",
+ "Assistant: |OUTPUT:\n",
+ "\u001b[1m[\u001b[0m\n",
+ " \u001b[32m'Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New \u001b[0m\n",
+ "\u001b[32mOrleans, kept tidy by a white-haired old man known only as Bambu.'\u001b[0m\n",
+ "\u001b[1m]\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "prompt, generation_args = idefics_generation_args(image, resulting_messages)\n",
+ "generated_ids = model.generate(**generation_args)\n",
+ "\n",
+ "generated_texts = processor.batch_decode(\n",
+ " generated_ids[:, generation_args[\"input_ids\"].size(1):], skip_special_tokens=True)\n",
+ "cprint(\"INPUT:\", prompt, \"|OUTPUT:\", generated_texts)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu. 1.00
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
INPUT: User:<image>Please perform optical character recognition (OCR) on this image, which displays \n",
+ "speech balloons from a comic book. The text is in English. Extract the text and format it as follows: \n",
+ "transcribe in standard sentence case, capitalized. Avoid using all capital letters, but ensure it is \n",
+ "capitalized where appropriate, including proper nouns. Provide the transcribed text clearly. Double \n",
+ "check the text is not all capital letters.<end_of_utterance>\n",
+ "Assistant: |OUTPUT:\n",
+ "[\n",
+ " 'Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New \n",
+ "Orleans, kept tidy by a white-haired old man known only as Bambu.'\n",
+ "]\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "INPUT: User:\u001b[1m<\u001b[0m\u001b[1;95mimage\u001b[0m\u001b[39m>Please perform optical character recognition \u001b[0m\u001b[1;39m(\u001b[0m\u001b[39mOCR\u001b[0m\u001b[1;39m)\u001b[0m\u001b[39m on this image, which displays \u001b[0m\n",
+ "\u001b[39mspeech balloons from a comic book. The text is in English. Extract the text and format it as follows: \u001b[0m\n",
+ "\u001b[39mtranscribe in standard sentence case, capitalized. Avoid using all capital letters, but ensure it is \u001b[0m\n",
+ "\u001b[39mcapitalized where appropriate, including proper nouns. Provide the transcribed text clearly. Double \u001b[0m\n",
+ "\u001b[39mcheck the text is not all capital letters.\u001b[0m\n",
+ "Assistant: |OUTPUT:\n",
+ "\u001b[1m[\u001b[0m\n",
+ " \u001b[32m'Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New \u001b[0m\n",
+ "\u001b[32mOrleans, kept tidy by a white-haired old man known only as Bambu.'\u001b[0m\n",
+ "\u001b[1m]\u001b[0m\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu. 1.00
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tidy by a white-haired old man known only as bambu. 0.98
\n",
+ " \n",
+ "
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Embowered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tidy by a white-haired old man known only as bambu.
Embow⎕⎕ered by great gnarled cypress trees, the ancient manor stands alone on the outskirts of New Orleans, kept tidy by a white-haired old man known only as Bambu.
Encountered by great charles cypress trees, the ancient manor stands alone on the outskirts of new orleans, kept tidy by a white-haired old man known only as bambu.