Deep learning model to detect and recognize text from images. A brief summary of the item is not available. Add a brief summary about the item.
Deep learning package
by
Item created: May 18, 2023 Item updated: Jan 2, 2025 Number of downloads: 4,273
Description
Text labels are an integral part of cadastral maps and floor plans. Text is also prevalent in natural scenes around us in the form of road signs, billboards, house numbers and place names. Extracting this text can provide additional context and details about the places the text describes and the information it conveys. Digitization of documents and extracting texts from them helps in retrieving and archiving of important information.
This deep learning model is based on the MMOCR model
and uses optical character recognition (OCR) technology to detect text
in images. This model was trained on a large dataset of different types
and styles of text with diverse background and contexts, allowing for
precise text extraction. It can be applied to various tasks such as
automatically detecting and reading text from documents, sign boards,
scanned maps, etc., thereby converting images containing text to
actionable data.
Using the model
Follow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.
Fine-tuning the model
This model cannot be fine-tuned using ArcGIS tools.
Input
High-resolution, 3-band street-level imagery/oriented imagery, scanned maps, or documents, with medium to large size text.
Output
A feature layer with the recognized text and bounding box around it.
Model architecture
This model is based on the open-source MMOCR model by MMLab.
Sample results
Here are a few results from the model.
An in-depth description of the item is not available.
Layers
Tools
Tables
Basemap
Project Contents:
Solution Contents
Contents
Layers
Screenshots
Terms of Use
No special restrictions or limitations on using the item's content have been provided.
Details
Dashboard views: Desktop
Source:
Creating data in:
Published as:
Other Views:
Dependent items in the recycle bin
Applicable: 2d
Size: 534.589 MB
ID: 8b56ed53e34b4304a5b8b826a7512ab0
Image Count: 0
Image Properties
Layer Drawing
Using tiles from a cache
Dynamically from data
Share
Owner
Folder
Categories
This item has not been categorized.
Credits (Attribution)
No acknowledgements.Esri
Comments (8)
I have downloaded the version from May 22 2024. It detects much less text with the same settings as for the version before. I am trying to digitize numbers (with comma for decimal separator). Any suggestions why that is the case or how to get more text recognized?
Any advice for training this model with our own data? I'm happy to dig into the MMOCR side of things if needed, but a starting point would be very welcome. Thanks.
I just realized that I downloaded a different version https://gis-au.maps.arcgis.com/home/item.html?id=901eac288e39420bb667e52a233d8195, so my earlier comment is related to that version. I am not sure why these two versions exist and get updated in parallel
@ptuteja_geosaurus Do you know if the mmocr in the model can simply be replaced with a newer version? I can also see that the model uses v0.6.3 and v1.0.0 is already published. My problem is mainly point and comma in my sources and another language than English. Otherwise it worked very well in first attempt.
You could explore the open source mmocr https://github.com/open-mmlab/mmocr for finetuning. The ArcGIS API for Python does not support finetuning an OCR-based model at this time.
Getting an error "AccessDenied, request has expired" when trying to access the guide link in the "Using the model" section. Can you please update the hyperlink?
The link has been updated. Apologies for the inconvenience.
Same.