Deep learning model to detect objects using text prompts. A brief summary of the item is not available. Add a brief summary about the item.
Deep learning package
by
Item created: Feb 2, 2024 Item updated: Feb 19, 2025 Number of downloads: 124,925
Description
Text SAM is an open-source sample model that can be prompted using free-form text prompts to extract features of various kinds. This is achieved by using Grounding DINO and Segment Anything Model (SAM). Grounding DINO is an open-set object detector that can find objects given a text prompt. Segment Anything Model can be used to segment any object in a region of interest represented by a bounding box or a point. Both
the models are called sequentially within this deep learning package.
The bounding boxes representing the detected objects from Grounding DINO
are fed into the Segment Anything Model as prompts to generate masks
for the objects. Finally, the masks are converted to polygons and
returned as GIS features. These
features, which are described by the input text prompts, can be any
object of interest such as vehicles, swimming pools, ships, airplanes,
solar panels, etc.
Using the model
Follow the guide to
use the model. Before using this model, ensure that the supported deep
learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.
Fine-tuning the model
This model cannot be fine-tuned using ArcGIS tools.
Input
8-bit, 3-band RGB imagery.
Output
Feature class containing masks of various objects in the image.
Applicable geographies
The model is expected to work globally.
Model architecture
This model is based on the open-source Grounding DINO by IDEA-Research and Segment Anything Model (SAM) by Meta. The source code of this sample deep learning package (DLPK) is available here.
Sample results
Here are a few results from the model.
An in-depth description of the item is not available.
Layers
Tools
Tables
Basemap
Project Contents:
Solution Contents
Contents
Layers
Screenshots
Terms of Use
No special restrictions or limitations on using the item's content have been provided.
Details
Dashboard views: Desktop
Source:
Creating data in:
Published as:
Other Views:
Dependent items in the recycle bin
Applicable: 2d
Size: 1,572.611 MB
ID: 8df3bf4167bc4c7b967f677f8b362ec3
Image Count: 0
Image Properties
Layer Drawing
Using tiles from a cache
Dynamically from data
Share
Owner
Folder
Categories
This item has not been categorized.
Tags
GroundingDINO, SAM, text, prompt, zeroshot, Object Detection, segmentation, DINO, LivingAtlasDLPK
Credits (Attribution)
No acknowledgements.Meta, IDEA-Research
Comments (31)
Once the objects are dectected, how to get the feature class to have the Shape_Area and Shape_Length fields in the Attribute table, as shown in the Tutorial for this data (https://learn.arcgis.com/en/projects/detect-objects-with-text-sam/) ?
Update to last post. Made 3 changes and then it worked successfully on the tutorial raster image. 1) Restarting ArcGisPro, manually setting setting the map field of view rather than using the bookmarked Detection area, and then using the new extent setting in the environment. The restart itself may have been sufficient. Not sure.
Hi, I experenced the same cell size issue described by another user while going through the example tutorial Detect objects with Text SAM (https://learn.arcgis.com/en/projects/detect-objects-with-text-sam/) . I have downloaded and installed the Deep Learning Libraries Installer for ArcGIS Pro 3.3. The ArcGisPro version I am using is ArcGIS Pro 3.3.0. Here are the parameters from the run in Detect Objects tool (IA): Detect Objects Using Deep Learning ===================== Parameters Input Raster Tuborg_Havn.tif Output Detected Objects D:\arcgis_pro_examples\deep_learning_obj_det\Detected_Boats.shp Model Definition D:\ArcGisPro_Tools\GeoAI\TextSAM.dlpk Arguments text_prompt boat;padding 256;batch_size 4;box_threshold 0.2;text_threshold 0.2;tta_scales 1;nms_overlap 0.7 Non Maximum Suppression NMS Confidence Score Field Confidence Class Value Field Class Max Overlap Ratio 0 Processing Mode PROCESS_AS_MOSAICKED_IMAGE Output Classified Raster Use pixel space NO_PIXELSPACE ===================== Environments GPU ID 0 Extent 724962.122334114 6181327.81085789 725211.77267448 6181572.98316525 PROJCS["ETRS_1989_UTM_Zone_32N",GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",9.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]] Processor Type GPU ===================== Messages Start Time: Wednesday, January 22, 2025 12:09:35 AM A raster error has occurred. The messages that follow will provide more detail. TypeError: 'NoneType' object is not subscriptable Working with cell size 0.200000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Failed at Wednesday, January 22, 2025 12:11:33 AM (Elapsed Time: 1 minutes 58 seconds) Here is the Environment: Detect Objects Using Deep Learning ===================== Parameters Input Raster Tuborg_Havn.tif Output Detected Objects D:\arcgis_pro_examples\deep_learning_obj_det\Detected_Boats.shp Model Definition D:\ArcGisPro_Tools\GeoAI\TextSAM.dlpk Arguments text_prompt boat;padding 256;batch_size 4;box_threshold 0.2;text_threshold 0.2;tta_scales 1;nms_overlap 0.7 Non Maximum Suppression NMS Confidence Score Field Confidence Class Value Field Class Max Overlap Ratio 0 Processing Mode PROCESS_AS_MOSAICKED_IMAGE Output Classified Raster Use pixel space NO_PIXELSPACE ===================== Environments GPU ID 0 Extent 724962.122334114 6181327.81085789 725211.77267448 6181572.98316525 PROJCS["ETRS_1989_UTM_Zone_32N",GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",9.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]] Processor Type GPU ===================== Messages Start Time: Wednesday, January 22, 2025 12:09:35 AM A raster error has occurred. The messages that follow will provide more detail. TypeError: 'NoneType' object is not subscriptable Working with cell size 0.200000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Failed at Wednesday, January 22, 2025 12:11:33 AM (Elapsed Time: 1 minutes 58 seconds) Here are the messages: Detect Objects Using Deep Learning ===================== Parameters Input Raster Tuborg_Havn.tif Output Detected Objects D:\arcgis_pro_examples\deep_learning_obj_det\Detected_Boats.shp Model Definition D:\ArcGisPro_Tools\GeoAI\TextSAM.dlpk Arguments text_prompt boat;padding 256;batch_size 4;box_threshold 0.2;text_threshold 0.2;tta_scales 1;nms_overlap 0.7 Non Maximum Suppression NMS Confidence Score Field Confidence Class Value Field Class Max Overlap Ratio 0 Processing Mode PROCESS_AS_MOSAICKED_IMAGE Output Classified Raster Use pixel space NO_PIXELSPACE ===================== Environments GPU ID 0 Extent 724962.122334114 6181327.81085789 725211.77267448 6181572.98316525 PROJCS["ETRS_1989_UTM_Zone_32N",GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",9.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]] Processor Type GPU ===================== Messages Start Time: Wednesday, January 22, 2025 12:09:35 AM A raster error has occurred. The messages that follow will provide more detail. TypeError: 'NoneType' object is not subscriptable Working with cell size 0.200000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Failed at Wednesday, January 22, 2025 12:11:33 AM (Elapsed Time: 1 minutes 58 seconds)
I am trying extract data from Drone imagery. It's throwing me the below mentioned error : A raster error has occurred. The messages that follow will provide more detail. TypeError: 'tuple' object does not support item assignment Working with cell size 0.100000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Could you please help me ?
Hi Neha, This issue seems to be coming because of an older version of dlpk. We fixed this issue and updated the dlpk. Can you download the new version of the model and also restart the pro and then run the model. If the issue still persist can you share more information regarding the ArcGIS pro version and data?
Could you try restart Pro and re rurun the tool? There may be some caching related issue. If you still face this error, could you reach out to me at ptuteja@esri.com
Hi @mazzouzi, I installed ArcGIS Pro 3.3 with Deep Learning Libraries Installer and I am not able to reproduce the error. Is it possible for you to share the raster?
Arcgis Pro 3.3.5 Deep Learning Libraries Installer for ArcGIS Pro 3.3
i have the same problem, with all the Dplk model i get the same error : Detect Objects Using Deep Learning ===================== Parameters Input Raster c85202203256635LA930M20E08.tif Output Detected Objects C:\Users\M’barekAzzouziAvineo\Documents\ArcGIS\Projects\Projet_Test_IA_ArcGISPro\Modele_MaskRCNN\MaskRCNN_V1_Test_20241029.gdb\TextSAM_DetectObjects3 Model Definition C:\Users\M’barekAzzouziAvineo\Downloads\TextSAM.dlpk Arguments text_prompt panels;padding 100;batch_size 4;box_threshold 0.2;text_threshold 0.2;tta_scales 0.5;nms_overlap 0.1 Non Maximum Suppression NMS Confidence Score Field Confidence Class Value Field Class Max Overlap Ratio 0.1 Processing Mode PROCESS_AS_MOSAICKED_IMAGE Output Classified Raster Use pixel space NO_PIXELSPACE ===================== Environments GPU ID 1 Cell Size 0.3 Processor Type GPU ===================== Messages Start Time: Tuesday, October 29, 2024 9:59:52 AM A raster error has occurred. The messages that follow will provide more detail. Traceback (most recent call last): Working with cell size 0.300000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Failed at Tuesday, October 29, 2024 10:00:20 AM (Elapsed Time: 27.50 seconds)
Detect Objects Using Deep Learning ===================== Parameters Input Raster c85202203256635LA930M20E08.tif Output Detected Objects C:\Users\M’barekAzzouziAvineo\Documents\ArcGIS\Projects\Projet_Test_IA_ArcGISPro\Modele_MaskRCNN\MaskRCNN_V1_Test_20241029.gdb\TextSAM_DetectObjects3 Model Definition C:\Users\M’barekAzzouziAvineo\Downloads\TextSAM.dlpk Arguments text_prompt panels;padding 100;batch_size 4;box_threshold 0.2;text_threshold 0.2;tta_scales 0.5;nms_overlap 0.1 Non Maximum Suppression NMS Confidence Score Field Confidence Class Value Field Class Max Overlap Ratio 0.1 Processing Mode PROCESS_AS_MOSAICKED_IMAGE Output Classified Raster Use pixel space NO_PIXELSPACE ===================== Environments GPU ID 1 Cell Size 0.3 Processor Type GPU ===================== Messages Start Time: Tuesday, October 29, 2024 9:59:52 AM A raster error has occurred. The messages that follow will provide more detail. Traceback (most recent call last): Working with cell size 0.300000 (less than 1.00) meter. Failed to execute (DetectObjectsUsingDeepLearning). Failed at Tuesday, October 29, 2024 10:00:20 AM (Elapsed Time: 27.50 seconds)
Hi, can you share more information regarding the ArcGIS pro version and data. On which ArcGIS Pro version you are running this tool? Please check your raster bit depth as the model is compatible with 8 bit unsigned bit depth imagery.
Hi, what ArcGIS Pro license level do you need to run this tool? Can it be run on a standard license with imagery analyst and 3D analyst?
Nevermind, I have it working on a basic ArcGIS Pro license with 3D Analyst and Image Analyst