TechGroundingDINO: Bridging Language and Vision for Open-Set Object Detection

GroundingDINO: Bridging Language and Vision for Open-Set Object Detection

Grounding DINO can detect arbitrary objects with human inputs such as category names or referring expressions. The key solution of open-set object detection is introducing language to a closed-set detector DINO. for open-set concept generalization

You can find the official github repository here

https://github.com/IDEA-Research/GroundingDINO

Steps in this Tutorial

In this tutorial, we are going to cover:

Before you start

Install Grounding DINO 🦕

Download Grounding DINO Weights 🏋️

Download Example Data

Load Grounding DINO Model

Grounding DINO Demo

Let’s begin!

Before you start

We are using google colab tool for doing this demo so if you are following this tutorial, Let’s make sure that we have access to GPU. We can use nvidia-smi command to do that. In case of any problems navigate to Edit -> Notebook settings -> Hardware accelerator, set it to GPU, and then click Save.

!nvidia-smi
# ----install libraries
!pip -q install transformers scipy 
# ------install GroundingDINO, DINO weights
!git clone https://github.com/IDEA-Research/GroundingDINO.git
%cd GroundingDINO
!pip -q install -e .
%mkdir weights
%cd weights
!wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
%cd ..
Installation of GroundingDINO and DINO Weights:

  • The code clones a GitHub repository for a project called “GroundingDINO” using git clone. GroundingDINO appears to be a machine learning project.
  • The code then navigates into the “GroundingDINO” directory using %cd GroundingDINO.
  • It installs the project as an editable Python package using pip -q install -e .. This allows the code in this repository to be imported and used in other Python scripts or notebooks.
  • A “weights” directory is created using %mkdir weights. This is likely where pre-trained model weights will be stored.
  • The code moves into the “weights” directory using %cd weights.
  • A pre-trained model weights are downloaded using wget -q. 
  • Weight file is “groundingdino_swint_ogc.pth.” These weights are likely to be used in the GroundingDINO project.
The above code is essentially setting up the environment by installing necessary libraries and downloading pre-trained model weights for the GroundingDINO project. This environment is likely needed to run further code related to GroundingDINO.

It’s important to note that this code is meant to be executed in a Python environment, such as a Jupyter Notebook, that supports the use of magic commands (commands prefixed with %) and shell commands (commands prefixed with !).
THIS IS THE sOURCE IMAGE WE ARE USING

# ----GroundingDINO
from groundingdino.util.inference import load_model, load_image, predict, annotate
from GroundingDINO.groundingdino.util import box_ops
# ----Extra Libraries
from PIL import Image
import torch
import cv2
import matplotlib.pyplot as plt
import numpy as np

device = "cuda"
img_path = '/content/image2.jpg'
src, img = load_image(img_path)
# ----Grounding DINO
groundingdino_model = load_model("groundingdino/config/GroundingDINO_SwinT_OGC.py", "weights/groundingdino_swint_ogc.pth")
# ---- prompt is for identifying target object in the image
TEXT_PROMPT = "train and car and person"
BOX_TRESHOLD = 0.3
TEXT_TRESHOLD = 0.25

boxes, logits, phrases = predict(
    model=groundingdino_model,
    image=img,
    caption=TEXT_PROMPT,
    box_threshold=BOX_TRESHOLD,
    text_threshold=TEXT_TRESHOLD
)
img_annnotated = annotate(image_source=src, boxes=boxes, logits=logits, phrases=phrases)[...,::-1]

fig, axes = plt.subplots(1, 2, figsize=(30, 20))
plt.title("Annotated Image with Text", fontsize=30)
axes[0].imshow(src)
axes[0].axis('off')
axes[1].imshow(img_annnotated)
axes[1].axis('off')

plt.show()
type your search
a

Reach out to us anytime and lets create a better future for all technology users together, forever. We are open to all types of collab offers and tons more.