Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
ferrygun18
Contributor
1,285
In this blog post, we'll learn how to utilize RetinaNet object detection framework to detect and localize logo in images and build a REST API Python Flask app with SAP Cloud Foundry. The result will be visualized in HTML file.


Pre-requisities



High Level Steps


Here is the high level steps that we will going to perform.


Image Logo Dataset


I have prepared 185 image logo files as a dataset with only 1 class as we will only detect one logo in the image. This was done manually by downloading images from Google search query.
Extract the dataset to a folder.

Annotate Images


Install and open the LabelImg tool.

  1. Click Open Dir and select the logo dataset folder. The list of images will appear on the File List.

  2. Change the save format to PascalVOC.

  3. Find the logo in the image and draw the bounding box.

  4. Create a new box label "pfe" if it doesn't exists.

  5. Save the label.

  6. Repeat the step 3 to 5 for all images in File List.


Once you save the the label, the XML file will be created with the bounding box information, bndbox. We will extract the information and put into the required Keras-RetinaNet format.
<annotation>
<folder>images</folder>
<filename>00000003.jpg</filename>
<path>C:\FD\Py\DownloadImg\images\00000003.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>0</width>
<height>0</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>pfe</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1</xmin>
<ymin>1</ymin>
<xmax>1200</xmax>
<ymax>693</ymax>
</bndbox>
</object>
</annotation>

Split into Train and Test


We have 185 datapoints, we need to shuffle and split 80% data for training and 20% data for testing. I have generated the train.txt and test.txt which consist of the unique image ID of the image filename. Take a look at those files.

Build Dataset


We will also need to prepare three CSV files that are required by Keras-RetinaNet framework library.

  • retinanet_classes.csv
    Contains the class name to integer ID mapping
    pfe,0​


  • retinanet_test.csv and retinanet_train.csv.
    Contains the image path,  bounding box annotation and the human readable class label.
    /content/keras-retinanet/images/00000096.jpg,37,21,380,224,pfe
    /content/keras-retinanet/images/00000596.jpg,179,20,578,317,pfe
    /content/keras-retinanet/images/00000250.jpg,27,11,207,119,pfe
    ...​

    The first entry is the path to the image, follows by the bounding box coordinates in the following order: start x, start y, end x, end yAnd the last entry is the human readable class label.


Run the Python script below to generate those CSV files:
python build_logos.py


Train RetinaNet to Detect Logos


Now we have all the required files to perform the training. We will use Google Colab with GPU.

Open Google Colab and set the notebook settings runtime type Python 3 and Hardware accelerator to GPU.



 

I have prepared the Python Jupyter notebook for this purpose. Run the notebook and let the network train for a total of 50 epochs.


Export Model


Once the training is completed, we need to export before we can evaluate the model or apply it to predict objects in our own images.

Download resnet50_csv_50.h5 from Google Colab.





Run the following command to convert:
retinanet-convert-model resnet50_csv_50.h5 output.h5

We will get the ready model output.h5.

Evaluate Model


To evaluate the model on a testing set, use the following command:
retinanet-evaluate csv retinanet_test.csv retinanet_classes.csv output.h5



From the evaluation we obtain mean average precision (mAP) 96%.

Python Flask REST API 


We will create a Python Flask app to detect logo in images and deploy it to SAP Cloud Foundry.



To test the app with this image, go to the SAP Cloud Foundry app URL and provide the url parameter with the link to the image file:
https://retinanet_tf.cfapps.eu10.hana.ondemand.com/img?url=https://c8.alamy.com/comp/BX8FGF/differen...

The service will return a JSON response that includes score and bounding box coordinates for the identified objects:
[
[
"pfe: 1.00",
"87",
"106",
"190",
"178"
],
[
"pfe: 1.00",
"219",
"480",
"295",
"530"
],
[
"pfe: 1.00",
"397",
"547",
"476",
"600"
],
[
"pfe: 1.00",
"1060",
"67",
"1129",
"172"
],
[
"pfe: 0.84",
"770",
"710",
"895",
"782"
],
[
"pfe: 0.58",
"585",
"350",
"703",
"421"
]
]

Visualize in HTML


To visualize the result, create a simple HTML code and populate the bounding boxes, class and scores.
var name = "https://c8.alamy.com/comp/BX8FGF/different-strengths-of-atorvastatin-trade-name-lipitor-made-by-pfizer-BX8FGF.jpg";
var response = {
"detection_boxes": [
[
87,
106,
190,
178
],
[
219,
480,
295,
530
],
[
397,
547,
476,
600
],
[
1060,
67,
1129,
172
],
[
770,
710,
895,
782
],
[
585,
350,
703,
421
]
],
"detection_classes": [
"pfe",
"pfe",
"pfe",
"pfe",
"pfe",
"pfe"
],
"detection_scores": [
1.00,
1.00,
1.00,
1.00,
0.84,
0.58
]
};



Full source code can be found on my Git repo and the generated model can be found here.

References
Labels in this area