Artificial Intelligence and Machine Learning Blogs
Explore AI and ML blogs. Discover use cases, advancements, and the transformative potential of AI for businesses. Stay informed of trends and applications.
Do you know Cozmo? The friendly robot from Anki? Well...here he is...
Cozmo is a programmable robot that has many features...and one of those includes a camera...so you can Cozmo take a picture of something...and then do something with that picture...
To code for Cozmo you need to use Python...actually...Python 3 😉
For this blog, we're going to need a couple of things...so let's install them...
pip3 install ‘cozmo[camera]’
This will install the Cozmo SDK...and you will need to install the Cozmo app in your phone as well...
If you have the SDK installed already, you may want to upgrade it because if you don't have the latest version it might not work...
If by any chance, something is not available on your system, simply remove it from the list and try again...unless you're like me and want to spend hours trying to get everything...
Now, we need to download the OpenCV source code so we can build it...from the source...
Then, we need to download the contributions because there are some things not bundled in OpenCV by default...and you might need them for any other project...
Keep extra attention that you need to pass the correct path to your opencv_contrib folder...so it's better to pass the full path to avoid making errors...
And yes...that's a pretty long command for a build...and it took me a long time to make it work...as you need to figure out all the parameters...
Once we're done, we need to make it...as cmake will prepare the recipe...
make -j2
If there's any mistake, simply do this...
make clean
make
Then, we can finally install OpenCV by doing this...
sudo make install
sudo ldconfig
To test that it's working properly...simply do this...
python3
>>>import cv2
If you don't have any errors...then we're good to go -;)
That was quite a lot of work...anyway...we need an extra tool to make sure our image get nicely processed...
Download textcleaner and put in the same folder as your Python script...
And...just in case you're wondering...yes...we're going to have Cozmo take a picture...we're going to process it...use SAP Leonardo's OCR API and then have Cozmo read it back to us...cool, huh?
SAP Leonardo's OCR API is still on version 2Alpha1...but regardless of that...it works amazing well -;)
Although keep in mind that if the result is not always pretty accurate that because of the lighting, the position of the image, your handwritting and the fact that the OCR API is still in Alpha...
Ok...so first things first...we need a white board...
And yes...my hand writing is far from being good... -:(
Now, let's jump into the source code...
import cozmo
from cozmo.util import degrees
import PIL
import cv2
import numpy as np
import os
import requests
import json
import re
import time
import pygame
import _thread
We're going to use threads, as we need to have a window where we can see what Cozmo is looking at and another with Pygame where we can press "Enter" as command to have Cozmo taking a picture.
Basically, when we run the application, Cozmo will move his head and get into picture mode...then, if we press "Enter" (On the terminal screen) it will take a picture and then send it to our OpenCV processing function.
This function will simply grab the image, scale it, make it grayscale, do a GaussianBlur to blur the image and remove the noise and reduce detail. Then we're going to apply a denoising to get rid of dust and fireflies...apply a threshold to separate the white and black pixels, and apply a couple more blurs...
Finally we're to call textcleaner to further remove noise and make the image cleaner...
So, here is the original picture taken by Cozmo...
This is the picture after our OpenCV post-processing...
And finally, this is our image after using textcleaner...
Finally, once we have the image the way we wanted, we can call the OCR API which is pretty straightforward...
Once we have the response back from the API, we can do some Regular Expressions cleanup just to make sure some characters doesn't get wrongly recognized...
Finally, we can have Cozmo to read the message out loud -;) And just for demonstration purposes...
Here, I was lucky enough that the lighting and everything was perfectly setup...so it was a pretty clean response...further tests were pretty bad -:( But again...it's important to have good lighting...
Of course...you wan to see a video of the process in action, right? Well...funny enough...my first try was perfect! Even better than this one...but I didn't shoot the video -:( Further tries were pretty crappy until I could get something acceptable...and this is what you're going to watch now...the sun coming through the window didn't helped me...but it's pretty good anyway...