Someone asked me how to scrape images from Google. I took it as a challenge.
Note that I did an image search, and took the Base64 versions of the images from the search page, which is how Google displays them. These are small, maybe 300-pixels square. If you wanted larger images, you'd extend the automation to click on the image to open the side panel, and use the right-click menu to save/download the image (my next challenge 😺).
I then created a process with a trigger form with 2 inputs, the keyword for the Google search and the local path to store the pictures. But I could almost as easily create a Automation Launcher and User Task so I could trigger it attended whenever I wanted.
Above looks like I knew what I was doing. But it took a little while, especially the following parts:
I did a recording by opening up Google in a new browser, and selecting Create → Application and selecting Recorder (not manual capture) and it created 2 screens – the Google home page, and then the results page.
On the results page, I created 2 captures. The main results page, and then the results page for images.
I also declared some elements I needed, like the search box and button. Most importantly, I declared the list of images, using the class as a criteria, and setting this to a collection.
The automation is created for you, based on the recording, but I found there were all kinds of extra artifacts that I needed to delete or combine. Not a big deal, just something you have to do.
And then I had to add the part where we iterate through the images, decode the Base64 strings, save them to a file, and add a condition to just take the first 10 images. All using simple activities. I also created an input parameter for the keyword, instead of the hard-coded "cat" I used in the recording.
I created the artifacts for enabling this as an attended automation, creating:
The automation launcher you would have to register in the Control Tower for your environment.
And the user task you would have to add to your automation. Here is an example from my SAP CodeJam demo of sending emails using an automation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.