Introduction
Have you ever encountered a scenario where you wanted to run a small python code in a notebook to check if a solution is viable, especially in the case where you want to run some sample data science tests using ML libraries? What would be your approach to perform this task locally on your system? (assuming you are not already subscribed to a Jupyter notebook running as SaaS application, which generally incurs some cost)
These are steps that you would probably follow
- Google "Running python notebooks" and read through a few posts on how to do it.
- You will see that there are 2 main routes that you can take i.e either install a python distribution like Anaconda or install a bare-bones python run-time.
- But this only provide the basic python run-time and you don't have that particular python library that you really love.
- You perform some more research and you see that it is possible to manually install you beloved library via "pip".
- You are happy to find this solution and proceed ahead without expecting any further hiccups. But that's were you are wrong.
- As you start installing these libraries, you see build or run time errors creeping in, like version mismatch, library dependency mismatch, and the list goes on.
- As you spend more and more time resolving these issues one by one, frustration starts to kick in and you think for yourself "Do I really have to go through all these hurdles, just to run one data science test? If I spend all this time setting up my environment, when will I ever start my POC and deliver the results?"
I believe everyone goes through similar hurdles, and the main reason why I thought I would write this blog is because I faced similar situation numerous times in the past (I faced this issue multiple times as I had to replace my corporate laptop once, re-imaged my windows OS, etc. to name a few and each time, I was left with nothing but a fresh Windows OS without any apps installed).
Then as expected, I had to repeat steps 2-7 mentioned above. What a hassle!.
I was looking for solutions to spend less time on environment setup, application installation, etc which are not productive tasks, but rather a prerequisite to start your actual "work".
Like a fated encounter, I happen to come across the concept of containerization and Docker containers. This opened my mind to a slew of ideas, which could potentially help to resolve the user/developer pain-points which I mentioned above.
Now, let's get to the topic at hand. Yes, how to leverage the power of Containers to ease your daily work, in this case to run Jupyter notebook preloaded with all your favorite ML libraries with zero or minimal effort.
Containers to the rescue
Our problem can be solved with the help of containers. In this blog, we will concentrate on Docker containers.
One of the major reason I prefer Docker is the simplicity with which the application can be installed, be it on Linux, MacOS or Windows.
To install Docker, you can follow the official link below
Windows :
https://docs.docker.com/v17.09/docker-for-windows/install/
Mac :
https://docs.docker.com/v17.09/docker-for-mac/install/
Ubuntu :
https://docs.docker.com/v17.09/engine/installation/linux/docker-ce/ubuntu/
Note: If you are running Docker for Windows, you need to set the container type as "Linux containers" during installation as the images I use for reference in this blog are all Linux images.
Once the installation is successful (System restart is mandatory for the installation to be successful), make sure that the Docker daemon is running. To validate a successful installation, open a Command Prompt (cmd) and run the following command
docker run --rm hello-world
The output would be something like this
"Hello from Docker!
This message shows that your installation appears to be working correctly."
Hurray!!! Your docker installation is completed and successful. Now you would be thinking, "This was way too easy...How can a installation be performed so easily. There must be some catch". Well, the good news is that there is no catch. This is all it takes to install Docker.
Now that the Docker app is running, let us pull/download our Jupyter image which is already built. To download the image to your local system, run the following command
docker pull ashomega/jupyter_all_libs
This is a jupyter image that has been already built and ready to be run as a container. The above command will download the jupyter image from the Docker Hub repository "ashomega/jupyter_all_libs"
Once the pull is complete, you can run the docker container with this command
docker run --rm -p 8888:8888 ashomega/jupyter_all_libs
In case you want to mount a shared directory on your local system to the container, you can use the -v option as shown below
docker run --rm -p 8888:8888 -v C:\Shared_directory:/home/docker/images/jupyter ashomega/jupyter_all_libs
Once this step is complete, you can launch your Jupyter Notebook by invoking the URL and start your work.
http://localhost:8888
Conclusion
You can see above how a Jupyter notebook environment can be easily setup up with the help of containerization.
The original docker image can be found in the URL
https://hub.docker.com/r/ashomega/jupyter_all_libs/
Github link to the dockerfile source
https://github.com/AshOmega/Jupyter_notebook
SSH : git@github.com:AshOmega/Jupyter_notebook.git
Feel free to provide your comments/feedback on this blog
🙂