How to Install Libraries permanently in Google Colab?
--
Most coders or developers use Google Colab for executing their prototypes very fast as it provides you access to faster GPUs like the T4 and P100. But sometimes your program requires a specific version of the libraries which are not installed in google colab to be installed. It becomes tedious and time-consuming to install dependencies in every run(installed dependencies are no longer available once runtime disconnects) of colab notebook.
Recently I got stuck on a similar problem. I have to install few large-sized libraries to run a program in colab, which takes too long to complete installation. But I want to run my program multiple times and thus need to avoid the installation of libraries in each run to decrease installation time. Which is not possible without installing dependencies permanently in google colab. So, In this blog, I will describe how I have successfully installed all the dependencies permanently in google colab.
First, mount the google drive using the following two lines of code.
from google.colab import drive
drive.mount("/content/drive")
After successfully mounting the google drive, Let’s create a virtual environment using virtualenv
library. One thing to keep in mind is that, create a virtual environment inside your google drive, which is mounted above.
!virtualenv /content/drive/MyDrive/colab_env
Here you can see that, a virtual environment namedcolab_env
has been created in google drive.
Now let’s install a library named Pypdf
in virtual environment colab_env
. To install a library in the virtual environment, we should activate the environment first, and install the library in the same cell.
!source /content/drive/MyDrive/colab_env/bin/activate; pip install Pypdf
In the above line of code !source/content/drive/MyDrive/colab_env/bin/activate
activates our environment colab_env
. And pip install Pypdf
installs Pypdf
library within the environment.
from pypdf import PdfReader
reader = PdfReader("/content/Data science journey 3.pdf")
number_of_pages = len(reader.pages)
print(number_of_pages)
Here, I have tried to run some functionalities of our newly installed Pypdf
library(a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files).
But the above lines of codes throw an error i.e ModuleNotFoundError: No module named ‘pypdf’
.
So, what do you think, why can’t we import our newly installed library Pypdf
?. Here, we have installed Pypdf
in our virtual environment but our imported module search library in google colab runtime. In order to find the library installed on the virtual environment we should add the path of the virtual environmentsite-packages
to colab system path.
import sys
sys.path.append("/content/drive/MyDrive/colab_env/lib/python3.8/site-packages")
The above lines of code added the path of virtual environment packages to the system path. Now, let’s again run the following script to test whether our newly installed dependencies are working or not.
from pypdf import PdfReader
reader = PdfReader("/content/Data science journey 3.pdf")
number_of_pages = len(reader.pages)
print(number_of_pages)
Wow, we fixed it. now it works as expected. Our PdfReader
function reads pdf and calculates the number of pages which can be viewed using.pages
method. Here, the length of the pdf which I have tested is 41 pages.
Here, I have tested our installation in the first run only, but my requirement is to avoid the installation of the dependencies in future runs. To use previously installed packages of the virtual environment
colab_env
you must mount your drive and add the path ofcolab_env
site-packages to colab system path usingsys.path.append("/content/drive/MyDrive/colab_env/lib/python3.8/site-packages”).
After adding the path, just import and use the packages which were installed in the virtual environmentcolab_env
.
In conclusion, It is possible to install dependencies permanently to google colab using a virtual environment. One thing to take care of is, don’t forget to add the path of the virtual environment site-packages
to colab system path. I hope that this tutorial is helpful for you. Thank you!