How to Install Libraries permanently in Google Colab?

Netra Prasad Neupane
4 min readJan 18, 2023

--

Most coders or developers use Google Colab for executing their prototypes very fast as it provides you access to faster GPUs like the T4 and P100. But sometimes your program requires a specific version of the libraries which are not installed in google colab to be installed. It becomes tedious and time-consuming to install dependencies in every run(installed dependencies are no longer available once runtime disconnects) of colab notebook.

Recently I got stuck on a similar problem. I have to install few large-sized libraries to run a program in colab, which takes too long to complete installation. But I want to run my program multiple times and thus need to avoid the installation of libraries in each run to decrease installation time. Which is not possible without installing dependencies permanently in google colab. So, In this blog, I will describe how I have successfully installed all the dependencies permanently in google colab.

First, mount the google drive using the following two lines of code.

from google.colab import drive
drive.mount("/content/drive")

After successfully mounting the google drive, Let’s create a virtual environment using virtualenv library. In 2024, virtualenv doesn’t come by default in colab, so install it using pip install virtualenv. One thing to keep in mind is that, create a virtual environment inside your Google Drive, which is mounted above.

!virtualenv /content/drive/MyDrive/colab_env

Here you can see that, a virtual environment namedcolab_env has been created in google drive.

Now let’s install a library named Pypdf in the virtual environment colab_env. To install a library in the virtual environment, we should activate the environment first, and install the library in the same cell.

!source /content/drive/MyDrive/colab_env/bin/activate; pip install Pypdf

In the above line of code !source/content/drive/MyDrive/colab_env/bin/activate activates our environment colab_env. And pip install Pypdf installs Pypdf library inside the colab_env environment.

from pypdf import PdfReader

reader = PdfReader("/content/Data science journey 3.pdf")
number_of_pages = len(reader.pages)
print(number_of_pages)

Here, I have tried to run some functionalities of our newly installed Pypdf library(a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files).

But the above lines of codes throw an error i.e ModuleNotFoundError: No module named ‘pypdf’.

So, what do you think about? Why we couldn’t import our newly installed library Pypdf? Here, we have installed Pypdf in our virtual environment colab_env but our imported module search library in google colab runtime. In order to find the library installed on the virtual environment we should add the path of the virtual environmentsite-packages to colab system path.

import sys
sys.path.append("/content/drive/MyDrive/colab_env/lib/python3.8/site-packages")

The above lines of code added the path of virtual environment packages to the system path. Now, let’s run again the following script to test whether our newly installed dependencies are working or not.

from pypdf import PdfReader

reader = PdfReader("/content/Data science journey 3.pdf")
number_of_pages = len(reader.pages)
print(number_of_pages)

Wow, we fixed it. now it works as expected. Our PdfReader function reads pdf and calculates the number of pages which can be viewed using.pages method. Here, the length of the pdf which I have tested is 41 pages.

Here, I have tested our installation in the first run only, but my requirement is to avoid the installation of the dependencies in future runs. To use previously installed packages of the virtual environment colab_env you must mount your drive and add the path of colab_env site-packages to colab system path using sys.path.append("/content/drive/MyDrive/colab_env/lib/python3.8/site-packages”).

# step 1: Mount the drive first
from google.colab import drive
drive.mount("/content/drive/")

# step 2: Add the path of virtual environment (colab_env) site-packages
# to colaboratory system path
import sys
sys.path.append("/content/drive/MyDrive/colab_env/lib/python3.8/site-packages")

After adding the path, just import and use the packages which were installed in the virtual environment colab_env.

In conclusion, It is possible to install dependencies permanently to Google Colab using a virtual environment. One thing to take care of is, don’t forget to add the path of the virtual environment site-packages to colab system path. I have tested this method in 2024 and it works fine for me. Find all the above code snippets here.

I hope that this tutorial is helpful for you to install dependencies permanently in Google Colab. It will reduce time-consuming and tedious dependencies installing issues if you are a Colaboratory regular user. Now, you can use a single virtual environment anywhere by just adding a path to the Colab system. Finally, if you like my work, then please don’t forget to clap and share it with your friends. See you in the next blog…

--

--

Netra Prasad Neupane

Machine Learning Engineer with expertise in Computer Vision, Deep Learning, NLP and Generative AI. https://www.linkedin.com/in/netraneupane/