How to Run Any LLM Models in Google Colab
Introduction
Google Colab provides an easy and efficient way to run large language models (LLMs) without needing powerful local hardware. In this guide, we’ll walk through the steps to set up and run LLM models in Colab using Ollama.
Setting Up Google Colab
Step 1: Create a New Notebook
- Go to Google Colab.
- Create a new notebook and name it
First.ipynb
. - Connect to a GPU instance.
Step 2: Enable GPU
- Click on Runtime in the menu.
- Select Change Runtime Type.
- Choose T4 GPU under hardware accelerator.
- Save and hit Connect.
To confirm GPU availability, check system resources. You should see around 15GB of VRAM.
!nvidia-smi
Installing Dependencies
Step 3: Install colab-xterm
To enable terminal access in Colab:
!pip install colab-xterm
%load_ext colabxterm
%xterm
Step 4: Install and Set Up Ollama
Ollama is a tool that allows you to run LLMs easily. Install it with the following command:
curl https://ollama.ai/install.sh | sh
Step 5: Start Ollama and Download a Model
ollama serve &
ollama pull llama3.2
You can verify the installation and downloaded models using:
ollama list
ollama show llama3.2
Step 6: Install the Ollama Python Package
!pip install ollama
Running LLM Models in Colab
Now, let's use Python to interact with the LLM model.
import ollama
prompt = "What is a pandas DataFrame?"
response = ollama.chat(
model="llama3.2",
messages=[{"role": "user", "content": prompt}]
)
print(response['message']['content'])
Conclusion
By following these steps, you can easily set up and run LLM models in Google Colab using Ollama. This method allows you to leverage cloud computing resources for AI model inference, making it accessible even on low-end devices.
Happy coding!