Kee's log

Have a gpu, become 100x

Kee Nam
Kee Nam

Evaluating an open model with a GPU for code is easy. I recently got an NVIDIA gpu and started testing with Ollama.

Follow the steps on this page to start the engine

Prepare to launch LLM

  • install CUDA toolkit from https://developer.nvidia.com/cuda-downloads
  • install Ollama https://ollama.com/

check GPU memory and choose a model.

As stated on the Ollama Github page,

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

For starters, we can start with qwen2.5-coder. https://ollama.com/library/qwen2.5-coder

Remember to select parameters that fit your GPU memory.

Launch!

Open Terminal

ollama run qwen2.5-coder

On the prompt, you can start chatting with the model. However, the next setp is to integrate with Visual Studio Code with Continue.dev extension.