Have a gpu, become 100x
Kee Nam
Kee Nam
Evaluating an open model with a GPU for code is easy. I recently got an NVIDIA gpu and started testing with Ollama.
Follow the steps on this page to start the engine
Prepare to launch LLM
- install CUDA toolkit from https://developer.nvidia.com/cuda-downloads
- install Ollama https://ollama.com/
check GPU memory and choose a model.
As stated on the Ollama Github page,
You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
For starters, we can start with qwen2.5-coder
. https://ollama.com/library/qwen2.5-coder
Remember to select parameters that fit your GPU memory.
Launch!
Open Terminal
ollama run qwen2.5-coder
On the prompt, you can start chatting with the model. However, the next setp is to integrate with Visual Studio Code with Continue.dev extension.