Giter Club home page Giter Club logo

llm's Introduction

[TOC]

This page exposes steps to make run a coding assistant on your VSCode/VSCodium.

First try

Steps

  • download a llama2 model
  • convert it in gguf format (for llama.cpp execution for CPUs)
  • make run llama.cpp as a service
  • make run VSCode/VSCodium Continue plugin

1 - Prepare your model

The earn time, you should find your prepared model in this place: https://huggingface.co/TheBloke/Llama-2-7B-GGUF.

However, I'll describe how you could apply the conversino by yourself if you need it.

For llama 2 models, let's go to https://ai.meta.com/llama/ and let's follow the instructions.

Example:

From /home/user/llm:

download.sh

After downloading your model, your should have a folder with consolidated.00.pth file. Its parent directory should also contain tokenizer.model file Example:

/home/user/llm/models/llama2/tokenizer.model
/home/user/llm/models/llama2/7B/consolidated.00.pth

Then, let's convert it to gguf format, to let llama.cpp use it.

From /home/user/llm:

docker run \
  -v ./models/llama2:/models \
  ghcr.io/ggerganov/llama.cpp:full \
  --convert /models/7B

Or in a docker-compose.yml file:

  llama-cpp:
    image: ghcr.io/ggerganov/llama.cpp:full
    volumes:
      - ./models/llama2:/models
    command: --convert /models/7B

... And command: docker compose up llama-cpp

2 - Run llama.cpp service

docker run \
  -d \
  -p 64256:64256 \
  -v ./models/llama2:/models \
  ghcr.io/ggerganov/llama.cpp:full \
  --server --host 0.0.0.0 --port 64256 -m /models/7B/ggml-model-f16.gguf -c 2048

Or in a docker-compose.yml file:

  llama-cpp:
    image: ghcr.io/ggerganov/llama.cpp:full
    volumes:
      - ./models/llama2:/models
    command: --server --host 0.0.0.0 --port 64256 -m /models/7B/ggml-model-f16.gguf -c 2048
    ports:
      - 64256:64256

... And command: docker compose up -d llama-cpp

To check your logs:

docker compose logs -f llama-cpp

3 - Run Continue plugin

Here are steps to make the link between VSCode/VSCodium Continue plugin and your llama.cpp service

Install Continue plugin

Could be downloaded from https://marketplace.visualstudio.com/

Direct link I used (install through VSCodium remote ssh, in a linux X64 VM): https://marketplace.visualstudio.com/_apis/public/gallery/publishers/Continue/vsextensions/continue/0.7.0/vspackage?targetPlatform=linux-x64

Then load manually your .vsix file: [email protected].

Setup your Continue plugin to connect to you llama.cpp service

  • Server url: http://172.16.3.63:65432

  • ~/.continue/config.py setup:

from continuedev.libs.llm.llamacpp import LlamaCpp

(...)

config = ContinueConfig(
    allow_anonymous_telemetry=False,
    models=Models(
        default=LlamaCpp(
            max_context_length=2048,
            server_url="http://172.16.3.63:64256")
    ),

(...)

And make it run with with docker.

Here is a convenient extract of my docker-compose.yml file:

  continue:
    image: python:3.10.13-bookworm
    working_dir: /continue
    volumes:
      - ./continue/config.py:/root/.continue/config.py
    command:
      - /bin/bash
      - -c
      - |
        pip install continuedev
        python -m continuedev --host 0.0.0.0 --port 65432
    ports:
      - 65432:65432

Make it run with: docker compose up -d continue

Conclusion

This expose only the first try.

We know clearly that the chat you'll get won't be powerful, but at least we have a full integration chain.

On next try, we'll discover rift full solution.

llm's People

Contributors

clement-igonet avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.