Private A.I. Chatbot

Private AI Chatbot

ai Dec 5, 2024

Synopsis

Let's discuss how to install and configure our very own private AI chatbot service using Traefik, LibreChat, Ollama and RunPod. Props go out to my techbuddies at Skatedeluxe, with whom I was able to develop the thoughts, ideas and concepts described here.

Traefik

Traefik is a simple way to automate the discovery, routing and load balancing of microservices. For our needs, we will use traefik as a reverse proxy for our LibreChat frontend. This will simplify implementing an SSL certificate and listening to our docker based LibreChat container.

LibreChat

LibreChat is a free, open source AI chat platform that offers a simple interface to communicate with various AI providers. The interface is similar to other chatbots like ChatGPT. This software application is community supported and is not behind any paywall.

Ollama

Ollama is an application which allows you to easily get started interacting and using large language models (LLMs). Usage is similar to the docker project, which means any experience with docker will make your life with Ollama very easy. Ollama allows us to configure and implement any variety of LLMs using a simple management interface.

RunPod

RunPod is a distributed GPU cloud service offering a wide array of computational options to help you create your own personalized AI infrastructure. This service takes the work out of building and maintaining a GPU server.

Hardware Requirements

In order to get started, we will need an internet facing Linux server to host our LibreChat service. You should have root access to this server. We will also run a RunPod "pod" which will do the heavy lifting in the background. Let's start with our self-hosted server first.

DNS

Before starting, we will need two subdomains that resolve to your host. Login to your domain name registrar and create the following subdomains.

  1. traefik.yourhostname.com
  2. ai.yourhostname.com

Replace 'yourhostnam.com' with a domain name you manage. The subdomain 'traefik.yourhostname.com' is used to view your traefik dashboard. This dashboard is optional, however for this example we will enable it. For more configuration information, feel free to view the Traefik documentation.

Docker

Installation

SSH into your server and install docker. For more installation information regarding your Linux distribution, view the official docker installation documentation. After installing, make sure that the service is running and enabled:

sudo systemctl start docker
sudo systemctl enable docker

check to see that everything is running by executing the following command:

sudo systemctl status docker

If everything is working, you show see something similar to the following:

systemctl status docker
 docker.service - Docker Application Container Engine
     Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Tue 2024-12-03 10:07:40 CET; 5h 12min ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 54741 (dockerd)
      Tasks: 55
     Memory: 167.9M (peak: 614.0M swap: 4.1M swap peak: 4.1M)
        CPU: 45.090s

Docker Compose

We will now create a system for easily managing docker compose services on our machine that are maintainable through systemctl.

Create a 'compose' directory under /etc/docker/:

sudo mkdir -p /etc/docker/compose

Create a systemd docker compose service file:

sudo cd /etc/systemd/system/
sudo vim docker-compose@.service

Add the following contents:

[Unit]
Description=%i service with docker compose
PartOf=docker.service
After=docker.service

[Service]
Type=oneshot
RemainAfterExit=true
WorkingDirectory=/etc/docker/compose/%i
ExecStart=/usr/bin/docker compose up -d --remove-orphans
ExecStop=/usr/bin/docker compose down

[Install]
WantedBy=multi-user.target

Reload the systemd daemon

sudo systemctl daemon-reload

This service configuration will allow you to easily manage docker compose based services via systemd. Now let's add our first service, namely Traefik.

Traefik

We will be using traefik as our reverse proxy for all docker compose service configurations located in the folder '/etc/docker/compose/'. If you know what you are doing or would rather use a different reverse proxy, feel free to skip this section.

Dashboard Credentials

For this example, we will create user credentials with basic HTTP authentication. This is necessary so we can log in and view the Traefik dashboard. We will create these credentials with the CLI command 'htpasswd'. For Debian based systems, you can install it with the following command:

sudo apt-get install apache2-utils

Create the user and password string with the following command:

sudo htpasswd -Bnb USERNAME PASSWORD | sed -e s/\\$/\\$\\$/g

Before executing the command, replace USERNAME and PASSWORD with your desired credentials. You should see something like the following:

USERNAME:$$2y$$05$$0LywPBbQ.bYjHaDuwkBilutCTV6ne8lH7uG18AzbyqKh8CKAW3U06

Copy and save this output, as we will need it later when configuring Traefik

Docker Compose

Create a 'traefik' and 'traefik/config' directory under '/etc/docker/compose' and move to it.

sudo mkdir -p /etc/docker/compose/traefik
sudo mkdir -p /etc/docker/compose/traefik/config
# cd /etc/docker/compose/traefik

In order for docker compose to run, we will need a compose.yml configuration file.

vim compose.yml

Add the following content. Replace 'traefik.youhostname.com' with your domain name for your traefik dashboard and 'traefik.http.middlewares.traefik-auth.basicauth.users' with the user credentials we created with htpasswd. For example:

services:

  # Traefik reverse proxy
  traefik:
    image: traefik:v2.5
    restart: unless-stopped
    container_name: traefik
    hostname: traefik.yourhostname.com
    labels:
      - "traefik.enable=true"

      # define basic auth middleware for dashboard
      - "traefik.http.middlewares.traefik-auth.basicauth.removeheader=true"
      - "traefik.http.middlewares.traefik-auth.basicauth.users=USERNAME:$$2y$$05$$0LywPBbQ.bYjHaDuwkBilutCTV6ne8lH7uG18AzbyqKh8CKAW3U06G"

      # define traefik dashboard router and service
      - "traefik.http.routers.traefik.rule=Host(`traefik.yourhostname.com`)"
      - "traefik.http.routers.traefik.service=api@internal"
      - "traefik.http.routers.traefik.tls.certresolver=tlschallenge"
      - "traefik.http.routers.traefik.entrypoints=web-secure"
      - "traefik.http.routers.traefik.middlewares=traefik-auth, secHeaders@file"
      - "traefik.http.services.traefik.loadbalancer.server.port=8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./config/traefik.yaml:/etc/traefik/traefik.yaml:ro
      - ./config/dynamic.yaml:/etc/traefik/dynamic.yaml:ro
      - ./config/acme.json:/etc/traefik/acme.json
    networks:
      - traefik-servicenet
    ports:
      - "80:80"
      - "443:443"

networks:
  traefik-servicenet:
    external: true

This configuration will expose ports 80 and 443 and will automatically handle SSL certificate generation via Let's Encrypt. We need to create a docker network for Traefik. This will enable the containers to talk to each other. It also insulates the containers from direct access via the internet. Run the following command:

docker network create traefik-servicenet

Traefik Configuration

Our Traefik instance needs to be configured. Create a file 'traefik.yaml' in '/etc/docker/compose/traefik/config' and add the following contents:

log:
  level: WARN  # ERROR, DEBUG, PANIC, FATAL, ERROR, WARN, INFO

providers:
  docker:
    exposedByDefault: false
    endpoint: 'unix:///var/run/docker.sock'
    network: traefik-servicenet
  file:
    filename: /etc/traefik/dynamic.yaml
    watch: true

api:
  dashboard: true # if you don't need the dashboard disable it

entryPoints:
  web:
    address: ':80'
    http:
      redirections:
        entryPoint:
          to: web-secure
          scheme: https
  web-secure:
    address: ':443'

certificatesResolvers:
  tlschallenge:
    acme:
      email: admin@yourhostname.com
      storage: /etc/traefik/acme.json
      tlsChallenge: {}

global:
  checkNewVersion: true
  sendAnonymousUsage: false

Create a file called 'dynamic.yaml' in '/etc/docker/compose/traefik/config' and add the following contents:

tls:
  options:
    default:
      minVersion: VersionTLS12
      cipherSuites:
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
        - TLS_AES_128_GCM_SHA256
        - TLS_AES_256_GCM_SHA384
        - TLS_CHACHA20_POLY1305_SHA256
      curvePreferences:
        - CurveP521
        - CurveP384

http:
  # define middlewares
  middlewares:

    # define some security header options,
    # see https://doc.traefik.io/traefik/v2.5/middlewares/http/headers/
    secHeaders:
      headers:
        browserXssFilter: true
        contentTypeNosniff: true
        frameDeny: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 31536000
        customFrameOptionsValue: "SAMEORIGIN"
        customResponseHeaders:
          server: ""
          x-powered-by: ""

As a final step, create the file 'acme.json' in '/etc/docker/compose/traefik/config' and restrict the permissions with the following commands:

sudo echo '{}' > /etc/docker/compose/traefik/config/acme.json
sudo chmod 600 /etc/docker/compose/traefik/config/acme.json

Start Traefik

Let's start and enable the new service via systemctl:

systemctl start docker-compose@traefik.service
systemctl enable docker-compose@traefik.service

Open a browser to traefik.yourhostname.com and log in with the HTTP basic authentication credentials we created earlier. Feel free to click around and become familiar with the traefik dashboard.

Screenshot_20241204_161253.png

If it is not starting, or you need help troubleshooting an issue, check the log output by going to '/etc/docker/compose/traefik' and entering the following command:

sudo docker compose logs -f

Our Traefik service is now up and running! Please note, that this is a basic configuration setup. Traefik is a powerful reverse proxy and offers many more configuration options. For more information, check out the official documentation.

LibreChat

Now that we have done most of the dirty work in setting up docker and Traefik, clone the LibreChat code from the official repository:

cd /etc/docker/compose
sudo git clone https://github.com/danny-avila/LibreChat.git librechat
cd librechat

Copy the following files:

sudo cp .env.example .env
sudo cp docker-compose.override.yml.example docker-compose.override.yml
sudo cp librechat.example.yaml librechat.yaml

.env

Open the .env file in your favorite editor and replace the following variables:

#==================================================#
#                        RAG                       #
#==================================================#
RAG_API_URL=http://host.docker.internal:8000
EMBEDDINGS_PROVIDER=ollama
OLLAMA_BASE_URL=https://POD_ID-11434.proxy.runpod.net
EMBEDDINGS_MODEL=nomic-embed-text

----

#========================#
# Registration and Login #
#========================#

ALLOW_REGISTRATION=false

NOTE: The POD_ID string from the OLLAMA_BASE_URL variable. We will change this later.

docker-compose.override.yml

Let's update our container to communicate with Traefik. Open the 'docker-compose.override.yml' file in your favorite editor and replace the 'API' service with the following content:

services:
  api:
    volumes:
      - type: bind
        source: ./librechat.yaml
        target: /app/librechat.yaml
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.librechat.entrypoints=web-secure"
      - "traefik.http.routers.librechat.rule=Host(`ai.yourhostname.com`)"
      - "traefik.http.routers.librechat.tls=true"
      - "traefik.http.routers.librechat.tls.certresolver=tlschallenge"
      - "traefik.http.middlewares.librechat.headers.stsSeconds=15552000"
      - "traefik.http.services.librechat.loadbalancer.server.port=3080"
    networks:
      - default
      - traefik-servicenet
networks:
  traefik-servicenet:
    external: true

Make sure to replace 'ai.yourhostname.com' with the subdomain you registered at the beginning.

librechat.yml

Let's configure LibreChat. Open the 'librechat.yaml' file and add the following contents:

endpoints:
  custom:
    - name: "Ollama"
      apiKey: "XXX"
      baseURL: "https://POD_ID-11434.proxy.runpod.net/v1/chat/completions"
      models:
        default: [
          "llama3.1:latest",
          ]
      titleConvo: true
      titleModel: "current_model"
      modelDisplayLabel: "Ollama"

NOTE: The 'POD_ID' string, we will change this later. Let's configure a GPU pod with RunPod.

RunPod

Since RunPod is an external service, head over to RunPod and create an account. Afterward, you will need to deposit a spending balance. Click on the "Billing" tab, enter your credit card details, and buy some credit. We will start small and add $25 to our balance.

Screenshot_20241203_145424.png

Create a Pod

Creating a pod is fairly straightforward. Select on 'Pods' in the menu and click 'Deploy':

Screenshot_20241204_135151.png

You will be routed to a page listing the different GPU options. Select the GPU model and number you require. For example, I will be selecting a single 'A40' GPU:

Screenshot_20241204_135411.png

RunPod offers different templates. Take a look and select the one that fits your needs. I will be selecting the default Pytorch template. Click the "Edit Template" button and a popup will appear. Change the template with the values highlighted in blue and click 'Set Overrides'

Screenshot_20241204_140210.png

Next, select the instance pricing plan. I will choose 'On-Demand'

Screenshot_20241204_140414.png

De-select 'Start Jupyter Notebook' and select 'Deploy On-Demand'

Screenshot_20241204_140626.png

Our Pod has been created! Copy the pod ID in the dashboard, we will need it to complete the rest of the configuration:

Screenshot_20241204_141114.png

Connect to the Pod

Click the 'connect' button for your selected pod in the dashboard and follow the instructions on how to generate a public/private key pair for RunPod.

Screenshot_20241204_141438.png

Now you can ssh into your pod via ssh as listed in the 'connection options' tab:

Screenshot_20241204_141559.png

Install Configure Ollama

SSH into your pod instance and update the apt package repo. We will also install a terminal multiplexer like 'screen' or 'tmux' to keep the ollama process alive in the background. For this example, I will be using screen. RunPod also offers a documentation regarding ollama integration.

apt update -y
apt install screen -y

start a screen session and install ollama via the install link:

screen
curl -fsSL https://ollama.com/install.sh | sh

Now we can start the ollama service:

ollama serve > ollama.log 2>&1

Open a new screen with "CTRL+a a" and pull the following LLMs:

ollama pull nomic-embed-text
ollama pull llama3.1

Exit the screen session with "CTRL-a d"

Start LibreChat

Highlight the pod ID in the RunDod dashboard and return to the server hosting LibreChat. Replace the POD_ID string in the following files on your server:

  1. /etc/docker/compose/librechat/librechat.yaml
  2. /etc/docker/compose/librechat/.env

Start your LibreChat server:

systemctl start docker-compose@librechat
systemctl enable docker-compose@librechat

Wait a few minutes for Traefik to configure the SSL certificate for LibreChat. Now you can open your browser to ai.yourhostname.com.

If everything was configured correctly, your AI instance should be up and running. Congratulations!

Final Thoughts

This is a basic configuration for demonstation purposes and can definitely be improved. For example one could:

  1. Create a permanent service daemon for ollama on the pod.
  2. POD_ID could be put in the .env file for global access to the rest of the system.

Any thoughs or constructive criticisms are welcome in the comments.

Tags