Openwebui

Kubernetes GitOps SearXNG Search Engine

Deploy a self-hosted, privacy-focused SearXNG metasearch engine on your Kubernetes cluster for integration with AI tools like OpenWebUI. Overview SearXNG is a privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking users. This deployment features proper SOPS encryption, IP whitelisting, and integration-ready JSON API. Features Privacy-focused: No user tracking or data collection Multi-engine aggregation: Combines results from Google, Bing, DuckDuckGo, Brave, Wikipedia, and more JSON API: RESTful API for programmatic access (perfect for AI integration) Rate limiting with IP whitelisting: Protects against abuse while allowing legitimate usage HTTPS with automatic certificates: Let’s Encrypt via cert-manager SOPS-encrypted secrets: Secure secret management following GitOps best practices Repository Structure ├── apps/ │ └── searxng/ │ └── base/ │ ├── kustomization.yaml │ ├── searxng-namespace.yaml │ ├── searxng-settings.yaml │ ├── searxng-deployment.yaml │ ├── searxng-service.yaml │ ├── searxng-certificate.yaml │ └── searxng-ingress.yaml ├── infrastructure/ │ └── security/ │ └── searxng-secrets/ │ ├── kustomization.yaml │ └── searxng-secret.yaml # SOPS encrypted └── clusters/ └── production/ ├── apps/ │ └── kustomization.yaml # References searxng └── flux-system/ ├── kustomization.yaml # References searxng-secrets └── searxng-secrets.yaml # Flux Kustomization Deployment Steps 1. Create Application Structure Create the application folder structure: ...

Llama.cpp + OpenWebUI Setup on Proxmox

Llama.cpp + OpenWebUI Setup on Proxmox This document details the complete setup of a llama.cpp server with OpenWebUI interface running in a Proxmox container with HTTPS access. Container Specifications Container ID: #106 Name: llama-ai RAM: 8GB CPU: 4 cores Storage: 32GB (local-lvm) MAC Address: BC:24:11:15:F2:3A IP Address: 172.16.32.135 OS: Debian 12 (Bookworm) Services Overview llama.cpp Server Port: 8080 Model: Qwen2.5-1.5B-Instruct (Q4_0 quantization) Context Window: 16,384 tokens (~12,000-13,000 words) Service: llama-cpp.service Status: Auto-start enabled OpenWebUI Port: 3000 Interface: Web-based chat interface Service: open-webui.service Status: Auto-start enabled PyTorch: CPU version installed NGINX Reverse Proxy HTTP Port: 80 (redirects to HTTPS) HTTPS Port: 443 Domain: https://llama-ai.<yourdomain.com> SSL: Let’s Encrypt with Cloudflare DNS challenge Auto-renewal: Enabled Installation Steps 1. Create Proxmox Container pct create 106 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \ --hostname llama-ai \ --memory 8192 \ --cores 4 \ --rootfs local-lvm:32 \ --net0 name=eth0,bridge=vmbr0,hwaddr=BC:24:11:15:F2:3A,ip=dhcp \ --unprivileged 1 \ --onboot 1 2. Install Dependencies apt update && apt upgrade -y apt install -y build-essential cmake git curl wget python3 python3-pip pkg-config libcurl4-openssl-dev 3. Build llama.cpp cd /opt git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp mkdir build && cd build cmake .. && make -j$(nproc) 4. Download Models mkdir -p /opt/llama.cpp/models cd /opt/llama.cpp/models # Qwen2.5 0.5B (fastest, 409MB) wget -O qwen3-0.6b-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_0.gguf # Qwen2.5 1.5B (most capable, 1017MB) wget -O qwen2.5-1.5b-instruct-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_0.gguf # Llama 3.2 1B (balanced, 738MB) wget -O llama3.2-1b-instruct-q4_0.gguf \ https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf 5. Create llama.cpp Service # /etc/systemd/system/llama-cpp.service [Unit] Description=Llama.cpp Server After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/llama.cpp ExecStart=/opt/llama.cpp/build/bin/llama-server \ --model /opt/llama.cpp/models/qwen2.5-1.5b-q4_0.gguf \ --host 0.0.0.0 \ --port 8080 \ --ctx-size 16384 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 6. Install OpenWebUI python3 -m venv /opt/openwebui-venv source /opt/openwebui-venv/bin/activate pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu pip install open-webui 7. Create OpenWebUI Service # /etc/systemd/system/open-webui.service [Unit] Description=OpenWebUI After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/openwebui-venv Environment=OPENAI_API_BASE_URL=http://127.0.0.1:8080/v1 Environment=OPENAI_API_KEY=sk-dummy Environment=WEBUI_AUTH=false ExecStart=/opt/openwebui-venv/bin/open-webui serve --port 3000 --host 0.0.0.0 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 8. Setup HTTPS with Let’s Encrypt Install NGINX and Certbot apt install -y nginx python3-certbot-nginx python3-certbot-dns-cloudflare Configure Cloudflare Credentials mkdir -p /etc/letsencrypt/credentials chmod 700 /etc/letsencrypt/credentials echo "dns_cloudflare_api_token = YOUR_CLOUDFLARE_TOKEN" > /etc/letsencrypt/credentials/cloudflare.ini chmod 600 /etc/letsencrypt/credentials/cloudflare.ini Obtain SSL Certificate certbot certonly \ --dns-cloudflare \ --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \ -d llama-ai.<yourdomain.com> \ --non-interactive \ --agree-tos \ --email [email protected] Configure NGINX # /etc/nginx/sites-available/llama-ai.<yourdomain.com> server { listen 80; server_name llama-ai.<yourdomain.com>; return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name llama-ai.<yourdomain.com>; ssl_certificate /etc/letsencrypt/live/llama-ai.<yourdomain.com>/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/llama-ai.<yourdomain.com>/privkey.pem; location / { proxy_pass http://127.0.0.1:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } 9. Enable Services systemctl daemon-reload systemctl enable --now llama-cpp.service systemctl enable --now open-webui.service systemctl enable --now nginx ln -s /etc/nginx/sites-available/llama-ai.<yourdomain.com> /etc/nginx/sites-enabled/ nginx -t && systemctl reload nginx Switching Models llama.cpp runs as a systemd service with a single model loaded at start (set via --model in llama-cpp.service). To change models, update the service, reload, and restart. ...

OpenWebUI Deployment in FluxCD Kubernetes Cluster

OpenWebUI Deployment in FluxCD Kubernetes Cluster Overview Successfully deployed OpenWebUI in a production Kubernetes cluster managed by FluxCD GitOps. OpenWebUI provides a web interface for interacting with Large Language Models (LLMs) and is configured to connect to a llama.cpp backend. Architecture graph TB A[User] --> B[Nginx Ingress] B --> C[OpenWebUI Service] C --> D[OpenWebUI Pod] D --> E[External llama.cpp Backend] F[FluxCD] --> G[Git Repository] G --> H[OpenWebUI Kustomization] H --> I[Kubernetes Resources] J[Cert-Manager] --> K[Let's Encrypt] K --> L[TLS Certificate] L --> B Deployment Configuration Core Components Namespace: openwebui Image: ghcr.io/open-webui/open-webui:main Backend: http://llama-cpp.<yourdomain.com>:11434 (external Ollama-compatible API) Storage: 10Gi PersistentVolume for user data persistence Access: HTTPS via nginx-ingress with Let’s Encrypt certificates Resource Specifications # Resource Limits & Requests resources: limits: cpu: 500m memory: 1Gi requests: cpu: 100m memory: 256Mi Key Features Configured Persistent Storage: User conversations and settings preserved across pod restarts TLS Encryption: Automatic HTTPS certificates via cert-manager + Let’s Encrypt External LLM Backend: Configured to use existing llama.cpp server GitOps Management: Fully managed via FluxCD from Git repository Access Information URL: https://openwebui.<yourdomain.com> TLS Certificate: Auto-provisioned by cert-manager using Let’s Encrypt production issuer DNS Challenge: Uses Cloudflare DNS-01 for certificate validation FluxCD GitOps Structure The deployment follows GitOps principles with the following structure: ...