AI | Dmitry Konovalov - Personal Site: Resume, Projects & Notes

AI‑Powered Slack Bot – ThreadDigest

This project involved building a serverless Slack bot, ThreadDigest, that uses OpenAI’s GPT models to produce concise summaries of conversations. The bot responds to slash commands and @mentions, fetches recent messages and returns summaries in near real time. The architecture leverages AWS Lambda to run on demand without persistent infrastructure, and integrates Slack’s Events API and Slash Commands using Lambda function URLs for simplicity. Key contributions Slack bot that responds to /summary commands and @mentions, summarizing messages from channels and threads. Event‑driven architecture with AWS Lambda (Python 3.11) and no persistent infrastructure. Integration with Slack Events API and Slash Commands via Lambda Function URLs. Secure credential management using AWS Secrets Manager. Robust signature verification, message parsing, retry logic and error handling. Support for dynamic user and channel name resolution. App Home interface built using Slack Block Kit for onboarding and quick help. Highlights Natural‑language summaries in Slack with near real‑time performance. Clean, minimal deployment with no external dependencies. Hands‑on experience with the Slack platform, OpenAI API and serverless operations. How It Works A user sends a command (@mention or /summary) to the bot Slack sends an event to AWS Lambda, which triggers a Lambda function Lambda verifies the request signature and processes the Slack event The bot fetches relevant conversation history using the Slack API It sends messages to the OpenAI API for summarization The generated summary is posted back to Slack Implementation Slack part Go to https://api.slack.com/apps Create a new app (from scratch) Enable the following: App Home: Enable Home Tab Event Subscriptions: Enable and provide AWS Lambda function URL endpoint Bot Token Scopes: app_mentions:read # Allow the bot to read messages where it's mentioned channels:history # Read message history from public channels channels:read # List public channels in the workspace chat:write # Post messages as the bot commands # Enable use of slash commands (e.g., /summary) groups:history # Read message history from private groups groups:read # List private groups the bot is a member of im:history # Read message history from direct messages im:write # Send direct messages as the bot users:read # Read basic user info (for name resolution in summaries) Events to Subscribe: app_home_opened # Triggered when a user opens the App Home tab app_mention # Triggered when the bot is mentioned in a channel message.channels # Triggered for new messages in public channels message.im # Triggered for new messages in direct messages (IMs) Install the app to your workspace and retrieve the OAuth token (starts with “xoxb-…”) and Slack signing secret OpenAI Part Generate OpenAI API token (starting with “sk-…”) AWS Part Navigate to AWS Secrets Manager and create a secret { "slack_bot_token": "xoxb-...", "openai_api_key": "sk-...", "slack_signing_secret": "..." } Create IAM role for Lambda function with permissions secretsmanager:GetSecretValue lambda:InvokeFunction Attach these inline policies to the role: ...

Kubernetes GitOps SearXNG Search Engine

Deploy a self-hosted, privacy-focused SearXNG metasearch engine on your Kubernetes cluster for integration with AI tools like OpenWebUI. Overview SearXNG is a privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking users. This deployment features proper SOPS encryption, IP whitelisting, and integration-ready JSON API. Features Privacy-focused: No user tracking or data collection Multi-engine aggregation: Combines results from Google, Bing, DuckDuckGo, Brave, Wikipedia, and more JSON API: RESTful API for programmatic access (perfect for AI integration) Rate limiting with IP whitelisting: Protects against abuse while allowing legitimate usage HTTPS with automatic certificates: Let’s Encrypt via cert-manager SOPS-encrypted secrets: Secure secret management following GitOps best practices Repository Structure ├── apps/ │ └── searxng/ │ └── base/ │ ├── kustomization.yaml │ ├── searxng-namespace.yaml │ ├── searxng-settings.yaml │ ├── searxng-deployment.yaml │ ├── searxng-service.yaml │ ├── searxng-certificate.yaml │ └── searxng-ingress.yaml ├── infrastructure/ │ └── security/ │ └── searxng-secrets/ │ ├── kustomization.yaml │ └── searxng-secret.yaml # SOPS encrypted └── clusters/ └── production/ ├── apps/ │ └── kustomization.yaml # References searxng └── flux-system/ ├── kustomization.yaml # References searxng-secrets └── searxng-secrets.yaml # Flux Kustomization Deployment Steps 1. Create Application Structure Create the application folder structure: ...

Llama.cpp + OpenWebUI Setup on Proxmox

Llama.cpp + OpenWebUI Setup on Proxmox This document details the complete setup of a llama.cpp server with OpenWebUI interface running in a Proxmox container with HTTPS access. Container Specifications Container ID: #106 Name: llama-ai RAM: 8GB CPU: 4 cores Storage: 32GB (local-lvm) MAC Address: BC:24:11:15:F2:3A IP Address: 172.16.32.135 OS: Debian 12 (Bookworm) Services Overview llama.cpp Server Port: 8080 Model: Qwen2.5-1.5B-Instruct (Q4_0 quantization) Context Window: 16,384 tokens (~12,000-13,000 words) Service: llama-cpp.service Status: Auto-start enabled OpenWebUI Port: 3000 Interface: Web-based chat interface Service: open-webui.service Status: Auto-start enabled PyTorch: CPU version installed NGINX Reverse Proxy HTTP Port: 80 (redirects to HTTPS) HTTPS Port: 443 Domain: https://llama-ai.<yourdomain.com> SSL: Let’s Encrypt with Cloudflare DNS challenge Auto-renewal: Enabled Installation Steps 1. Create Proxmox Container pct create 106 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \ --hostname llama-ai \ --memory 8192 \ --cores 4 \ --rootfs local-lvm:32 \ --net0 name=eth0,bridge=vmbr0,hwaddr=BC:24:11:15:F2:3A,ip=dhcp \ --unprivileged 1 \ --onboot 1 2. Install Dependencies apt update && apt upgrade -y apt install -y build-essential cmake git curl wget python3 python3-pip pkg-config libcurl4-openssl-dev 3. Build llama.cpp cd /opt git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp mkdir build && cd build cmake .. && make -j$(nproc) 4. Download Models mkdir -p /opt/llama.cpp/models cd /opt/llama.cpp/models # Qwen2.5 0.5B (fastest, 409MB) wget -O qwen3-0.6b-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_0.gguf # Qwen2.5 1.5B (most capable, 1017MB) wget -O qwen2.5-1.5b-instruct-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_0.gguf # Llama 3.2 1B (balanced, 738MB) wget -O llama3.2-1b-instruct-q4_0.gguf \ https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf 5. Create llama.cpp Service # /etc/systemd/system/llama-cpp.service [Unit] Description=Llama.cpp Server After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/llama.cpp ExecStart=/opt/llama.cpp/build/bin/llama-server \ --model /opt/llama.cpp/models/qwen2.5-1.5b-q4_0.gguf \ --host 0.0.0.0 \ --port 8080 \ --ctx-size 16384 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 6. Install OpenWebUI python3 -m venv /opt/openwebui-venv source /opt/openwebui-venv/bin/activate pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu pip install open-webui 7. Create OpenWebUI Service # /etc/systemd/system/open-webui.service [Unit] Description=OpenWebUI After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/openwebui-venv Environment=OPENAI_API_BASE_URL=http://127.0.0.1:8080/v1 Environment=OPENAI_API_KEY=sk-dummy Environment=WEBUI_AUTH=false ExecStart=/opt/openwebui-venv/bin/open-webui serve --port 3000 --host 0.0.0.0 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 8. Setup HTTPS with Let’s Encrypt Install NGINX and Certbot apt install -y nginx python3-certbot-nginx python3-certbot-dns-cloudflare Configure Cloudflare Credentials mkdir -p /etc/letsencrypt/credentials chmod 700 /etc/letsencrypt/credentials echo "dns_cloudflare_api_token = YOUR_CLOUDFLARE_TOKEN" > /etc/letsencrypt/credentials/cloudflare.ini chmod 600 /etc/letsencrypt/credentials/cloudflare.ini Obtain SSL Certificate certbot certonly \ --dns-cloudflare \ --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \ -d llama-ai.<yourdomain.com> \ --non-interactive \ --agree-tos \ --email [email protected] Configure NGINX # /etc/nginx/sites-available/llama-ai.<yourdomain.com> server { listen 80; server_name llama-ai.<yourdomain.com>; return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name llama-ai.<yourdomain.com>; ssl_certificate /etc/letsencrypt/live/llama-ai.<yourdomain.com>/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/llama-ai.<yourdomain.com>/privkey.pem; location / { proxy_pass http://127.0.0.1:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } 9. Enable Services systemctl daemon-reload systemctl enable --now llama-cpp.service systemctl enable --now open-webui.service systemctl enable --now nginx ln -s /etc/nginx/sites-available/llama-ai.<yourdomain.com> /etc/nginx/sites-enabled/ nginx -t && systemctl reload nginx Switching Models llama.cpp runs as a systemd service with a single model loaded at start (set via --model in llama-cpp.service). To change models, update the service, reload, and restart. ...

OpenWebUI Deployment in FluxCD Kubernetes Cluster

OpenWebUI Deployment in FluxCD Kubernetes Cluster Overview Successfully deployed OpenWebUI in a production Kubernetes cluster managed by FluxCD GitOps. OpenWebUI provides a web interface for interacting with Large Language Models (LLMs) and is configured to connect to a llama.cpp backend. Architecture graph TB A[User] --> B[Nginx Ingress] B --> C[OpenWebUI Service] C --> D[OpenWebUI Pod] D --> E[External llama.cpp Backend] F[FluxCD] --> G[Git Repository] G --> H[OpenWebUI Kustomization] H --> I[Kubernetes Resources] J[Cert-Manager] --> K[Let's Encrypt] K --> L[TLS Certificate] L --> B Deployment Configuration Core Components Namespace: openwebui Image: ghcr.io/open-webui/open-webui:main Backend: http://llama-cpp.<yourdomain.com>:11434 (external Ollama-compatible API) Storage: 10Gi PersistentVolume for user data persistence Access: HTTPS via nginx-ingress with Let’s Encrypt certificates Resource Specifications # Resource Limits & Requests resources: limits: cpu: 500m memory: 1Gi requests: cpu: 100m memory: 256Mi Key Features Configured Persistent Storage: User conversations and settings preserved across pod restarts TLS Encryption: Automatic HTTPS certificates via cert-manager + Let’s Encrypt External LLM Backend: Configured to use existing llama.cpp server GitOps Management: Fully managed via FluxCD from Git repository Access Information URL: https://openwebui.<yourdomain.com> TLS Certificate: Auto-provisioned by cert-manager using Let’s Encrypt production issuer DNS Challenge: Uses Cloudflare DNS-01 for certificate validation FluxCD GitOps Structure The deployment follows GitOps principles with the following structure: ...

Setting up llama.cpp in LXC Container on Proxmox

Setting up llama.cpp in LXC Container on Proxmox This guide documents the complete process of setting up llama.cpp in an LXC container on Proxmox with Intel GPU support and OpenAI-compatible API endpoints. Overview Goal: Replace Ollama with llama.cpp for better performance and lower resource usage Hardware: Intel N150 GPU (OpenCL support) Container: Debian 12 LXC on Proxmox API: OpenAI-compatible endpoints on port 11434 Container Creation 1. Create LXC Container # Download Debian 12 template pveam download local debian-12-standard_12.12-1_amd64.tar.zst # Create container pct create 107 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \ --hostname llama-cpp \ --memory 8192 \ --swap 512 \ --cores 4 \ --rootfs local-lvm:32 \ --net0 name=eth0,bridge=vmbr0,ip=dhcp \ --features keyctl=1,nesting=1 \ --unprivileged 1 \ --onboot 1 \ --tags ai 2. Add GPU Passthrough # Get GPU group IDs stat -c '%g' /dev/dri/card0 # Output: 44 stat -c '%g' /dev/dri/renderD128 # Output: 104 # Add GPU devices to container pct set 107 --dev0 /dev/dri/card0,gid=44 --dev1 /dev/dri/renderD128,gid=104 # Start container pct start 107 Software Installation 3. Install Dependencies # Update package list pct exec 107 -- apt update # Install build tools and dependencies pct exec 107 -- apt install -y \ build-essential \ cmake \ git \ curl \ pkg-config \ libssl-dev \ python3 \ python3-pip \ libcurl4-openssl-dev # Install OpenCL support for Intel GPU pct exec 107 -- apt install -y \ opencl-headers \ ocl-icd-opencl-dev \ intel-opencl-icd 4. Compile llama.cpp # Clone repository pct exec 107 -- bash -c "cd /opt && git clone https://github.com/ggerganov/llama.cpp.git" # Configure with OpenCL support pct exec 107 -- bash -c "cd /opt/llama.cpp && mkdir build && cd build && cmake .. -DGGML_OPENCL=ON -DCMAKE_BUILD_TYPE=Release" # Compile server binary pct exec 107 -- bash -c "cd /opt/llama.cpp/build && make -j$(nproc) llama-server" Model Setup 5. Download Models # Create models directory pct exec 107 -- mkdir -p /opt/llama.cpp/models # Download Qwen2.5-1.5B model (Q4_0 quantized) pct exec 107 -- bash -c "cd /opt/llama.cpp/models && curl -L -o qwen2.5-1.5b-q4_0.gguf https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_0.gguf" Service Configuration 6. Create systemd Service # Create service file pct exec 107 -- bash -c "printf '[Unit]\nDescription=llama.cpp Server\nAfter=network-online.target\nWants=network-online.target\n\n[Service]\nType=simple\nExecStart=/opt/llama.cpp/build/bin/llama-server --host 0.0.0.0 --port 11434 --threads 4 --model /opt/llama.cpp/models/qwen2.5-1.5b-q4_0.gguf --ctx-size 8192 --batch-size 512\nRestart=always\nRestartSec=3\nUser=root\nGroup=root\nEnvironment=HOME=/root\nStandardOutput=journal\nStandardError=journal\nSyslogIdentifier=llama-cpp\n\n[Install]\nWantedBy=multi-user.target\n' > /etc/systemd/system/llama-cpp.service" # Enable and start service pct exec 107 -- systemctl daemon-reload pct exec 107 -- systemctl enable llama-cpp.service pct exec 107 -- systemctl start llama-cpp.service Testing and Verification 7. Verify Service Status # Check service status pct exec 107 -- systemctl status llama-cpp.service # Check port binding pct exec 107 -- ss -tlnp | grep :11434 8. Test API # Test OpenAI-compatible API pct exec 107 -- curl -X POST http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"qwen2.5-1.5b-q4_0","messages":[{"role":"user","content":"Hello, how are you?"}],"max_tokens":50}' Expected response: ...

Ollama LXC Setup with GPU Acceleration and Web Interface

This guide covers setting up Ollama (Open Large Language Model) in a Proxmox LXC container with GPU passthrough and creating a simple web interface for easy interaction. Overview We’ll deploy Ollama in a resource-constrained LXC environment with: Intel UHD Graphics GPU acceleration llama3.2:1b model (~1.3GB) Lightweight Python web interface Auto-starting services Prerequisites Proxmox VE host Intel integrated graphics (UHD Graphics) At least 4GB RAM allocated to LXC 40GB+ storage for container Step 1: Container Setup Check Available GPU Resources First, verify GPU availability on the Proxmox host: ...