Proxmox

Proxmox Setup

1. System Requirements Before starting, ensure your hardware meets the requirements for Proxmox. This documentation outlines the minimum and recommended specifications for CPU, memory, storage, and network. See https://pve.proxmox.com/wiki/System_Requirements 2. Prepare the Installation Media Learn how to create a bootable USB or DVD for Proxmox installation at https://pve.proxmox.com/wiki/Prepare_Installation_Media . This guide covers the tools and steps needed for preparing your installation media. 3. Installation Follow the step-by-step instructions for installing Proxmox. This includes partitioning your drives, configuring the network, and completing the initial setup. ...

Proxmox VE 8 to 9 Upgrade Guide

Complete step-by-step guide for upgrading Proxmox VE 8 servers to Proxmox VE 9

Llama.cpp + OpenWebUI Setup on Proxmox

Llama.cpp + OpenWebUI Setup on Proxmox This document details the complete setup of a llama.cpp server with OpenWebUI interface running in a Proxmox container with HTTPS access. Container Specifications Container ID: #106 Name: llama-ai RAM: 8GB CPU: 4 cores Storage: 32GB (local-lvm) MAC Address: BC:24:11:15:F2:3A IP Address: 172.16.32.135 OS: Debian 12 (Bookworm) Services Overview llama.cpp Server Port: 8080 Model: Qwen2.5-1.5B-Instruct (Q4_0 quantization) Context Window: 16,384 tokens (~12,000-13,000 words) Service: llama-cpp.service Status: Auto-start enabled OpenWebUI Port: 3000 Interface: Web-based chat interface Service: open-webui.service Status: Auto-start enabled PyTorch: CPU version installed NGINX Reverse Proxy HTTP Port: 80 (redirects to HTTPS) HTTPS Port: 443 Domain: https://llama-ai.<yourdomain.com> SSL: Let’s Encrypt with Cloudflare DNS challenge Auto-renewal: Enabled Installation Steps 1. Create Proxmox Container pct create 106 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \ --hostname llama-ai \ --memory 8192 \ --cores 4 \ --rootfs local-lvm:32 \ --net0 name=eth0,bridge=vmbr0,hwaddr=BC:24:11:15:F2:3A,ip=dhcp \ --unprivileged 1 \ --onboot 1 2. Install Dependencies apt update && apt upgrade -y apt install -y build-essential cmake git curl wget python3 python3-pip pkg-config libcurl4-openssl-dev 3. Build llama.cpp cd /opt git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp mkdir build && cd build cmake .. && make -j$(nproc) 4. Download Models mkdir -p /opt/llama.cpp/models cd /opt/llama.cpp/models # Qwen2.5 0.5B (fastest, 409MB) wget -O qwen3-0.6b-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_0.gguf # Qwen2.5 1.5B (most capable, 1017MB) wget -O qwen2.5-1.5b-instruct-q4_0.gguf \ https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_0.gguf # Llama 3.2 1B (balanced, 738MB) wget -O llama3.2-1b-instruct-q4_0.gguf \ https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf 5. Create llama.cpp Service # /etc/systemd/system/llama-cpp.service [Unit] Description=Llama.cpp Server After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/llama.cpp ExecStart=/opt/llama.cpp/build/bin/llama-server \ --model /opt/llama.cpp/models/qwen2.5-1.5b-q4_0.gguf \ --host 0.0.0.0 \ --port 8080 \ --ctx-size 16384 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 6. Install OpenWebUI python3 -m venv /opt/openwebui-venv source /opt/openwebui-venv/bin/activate pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu pip install open-webui 7. Create OpenWebUI Service # /etc/systemd/system/open-webui.service [Unit] Description=OpenWebUI After=network.target [Service] Type=simple User=root WorkingDirectory=/opt/openwebui-venv Environment=OPENAI_API_BASE_URL=http://127.0.0.1:8080/v1 Environment=OPENAI_API_KEY=sk-dummy Environment=WEBUI_AUTH=false ExecStart=/opt/openwebui-venv/bin/open-webui serve --port 3000 --host 0.0.0.0 Restart=always RestartSec=3 [Install] WantedBy=multi-user.target 8. Setup HTTPS with Let’s Encrypt Install NGINX and Certbot apt install -y nginx python3-certbot-nginx python3-certbot-dns-cloudflare Configure Cloudflare Credentials mkdir -p /etc/letsencrypt/credentials chmod 700 /etc/letsencrypt/credentials echo "dns_cloudflare_api_token = YOUR_CLOUDFLARE_TOKEN" > /etc/letsencrypt/credentials/cloudflare.ini chmod 600 /etc/letsencrypt/credentials/cloudflare.ini Obtain SSL Certificate certbot certonly \ --dns-cloudflare \ --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \ -d llama-ai.<yourdomain.com> \ --non-interactive \ --agree-tos \ --email [email protected] Configure NGINX # /etc/nginx/sites-available/llama-ai.<yourdomain.com> server { listen 80; server_name llama-ai.<yourdomain.com>; return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name llama-ai.<yourdomain.com>; ssl_certificate /etc/letsencrypt/live/llama-ai.<yourdomain.com>/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/llama-ai.<yourdomain.com>/privkey.pem; location / { proxy_pass http://127.0.0.1:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } } 9. Enable Services systemctl daemon-reload systemctl enable --now llama-cpp.service systemctl enable --now open-webui.service systemctl enable --now nginx ln -s /etc/nginx/sites-available/llama-ai.<yourdomain.com> /etc/nginx/sites-enabled/ nginx -t && systemctl reload nginx Switching Models llama.cpp runs as a systemd service with a single model loaded at start (set via --model in llama-cpp.service). To change models, update the service, reload, and restart. ...

Setting up llama.cpp in LXC Container on Proxmox

Setting up llama.cpp in LXC Container on Proxmox This guide documents the complete process of setting up llama.cpp in an LXC container on Proxmox with Intel GPU support and OpenAI-compatible API endpoints. Overview Goal: Replace Ollama with llama.cpp for better performance and lower resource usage Hardware: Intel N150 GPU (OpenCL support) Container: Debian 12 LXC on Proxmox API: OpenAI-compatible endpoints on port 11434 Container Creation 1. Create LXC Container # Download Debian 12 template pveam download local debian-12-standard_12.12-1_amd64.tar.zst # Create container pct create 107 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \ --hostname llama-cpp \ --memory 8192 \ --swap 512 \ --cores 4 \ --rootfs local-lvm:32 \ --net0 name=eth0,bridge=vmbr0,ip=dhcp \ --features keyctl=1,nesting=1 \ --unprivileged 1 \ --onboot 1 \ --tags ai 2. Add GPU Passthrough # Get GPU group IDs stat -c '%g' /dev/dri/card0 # Output: 44 stat -c '%g' /dev/dri/renderD128 # Output: 104 # Add GPU devices to container pct set 107 --dev0 /dev/dri/card0,gid=44 --dev1 /dev/dri/renderD128,gid=104 # Start container pct start 107 Software Installation 3. Install Dependencies # Update package list pct exec 107 -- apt update # Install build tools and dependencies pct exec 107 -- apt install -y \ build-essential \ cmake \ git \ curl \ pkg-config \ libssl-dev \ python3 \ python3-pip \ libcurl4-openssl-dev # Install OpenCL support for Intel GPU pct exec 107 -- apt install -y \ opencl-headers \ ocl-icd-opencl-dev \ intel-opencl-icd 4. Compile llama.cpp # Clone repository pct exec 107 -- bash -c "cd /opt && git clone https://github.com/ggerganov/llama.cpp.git" # Configure with OpenCL support pct exec 107 -- bash -c "cd /opt/llama.cpp && mkdir build && cd build && cmake .. -DGGML_OPENCL=ON -DCMAKE_BUILD_TYPE=Release" # Compile server binary pct exec 107 -- bash -c "cd /opt/llama.cpp/build && make -j$(nproc) llama-server" Model Setup 5. Download Models # Create models directory pct exec 107 -- mkdir -p /opt/llama.cpp/models # Download Qwen2.5-1.5B model (Q4_0 quantized) pct exec 107 -- bash -c "cd /opt/llama.cpp/models && curl -L -o qwen2.5-1.5b-q4_0.gguf https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_0.gguf" Service Configuration 6. Create systemd Service # Create service file pct exec 107 -- bash -c "printf '[Unit]\nDescription=llama.cpp Server\nAfter=network-online.target\nWants=network-online.target\n\n[Service]\nType=simple\nExecStart=/opt/llama.cpp/build/bin/llama-server --host 0.0.0.0 --port 11434 --threads 4 --model /opt/llama.cpp/models/qwen2.5-1.5b-q4_0.gguf --ctx-size 8192 --batch-size 512\nRestart=always\nRestartSec=3\nUser=root\nGroup=root\nEnvironment=HOME=/root\nStandardOutput=journal\nStandardError=journal\nSyslogIdentifier=llama-cpp\n\n[Install]\nWantedBy=multi-user.target\n' > /etc/systemd/system/llama-cpp.service" # Enable and start service pct exec 107 -- systemctl daemon-reload pct exec 107 -- systemctl enable llama-cpp.service pct exec 107 -- systemctl start llama-cpp.service Testing and Verification 7. Verify Service Status # Check service status pct exec 107 -- systemctl status llama-cpp.service # Check port binding pct exec 107 -- ss -tlnp | grep :11434 8. Test API # Test OpenAI-compatible API pct exec 107 -- curl -X POST http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"qwen2.5-1.5b-q4_0","messages":[{"role":"user","content":"Hello, how are you?"}],"max_tokens":50}' Expected response: ...

Ollama LXC Setup with GPU Acceleration and Web Interface

This guide covers setting up Ollama (Open Large Language Model) in a Proxmox LXC container with GPU passthrough and creating a simple web interface for easy interaction. Overview We’ll deploy Ollama in a resource-constrained LXC environment with: Intel UHD Graphics GPU acceleration llama3.2:1b model (~1.3GB) Lightweight Python web interface Auto-starting services Prerequisites Proxmox VE host Intel integrated graphics (UHD Graphics) At least 4GB RAM allocated to LXC 40GB+ storage for container Step 1: Container Setup Check Available GPU Resources First, verify GPU availability on the Proxmox host: ...

Proxmox LVM Thin Pool Auto-Extend Configuration Fix

Problem Proxmox shows this warning message: WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. This warning indicates that LVM thin pools are not configured to automatically extend when they approach capacity, which could lead to VMs running out of disk space unexpectedly. Solution Configure LVM to automatically extend thin pools before they become full. ...

Plex NFS Performance Optimization - Proxmox Container

How to dramatically improve Plex Media Server performance by optimizing NFS buffer sizes in a Proxmox container environment, reducing system load from 10.76 to 1.39.

Configuring GPU Hardware Transcoding for Plex in Proxmox LXC Container

Complete guide to enable Intel GPU hardware transcoding for Plex Media Server running in a Proxmox LXC container using VA-API.

Setting up Plex LXC Container on Proxmox with NFS and GPU Passthrough

Overview This guide covers the complete setup of a Plex Media Server running in an LXC container on Proxmox VE, including NFS storage integration and Intel GPU passthrough for hardware transcoding. Environment Details Host: Proxmox VE 8.4.8 (kernel 6.8.12-13-pve) Hardware: Intel N97 processor with integrated UHD Graphics Storage: NFS shares from NAS (nas.my.domain.com) Container: Ubuntu 22.04 LTS template Prerequisites 1. Enable IOMMU on Proxmox Host Ensure IOMMU is enabled in the kernel command line: ...

Proxmox GPU Passthrough, Q35 Machine Type Network Issues, and Plex Deployment

Overview This comprehensive guide covers GPU passthrough setup in Proxmox, the network interface issues caused by switching to Q35 machine type, and the complete deployment of Plex Media Server with Intel QSV hardware transcoding on a Talos Kubernetes cluster. Part 1: GPU Passthrough Setup Problem Need to grant a Proxmox VM direct access to a GPU for hardware acceleration or AI workloads. Solution Steps Enable IOMMU in host BIOS/UEFI Intel: Enable VT-d AMD: Enable AMD-Vi Configure host kernel parameters ...