Files
receipt-manager/.claude/projects/-home-kamaji/memory/MEMORY.md
2026-02-24 22:37:12 -06:00

18 KiB

Project Memory

Receipt Manager (separate from NextSnap)

  • Source: compute1:~/ (receipts.html + server.py), Gitea repo kamaji/receipt-manager
  • Container: receipt-manager on docker1 (~/receipt-manager/), port 8082
  • Deploy: scp receipts.html server.py docker1:~/receipt-manager/ && ssh docker1 'cd ~/receipt-manager && docker-compose down && docker-compose up -d --build'
  • Stack: Python stdlib HTTP server, vanilla JS frontend, Nextcloud WebDAV for storage
  • NOT part of NextSnap — separate app, separate container, separate repo

NextSnap Architecture

  • Flask PWA on docker1 (~/nextsnap), Docker container
  • Deploy: ssh docker1 'cd ~/nextsnap && docker-compose down && docker-compose up -d --build'
  • Files accessed via SSH (scp to deploy, ssh to run commands)
  • Frontend: Vanilla JS, Dexie.js (IndexedDB), Service Worker for offline-first
  • Reverse proxy: nginx-reverse host, config at /etc/nginx/sites-available/reverse-proxy.conf

Infrastructure - nginx-reverse

  • VM: on host compute1 (migrated from console, 2026-02-22)
  • Disk: /1TB/vm/nginx-reverse.qcow2 on compute1
  • IP: 192.168.128.19
  • Sudo: %sudo group has NOPASSWD:ALL; kamaji is in sudo group
  • SSH from compute1: Uses id_rsa key (added to authorized_keys)
  • Certbot: Webroot at /var/www/letsencrypt, auto-renewal enabled
  • Services: nginx (reverse proxy), iperf3 (speed test), WireGuard (wg0 — direct VPN server)
  • WireGuard: Direct VPN server on port 51820, clients connect via 104.52.199.76:51820
  • Autostart enabled on compute1

Infrastructure - docker1

  • VM: "docker1" on host compute1 (migrated from "docker-box" on console, 2026-02-22)
  • Disk: /1TB/vm/docker-box.qcow2 on compute1
  • RAM: 8 GB, vCPUs: 8
  • IP: 192.168.128.5
  • Root disk: 24 GB /dev/vda1 — tight with Docker; monitor usage
  • NFS mount: console:/Storage/Storage (in fstab with _netdev)
  • Nextcloud data: symlinked /opt/nextcloud/data/Storage/nextcloud/data
  • Nextcloud compose: /opt/nextcloud/docker-compose.yml (app + redis + mariadb)
  • Nextcloud Office: Built-in CODE (richdocumentscode AppImage), runs COOLWSD on port 9983 inside container
  • Autostart enabled on compute1

Nextcloud Backup (legacy)

  • Script: docker1:~/nextcloud-backup.sh — hourly cron, predates snapback
  • Note: Snapback now handles the same backups (nextcloud-data, nextcloud-app, nextcloud-db jobs)
  • SSH from docker1→compute1 uses kamaji's ed25519 key; sudo rsync needs explicit SSH opts

Two User Classes

  • Admin: NC admin login (username+password), full access including admin panel
  • Tech: Field workers, login with username+PIN, access to capture/queue/browser only
  • Tech users stored in data/tech_users.json (Docker named volume app_data at /app/data)
  • PIN hashed with werkzeug.security; NC password stored plaintext (admin-visible)
  • Tech user creation auto-provisions NC account via OCS API
  • Session stores NC password (base64) for transparent API calls
  • session['user_type'] = 'admin' | 'tech'; session['is_admin'] for nav guards
  • Admin panel: tappable user list → detail modal (enable/disable, reset PIN, reset NC password, delete)
  • Login page: tabbed UI — "Tech Login" (default) and "Admin Login"

Key Bugs & Lessons

Service Worker Caching (CRITICAL)

  • SW was cache-first for ALL routes including HTML pages → stale code served forever
  • Fix: Only /static/ uses cache-first; pages + API use network-first
  • Always bump SW cache versions when changing any static asset
  • Even after SW bump, user needs one hard refresh to pick up new SW

DOM Element Destroyed by innerHTML (queue.html)

  • getElementById('empty-state') returned null after innerHTML = '' destroyed it
  • Caused silent crash on every subsequent loadQueue() call
  • Counters still worked because updateStats() ran independently
  • Lesson: Never reference DOM elements inside a container you clear with innerHTML

iOS Safari Blob Eviction

  • Blobs stored in IndexedDB are file-backed references that iOS can evict
  • instanceof Blob and .size checks pass but reading data throws "Load failed"
  • Fix: Store as ArrayBuffer (inline bytes) + navigator.storage.persist()
  • Storage.getBlob(photo) converts ArrayBuffer→Blob at read time

iOS Safari Fetch "Load Failed"

  • Generic error for any network failure (backgrounding, connectivity)
  • Upload POST fails client-side, never reaches server
  • Show friendly messages, hide transient errors (< 3 retries) from UI

Nextcloud OCS API 412 Error

  • All OCS endpoints require OCS-APIRequest: true header (CSRF protection)
  • Without it → 412 Precondition Failed

Docker Volume Permissions

  • Named volumes mounted to dirs created in Dockerfile inherit ownership from image
  • If volume already exists with root ownership, must docker-compose down -v to recreate
  • Container runs as nextsnap (uid 1000) — /app/data must be owned by nextsnap

iOS Safari inputmode="numeric" + type="password"

  • Combining these causes browser validation error "string did not match expected pattern"
  • Fix: Use type="text" with inputmode="numeric" and validate in JS

Nextcloud CODE "Document loading failed"

  • COOLWSD (CODE server) stores temp files in /tmp jails inside the container
  • Disk full → "Low disk space" / "Out of storage" in COOLWSD logs → 400/408 on proxy.php
  • Memory exhaustion → inconsistent loading (works sometimes), CODE processes killed/timed out
  • Fix: Free disk space, increase VM RAM; COOLWSD log at /tmp/coolwsd.*/coolwsd.log inside container
  • After fixing, restart container so COOLWSD gets a clean start

MikroTik (cornerLot)

  • Model: RB750Gr3, RouterOS 6.48.2, identity cornerLot
  • IP: 192.168.88.1 (clients), 192.168.128.1 (servers)
  • SSH: Port 65523, user kamaji
  • SSH keys: RouterOS 6.x disables password auth once a key is imported; key must work or user is locked out (WinBox recovery only)
  • SSH from compute1: Dedicated key ~/.ssh/mikrotik_rsa, SSH config alias mikrotik; requires ssh-rsa algo + DEFAULT:SHA1 crypto policy (RHEL 9 blocks SHA-1 by default)
  • NAT rules: Working dstnat rules use dst-address=104.52.199.76, NOT in-interface=ether1
  • Hairpin NAT: srcnat masquerade for 192.168.0.0/16 at rule 2
  • Forward chain: action=accept with no conditions at rule 3 (accepts all)

Podman SELinux on compute1

  • Volume mounts require :z flag (e.g. -v /path:/mount:ro,z)
  • Without it: PermissionError: [Errno 13] Permission denied inside container

NFS Mount Not in fstab

  • /Storage was NFS-mounted manually (console:/Storage) but not in fstab
  • After VM reboot, Docker couldn't start nextcloud-app: broken symlink to /Storage/nextcloud/data
  • Fix: Added console:/Storage /Storage nfs defaults,_netdev 0 0 to fstab

Infrastructure - console

  • IP: 192.168.88.5 (servers subnet)
  • Role: KVM host (libvirt, Debian), NFS server (/Storage)
  • VMs remaining: devtest1 (autostart), pihole2, unifi, win10-signed_to_outlook.com
  • Migrated off (2026-02-22): docker-box→docker1, docker2, nginx-reverse, pihole, jellyfin2, git1 — all to compute1
  • win10-signed_to_outlook.com: moved from compute1 (2026-02-23), /var/lib/libvirt/images/win10.qcow2, 8 GB RAM, 4 vCPUs, autostart enabled, Hyper-V enlightenments enabled (vpindex, synic, stimer, frequencies, reenlightenment, tlbflush — dropped idle CPU from ~32% to ~5%)

Gitea (git1)

  • VM: on host compute1 (migrated from console, 2026-02-22)
  • Disk: /1TB/vm/git1.qcow2 on compute1
  • IP: 192.168.128.23
  • URL: https://git.sdanywhere.com/
  • Version: 1.25.4
  • Config: /etc/gitea/app.ini, runs as kamaji user
  • Repo root: /srv/git/
  • SSH push: Use kamaji@git1:kamaji/<repo>.git format (Gitea SSH wrapper)
  • Do NOT use bare repo paths (git1:/srv/git/...) — pre-receive hook rejects without Gitea env
  • authorized_keys: Plain keys (docker1, nginx-reverse, dns1) for shell access; RSA key has Gitea command= wrapper for git operations
  • Shell access to git1: Use docker1 (ed25519 key), not this machine (RSA key goes through Gitea wrapper)
  • API: Available at https://git.sdanywhere.com/api/v1/, generate tokens via gitea --config /etc/gitea/app.ini admin user generate-access-token
  • Repos: kamaji/receipt-manager (Receipt Manager), kamaji/snapback (backup system)

Snapback (compute1)

  • Repo: kamaji/snapback on Gitea
  • Deployed to: compute1:~/snapback/
  • Backup root: /1TB/backups/ on compute1
  • Cron: 0 * * * * runs all jobs hourly
  • SSH key: id_rsa on compute1 (not ed25519)
  • Log: ~/snapback/snapback.log (not /var/log — not writable by kamaji)
  • ntfy: Sends to https://ntfy.sdanywhere.com topic backups
  • Retention: 24 hourly, 7 daily, 4 weekly
  • Jobs: nextcloud-data (rsync), nextcloud-app (rsync+sudo), nextcloud-db (db_dump), gitea-repos (rsync), gitea-data (rsync), gitea-db (db_dump), nginx-config (rsync), nginx-certs (rsync+sudo), nginx-wireguard (rsync+sudo), mikrotik-config (db_dump, /export), mikrotik-backup (db_dump, local, binary .backup)
  • Local jobs: local: true on db_dump runs command directly instead of via SSH
  • MikroTik SSH: Dedicated key ~/.ssh/mikrotik_rsa on compute1; SSH config alias mikrotik (port 65523, user kamaji, ssh-rsa algo for RouterOS 6.x); requires DEFAULT:SHA1 crypto policy on compute1 (RHEL 9)
  • MikroTik helper: ~/snapback/mikrotik-backup.sh — creates backup on router, SCPs it off, outputs to stdout

Snapback Web Browser (compute1)

  • Container: snapback-web via podman on compute1, port 8082
  • Public URL: https://backups.sdanywhere.com (via nginx-reverse proxy)
  • Auth: HTTP Basic Auth (BROWSER_USER/BROWSER_PASSWORD env vars)
  • Source: compute1:~/snapback/web/ (Flask + gunicorn, vanilla JS SPA)
  • Mounts: /1TB/backups:/backups:ro,z, ~/.ssh:/ssh:ro,z, config.yml:/app/config.yml:ro,z
  • Features: Browse snapshots, preview text/images, download files/zips, restore via rsync
  • SSH key handling: Copies /ssh/* to /tmp/.ssh/ with correct permissions at startup
  • Restore: Only rsync-type jobs; builds rsync command from config job metadata
  • SELinux: Podman on compute1 requires :z flag on volume mounts
  • Deploy: ssh compute1 'podman build -t snapback-web ~/snapback/web/ && podman stop snapback-web && podman rm snapback-web && podman run -d --name snapback-web -p 8082:8082 -v /1TB/backups:/backups:ro,z -v /home/kamaji/.ssh:/ssh:ro,z -v /home/kamaji/snapback/config.yml:/app/config.yml:ro,z -e BROWSER_USER=kamaji -e BROWSER_PASSWORD=<pw> --restart unless-stopped snapback-web'

Sysmon (compute1)

  • Source: devtest1:~/sysmon/, deployed to compute1:~/sysmon/
  • Dashboard: Podman container sysmon on compute1, port 8083 (pure aggregator, no /proc)
  • Stack: Flask + gunicorn, vanilla JS frontend, dark theme
  • Architecture: Single codebase, two modes controlled by SYSMON_SERVERS env var
    • Agent mode (no env var): Reads local /proc + sudo virsh for VMs, serves /api/stats
    • Dashboard mode (env var set): Pure aggregator, no /proc or CPU sampler; serves UI at /
  • Config: SYSMON_SERVERS="compute1:http://host.containers.internal:8084,console:http://192.168.88.5:8083"
  • Agents: compute1 (port 8084, systemd), console (port 8083, systemd) — both bare installs, python3 -m gunicorn
  • UI: Hash-based routing — dashboard summary cards (#) + server detail view (#<name>) with CPU grid, memory, load, uptime, VM table
  • VM base info: sudo virsh dominfo per VM (30s cache) — name, state, vcpus, memory, autostart
  • VM CPU %: sudo virsh domstats --cpu-total --balloon (5s cache), delta tracking for CPU %
  • VM memory: From balloon stats (available - unused); Windows VMs lack guest-side stats, show allocated only
  • VM disk: sudo virsh guestinfo --filesystem per VM (30s cache, async background refresh to avoid blocking)
  • Dashboard fetch timeout: 8s (cold-cache agent responses can take 2-3s due to dominfo calls)
  • Deploy dashboard: scp ~/sysmon/{app.py,templates/index.html,run.sh,Dockerfile} compute1:~/sysmon/ && ssh compute1 'mkdir -p ~/sysmon/templates && mv ~/sysmon/index.html ~/sysmon/templates/ && ~/sysmon/run.sh'
  • Deploy agent: ~/sysmon/deploy-agent.sh <host> [port] (default port 8083)
  • SELinux note: compute1 (RHEL 9) blocks ~/.local/bin/gunicorn from systemd; use python3 -m gunicorn instead
  • Adding servers: Add name:url to SYSMON_SERVERS, deploy agent on new host, rebuild dashboard

iperf3 (nginx-reverse)

  • Service: iperf3.service (systemd, installed via apt)
  • Port: 5201 (TCP+UDP)
  • DNS: speed.sdanywhere.com
  • MikroTik NAT: dstnat rules use dst-address=104.52.199.76 (NOT in-interface=ether1 — WAN traffic doesn't match ether1)
  • Hairpin NAT: srcnat masquerade for 192.168.0.0/16 (rule 2 in NAT table)

Infrastructure - Pi-hole HA

  • pihole1 (primary): 192.168.128.3 on compute1, /1TB/vm/pihole.qcow2, 2 GB RAM, 2 vCPUs
  • pihole2 (replica): 192.168.128.2 on console, /var/lib/libvirt/images/pihole2.qcow2, 1 GB RAM, 2 vCPUs
  • Both running Pi-hole v6.4, autostart enabled
  • Sync: Nebula Sync container (nebula-sync) on docker1, full sync every 30 min (pihole1 → pihole2)
  • DHCP DNS: MikroTik hands out both 192.168.128.3,192.168.128.2
  • Cross-host redundancy: pihole1 on compute1, pihole2 on console
  • Admin password: both set to same password via sudo pihole setpassword
  • pihole2 SSH: kamaji user, ed25519 key generated on pihole2, can SSH to pihole1

Infrastructure - unifi

  • VM: on host console (moved back from compute1, 2026-02-23)
  • Disk: /var/lib/libvirt/images/unifi.qcow2 on console
  • IP: 192.168.128.6
  • RAM: 1 GB, vCPUs: 2
  • Services: UniFi Network Controller
  • Autostart enabled on console

Infrastructure - jellyfin2

  • VM: on host compute1 (migrated from console, 2026-02-22)
  • Disk: /1TB/vm/jellyfin2.qcow2 on compute1
  • IP: 192.168.128.4
  • RAM: 1 GB, vCPUs: 2
  • NFS mount: console:/Storage/Storage (in fstab with _netdev)
  • Services: Jellyfin media server (port 8096)
  • Autostart enabled on compute1

Infrastructure - zoneminder

  • VM: on host compute1, cloned from debian12-template (2026-02-23)
  • Disk: /1TB/vm/zoneminder.qcow2 on compute1
  • IP: 192.168.128.20
  • RAM: 4 GB, vCPUs: 4
  • OS: Debian 12 Bookworm
  • Services: ZoneMinder 1.36.33 (CCTV/surveillance), Apache2, MariaDB
  • DB: zm database, user zmuser/zmpass
  • Web UI: http://192.168.128.20/zm/
  • Config: /etc/zm/zm.conf (group www-data), overrides in /etc/zm/conf.d/
  • ffmpeg: Configured in /etc/zm/conf.d/03-ffmpeg.conf
  • Sudo: kamaji has NOPASSWD:ALL via /etc/sudoers.d/kamaji
  • Autostart enabled on compute1

Infrastructure - compute1 (VM host)

  • IP: 192.168.88.9 (servers subnet), also reachable as compute1
  • OS: RHEL 9 (Rocky), QEMU at /usr/libexec/qemu-kvm
  • libvirt: Machine type q35 (RHEL variant), VNC graphics (no SPICE support)
  • Networking: Bridge br0 on enp4s0f1 for VM bridged networking
  • VM storage: /1TB/vm/ on 932 GB local disk
  • VMs hosted: docker1, docker2, nginx-reverse, pihole, jellyfin2, git1, zoneminder, macos/SheepShaver, dos/DOSBox
  • Migration notes: Adapted from console (Debian): emulator path, machine type, macvtap→bridge, SPICE→VNC

Retro VMs

See retro-vms.md — DOSBox (192.168.128.31) and SheepShaver/Mac OS 9 (192.168.128.30) on compute1, via Guacamole

Guacamole (docker1)

  • Containers: guacamole, guacd, guac-mysql on docker1
  • Compose: docker1:~/guacamole/docker-compose.yml
  • DB: MySQL 8.0, user guacuser / guacpassword, database guacamole
  • Port: 8080 (guacamole webapp)
  • Auth: MySQL + LDAP (OpenLDAP, ou=users,dc=sdanywhere,dc=com)
  • kamaji entity_id: 2

VPN (Direct WireGuard)

  • Server: nginx-reverse, 10.10.10.1/24 (wg0), ListenPort 51820
  • Public endpoint: 104.52.199.76:51820 (MikroTik dstnat → 192.168.128.19:51820)
  • Config: /etc/wireguard/wg0.conf on nginx-reverse, wg-quick@wg0 enabled
  • Split tunnel: Clients route 192.168.128.0/24, 192.168.88.0/24, 10.10.10.0/24 through VPN
  • DNS: 192.168.128.3 (Pi-hole primary), 192.168.128.2 (Pi-hole replica) — no public fallback
  • Clients: iPhone at 10.10.10.2/32, MikroTik travel router at 10.10.10.3/32, Windows laptop at 10.10.10.4/32, Linux client at 10.10.10.5/32
  • Routing: No masquerade — MikroTik has static route 10.10.10.0/24 via 192.168.128.19 so LAN devices can reach VPN clients directly; bidirectional FORWARD rules; MSS clamping for TCP on both enp1s0 and wg0
  • MikroTik notrack: Raw prerouting rule dst-address=10.10.10.0/24 action=notrack — required because nginx-reverse delivers VPN→LAN packets directly on L2 (same subnet), creating asymmetric routing; without notrack, MikroTik conntrack marks reply packets as invalid and drops them intermittently
  • sdanywhere.com: VPN services disabled (2026-02-22), still active as web server (64.227.104.26, Debian 12, login root)

ntfy (docker2)

  • VM: on host compute1 (migrated from console, 2026-02-22)
  • Disk: /1TB/vm/docker2.qcow2 on compute1
  • IP: 192.168.128.8
  • Container: binwiederhier/ntfy on docker2, port 8080; uses docker compose v2
  • Deploy: ssh docker2 'cd ~/snapback && docker compose down && docker compose up -d'
  • URL: https://ntfy.sdanywhere.com, auth deny-all, admin user kamaji
  • Config: docker2:~/snapback/ntfy/server.yml, upstream relay to ntfy.sh for mobile push
  • Topic: backups (requires auth)
  • Autostart enabled on compute1

NextSnap File Locations

See nextsnap-files.md