nexus/ctl.sh
Nexus Dev 3d2117ee9f fix(nexus): auto-bootstrap invite and vite onnxruntime middleware
Zero-terminal first boot. Previously the bootstrap_ceo invite had to be
created via a CLI command (paperclipai auth bootstrap-ceo) and the UI
showed a code block instructing the user to run it. Nexus is meant to
be zero-terminal, so the server now auto-creates the invite on startup
when no instance admin exists and exposes its relative path through
/api/health. BootstrapPendingPage redirects straight to /invite/{token}.
The CLI command is left intact for headless/SSH-only setups.

Invite flow fixes that surfaced during testing:

  - InviteLanding's invite query had default React Query refetch
    behavior. After a successful bootstrap accept, the invite is marked
    accepted server-side, so the refetch returned "not available" and
    shadowed the success screen, making it look like the bootstrap had
    failed when it actually succeeded. Set staleTime: Infinity +
    refetchOnWindowFocus/Mount/Reconnect: false so the first fetch is a
    one-shot snapshot.

  - Reordered the render checks so result?.kind === "bootstrap" / "join"
    are evaluated before the invite-availability error check — defensive
    against any stray refetch that still leaks through.

  - On bootstrap success, window.location.replace("/") lands the new
    admin directly on the board; the "Bootstrap complete" confirmation
    screen is now an unreachable safety net.

Vite onnxruntime middleware replaces the earlier public/ dump. The
previous commit put ort-wasm-simd-threaded.{mjs,wasm} in ui/public/ so
VAD's onnxWASMBasePath: "/" would find them. That works at runtime but
trips vite's dep optimizer: it scans onnxruntime-web, resolves the
dynamic import string to the public asset, and errors with "files in
/public should not be imported from source code." Remove the files and
add a vite plugin (configureServer middleware) that serves the two URLs
straight from node_modules/.pnpm/onnxruntime-web@*/. Runtime keeps
working and the files never enter vite's module graph.

Production build caveat: the middleware only runs in dev. When building
a static dist for production, the wasm files will need a different
mechanism (e.g. generateBundle hook). Not addressed here.

Also bundled (load-bearing for LAN browser testing):

  - ui/src/lib/queryKeys.ts: add missing 'nexus' group. useNexusMode
    referenced queryKeys.nexus.settings since commit 7bb72a5a (Phase
    33-02) but the key was never added. Caused a blank screen crash on
    any page that mounts Sidebar.

  - ctl.sh: read PORT from .env instead of hardcoding 3100, and read it
    once at the top so every subcommand honors it. Fixes the Version /
    Mode showing '?' in status output after the port move to 6100.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:50:28 +00:00

228 lines
6.4 KiB
Bash
Executable file

#!/usr/bin/env bash
#
# ctl.sh — Nexus dev server control script
#
# Usage:
# ./ctl.sh start Start the dev server (background, logs to .paperclip/dev.log)
# ./ctl.sh stop Gracefully stop the dev server and all child processes
# ./ctl.sh restart Stop then start
# ./ctl.sh status Show whether the server is running
# ./ctl.sh logs Tail the dev server log
# ./ctl.sh fg Start in foreground (interactive, Ctrl-C to stop)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PIDFILE="$SCRIPT_DIR/.paperclip/dev.pid"
LOGFILE="$SCRIPT_DIR/.paperclip/dev.log"
# Read PORT from .env (default 6100) so every command uses the same port.
PORT=$(grep -s '^PORT=' "$SCRIPT_DIR/.env" | cut -d= -f2)
PORT=${PORT:-6100}
mkdir -p "$(dirname "$PIDFILE")"
is_running() {
[[ -f "$PIDFILE" ]] || return 1
local pid
pid=$(<"$PIDFILE")
kill -0 "$pid" 2>/dev/null
}
get_pid() {
[[ -f "$PIDFILE" ]] && cat "$PIDFILE" || echo ""
}
do_start() {
if is_running; then
echo "Already running (pid $(get_pid)). Use '$0 restart' to restart."
exit 0
fi
echo "Starting Nexus dev server..."
cd "$SCRIPT_DIR"
# setsid creates a new process group so 'kill -- -PGID' reaches all children
setsid pnpm dev >> "$LOGFILE" 2>&1 &
local parent_pid=$!
echo "$parent_pid" > "$PIDFILE"
# Wait for health endpoint
local ready=false
for i in $(seq 1 30); do
if curl -sf "http://localhost:$PORT/api/health" > /dev/null 2>&1; then
ready=true
break
fi
sleep 1
done
if $ready; then
echo "Server ready (pid $parent_pid) — http://localhost:$PORT"
echo "Logs: $LOGFILE"
else
echo "Server started (pid $parent_pid) but health check not yet responding."
echo "Check logs: tail -f $LOGFILE"
fi
}
do_stop() {
if ! is_running; then
echo "Not running."
# Clean up orphans anyway
"$SCRIPT_DIR/scripts/kill-dev.sh" 2>/dev/null || true
rm -f "$PIDFILE"
return 0
fi
local pid
pid=$(get_pid)
echo "Stopping Nexus dev server (pid $pid)..."
# Kill the entire process group (negative PID = PGID).
# setsid in do_start makes the parent the session/group leader,
# so this reaches all children: pnpm, tsx, cross-env, node server, etc.
kill -TERM -- -"$pid" 2>/dev/null || kill -TERM "$pid" 2>/dev/null || true
# Wait up to 5s for graceful shutdown
local waited=0
while kill -0 "$pid" 2>/dev/null && (( waited < 50 )); do
sleep 0.1
((waited += 1))
done
# Force-kill any stragglers in the group
if kill -0 "$pid" 2>/dev/null; then
echo "Sending SIGKILL to remaining processes..."
kill -KILL -- -"$pid" 2>/dev/null || true
fi
# Use kill-dev.sh to mop up embedded postgres
"$SCRIPT_DIR/scripts/kill-dev.sh" 2>/dev/null || true
rm -f "$PIDFILE"
echo "Stopped."
}
do_status() {
if ! is_running; then
rm -f "$PIDFILE"
echo "Nexus is not running."
return
fi
local pid
pid=$(get_pid)
# Uptime from pidfile modification time (written at start)
local uptime started_at
if [[ -f "$PIDFILE" ]]; then
local now pidfile_epoch
now=$(date +%s)
pidfile_epoch=$(stat -c %Y "$PIDFILE" 2>/dev/null || echo "$now")
local diff=$(( now - pidfile_epoch ))
local days=$(( diff / 86400 ))
local hours=$(( (diff % 86400) / 3600 ))
local mins=$(( (diff % 3600) / 60 ))
if (( days > 0 )); then
uptime="${days}d ${hours}h ${mins}m"
elif (( hours > 0 )); then
uptime="${hours}h ${mins}m"
else
uptime="${mins}m"
fi
started_at=$(date -d "@$pidfile_epoch" "+%Y-%m-%d %H:%M" 2>/dev/null || echo "?")
else
uptime="?"
started_at="?"
fi
# Query health endpoint for deploy info
local health
health=$(curl -sf --max-time 2 "http://localhost:$PORT/api/health" 2>/dev/null || echo "")
local mode exposure version
mode=$(echo "$health" | grep -oP '"deploymentMode"\s*:\s*"\K[^"]+' || echo "?")
exposure=$(echo "$health" | grep -oP '"deploymentExposure"\s*:\s*"\K[^"]+' || echo "?")
version=$(echo "$health" | grep -oP '"version"\s*:\s*"\K[^"]+' || echo "?")
# Detect listen host/port from .env or defaults
local host port
host=$(grep -s '^HOST=' "$SCRIPT_DIR/.env" | cut -d= -f2 || echo "127.0.0.1")
host=${host:-127.0.0.1}
port=$PORT
# LAN IP (first non-loopback IPv4)
local lan_ip
lan_ip=$(ip -4 -o addr show scope global 2>/dev/null | awk '{print $4}' | cut -d/ -f1 | head -1 || echo "")
# Detect embedded postgres
local pg_pid pg_port pg_status
pg_pid=$(ps --ppid "$pid" -o pid,args --no-headers -w 2>/dev/null | grep postgres | awk '{print $1}' | head -1 || echo "")
if [[ -z "$pg_pid" ]]; then
# postgres may be deeper in the tree — search all descendants
pg_pid=$(ps -eo pid,args --no-headers 2>/dev/null | grep "[p]ostgres -D" | awk '{print $1}' | head -1 || echo "")
fi
pg_port=$(grep -s 'embeddedPostgresPort' "$HOME/.paperclip/instances/default/config.json" 2>/dev/null | grep -oP '\d+' || echo "54329")
if [[ -n "$pg_pid" ]]; then
pg_status="running (pid $pg_pid, port $pg_port)"
else
pg_status="not detected"
fi
# Vite HMR port
local hmr_port=$(( port + 10000 ))
# Process count in group
local proc_count
proc_count=$(ps -g "$pid" --no-headers 2>/dev/null | wc -l || echo "?")
# Print status
echo ""
echo " Nexus Dev Server"
echo " ──────────────────────────────────────────"
echo " Status running since $started_at ($uptime)"
echo " Version $version"
echo " PID $pid ($proc_count processes)"
echo " Mode $mode ($exposure)"
echo ""
echo " API http://localhost:$port/api"
echo " UI http://localhost:$port"
if [[ "$host" == "0.0.0.0" && -n "$lan_ip" ]]; then
echo " LAN http://$lan_ip:$port"
fi
echo " HMR ws://localhost:$hmr_port"
echo ""
echo " Postgres $pg_status"
echo " Log $LOGFILE"
echo ""
}
do_logs() {
if [[ -f "$LOGFILE" ]]; then
tail -f "$LOGFILE"
else
echo "No log file at $LOGFILE"
fi
}
do_fg() {
if is_running; then
echo "Already running in background (pid $(get_pid)). Stop it first with '$0 stop'."
exit 1
fi
cd "$SCRIPT_DIR"
exec pnpm dev
}
case "${1:-}" in
start) do_start ;;
stop) do_stop ;;
restart) do_stop; do_start ;;
status) do_status ;;
logs) do_logs ;;
fg) do_fg ;;
*)
echo "Usage: $0 {start|stop|restart|status|logs|fg}"
exit 1
;;
esac