Skip to content

Build Monitoring

ARROW provides real-time monitoring for VM build and provisioning processes, allowing you to track progress, view logs, and troubleshoot issues. The monitoring system is implemented in backend/api/vm_build_monitor/.

The build monitoring system offers:

  • Real-Time Status: Live updates on build progress via WebSocket
  • Log Streaming: WebSocket-based live log viewing at /api/vm-build/logs/stream
  • Progress Tracking: Visual indicators of build phases with percentage completion
  • Error Visibility: Immediate notification of failures with detailed error messages
  • Build History: Access to past build logs stored in B2 storage
flowchart TD
    subgraph Console["ARROW Console"]
        A[Build Monitor UI]
        B[WebSocket Client]
    end

    subgraph Backend["PocketBase Backend"]
        C[vm_build_monitor handlers]
        D[WebSocket Hub]
        E[Task Queue Manager]
    end

    subgraph Storage["B2 Storage"]
        F[Build Logs]
    end

    subgraph Builders["Build Servers"]
        G[GitHub Actions Runner]
        H[Local Builder]
    end

    B <-->|WebSocket| D
    A --> C
    G -->|POST /api/vm-build/task-status| C
    H -->|POST /api/vm-build/task-logs| C
    C --> E
    C --> F
    D --> A

VM provisioning progresses through defined phases:

stateDiagram-v2
    [*] --> Queued
    Queued --> Imaging: Builder Picks Task
    Imaging --> Provisioning: Image Ready
    Provisioning --> Configuring: VM Deployed
    Configuring --> Complete: Configuration Done
    Imaging --> Failed: Image Error
    Provisioning --> Failed: Deployment Error
    Configuring --> Failed: Config Error
    Failed --> Queued: Rebuild Triggered
    Complete --> [*]
PhaseDescriptionDuration
QueuedBuild task waiting for available runnerVariable
ImagingBase image being customized10-20 minutes
ProvisioningVM being deployed to infrastructure5-10 minutes
ConfiguringPost-deployment configuration5-15 minutes
CompleteVM ready for use-
FailedBuild encountered an error-

The console displays build status with visual indicators:

  • Blue (Queued): Task waiting in queue
  • Yellow (In Progress): Active build operation
  • Green (Complete): Successfully completed
  • Red (Failed): Build error occurred

The build monitoring system exposes the following endpoints:

EndpointMethodAuthPurpose
/api/vm-build/statusPOSTBuilderBuild status updates from builders
/api/vm-build/server-healthPOSTBuilderServer health metrics reporting
/api/vm-build/triggerPOSTimage.adminTrigger manual build
/api/vm-build/tasksGETBuilderPoll for available tasks
/api/vm-build/task-statusPOSTBuilderReport task progress
/api/vm-build/task-logsPOSTBuilderSubmit batch of build logs
/api/vm-build/serversGETAdminList build servers
/api/vm-build/jobsGETAdminGet build job status
/api/vm-build/task-logs/{task_id}GETAuthRetrieve logs for specific task
/api/vm-build/cancel-task/{task_id}POSTimage.adminCancel running task
/api/vm-build/task/{task_id}DELETEimage.adminDelete completed task
/api/vm-build/rebuild/{task_id}POSTimage.adminRebuild failed task
/api/vm-build/logs/streamWebSocketAuthReal-time log streaming

During active builds, logs stream in real-time via WebSocket connection:

  1. Navigate to your device request
  2. Click View Build Logs or the build status indicator
  3. Logs appear as they are generated
  4. Stream continues until build completes or fails

WebSocket Endpoint: /api/vm-build/logs/stream

Connection Authentication: Requires valid session token

The build monitoring system uses structured messages:

Build Log Message:

{
"type": "build_log",
"task_id": "task-123",
"session_id": "session-456",
"build_info": {
"client": "client-name",
"platform": "vm",
"image": "image-id",
"build_type": "pvm"
},
"log_entry": {
"timestamp": "2024-01-15T10:31:00Z",
"level": "INFO",
"component": "ANSIBLE",
"context": "Installing packages",
"message": "Installed: curl, wget, git"
}
}

Progress Message:

{
"type": "progress",
"task_id": "task-123",
"progress": {
"percentage": 45,
"stage": "imaging",
"stage_name": "Installing Software",
"current_step": "Running Ansible playbooks",
"estimated_remaining": "8m 30s"
}
}

Status Change Message:

{
"type": "status_change",
"task_id": "task-123",
"old_status": "building",
"new_status": "completed",
"message": "Build completed successfully",
"metadata": {
"build_time_seconds": 1245,
"download_url": "https://..."
}
}
LevelDescription
DEBUGDetailed diagnostic information
INFONormal operational messages
WARNWarning conditions
ERRORError conditions
FATALCritical failures
ComponentDescription
BUILD-ORCHESTRATOROverall build coordination
PACKERImage building operations
ANSIBLEPlaybook execution
PROXMOXVM deployment operations
NETBIRDVPN registration

Build logs include:

  • Timestamps: When each action occurred
  • Phase Markers: Current build phase indicators
  • Command Output: Output from build commands
  • Status Messages: Progress and status updates
  • Error Details: Full error messages and stack traces

The log viewer interface provides:

  • Auto-Scroll: Automatically follows new log entries
  • Search: Find specific text in logs
  • Pause/Resume: Stop auto-scroll to review
  • Download: Export logs for offline analysis

Visual progress tracking shows:

  • Overall Progress: Percentage complete bar
  • Current Phase: Active phase indicator
  • Elapsed Time: Time since build started
  • Estimated Remaining: Approximate time to completion

Each phase reports detailed progress:

Imaging Phase:

  • Downloading base image
  • Applying customizations
  • Installing software packages
  • Running Ansible playbooks
  • Uploading customized image

Provisioning Phase:

  • Creating VM on Proxmox
  • Configuring resources (CPU, memory, disk)
  • Attaching network interfaces
  • Starting VM

Configuring Phase:

  • Registering with NetBird VPN
  • Configuring access groups
  • Setting up policies
  • Verifying connectivity

Build tasks are managed in a queue system:

  • FIFO Processing: Tasks processed in order received
  • Priority Support: Urgent builds can be prioritized
  • Parallel Builds: Multiple runners process tasks concurrently
stateDiagram-v2
    [*] --> queued
    queued --> started: Builder Claims Task
    started --> building: Build Begins
    building --> completed: Success
    building --> failed: Error
    building --> cancelling: Cancel Requested
    cancelling --> cancelled: Cleanup Done
    failed --> queued: Retry Triggered
    completed --> [*]
    cancelled --> [*]

Each build task contains:

FieldDescription
Task IDUnique identifier
Device RequestAssociated device request
OrganizationOwning organization
StatusCurrent task status
CreatedWhen task was created
StartedWhen build began
CompletedWhen build finished
ProgressPercentage complete (0-100)
Assigned ServerBuild server handling the task
Build TimeDuration in seconds

Build servers report health status to enable intelligent task scheduling:

Health Update Endpoint: POST /api/vm-build/server-health

{
"server_name": "builder-1",
"hostname": "builder-1.example.com",
"status": "online",
"cpu_usage": 45.5,
"memory_usage": 60.2,
"memory_total": 8192,
"disk_usage": 75.0,
"disk_total": 500,
"active_builds": 3,
"max_concurrent_builds": 5,
"version": "1.2.0",
"last_heartbeat": "2024-01-15T10:31:00Z"
}
StatusDescription
onlineServer available for builds
offlineServer not responding
busyServer at capacity
maintenanceServer under maintenance

Builders register with the system via WebSocket:

{
"type": "register",
"server_info": {
"server_name": "builder-1",
"hostname": "builder-1.example.com",
"capabilities": ["vm", "hardware"],
"active_tasks": ["task-id-1", "task-id-2"]
},
"timestamp": "2024-01-15T10:30:00Z"
}

Administrators can send commands to build servers:

CommandDescription
cancelCancel the running task
pausePause build execution
resumeResume paused build

Commands are stored in vm_build_commands and delivered to builders on next poll.

Access task details through:

  1. Device Request: Click the build status on your request
  2. Build Monitor: Admin view of all active builds
  3. Task API: Programmatic access to task information

Builders submit logs in batches via POST /api/vm-build/task-logs:

{
"task_id": "task-123",
"entries": [
{
"timestamp": "2024-01-15T10:31:00Z",
"level": "INFO",
"component": "ANSIBLE",
"message": "Installing packages: curl, wget, git"
}
]
}

Logs are stored in the database and can be streamed to connected WebSocket clients.

Error TypeCauseResolution
Image Download FailedB2 connectivity issueRetry build, check storage
Ansible Playbook FailedConfiguration errorReview logs, fix playbook
Proxmox Deployment FailedResource unavailableCheck cluster capacity
VPN Registration FailedNetBird API errorVerify VPN configuration
TimeoutBuild exceeded time limitOptimize build, increase timeout

When builds fail, error details include:

  • Error Type: Classification of the failure
  • Error Message: Detailed description
  • Phase: Which phase failed
  • Logs: Full build logs up to failure
  • Suggestions: Potential resolution steps

For failed builds:

  1. Review Logs: Check the complete build log
  2. Identify Phase: Note which phase failed
  3. Check Error: Read the specific error message
  4. Verify Configuration: Ensure request settings are correct
  5. Retry Build: Trigger a rebuild if appropriate
  6. Contact Support: Escalate persistent issues

When builds complete successfully:

  • Status Update: Request status changes to “Fulfilled”
  • Console Notification: Visual indicator in the console
  • Email Notification: Optional email alert (if configured)
  • Access Details: Connection information available

After successful completion:

  1. Verify VM: Check VM is accessible via VPN
  2. Test Connectivity: Confirm network access
  3. Review Configuration: Verify software installation
  4. Document Access: Note connection details

Successful builds produce:

  • Customized Image: Stored in B2 (for downloadable VMs)
  • Deployed VM: Running instance on Proxmox
  • Build Logs: Complete log history
  • Configuration Records: Settings applied to VM

Access build logs through the proxy endpoint:

  • Endpoint: /api/build-logs/proxy
  • Authentication: Requires valid session
  • Parameters: Task ID and log file reference

Build logs are retained according to policy:

  • Active Builds: Real-time streaming available
  • Recent Builds: Full logs available (30 days)
  • Archived Builds: Compressed logs (90 days)
  • Expired: Logs automatically removed

Export logs for analysis:

  1. Open the build log viewer
  2. Click Download button
  3. Select format (text or JSON)
  4. Save to local system
  • Monitor Progress: Watch for stalled phases
  • Check Logs: Review logs for warnings
  • Note Timing: Track how long phases take
  • Report Issues: Flag problems early
  • Capture Logs: Download full logs before retry
  • Document Issues: Note error patterns
  • Check Resources: Verify infrastructure capacity
  • Coordinate Retry: Plan rebuild timing
  • Monitor Queue: Watch task queue depth
  • Track Success Rates: Review build success metrics
  • Optimize Images: Reduce build times
  • Scale Runners: Add capacity as needed