AI Agents HQ

Project Overview

This page dives into the codebase itself — how the Go code is organized, what each package does, and how the layers connect. If you want to understand the code well enough to modify or extend it, this is where to start.

AI Agents HQ is written in Go (also called Golang), a programming language created by Google that is popular for building command-line tools and backend systems. Go was chosen for this project because it has excellent support for concurrency (running multiple things at once), compiles to a single binary with no dependencies, and has a straightforward standard library.

Project Structure

Here is the complete directory layout with explanations of what each file and directory does:

terminal

$ tree ai_agents_hq/

ai_agents_hq/

go.mod # Go module definition (like package.json for Node.js)

CLAUDE.md # Instructions for Claude Code (auto-read by Claude)

GEMINI.md # Instructions for Gemini CLI (auto-read by Gemini)

AGENTS.md # Instructions for Codex CLI (auto-read by Codex)

plan_codex.md # Full project specification (source of truth)

implementation_log.md # Session-by-session build log

.gitignore # Git ignore rules

.github/workflows/ci.yml # CI pipeline (runs on every push)

cmd/hq/ # The CLI application

main.go # Entry point and command dispatch

task_cmd.go # Task subcommands (create, list, claim, etc.)

inbox_cmd.go # Inbox subcommands (send, read)

command_helpers.go # Shared CLI utilities (flag parsing, validation, exit codes)

integration_test.go # End-to-end tests using the real binary

internal/teams/ # Core library code (not importable by external projects)

protocol/ # Data structures and constants

constants.go # Exit codes, status enums, protocol version

constants_test.go # Tests for constants

task_schema.go # Task struct, validation, CAS, lease helpers

task_schema_test.go # Tests for task schema

inbox_schema.go # InboxEvent struct, append-only operations

inbox_schema_test.go # Tests for inbox schema

storage/ # File system operations

lockfile.go # File locking using flock system call

atomic_write.go # Safe file writing (temp + rename)

idempotency.go # Persistent key store for duplicate detection

task_store.go # Task CRUD with locking and CAS

inbox_store.go # Inbox read/write with locking

storage_test.go # Tests for all storage operations

.claude/ # Claude-specific configuration

agents/ # Claude subagent definitions

researcher.md # Fast codebase exploration agent

reviewer.md # Code review agent

architect.md # System design agent

optimizer.md # Performance optimization agent

skills/ # Claude skill templates

research/SKILL.md

refactor/SKILL.md

test-writer/SKILL.md

code-review/SKILL.md

team-protocol/SKILL.md

.gemini/ # Gemini-specific configuration

settings.json # Model and tool settings

agents/ # Gemini subagent definitions

deep-researcher.md

api-auditor.md

skills/ # Gemini skill templates

deep-research/SKILL.md

api-audit/SKILL.md

team-protocol/SKILL.md

.codex/ # Codex-specific configuration

config.toml # 5 profiles (reviewer, security-auditor, etc.)

agents/ # Shared agent templates (cross-tool)

researcher.md

architect.md

coder.md

reviewer.md

optimizer.md

skills/ # Shared skill templates (cross-tool)

research/SKILL.md

refactor/SKILL.md

test-writer/SKILL.md

code-review/SKILL.md

team-protocol/SKILL.md

bin/ # Helper scripts (reserved for future use)

memory/global_context.md # Persistent context for agents

Why `internal/`?

In Go, the internal/ directory has special meaning. Code inside internal/ can only be imported by code within the same module — external projects cannot use it. This is intentional: the protocol and storage packages are implementation details of the hq CLI. If we ever publish this as a library, we would move stable APIs to a public package.

Why `cmd/hq/`?

The cmd/ directory is a Go convention for applications. Each subdirectory under cmd/ becomes a separate binary. cmd/hq/ compiles into the hq executable. If we added more tools later (like a monitoring daemon), they would go in cmd/hq-monitor/ or similar.

Protocol Layer

The protocol layer (internal/teams/protocol/) defines what the data looks like — the data structures, constants, and validation rules. It does not do any file I/O. Think of it as the "schema" or "contract" that everything else agrees on.

constants.go — The Frozen Contract

This file contains values that must never change (or change very carefully) because agents depend on them:

Exit Codes:

Constant

Value

When It Is Used

ExitSuccess

Operation completed successfully

ExitCASConflict

Version mismatch during update

ExitStaleLease

Lease validation failed

ExitProtocolMismatch

Agent's protocol version is wrong

ExitValidationError

Invalid input or state

ExitPermissionDenied

File permission check failed

Task Statuses:

Six possible states: pending, in_progress, completed, failed, blocked, escalated. These are defined as a Go string type called TaskStatus so the compiler can catch typos.

State Reasons:

When a task is in a non-happy state, the stateReason field explains why: dependency_pending, mcp_unavailable, lease_expired, budget_exceeded, agent_error, human_escalation.

Inbox Event Types:

Five types of inter-agent messages: task_completed, review_approved, research_findings, error_report, ad_hoc_request.

Timing Constants:

DefaultLeaseTTL: 30 minutes (in milliseconds: 1,800,000)
HeartbeatInterval: 10 minutes (in milliseconds: 600,000)
MaxFailureCount: 3 (triggers auto-escalation)

task_schema.go — The Task Data Structure

This file defines the Task struct (Go's version of a class/object) with 24 fields (including summary and failureDetail added during hardening). Every task JSON file on disk matches this structure exactly.

Key functions:

NewTask(id, tool, subject) — Creates a new task with sensible defaults (status=pending, version=1, empty arrays for blockedBy/blocks, current timestamp)
MarshalTask(t) — Converts a Task struct to pretty-printed JSON bytes
UnmarshalTask(data) — Parses JSON bytes into a Task struct, with validation
ValidateTask(t) — Checks that all required fields are present and valid
CheckCASVersion(t, expected) — Returns an error if the version does not match (used for conflict detection)
IsLeaseValid(t) — Checks if the lease has not expired
IsLeaseHeldBy(t, agent) — Checks if a specific agent holds the lease
SetLease(t, agent) — Sets a 30-minute lease for the given agent
ClearLease(t) — Removes the lease (on complete or fail)
IncrementVersion(t) — Bumps the version number and updates the timestamp
IsReady(t) — Returns true if the task is pending with no blockers

inbox_schema.go — The Inbox Data Structure

This file defines InboxEvent (a single message) and Inbox (a wrapper around an array of events).

Key functions:

NewInbox(agent) — Creates an empty inbox for an agent
AppendEvent(inbox, ...) — Adds a new event, checking for duplicate idempotency keys
EventsSince(inbox, eventID) — Returns only events newer than the given ID
MarshalInbox(inbox) / UnmarshalInbox(data) — JSON serialization

The inbox is append-only — events are never removed or modified. This is a deliberate design choice that creates an audit trail and simplifies concurrency (you only ever add to the end of the array).

Storage Layer

The storage layer (internal/teams/storage/) handles reading and writing files safely. It builds on the protocol layer (uses its data structures) and adds file locking, atomic writes, and persistent idempotency checking.

lockfile.go — File Locking

Uses the flock(2) system call (a POSIX standard) to create advisory locks. When you lock a file, any other process trying to lock the same file will block (wait) until you unlock it.

How it works:

To lock task-1.json, the system creates a lock file at task-1.json.lock
It opens this lock file and calls flock(fd, LOCK_EX) — an exclusive lock
While the lock is held, any other process calling flock on the same file will wait
When the operation is done, the system calls flock(fd, LOCK_UN) to release

The WithFileLock(path, fn) helper wraps this pattern: acquire lock, run your function, release lock. If the function panics or returns an error, the lock is still released (using Go's defer mechanism).

atomic_write.go — Safe File Writing

The AtomicWrite(path, data) function ensures that file writes are all-or-nothing:

Creates a temporary file in the same directory as the target (e.g., task-1.json.tmp12345)
Writes all data to the temp file
Calls fsync — this forces the operating system to flush data from memory buffers to the physical disk
Renames the temp file to the target path — this is an atomic operation (it either fully succeeds or does not happen at all)

Why this matters: if you write directly to a file and the system crashes mid-write, you end up with a half-written, corrupted file. With the atomic write pattern, you either have the old complete file (if the crash happened before rename) or the new complete file (if the rename completed). Never a corrupted file.

Additional functions:

ReadFile(path) — Reads a file and checks that its permissions are 0600 (only the owner can read/write)
EnsureDir(path) — Creates a directory with 0700 permissions if it does not exist

idempotency.go — Duplicate Detection

The IdempotencyStore is a simple JSON file that stores a list of seen idempotency keys:

terminal

$ cat ~/.hq/teams/demo/idempotency_keys.json

{"keys":["1-worker-complete","1-worker-report","2-researcher-complete"]}

Two operations:

CheckAndSet(key) — Atomically checks if the key exists and adds it if not. Returns true if it was a duplicate, false if it was new. Uses file locking to ensure two concurrent calls with the same key cannot both return "new."
HasKey(key) — Checks without adding (read-only).

task_store.go — Task CRUD

The TaskStore manages task JSON files on disk:

Save(task) — Writes a task to {baseDir}/tasks/{team}/{id}.json using atomic write
Load(id) — Reads and parses a task file
List() — Scans the team's task directory, reads all task files, and returns them sorted by ID
NextID() — Scans existing tasks and returns max ID + 1
UpdateWithLock(id, expectedVersion, fn) — The most important function. It:

Acquires a file lock on the task
Reads the current task from disk
Checks that the version matches expectedVersion (CAS check)
Calls your function fn to modify the task
Increments the version
Writes the updated task back to disk
Releases the lock

This read-modify-write pattern with CAS is the foundation of all task mutations.

inbox_store.go — Inbox Operations

The InboxStore manages inbox JSON files:

Load(agent) — Reads an agent's inbox, or returns an empty inbox if the file/directory does not exist
Send(to, from, eventType, protocolVersion, idempotencyKey, payload) — Appends a new event to the recipient's inbox with deduplication
Read(agent, sinceEventID) — Returns all events (or only those after sinceEventID)

CLI Layer

The CLI layer (cmd/hq/) ties everything together. It parses command-line arguments, calls the storage and protocol layers, and outputs JSON results.

main.go — Entry Point

The entry point parses the first two arguments to determine which command to run:

terminal

hq task list ... → calls runTaskList()

hq task create ... → calls runTaskCreate()

hq task claim ... → calls runTaskClaim()

hq inbox send ... → calls runInboxSend()

...

The hqBaseDir() function determines where all data is stored. It resolves to ~/.hq by default.

task_cmd.go — Task Commands

Each task subcommand (create, list, claim, complete, fail, heartbeat) is a separate function that:

Parses its flags using Go's standard flag package
Validates required arguments
Creates a TaskStore pointing at ~/.hq
Performs the operation (using UpdateWithLock for mutations)
Prints JSON output to stdout
Returns the appropriate exit code

The autoUnblock function (called from runTaskComplete) is notable — it scans every task in the team and removes the completed task's ID from their blockedBy arrays, transitioning blocked tasks to pending when all blockers are resolved. Auto-unblock is best-effort: if it fails (e.g., CAS conflict on a dependent task), a warning is logged but the completion still succeeds.

command_helpers.go — Shared Utilities

This file contains shared CLI utilities used across all commands:

newFlagSet(name) / parseFlags(fs, args, command) — Standardized flag parsing with consistent error wrapping
validateTeam(team) / validateAgent(agent) — Identifier validation using safeIdentifierPattern regex ([A-Za-z0-9._-]+)
commandExitCode(err) — Maps sentinel errors to deterministic exit codes (0, 10-14)
exitCommandError(prefix, err) — Prints error to stderr and exits with the correct code
mustMarshalString(s) — Safe JSON string encoding for payloads
splitCSV(value) — Splits comma-separated flag values

integration_test.go — End-to-End Tests

The integration tests compile the hq binary, then run it as a real subprocess (like a user typing commands in a terminal). There are 27 integration tests covering task lifecycle, concurrency, error handling, and edge cases. Each test:

Creates a temporary directory for ~/.hq
Sets the HQ_BASE_DIR environment variable to the temp directory
Runs hq commands and checks the output and exit codes
Verifies the final state matches expectations

This catches issues that unit tests miss — like argument parsing bugs, output formatting problems, and the interaction between multiple commands in sequence.

How the Layers Connect

01 CLI (cmd/hq/)

02 Storage (internal/teams/storage/)

03 Protocol (internal/teams/protocol/)

04 JSON Files (~/.hq/)

When an agent runs hq task claim --team demo --task 1 --agent worker --tool claude --protocol-version 2, here is the full call chain:

CLI layer parses arguments, validates protocol version
Storage layer (UpdateWithLock) acquires file lock on task 1
Protocol layer (CheckCASVersion) verifies the version
CLI layer checks task status, tool match, blockers
Protocol layer (SetLease) sets the lease
Protocol layer (IncrementVersion) bumps version + timestamp
Storage layer (AtomicWrite) writes the updated task to disk
Storage layer releases the file lock
CLI layer prints the updated task as JSON

Each layer has a single responsibility and only depends on the layer below it. The protocol layer knows nothing about files. The storage layer knows nothing about CLI arguments. This separation makes the code easier to test, understand, and extend.

Project Overview

Project Structure

Why internal/?

Why cmd/hq/?

Protocol Layer

constants.go — The Frozen Contract

task_schema.go — The Task Data Structure

inbox_schema.go — The Inbox Data Structure

Storage Layer

lockfile.go — File Locking

atomic_write.go — Safe File Writing

idempotency.go — Duplicate Detection

task_store.go — Task CRUD

inbox_store.go — Inbox Operations

CLI Layer

main.go — Entry Point

task_cmd.go — Task Commands

command_helpers.go — Shared Utilities

integration_test.go — End-to-End Tests

How the Layers Connect

Why `internal/`?

Why `cmd/hq/`?