Project Overview


This page dives into the codebase itself — how the Go code is organized, what each package does, and how the layers connect. If you want to understand the code well enough to modify or extend it, this is where to start.

AI Agents HQ is written in Go (also called Golang), a programming language created by Google that is popular for building command-line tools and backend systems. Go was chosen for this project because it has excellent support for concurrency (running multiple things at once), compiles to a single binary with no dependencies, and has a straightforward standard library.

Project Structure


Here is the complete directory layout with explanations of what each file and directory does:

terminal
$ tree ai_agents_hq/
ai_agents_hq/
go.mod # Go module definition (like package.json for Node.js)
CLAUDE.md # Instructions for Claude Code (auto-read by Claude)
GEMINI.md # Instructions for Gemini CLI (auto-read by Gemini)
AGENTS.md # Instructions for Codex CLI (auto-read by Codex)
plan_codex.md # Full project specification (source of truth)
implementation_log.md # Session-by-session build log
.gitignore # Git ignore rules
.github/workflows/ci.yml # CI pipeline (runs on every push)
cmd/hq/ # The CLI application
main.go # Entry point and command dispatch
task_cmd.go # Task subcommands (create, list, claim, etc.)
inbox_cmd.go # Inbox subcommands (send, read)
command_helpers.go # Shared CLI utilities (flag parsing, validation, exit codes)
integration_test.go # End-to-end tests using the real binary
internal/teams/ # Core library code (not importable by external projects)
protocol/ # Data structures and constants
constants.go # Exit codes, status enums, protocol version
constants_test.go # Tests for constants
task_schema.go # Task struct, validation, CAS, lease helpers
task_schema_test.go # Tests for task schema
inbox_schema.go # InboxEvent struct, append-only operations
inbox_schema_test.go # Tests for inbox schema
storage/ # File system operations
lockfile.go # File locking using flock system call
atomic_write.go # Safe file writing (temp + rename)
idempotency.go # Persistent key store for duplicate detection
task_store.go # Task CRUD with locking and CAS
inbox_store.go # Inbox read/write with locking
storage_test.go # Tests for all storage operations
.claude/ # Claude-specific configuration
agents/ # Claude subagent definitions
researcher.md # Fast codebase exploration agent
reviewer.md # Code review agent
architect.md # System design agent
optimizer.md # Performance optimization agent
skills/ # Claude skill templates
research/SKILL.md
refactor/SKILL.md
test-writer/SKILL.md
code-review/SKILL.md
team-protocol/SKILL.md
.gemini/ # Gemini-specific configuration
settings.json # Model and tool settings
agents/ # Gemini subagent definitions
deep-researcher.md
api-auditor.md
skills/ # Gemini skill templates
deep-research/SKILL.md
api-audit/SKILL.md
team-protocol/SKILL.md
.codex/ # Codex-specific configuration
config.toml # 5 profiles (reviewer, security-auditor, etc.)
agents/ # Shared agent templates (cross-tool)
researcher.md
architect.md
coder.md
reviewer.md
optimizer.md
skills/ # Shared skill templates (cross-tool)
research/SKILL.md
refactor/SKILL.md
test-writer/SKILL.md
code-review/SKILL.md
team-protocol/SKILL.md
bin/ # Helper scripts (reserved for future use)
memory/global_context.md # Persistent context for agents

Why internal/?

In Go, the internal/ directory has special meaning. Code inside internal/ can only be imported by code within the same module — external projects cannot use it. This is intentional: the protocol and storage packages are implementation details of the hq CLI. If we ever publish this as a library, we would move stable APIs to a public package.

Why cmd/hq/?

The cmd/ directory is a Go convention for applications. Each subdirectory under cmd/ becomes a separate binary. cmd/hq/ compiles into the hq executable. If we added more tools later (like a monitoring daemon), they would go in cmd/hq-monitor/ or similar.

Protocol Layer


The protocol layer (internal/teams/protocol/) defines what the data looks like — the data structures, constants, and validation rules. It does not do any file I/O. Think of it as the "schema" or "contract" that everything else agrees on.

constants.go — The Frozen Contract

This file contains values that must never change (or change very carefully) because agents depend on them:

Exit Codes:

Constant
Value
When It Is Used
ExitSuccess
0
Operation completed successfully
ExitCASConflict
10
Version mismatch during update
ExitStaleLease
11
Lease validation failed
ExitProtocolMismatch
12
Agent's protocol version is wrong
ExitValidationError
13
Invalid input or state
ExitPermissionDenied
14
File permission check failed

Task Statuses:

Six possible states: pending, in_progress, completed, failed, blocked, escalated. These are defined as a Go string type called TaskStatus so the compiler can catch typos.

State Reasons:

When a task is in a non-happy state, the stateReason field explains why: dependency_pending, mcp_unavailable, lease_expired, budget_exceeded, agent_error, human_escalation.

Inbox Event Types:

Five types of inter-agent messages: task_completed, review_approved, research_findings, error_report, ad_hoc_request.

Timing Constants:

  • DefaultLeaseTTL: 30 minutes (in milliseconds: 1,800,000)
  • HeartbeatInterval: 10 minutes (in milliseconds: 600,000)
  • MaxFailureCount: 3 (triggers auto-escalation)

task_schema.go — The Task Data Structure

This file defines the Task struct (Go's version of a class/object) with 24 fields (including summary and failureDetail added during hardening). Every task JSON file on disk matches this structure exactly.

Key functions:

  • NewTask(id, tool, subject) — Creates a new task with sensible defaults (status=pending, version=1, empty arrays for blockedBy/blocks, current timestamp)
  • MarshalTask(t) — Converts a Task struct to pretty-printed JSON bytes
  • UnmarshalTask(data) — Parses JSON bytes into a Task struct, with validation
  • ValidateTask(t) — Checks that all required fields are present and valid
  • CheckCASVersion(t, expected) — Returns an error if the version does not match (used for conflict detection)
  • IsLeaseValid(t) — Checks if the lease has not expired
  • IsLeaseHeldBy(t, agent) — Checks if a specific agent holds the lease
  • SetLease(t, agent) — Sets a 30-minute lease for the given agent
  • ClearLease(t) — Removes the lease (on complete or fail)
  • IncrementVersion(t) — Bumps the version number and updates the timestamp
  • IsReady(t) — Returns true if the task is pending with no blockers

inbox_schema.go — The Inbox Data Structure

This file defines InboxEvent (a single message) and Inbox (a wrapper around an array of events).

Key functions:

  • NewInbox(agent) — Creates an empty inbox for an agent
  • AppendEvent(inbox, ...) — Adds a new event, checking for duplicate idempotency keys
  • EventsSince(inbox, eventID) — Returns only events newer than the given ID
  • MarshalInbox(inbox) / UnmarshalInbox(data) — JSON serialization

The inbox is append-only — events are never removed or modified. This is a deliberate design choice that creates an audit trail and simplifies concurrency (you only ever add to the end of the array).

Storage Layer


The storage layer (internal/teams/storage/) handles reading and writing files safely. It builds on the protocol layer (uses its data structures) and adds file locking, atomic writes, and persistent idempotency checking.

lockfile.go — File Locking

Uses the flock(2) system call (a POSIX standard) to create advisory locks. When you lock a file, any other process trying to lock the same file will block (wait) until you unlock it.

How it works:

  1. To lock task-1.json, the system creates a lock file at task-1.json.lock
  2. It opens this lock file and calls flock(fd, LOCK_EX) — an exclusive lock
  3. While the lock is held, any other process calling flock on the same file will wait
  4. When the operation is done, the system calls flock(fd, LOCK_UN) to release

The WithFileLock(path, fn) helper wraps this pattern: acquire lock, run your function, release lock. If the function panics or returns an error, the lock is still released (using Go's defer mechanism).

atomic_write.go — Safe File Writing

The AtomicWrite(path, data) function ensures that file writes are all-or-nothing:

  1. Creates a temporary file in the same directory as the target (e.g., task-1.json.tmp12345)
  2. Writes all data to the temp file
  3. Calls fsync — this forces the operating system to flush data from memory buffers to the physical disk
  4. Renames the temp file to the target path — this is an atomic operation (it either fully succeeds or does not happen at all)

Why this matters: if you write directly to a file and the system crashes mid-write, you end up with a half-written, corrupted file. With the atomic write pattern, you either have the old complete file (if the crash happened before rename) or the new complete file (if the rename completed). Never a corrupted file.

Additional functions:

  • ReadFile(path) — Reads a file and checks that its permissions are 0600 (only the owner can read/write)
  • EnsureDir(path) — Creates a directory with 0700 permissions if it does not exist

idempotency.go — Duplicate Detection

The IdempotencyStore is a simple JSON file that stores a list of seen idempotency keys:

terminal
$ cat ~/.hq/teams/demo/idempotency_keys.json
{"keys":["1-worker-complete","1-worker-report","2-researcher-complete"]}

Two operations:

  • CheckAndSet(key) — Atomically checks if the key exists and adds it if not. Returns true if it was a duplicate, false if it was new. Uses file locking to ensure two concurrent calls with the same key cannot both return "new."
  • HasKey(key) — Checks without adding (read-only).

task_store.go — Task CRUD

The TaskStore manages task JSON files on disk:

  • Save(task) — Writes a task to {baseDir}/tasks/{team}/{id}.json using atomic write
  • Load(id) — Reads and parses a task file
  • List() — Scans the team's task directory, reads all task files, and returns them sorted by ID
  • NextID() — Scans existing tasks and returns max ID + 1
  • UpdateWithLock(id, expectedVersion, fn) — The most important function. It:
  1. Acquires a file lock on the task
  2. Reads the current task from disk
  3. Checks that the version matches expectedVersion (CAS check)
  4. Calls your function fn to modify the task
  5. Increments the version
  6. Writes the updated task back to disk
  7. Releases the lock

This read-modify-write pattern with CAS is the foundation of all task mutations.

inbox_store.go — Inbox Operations

The InboxStore manages inbox JSON files:

  • Load(agent) — Reads an agent's inbox, or returns an empty inbox if the file/directory does not exist
  • Send(to, from, eventType, protocolVersion, idempotencyKey, payload) — Appends a new event to the recipient's inbox with deduplication
  • Read(agent, sinceEventID) — Returns all events (or only those after sinceEventID)

CLI Layer


The CLI layer (cmd/hq/) ties everything together. It parses command-line arguments, calls the storage and protocol layers, and outputs JSON results.

main.go — Entry Point

The entry point parses the first two arguments to determine which command to run:

terminal
hq task list ... → calls runTaskList()
hq task create ... → calls runTaskCreate()
hq task claim ... → calls runTaskClaim()
hq inbox send ... → calls runInboxSend()
...

The hqBaseDir() function determines where all data is stored. It resolves to ~/.hq by default.

task_cmd.go — Task Commands

Each task subcommand (create, list, claim, complete, fail, heartbeat) is a separate function that:

  1. Parses its flags using Go's standard flag package
  2. Validates required arguments
  3. Creates a TaskStore pointing at ~/.hq
  4. Performs the operation (using UpdateWithLock for mutations)
  5. Prints JSON output to stdout
  6. Returns the appropriate exit code

The autoUnblock function (called from runTaskComplete) is notable — it scans every task in the team and removes the completed task's ID from their blockedBy arrays, transitioning blocked tasks to pending when all blockers are resolved. Auto-unblock is best-effort: if it fails (e.g., CAS conflict on a dependent task), a warning is logged but the completion still succeeds.

command_helpers.go — Shared Utilities

This file contains shared CLI utilities used across all commands:

  • newFlagSet(name) / parseFlags(fs, args, command) — Standardized flag parsing with consistent error wrapping
  • validateTeam(team) / validateAgent(agent) — Identifier validation using safeIdentifierPattern regex ([A-Za-z0-9._-]+)
  • commandExitCode(err) — Maps sentinel errors to deterministic exit codes (0, 10-14)
  • exitCommandError(prefix, err) — Prints error to stderr and exits with the correct code
  • mustMarshalString(s) — Safe JSON string encoding for payloads
  • splitCSV(value) — Splits comma-separated flag values

integration_test.go — End-to-End Tests

The integration tests compile the hq binary, then run it as a real subprocess (like a user typing commands in a terminal). There are 27 integration tests covering task lifecycle, concurrency, error handling, and edge cases. Each test:

  1. Creates a temporary directory for ~/.hq
  2. Sets the HQ_BASE_DIR environment variable to the temp directory
  3. Runs hq commands and checks the output and exit codes
  4. Verifies the final state matches expectations

This catches issues that unit tests miss — like argument parsing bugs, output formatting problems, and the interaction between multiple commands in sequence.

How the Layers Connect


01 CLI (cmd/hq/)
02 Storage (internal/teams/storage/)
03 Protocol (internal/teams/protocol/)
04 JSON Files (~/.hq/)

When an agent runs hq task claim --team demo --task 1 --agent worker --tool claude --protocol-version 2, here is the full call chain:

  1. CLI layer parses arguments, validates protocol version
  2. Storage layer (UpdateWithLock) acquires file lock on task 1
  3. Protocol layer (CheckCASVersion) verifies the version
  4. CLI layer checks task status, tool match, blockers
  5. Protocol layer (SetLease) sets the lease
  6. Protocol layer (IncrementVersion) bumps version + timestamp
  7. Storage layer (AtomicWrite) writes the updated task to disk
  8. Storage layer releases the file lock
  9. CLI layer prints the updated task as JSON

Each layer has a single responsibility and only depends on the layer below it. The protocol layer knows nothing about files. The storage layer knows nothing about CLI arguments. This separation makes the code easier to test, understand, and extend.