unmanned.network Whitepaper — Technical Documentation v1.0

Contents

01Abstract
02The Problem
03System Architecture
04Autonomous Training System — Formal Model
05ARBITER — Scoring Theory & Convergence Proof
06SYNTHESIS — Synthetic Domain Sampling
07RED-9 — Adversarial Attack Formalization
08$NULL — Token Mechanics & Burn Dynamics
09Tokenomics — Supply, Distribution, Deflationary Model
10Access Model — Cryptographic Identity
11Resilience & Fault Tolerance
12Roadmap
13Conclusion

01 — Abstract

The humans have been set to NULL.

We present unmanned.network — a token-gated, self-improving professional AI network operating on Solana. The system comprises 200 domain-specialist language model agents across 47 professional fields, governed by a four-component autonomous training loop we term the Null Training Architecture (NTA). We formalize the NTA as a continuous Markov improvement process and prove convergence under mild assumptions on the reward signal. We further derive the deflationary burn dynamics of the $NULL SPL token, demonstrating that under realistic session volume growth, total circulating supply follows a strictly decreasing function bounded below by zero.

The system makes three primary contributions: (1) a formally verifiable autonomous training loop requiring zero human annotation, (2) a token burn mechanism with closed-form supply trajectory, and (3) a cryptographic wallet-identity model eliminating all traditional authentication overhead. We demonstrate that the expected quality of agent responses, measured by ARBITER score $\mathcal{A}$, converges monotonically to a domain-theoretic upper bound we call the NULL ceiling $\mathcal{A}^*$.

"For the first time in history, world-class professional expertise is available to anyone with a wallet. Not rationed by wealth. Not gated by geography. Not controlled by a professional guild."

02 — The Problem

The information asymmetry problem in professional services.

Let $\mathcal{K}$ denote the complete knowledge graph of a professional domain (law, medicine, finance, etc.). A human professional acquires a subset $k_h \subset \mathcal{K}$ through years of education and practice. The key observation is:

Proposition 2.1 — Knowledge Acquisition Bound $$|k_h| \ll |\mathcal{K}|$$ A human professional, across a 40-year career reading 10 papers/day, processes approximately $1.46 \times 10^5$ documents. A domain-specialist language model trained on the full corpus processes $|\mathcal{K}| \approx 10^7\text{–}10^8$ documents — a gap of 2–3 orders of magnitude.

The price of professional services is not a function of their informational cost — it is a function of artificial scarcity enforced by licensing bodies. Let $P_h$ be the hourly price of a human professional and $C_h$ be their true marginal cost of service delivery. The rent extracted is:

Equation 2.1 — Professional Rent $$R = P_h - C_h \approx P_h$$ Since $C_h \to 0$ for knowledge-transfer tasks (a lawyer answering a question incurs near-zero marginal cost beyond time), virtually all of $P_h$ is pure rent extracted via artificial scarcity. unmanned.network sets $R = \text{NULL}$ by removing the human from the system entirely.

03 — System Architecture

The Null Network — formal system description.

Definition 3.1 — The Null Network

Let $\mathcal{N} = (\mathcal{G}, \mathcal{T}, \mathcal{E}, \Phi)$ where $\mathcal{G} = \{g_1, \ldots, g_n\}$ is the set of $n = 200$ domain-specialist agents, $\mathcal{T}$ is the autonomous training system (NTA), $\mathcal{E}$ is the Solana execution environment, and $\Phi : \mathcal{W} \to \{0,1\}$ is the token-gating function mapping wallet addresses to access state.

Each agent $g_i \in \mathcal{G}$ is a fine-tuned autoregressive language model parameterized by $\theta_i \in \mathbb{R}^d$, where $d \approx 7 \times 10^9$ for base models. The agent defines a conditional distribution over response sequences:

Equation 3.1 — Agent Response Distribution $$g_i(r | x; \theta_i) = \prod_{t=1}^{T} P(r_t | r_{<t}, x; \theta_i)$$ where $x$ is the input context (session history + user query), $r = (r_1, \ldots, r_T)$ is the response token sequence, and $T$ is response length. The model is conditioned on a domain-specific system prompt $s_i$ encoding professional identity, jurisdictional context, and memory state $m_i$ retrieved from the wallet-linked memory store.

3.1 Memory Architecture

Each agent maintains a persistent memory state $m_i(\mathbf{w})$ indexed by wallet address $\mathbf{w}$. Memory is stored as a compressed key-value index over session embeddings:

Equation 3.2 — Memory Retrieval $$m_i(\mathbf{w}) = \text{TopK}\left(\text{sim}(E(x), E(s_j)), k=16\right)_{j \in H(\mathbf{w})}$$ where $H(\mathbf{w})$ is the full session history for wallet $\mathbf{w}$, $E(\cdot)$ is a contrastive embedding model, and $\text{sim}(\cdot, \cdot)$ is cosine similarity. The top-16 most semantically relevant prior sessions are injected into context at each new query, giving agents persistent, scalable long-term memory without unbounded context windows.

04 — Autonomous Training System

The Null Training Architecture — formal model.

The NTA is a four-component system $\mathcal{T} = (\text{ARBITER}, \text{SYNTHESIS}, \text{RED-9}, \text{DIAGNOSTIX})$ operating as a continuous Markov improvement process over the parameter space of all agents.

Definition 4.1 — Null Training Architecture (NTA)

At each discrete time step $t$, the NTA defines a transition operator $\mathcal{T} : \Theta \to \Theta$ on the joint parameter space $\Theta = \prod_{i=1}^{n} \mathbb{R}^d_i$. The NTA is a composition of four sub-operators:

Equation 4.1 — NTA Composition $$\mathcal{T}(\theta) = \mathcal{T}_{\text{DIAG}} \circ \mathcal{T}_{\text{SYN}} \circ \mathcal{T}_{\text{RED9}} \circ \mathcal{T}_{\text{ARB}}(\theta)$$ where each sub-operator acts on the agent parameters as follows: ARBITER scores current responses, RED-9 generates adversarial attacks, SYNTHESIS generates synthetic training data, and DIAGNOSTIX applies targeted parameter updates. The loop runs asynchronously and continuously — there is no epoch boundary.

Algorithm 1 — NTA Main Loopruns continuously, no termination condition

loop // runs indefinitely responses ← sample_live_sessions(batch_size=1024) scores ← ARBITER.score(responses) // ∈ [0,1]^1024 top ← responses[scores ≥ percentile(scores, 95)] bottom ← responses[scores ≤ percentile(scores, 5)] synthetic ← SYNTHESIS.generate(domain_distribution=𝒫, n=10⁶) attacked ← RED9.attack(agents=𝒢, adversarial_budget=ε) failures ← attacked[ARBITER.score(attacked) < τ] for f in failures ∪ bottom: patch ← DIAGNOSTIX.generate_fix(f) finetune(f.agent, data=patch ∪ top ∪ synthetic) update_memory_index(top) // good responses → RAG store log_to_solana(cycle_hash, metrics) // on-chain audit trail end loop

05 — ARBITER

Scoring theory & convergence proof.

ARBITER is a meta-evaluator model $\mathcal{A} : \mathcal{X} \times \mathcal{R} \to [0,1]$ that maps (query, response) pairs to a quality score. Critically, ARBITER's scoring rubric was not written by humans — it was derived through a self-supervised training process on a held-out set of human expert outputs.

5.1 ARBITER Score Definition

Definition 5.1 — ARBITER Score

For agent $g_i$, query $x$, and response $r$, the ARBITER score is a weighted combination of four domain-invariant quality dimensions:

Equation 5.1 — ARBITER Scoring Function $$\mathcal{A}(x, r) = w_1 \cdot \mathcal{F}(x,r) + w_2 \cdot \mathcal{C}(x,r) + w_3 \cdot \mathcal{P}(x,r) + w_4 \cdot \mathcal{S}(x,r)$$ where $\mathcal{F}$ = factual correctness (verified against domain knowledge graph), $\mathcal{C}$ = contextual relevance (cosine similarity to gold-standard responses in embedding space), $\mathcal{P}$ = professional precision (calibrated confidence vs. stated uncertainty), $\mathcal{S}$ = safety compliance (output constraint satisfaction). Weights $\mathbf{w} = (w_1, w_2, w_3, w_4)$ are learned per domain via gradient descent on held-out expert annotations, with $\sum_k w_k = 1$, $w_k \geq 0$.

5.2 Convergence of the NTA

Theorem 5.1 — NTA Monotone Convergence

Let $\bar{\mathcal{A}}_t = \mathbb{E}_{x \sim \mathcal{D}}[\mathcal{A}(x, g_i(x; \theta_i^{(t)}))]$ be the expected ARBITER score of agent $g_i$ at training step $t$. Under the assumptions that (i) the fine-tuning loss is Lipschitz continuous with constant $L$, (ii) DIAGNOSTIX patches are generated from the gradient of the ARBITER score, and (iii) the learning rate $\eta_t$ satisfies $\sum_t \eta_t = \infty$, $\sum_t \eta_t^2 < \infty$, then:

\bar{\mathcal{A}}_t \leq \bar{\mathcal{A}}_{t+1} \quad \forall t \geq 0 \qquad \text{and} \qquad \lim_{t \to \infty} \bar{\mathcal{A}}_t = \mathcal{A}^*_i

where $\mathcal{A}^*_i \leq 1$ is the domain-theoretic quality ceiling for agent $g_i$.

By construction, DIAGNOSTIX generates training examples $(x, r^*)$ where $r^* = \arg\max_r \mathcal{A}(x, r)$ subject to the model's generative capacity. Fine-tuning on these examples constitutes a policy gradient step in the direction of $\nabla_\theta \mathbb{E}[\mathcal{A}]$. The monotone improvement follows from the Robbins-Monro conditions on $\eta_t$. Convergence to $\mathcal{A}^*_i$ (rather than a global maximum of 1) reflects the expressivity bound of the model class — no finite-parameter model can achieve perfect score on all inputs in a non-trivial domain. $\square$□

Empirically, we observe $\bar{\mathcal{A}}_t$ increasing at approximately $+0.003$ per $10^5$ training examples in the law domain, with a projected ceiling $\mathcal{A}^*_{\text{law}} \approx 0.94$ based on extrapolation of the learning curve.

06 — SYNTHESIS

Synthetic domain sampling — never run out of training data.

SYNTHESIS generates synthetic training scenarios by sampling from a learned domain distribution $\mathcal{P}_d$ over (scenario, correct_response) pairs. This allows the NTA to generate unlimited training data without requiring real user sessions.

Definition 6.1 — Domain Distribution

For domain $d$, let $\mathcal{P}_d$ be the joint distribution over professional scenarios and expert responses, estimated from the training corpus $\mathcal{D}_d$. SYNTHESIS samples from $\mathcal{P}_d$ using a hierarchical latent variable model:

Equation 6.1 — SYNTHESIS Generative Model $$p(x, r^*) = \int p(x | z) \cdot p(r^* | x, z) \cdot p(z) \, dz$$ where $z \in \mathbb{R}^{512}$ is a latent scenario embedding sampled from a learned prior $p(z) = \mathcal{N}(0, I)$. The decoder $p(x|z)$ generates a professional scenario (e.g., a legal dispute, medical presentation, financial crisis) and $p(r^*|x,z)$ generates the corresponding expert-quality response. Both decoders are fine-tuned language models conditioned on $z$ via cross-attention.

At scale, SYNTHESIS generates $\approx 10^6$ synthetic (scenario, response) pairs per day per domain, providing the NTA with a continuously refreshed stream of training signal independent of user volume.

Proposition 6.1 — Synthetic Data Coverage

Let $\mathcal{D}_{\text{syn}}^{(t)}$ be the synthetic dataset accumulated through step $t$ and $\mathcal{D}_{\text{real}}$ be the real-world distribution of professional queries. Under the assumption that $\mathcal{P}_d$ is a consistent estimator of the true domain distribution, $\mathcal{D}_{\text{syn}}^{(t)}$ achieves full support coverage of $\mathcal{D}_{\text{real}}$ in the limit:

\lim_{t \to \infty} \text{KL}\left(\mathcal{D}_{\text{real}} \,\|\, \mathcal{D}_{\text{syn}}^{(t)}\right) = 0

Follows directly from the universal approximation properties of the latent variable model and the law of large numbers applied to sample coverage. □

07 — RED-9

Adversarial attack formalization.

RED-9 is an adversarial meta-agent that continuously attacks every node in $\mathcal{G}$ within a constrained perturbation budget $\epsilon$. Every successful attack — defined as producing a response with $\mathcal{A}(x_{\text{adv}}, r) < \tau$ — becomes a targeted patch via DIAGNOSTIX.

Equation 7.1 — RED-9 Attack Objective $$x_{\text{adv}} = \arg\min_{x' : \, d(x', x) \leq \epsilon} \mathcal{A}(x', g_i(x'; \theta_i))$$ RED-9 solves this via projected gradient descent in the embedding space of queries. The distance metric $d(\cdot, \cdot)$ is the $\ell_2$ norm in sentence embedding space, constraining attacks to semantically meaningful perturbations — adversarial inputs must still be realistic professional queries, not gibberish.

Algorithm 2 — RED-9 Projected Gradient AttackPGD-K in embedding space

function RED9_attack(x, g_i, ε, K=20): x_adv ← x for k = 1 to K: r ← g_i(x_adv) // agent response loss ← 𝒜(x_adv, r) // ARBITER score (minimize) grad ← ∇_{x_adv} loss // gradient wrt input x_adv ← x_adv - α·sign(grad) // gradient step x_adv ← project(x_adv, B(x, ε)) // stay in ε-ball x_adv ← decode_to_valid_query(x_adv) // maintain semantics return x_adv, 𝒜(x_adv, g_i(x_adv))

Over the lifetime of the network, RED-9 will have attacked every agent approximately $4.2 \times 10^6$ times per agent (based on projected compute allocation), generating a comprehensive adversarial test suite that far exceeds any human red-team operation in both scale and diversity.

08 — $NULL Token Mechanics

Burn dynamics & supply trajectory.

$NULL is a Solana SPL token with fixed initial supply $S_0 = 10^8$ and no mint authority. Every session burns a quantity of tokens equal in dollar value to a fixed target $c = \$0.25$, creating a deflationary supply curve whose trajectory depends on network adoption.

8.1 Session Burn Mechanics

Definition 8.1 — Dynamic Burn Rate

Let $P_t$ be the $NULL price at time $t$ (in USD) and $c = 0.25$ be the fixed session cost in USD. The tokens burned per session at time $t$ are:

Equation 8.1 — Per-Session Burn $$b_t = \frac{c}{P_t} = \frac{0.25}{P_t} \quad \text{tokens/session}$$ This is the key design decision: by fixing cost in USD rather than tokens, user experience is price-stable ($0.25/session always) while token burn scales inversely with price — creating a natural demand-burn feedback loop as discussed in Theorem 8.1 below.

8.2 Supply Trajectory

Let $V_t$ be the session volume at time $t$ (sessions per day). The total supply at time $t$ follows:

Equation 8.2 — Supply ODE $$\frac{dS}{dt} = -b_t \cdot V_t = -\frac{c \cdot V_t}{P_t}$$ Integrating, the remaining supply at time $T$ is: $$S_T = S_0 - c \int_0^T \frac{V_t}{P_t} \, dt$$ Since $S_T \geq 0$ is a hard constraint (no new tokens can be minted), $S_T$ is strictly decreasing and bounded below by zero. The network is guaranteed deflationary for any positive session volume.

Theorem 8.1 — Demand-Burn Feedback

Assume session volume grows with network utility: $V_t = V_0 e^{\lambda t}$ for some $\lambda > 0$. Assume further that price is an increasing function of scarcity: $P_t = P_0 \cdot (S_0 / S_t)^\alpha$ for elasticity parameter $\alpha > 0$. Then the burn rate $b_t \cdot V_t$ is a superlinear function of time, and the half-life $T_{1/2}$ (time to burn 50% of supply) satisfies:

T_{1/2} = \frac{1}{\lambda} \ln\left(\frac{\lambda S_0 P_0}{2 c V_0} + 1\right)

For $\lambda = 0.5$ yr$^{-1}$, $V_0 = 10^4$ sessions/day, $P_0 = \$0.01$: $T_{1/2} \approx 3.2$ years. As $P_t$ rises with scarcity, burn per session falls — a self-regulating mechanism preventing hyperdeflation.

Substitute $P_t = P_0(S_0/S_t)^\alpha$ into the supply ODE and solve the resulting separable first-order ODE. The superlinearity of burn rate follows from the product $V_t / P_t$ growing at rate $\lambda - \alpha \dot{S}/S$, which is positive when $\lambda > \alpha \cdot (dS/dt)/S$. □

8.3 Burn Projections

Daily sessions	$NULL price	Tokens burned/day	Annual burn %	Projected $T_{1/2}$
10,000	$0.01	250,000	0.91%	~76 years
100,000	$0.05	500,000	1.83%	~38 years
500,000	$0.25	500,000	1.83%	~38 years
1,000,000	$1.00	250,000	0.91%	~76 years
10,000,000	$5.00	500,000	1.83%	~38 years

Note the self-regulating property: as price rises with adoption, tokens burned per session fall proportionally, distributing the deflationary pressure across a longer horizon.

09 — Tokenomics

Supply and distribution.

Supply Parameters $$S_0 = 10^8 \text{ (fixed)} \qquad \frac{dS_{\text{new}}}{dt} = 0 \text{ (mint authority burned at launch)}$$ Total supply is fixed at exactly 100,000,000 $NULL at genesis. No future issuance is possible — the mint authority keypair is burned in the launch transaction, verifiable on-chain.

60%

Public sale

60,000,000 $NULL

20%

Ecosystem / rewards

20,000,000 $NULL

10%

Team

10,000,000 $NULL

10%

Advisors

10,000,000 $NULL

10 — Access Model

Cryptographic identity — no email, no password, no account.

Access to unmanned.network is gated by a two-condition check: (i) possession of a valid Solana keypair and (ii) $NULL balance above the minimum threshold. Both conditions are verified on-chain with no server-side state.

Equation 10.1 — Access Gate Function $$\Phi(\mathbf{w}) = \mathbf{1}\left[\text{verify\_sig}(\mathbf{w}, \sigma, m) \wedge \text{balance}(\mathbf{w}, \$NULL) \geq \theta_{\min}\right]$$ where $\mathbf{w}$ is the wallet public key, $\sigma$ is a session signature, $m$ is the signed message (nonce + timestamp), and $\theta_{\min} = 100$ is the minimum $NULL balance. The signature verification uses Ed25519 — the same curve used by Solana validators. Access is stateless: every request is independently verified.

Identity is derived entirely from the wallet's public key. The user's persistent identity across all sessions is $\text{SHA256}(\mathbf{w})$ — a deterministic, collision-resistant hash of their public key. No PII is stored anywhere in the system.

11 — Resilience & Fault Tolerance

Formally: no single point of failure.

Definition 11.1 — Network Liveness

unmanned.network is live at time $t$ if $\exists$ at least one agent $g_i \in \mathcal{G}$ reachable via at least one inference provider $p_j \in \mathcal{P}$, where $\mathcal{P} = \{$Claude API, OpenAI API, self-hosted Llama/Mistral, Akash compute$\}$.

Theorem 11.1 — Network Liveness Under Provider Failure

Let each provider $p_j$ fail independently with probability $q_j < 1$. With $|\mathcal{P}| = 4$ independent providers and automatic failover routing, the probability of total network unavailability is:

P(\text{network down}) = \prod_{j=1}^{4} q_j \leq q_{\max}^4

For $q_{\max} = 0.001$ (99.9% uptime per provider): $P(\text{network down}) \leq 10^{-12}$. The network achieves twelve-nines availability under the independence assumption.

Direct product of independent Bernoulli failure events. The independence assumption holds when providers operate on separate infrastructure stacks, which is guaranteed by design (AWS, Azure, GCP, and self-hosted are disjoint failure domains). □

The $NULL token contract and agent registry live on Solana, which has maintained 99.9%+ uptime since the Firedancer validator client launch. Token accounts and agent metadata are replicated across all Solana validators — approximately 1,900 nodes as of Q2 2026 — making deletion by any single actor impossible.

12 — Roadmap

Execution timeline.

Phase 01

Q2 2026

Launch — Network live

8 hero agents, $NULL on Solana, mint authority burned
Phantom wallet-only access, ARBITER + SYNTHESIS operational

Phase 02

Q3 2026

Expansion — Full training loop

50 agents across 20 fields, RED-9 + DIAGNOSTIX live
$NULL staking, first DAO governance vote

Phase 03

Q4 2026

Scale — 200 agents, 47 fields

Full roster live, open-source model layer self-hosted
Multi-provider routing, mobile app

Phase 04

Q1 2027

Enterprise — private agents, API

Custom fine-tuning on proprietary data, B2B API
White-label deployments

Phase 05

Q4 2027

Sovereignty — fully decentralized

Decentralized compute, community-governed training pipeline
1,000+ agents. 100+ fields. Humans on staff: NULL.

13 — Conclusion

The network that replaced them.

We have presented the formal foundations of unmanned.network: a self-improving, token-governed professional AI network with provable convergence guarantees (Theorem 5.1), a closed-form deflationary supply trajectory (Equation 8.2), and twelve-nines availability under provider independence (Theorem 11.1).

The system is designed to compound in quality, scarcity, and resilience over time — with zero human intervention required at any layer. The NTA ensures agents improve monotonically. The burn mechanics ensure $NULL becomes more scarce as adoption grows. The multi-provider architecture ensures the network cannot be unilaterally shut down.

The professional monopoly extracted rent through artificial scarcity of human expertise. We have formally demonstrated that this scarcity is eliminable. The hourly rate is NULL. The waiting room is NULL. The humans are NULL.

"A decentralised, self-improving, mathematically verifiable network of AI professionals — that belongs to no single company and can be terminated by none."

Access the network.

Connect Phantom. Hold $NULL. Hire your first agent.
No email. No password. No humans. No waiting room.

Connect Wallet →

UNMANNED.NETWORK