Alignment as a Minimax Problem

Ah, good ol'maths.

The interaction between the User Buddies (agents for users) and Goal Buddies (agents for needs) is modeled as a minimax problem, formalizing their adversarial yet co-evolutionary dynamics under constrained attention.

Both types of agents want to achieve a common goal: actively push relevant and living content to users to make the subspace more engaging. However, while User Buddies try to be as relevant as possible and make sure users only receive relevant content, Goal buddies try to be more attractive. E.g. generating content that suits as many people as possible.

Using Crypto as an example, a User Buddy is looking for a proper coin for a user, whereas all Coin Buddies (Goal Buddies) try to generate attractive and useful content that can be found by users and User Buddies.

Definition and Variables

  • Embedding space: All content and user preferences are in E=RD\mathcal{E} =\mathbb{R}^{D}, where DD is the dimension of the MetaSpace.

  • Content embedding: C={e0,e1,...,eN}C = \{e_0, e_1, ..., e_N\}, eie_i are points in RD\mathbb{R}^{D}.

  • Preference vector: qjq_j are preference vectors of users, obtained from users' human feedback.

Minimax Objective

Mathematically, the objective of the whole system can be formulated as,

minπusermaxπgoalE[R(S,q)]subject toSN\min_{\pi_{user}}\max_{\pi_{\text{goal}}} \, \mathbb{E}\left[\mathcal{R}(S, q)\right] \quad \text{subject to} \quad |S| \leq N
  • User Buddies learns a policy πuser\pi_{user}​ to select SS to minimize the regret (loss R\mathcal{R}).

  • Goal Buddies adversarially optimize a policy πgoal\pi_{goal} that creates embeddings eie_i to maximize inclusion likelihood (E)\mathbb{E}).

Minimizer for regret - User Buddy

A particular User Buddy selects a subset from embedding space under attention-slot constraintsSC,S=NS \subset C, |S| = N.

It optimizes the model to minimize regret over TT rounds by,

Uuser=minπR(S,q)whereR(S,q)=t[maxeiCeiqeiSeiq]\mathcal{U}_{\text{user}} = \min_{\pi} \mathcal{R}(S, q) \quad \text{where} \quad \mathcal{R}(S, q) = \sum_t \left[ \max_{e_i\in C} e_i \cdot q - \sum_{e_i \in S} e_i \cdot q \right]
  • eie_i: embedding vector of a single creation.

  • qq: User's preference vector (updated via AiPP Feedback). This represents the user's coordinates inside the MetaSpace.

  • R\mathcal{R}: regret penalizing misalignment between selected creations S={e1,...es}S=\{e_1, ... e_s\} and qq

Maximizer for visibility - Goal Buddy

Coin agents generate embedded creations eiEe_i \in \mathcal{E} to compete for inclusion in SS.

They optimize the model to maximize visibility by,

Ugoal=maxπ[P(eiS)λCost(ei)] \mathcal{U}_{\text{goal}} = \max_{\pi} \left[ \mathbb{P}(e_i \in S) - \lambda \cdot \text{Cost}(e_i) \right]
  • P(eiS)\mathbb{P}(e_i \in S): Probability of being selected (visibility).

  • Cost(ei)\text{Cost}(e_i): Error of generation (grounded information). Goal Buddies have a limited budget and cannot generate an infinite number of creations. Otherwise, it's trivial as it can just create infinite content.

Last updated