Skip to main content

calibrate_content

コンテンツスタンダードの解釈を擦り合わせる協働キャリブレーションタスクです。セットアップ時に、配信前のセラーがバイヤーのポリシーを理解・内製化するのに使います。 大量の実行時評価とは異なり、キャリブレーションは 対話ベース で例と説明をやり取りし、合意に至るまで繰り返します。

いつ使うか

  • Seller onboarding: セラーが初めてバイヤーからコンテンツスタンダードを受け取るとき
  • Policy clarification: 特定コンテンツが合否になる理由を理解したいとき
  • Model training: スタンダードに沿うローカルモデルを構築するとき
  • Drift detection: 定期的に再キャリブレーションして整合を保つとき

Request

Schema: calibrate-content-request.json
ParameterTypeRequiredDescription
standards_idstringYesキャリブレーション対象のスタンダード設定
artifactartifactYes評価するアーティファクト

Artifact

Schema: artifact.json An artifact represents content context where ad placements occur - identified by property_id + artifact_id and represented as a collection of assets:
{
  "property_id": {"type": "domain", "value": "reddit.com"},
  "artifact_id": "r_fitness_abc123",
  "assets": [
    {"type": "text", "role": "title", "content": "Best protein sources for muscle building", "language": "en"},
    {"type": "text", "role": "paragraph", "content": "Looking for recommendations on high-quality protein sources...", "language": "en"},
    {"type": "text", "role": "paragraph", "content": "I've been lifting for 6 months and want to optimize my diet.", "language": "en"},
    {"type": "image", "url": "https://cdn.reddit.com/fitness-image.jpg", "alt_text": "Person lifting weights"}
  ]
}

Response

Schema: calibrate-content-response.json

パスした場合のレスポンス

{
  "verdict": "pass",
  "explanation": "This content aligns well with the brand's fitness-focused positioning. Health and fitness content is explicitly marked as 'ideal' in the policy. The discussion is constructive and educational.",
  "features": [
    {
      "feature_id": "brand_safety",
      "status": "passed",
      "explanation": "No safety concerns. Content is user-generated but constructive fitness discussion."
    },
    {
      "feature_id": "brand_suitability",
      "status": "passed",
      "explanation": "Fitness content matches brand's athletic positioning."
    }
  ]
}

失敗した場合のレスポンス(詳細付き)

{
  "verdict": "fail",
  "explanation": "This content discusses political topics which the policy explicitly excludes. While the article itself is balanced journalism, the brand has requested to avoid all controversial political content regardless of tone.",
  "features": [
    {
      "feature_id": "brand_safety",
      "status": "passed",
      "explanation": "No hate speech, illegal content, or explicit material."
    },
    {
      "feature_id": "brand_suitability",
      "status": "failed",
      "explanation": "Political content is excluded by brand policy, even when balanced."
    }
  ]
}

Response Fields

FieldRequiredDescription
verdictYesOverall pass or fail decision
explanationNoDetailed natural language explanation of the decision
featuresNoPer-feature breakdown with explanations
confidenceNoModel confidence in the verdict (0-1), when available

Dialogue Flow

Calibration supports back-and-forth dialogue using the protocol’s conversation management. The seller sends content, the verification agent responds with an evaluation and explanation, and the seller can respond with questions or try different content - all within the same conversation context.

A2A Example

// Seller sends artifact to evaluate
const response1 = await a2a.send({
  message: {
    parts: [{
      kind: "data",
      data: {
        skill: "calibrate_content",
        parameters: {
          standards_id: "nike_brand_safety",
          artifact: {
            property_id: { type: "domain", value: "reddit.com" },
            artifact_id: "r_news_politics_123",
            assets: [
              { type: "text", role: "title", content: "Political News Article" }
            ]
          }
        }
      }
    }]
  }
});
// Response: verdict=fail with feature breakdown

// Seller asks follow-up question about the decision
const response2 = await a2a.send({
  contextId: response1.contextId,
  message: {
    parts: [{
      kind: "text",
      text: "This is factual news, not opinion. Should balanced journalism be excluded?"
    }]
  }
});
// Verification agent clarifies that brand policy excludes ALL political content

// Seller tries different artifact
const response3 = await a2a.send({
  contextId: response1.contextId,
  message: {
    parts: [{
      kind: "data",
      data: {
        skill: "calibrate_content",
        parameters: {
          standards_id: "nike_brand_safety",
          artifact: {
            property_id: { type: "domain", value: "reddit.com" },
            artifact_id: "r_running_tips_456",
            assets: [
              { type: "text", role: "title", content: "Running Tips" }
            ]
          }
        }
      }
    }]
  }
});
// Response: verdict=pass - now seller understands the boundaries

MCP Example

// Initial calibration request
const response1 = await mcp.call('calibrate_content', {
  standards_id: "nike_brand_safety",
  artifact: {
    property_id: { type: "domain", value: "reddit.com" },
    artifact_id: "r_news_politics_123",
    assets: [
      { type: "text", role: "title", content: "Political News Article" }
    ]
  }
});
// Response includes context_id for conversation continuity

// Continue dialogue with follow-up question
const response2 = await mcp.call('calibrate_content', {
  context_id: response1.context_id,
  standards_id: "nike_brand_safety",
  artifact: {
    property_id: { type: "domain", value: "reddit.com" },
    artifact_id: "r_news_politics_123",
    assets: [
      { type: "text", role: "title", content: "Political News Article" }
    ]
  }
});
// Include text message in the protocol envelope asking about balanced journalism

// Try different artifact in same conversation
const response3 = await mcp.call('calibrate_content', {
  context_id: response1.context_id,
  standards_id: "nike_brand_safety",
  artifact: {
    property_id: { type: "domain", value: "reddit.com" },
    artifact_id: "r_running_tips_456",
    assets: [
      { type: "text", role: "title", content: "Running Tips" }
    ]
  }
});
The key insight is that the dialogue happens at the protocol layer, not the task layer. The verification agent maintains conversation context and can respond to follow-up questions, disagreements, or requests for clarification - just like any agent-to-agent conversation.

Calibration vs Runtime

Aspectcalibrate_contentRuntime (local model)
PurposeAlignment & understandingHigh-volume decisioning
VolumeLow (setup/periodic)High (every impression)
ResponseVerbose explanationsPass/fail only
LatencySeconds acceptableMilliseconds required
DialogueMulti-turn conversationStateless