Sampling
Das Model Context Protocol (MCP) bietet einen standardisierten Weg für Server, LLM- Sampling (“Vervollständigungen” oder “Generierungen”) von Sprachmodellen über Clients anzufordern. Dieser Ablauf ermöglicht es Clients, die Kontrolle über Modellzugriff, -auswahl und -berechtigungen zu behalten, während Server KI-Fähigkeiten nutzen können—ohne dass Server-API-Schlüssel erforderlich sind. Server können text- oder bildbasierte Interaktionen anfordern und optional Kontext von MCP-Servern in ihre Prompts einbeziehen.
Benutzerinteraktionsmodell
Sampling in MCP ermöglicht es Servern, agentische Verhaltensweisen zu implementieren, indem LLM-Aufrufe verschachtelt innerhalb anderer MCP-Server-Funktionen auftreten können.
Implementierungen sind frei, Sampling über jedes Schnittstellenmuster bereitzustellen, das ihren Bedürfnissen entspricht—das Protokoll selbst schreibt kein spezifisches Benutzerinteraktionsmodell vor.
Für Vertrauen & Sicherheit und Schutz SOLLTE immer ein Mensch in der Schleife sein, der die Möglichkeit hat, Sampling-Anfragen abzulehnen.
Anwendungen SOLLTEN:
- Eine UI bereitstellen, die eine einfache und intuitive Prüfung von Sampling‑Anfragen ermöglicht
- Benutzern ermöglichen, Prompts vor dem Senden anzuzeigen und zu bearbeiten
- Generierte Antworten zur Überprüfung vor der Zustellung präsentieren
Fähigkeiten
Clients, die Sampling unterstützen, MÜSSEN die sampling
-Fähigkeit während der
Initialisierung deklarieren:
{
"capabilities": {
"sampling": {}
}
}
Protocol Messages
Creating Messages
To request a language model generation, servers send a sampling/createMessage
request:
Request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "sampling/createMessage",
"params": {
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the capital of France?"
}
}
],
"modelPreferences": {
"hints": [
{
"name": "claude-3-sonnet"
}
],
"intelligencePriority": 0.8,
"speedPriority": 0.5
},
"systemPrompt": "You are a helpful assistant.",
"maxTokens": 100
}
}
Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"role": "assistant",
"content": {
"type": "text",
"text": "The capital of France is Paris."
},
"model": "claude-3-sonnet-20240307",
"stopReason": "endTurn"
}
}
Message Flow
sequenceDiagram participant Server participant Client participant User participant LLM Note over Server,Client: Server initiates sampling Server->>Client: sampling/createMessage Note over Client,User: Human-in-the-loop review Client->>User: Present request for approval User-->>Client: Review and approve/modify Note over Client,LLM: Model interaction Client->>LLM: Forward approved request LLM-->>Client: Return generation Note over Client,User: Response review Client->>User: Present response for approval User-->>Client: Review and approve/modify Note over Server,Client: Complete request Client-->>Server: Return approved response
Data Types
Messages
Sampling messages can contain:
Text Content
{
"type": "text",
"text": "The message content"
}
Image Content
{
"type": "image",
"data": "base64-encoded-image-data",
"mimeType": "image/jpeg"
}
Model Preferences
Model selection in MCP requires careful abstraction since servers and clients may use different AI providers with distinct model offerings. A server cannot simply request a specific model by name since the client may not have access to that exact model or may prefer to use a different provider’s equivalent model.
To solve this, MCP implements a preference system that combines abstract capability priorities with optional model hints:
Capability Priorities
Servers express their needs through three normalized priority values (0-1):
costPriority
: How important is minimizing costs? Higher values prefer cheaper models.speedPriority
: How important is low latency? Higher values prefer faster models.intelligencePriority
: How important are advanced capabilities? Higher values prefer more capable models.
Model Hints
While priorities help select models based on characteristics, hints
allow servers to
suggest specific models or model families:
- Hints are treated as substrings that can match model names flexibly
- Multiple hints are evaluated in order of preference
- Clients MAY map hints to equivalent models from different providers
- Hints are advisory—clients make final model selection
For example:
{
"hints": [
{ "name": "claude-3-sonnet" }, // Prefer Sonnet-class models
{ "name": "claude" } // Fall back to any Claude model
],
"costPriority": 0.3, // Cost is less important
"speedPriority": 0.8, // Speed is very important
"intelligencePriority": 0.5 // Moderate capability needs
}
The client processes these preferences to select an appropriate model from its available
options. For instance, if the client doesn’t have access to Claude models but has Gemini,
it might map the sonnet hint to gemini-1.5-pro
based on similar capabilities.
Error Handling
Clients SHOULD return errors for common failure cases:
Example error:
{
"jsonrpc": "2.0",
"id": 1,
"error": {
"code": -1,
"message": "User rejected sampling request"
}
}
Security Considerations
- Clients SHOULD implement user approval controls
- Both parties SHOULD validate message content
- Clients SHOULD respect model preference hints
- Clients SHOULD implement rate limiting
- Both parties MUST handle sensitive data appropriately