Generation: Parameters & Output format

Parameter configuration is an important aspect of AI Agents tuning. Each parameter influences how the model generates text, affecting creativity, coherence, determinism, and overall output quality. Teammately's system provides optimized parameter management for a specific use case.

Core Parameters

Temperature

Temperature controls the randomness of the model's outputs:

Low temperature [0.0-0.3]: More deterministic, focused responses. Ideal for factual Q&A, structured data extraction, and use cases requiring consistency
Medium temperature [0.4-0.7]: Balanced creativity and determinism. Suitable for most general-purpose applications and conversational AI
High temperature [0.8-1.0]: Increased randomness and creativity. Useful for brainstorming, creative writing, and generating diverse alternatives

Top_p (Nucleus Sampling)

Top_p sampling selects from the smallest possible set of tokens whose cumulative probability exceeds the threshold p:

Low values [0.1-0.5]: More focused, deterministic outputs with limited variety
Medium values [0.5-0.8]: Balanced approach suitable for most applications
High values [0.9-1.0]: Considers a wider range of tokens, increasing diversity

Top_p can be used alongside or as an alternative to temperature. While temperature adjusts probability distribution, top_p truncates the distribution to the most likely tokens.

Max Tokens

This parameter sets the maximum length of the generated response. Setting too low may result in truncating important information, while setting too high may lead to unnecessary verbosity or hallucinations.

When using Teammately Agent, it dynamically adjusts max tokens based on input context length, agent's complexity, and response type.

Frequency Penalty

Frequency penalty reduces the likelihood of repetition by penalizing tokens that have already appeared in the generated text:

0.0: No penalty applied
0.1-0.5: Mild discouragement of repetition while preserving natural language patterns
0.6-1.0: Strong prevention of repetition, useful for reducing "stuck in a loop" behaviors

This parameter is particularly useful for longer generations where repetition can become problematic.

Presence Penalty

Presence penalty reduces the likelihood of the model repeating the same topics:

0.0: No penalty applied
[0.1-0.5]: Gentle encouragement toward topic diversity
[0.6-1.0]: Strong push toward covering new topics

Unlike the frequency penalty (which targets specific repeated tokens), the presence penalty discourages thematic repetition more broadly.

Output Formatting

Teammately supports multiple output formats to match your integration needs:

Text Output - The default format providing natural language responses. Usually best for user-facing conversations, narrative outputs, and creative content
JSON Schema - Structured output conforming to a predefined schema. Use it to ensure consistent, parseable responses from agents and to simplify integration with existing services
JSON Object - Free-form structured data without a predefined schema. It is more flexible than schema-based outputs and supports dynamic response structures

Parameter Selection Strategy

Teammately's Agent parameter management system:

Analyzes your use case requirements
Considers the target model's characteristics
Dynamically adjusts parameters based on input context
Applies best practices from extensive testing
Continuously optimizes based on performance metrics

Our approach eliminates the need for manual parameter tuning while ensuring optimal results across different models and use cases.

Core Parameters​

Temperature​

Top_p (Nucleus Sampling)​

Max Tokens​

Frequency Penalty​

Presence Penalty​

Output Formatting​

Parameter Selection Strategy​