Model Configs

Model configs are a configuration of a particular model. It allows you to select your model provider and tweak settings to guide the generative model's behaviour.

You can have multiple model configs per model, choose which one is the default and in Workflows choose the config you want to use for that workflow.

Dependant on the selected model, configurations typicallt consist out of:

  • Max Tokens: this parameter instructs how much tokens (words, characters, data) you want the model to maximally give in it's responses. It controls thinks like quality, speed, relevancy. You can also let Lleverage dynamically calculate the Max tokens based on your prompt / use case by using max_tokens=-1.

  • Temperature: this parameter allows you to influence the randomness and creativity of the response by the model. The higher the value, the more creative and random the responses will get.

  • Top-P (Nucleus): this parameter helps in managing the randomness of the model's responses by focusing on the most probable next words. Top-p controls the cumulative probability threshold; the model considers a smaller set of options that cumulatively are below this threshold. This sampling method can help in generating more coherent and contextually appropriate text while still allowing for creativity and variation.

  • Frequency Penalty: This parameter penalizes new tokens based on their frequency in the text already generated, thus decreasing the model's tendency to repeat itself. It can be particularly useful in applications like long-form content generation where repetitiveness can detract from the quality of the output.

  • Presence Penalty: Similar to the frequency penalty, the presence penalty discourages the model from repeating the same terms, enhancing the variety and richness of the content. It’s useful in keeping the dialogue or text output engaging and varied, especially in conversational AI applications.

  • Stop Sequences: These are specific tokens or sequences of tokens where, once generated, the model stops generating any further tokens. This is particularly useful for structuring responses or ensuring that the model does not generate beyond a desired endpoint, such as the end of a sentence or paragraph, or a specific closing statement in customer interactions.

Last updated