Advanced Usage

Symbolic Variables and Batch Mode

If you want to generate responses for a batch of examples, you can achieve this by preparing a prompt with symbolic variables and providing input data that will be injected into this prompt. allms will automatically make these requests in an async mode and retry them in case of any API error.

Let's say we want to classify reviews of coffee as positive or negative. Here's how to do it:

from allms.models import AzureOpenAIModel
from allms.domain.configuration import AzureOpenAIConfiguration
from allms.domain.input_data import InputData

configuration = AzureOpenAIConfiguration(
    api_key="<OPENAI_API_KEY>",
    base_url="<OPENAI_API_BASE>",
    api_version="<OPENAI_API_VERSION>",
    deployment="<OPENAI_API_DEPLOYMENT_NAME>",
    model_name="<OPENAI_API_MODEL_NAME>"
)

model = AzureOpenAIModel(config=configuration)

positive_review_0 = "Very good coffee, lightly roasted, with good aroma and taste. The taste of sourness is barely noticeable (which is good because I don't like sour coffees). After grinding, the aroma spreads throughout the room. I recommend it to all those who do not like strongly roasted and pitch-black coffees. A very good solution is to close the package with string, which allows you to preserve the aroma and freshness."
positive_review_1 = "Delicious coffee!! Delicate, just the way I like it, and the smell after opening is amazing. It smells freshly roasted. Faithful to Lavazza coffee for years, I decided to look for other flavors. Based on the reviews, I blindly bought it and it was a 10-shot, it outperformed Lavazze in taste. For me the best."
negative_review = "Marketing is doing its job and I was tempted too, but this coffee is nothing above the level of coffees from the supermarket. And the method of brewing or grinding does not help here. The coffee is simply weak - both in terms of strength and taste. I do not recommend."

prompt = "You'll be provided with a review of a coffe. Decide if the review is positive or negative. Review: {review}"
input_data = [
    InputData(input_mappings={"review": positive_review_0}, id="0"),
    InputData(input_mappings={"review": positive_review_1}, id="1"),
    InputData(input_mappings={"review": negative_review}, id="2")
]

responses = model.generate(prompt=prompt, input_data=input_data)

As an output we'll get List[ResponseData] where each ResponseData will contain response for a single example from input_data. The requests are performed in an async mode, so remember that the order of the responses is not the same as the order of the input_data. That's why together with the response, we pass also the ResponseData.input_data to the output.

So let's see the responses:

>>> {f"review_id={response.input_data.id}": response.response for response in responses}
{
    'review_id=0': 'The review is positive.',
    'review_id=1': 'The review is positive.',
    'review_id=2': 'The review is negative.'
}

Multiple symbolic variables

You can also define prompt with multiple symbolic variables. The rule is that each symbolic variable from the prompt should have mapping provided in the input_mappings of InputData. Let's say we want to provide two reviews in one prompt and let the model decide which one of them is positive. Here's how to do it:

prompt = """You'll be provided with two reviews of a coffee. Decide which one is positive.

First review: {first_review}
Second review: {second_review}"""
input_data = [
    InputData(input_mappings={"first_review": positive_review_0, "second_review": negative_review}, id="0"),
    InputData(input_mappings={"first_review": negative_review, "second_review": positive_review_1}, id="1"),
]

responses = model.generate(prompt=prompt, input_data=input_data)

And the results:

>>> {f"example_id={response.input_data.id}": response.response for response in responses}
{
    'example_id=0': 'The first review is positive.',
    'example_id=1': 'The second review is positive.'
}

Controlling the Number of Concurrent Requests

As it's written above, allms automatically makes requests in an async mode. By default, the maximum number of concurrent requests is set to 1000. You can control this value by setting the max_concurrency parameter when initializing the model. Set it to a value that is appropriate for your model endpoint.

Using a common asyncio event loop

By default, each model instance has its own event loop for handling the execution of async tasks. If you want to use a common loop for multiple models or to have a custom loop, it's possible to specify it in the model constructor:

import asyncio

from allms.models import AzureOpenAIModel
from allms.domain.configuration import AzureOpenAIConfiguration

custom_event_loop = asyncio.new_event_loop()

configuration = AzureOpenAIConfiguration(
    api_key="<OPENAI_API_KEY>",
    base_url="<OPENAI_API_BASE>",
    api_version="<OPENAI_API_VERSION>",
    deployment="<OPENAI_API_DEPLOYMENT_NAME>",
    model_name="<OPENAI_API_MODEL_NAME>"
)

model = AzureOpenAIModel(
    config=configuration,
    event_loop=custom_event_loop
)