How to monitor GPT API usage in the Serverspace dashboard

Control panel FAQ

When working with the GPT API in the Serverspace panel, it is important to understand that the cost of requests directly depends on the number of processed tokens. Tokens are units of text that make up the model’s request and response.

One token is approximately equal to:

1 word (on average) in English,
or parts of a word / characters in other languages.

The longer the request and response, the higher the cost.

Token pricing

Serverspace uses separate pricing — each model has its own rates for input and output tokens, for example for the model GPT-5.3 Codex:

Input token cost: 2.3 €/ 1M tokens
Output token cost: 18.42 €/ 1M tokens

What this means in practice:
Input tokens are the text you send to the model (prompt, instructions, context).
Output tokens are the response generated by the model.

Important: output tokens are usually more expensive because they require more computational resources.

Why token control is important

Controlling token usage helps to:

avoid unexpected costs,
predict API budget,
optimize response quality/length,
manage load in applications.

Without token limits, the model may generate overly long responses, increasing the cost of each request.

How token limits work

The Serverspace panel provides a setting:

Maximum number of tokens
This setting defines the upper limit of the model’s response length.

If a limit is set: the model cannot exceed the specified number of tokens in its response; the output will be automatically cut off when the limit is reached; you get a predictable cost for each request.

Benefits of using token limits

Using token limits allows you to:

Control your budget — prevents overly long and expensive responses
Increase cost predictability — easier to plan expenses
Optimize performance — faster responses with less text
Flexibly manage model behavior — balance between brevity and detail