Discover our new AI Inference feature!
Model image
Set startup command
Port
Pod configurations

To run the model, the system will deploy a Kubernetes pod and allocate the specified amount of memory (MiB) and CPU or GPU-optimized resources.

Routing placement
Please, select flavor
Autoscaling limits

If the pod cannot handle all user requests, the system will deploy additional pods. When the load is down, they will be deleted.

sec

Autoscalling triggers

When triggers exceed thresholds, pod numbers increase within limits and decrease to the initial level when values drop, ensuring stable operation and high performance.

%
%
Environment variables

Environment variables created in your container will be available only here.

API Keys
Enable API Key authentication
Pod lifetime

This is how long the autoscaling waits before deleting the pod that doesn't receive requests. The countdown starts from the last request.

sec
Deployment details
Plan
$ / per monthApproximately: $ per hour
Type
Regions