dstack is a critical piece of the agentic stack because it provides the programmable compute layer that autonomous agents require to execute heavy tasks. While many AI agents operate within the constraints of a browser or a text-based environment, 'agentic orchestration' allows these systems to transcend those limits. If an agent determines it needs to fine-tune a model or run a high-intensity simulation, dstack provides the API-driven infrastructure to spin up the necessary GPU resources, execute the containerized task, and then tear down the resources once complete.
Within the broader ecosystem, dstack occupies the infrastructure management layer. It is active in the space where software intent meets hardware reality. By championing a model where infrastructure is treated as a dynamic resource rather than a static cluster, they are pushing forward a future where agents can manage their own compute budgets and hardware requirements autonomously. This makes dstack a fundamental tool for developers building agents that need to perform real-world engineering or data science tasks.
The infrastructure requirements for artificial intelligence differ fundamentally from those of standard web applications. While tools like Kubernetes are the standard for CPU-based container management, they often introduce excessive overhead for machine learning teams who simply need to run a fine-tuning job or deploy an inference service. dstack is an open-source control plane designed to bridge this gap. It provides a specialized layer for GPU provisioning that is intended to be more accessible than Slurm and more focused than Kubernetes.
At its core, dstack allows users to define their infrastructure requirements—specifying vCPUs, RAM, and specific GPU types—within a configuration file. The platform then automates the process of finding and provisioning these resources across various environments. This includes major cloud providers like AWS, GCP, and Azure, as well as specialized GPU clouds such as Lambda, CoreWeave, and FluidStack. By providing a unified interface for these disparate backends, the software prevents teams from being locked into a single provider's ecosystem or pricing model.
The software operates by managing the lifecycle of 'runs' and 'fleets.' A run represents a specific task, such as a training job or a development environment, while a fleet allows for the reuse of existing instances to minimize startup times and costs. This abstraction is handled through a Python API or an HTTP API, allowing engineers to integrate infrastructure management directly into their code. For example, a developer can submit a task with a specific Docker image and environment variables, and dstack will handle the SSH tunneling, port forwarding, and data mounting required to execute that task on a remote GPU.
This approach solves a common problem in AI development: the scarcity and cost of hardware. By supporting multiple backends, dstack enables a strategy where jobs are sent to the provider with the lowest current price or the highest availability. If an NVIDIA H100 is unavailable on one cloud, the orchestrator can shift the workload to another without requiring the user to rewrite their deployment scripts.
Founded by Andrey Cheptsov, Vitaly Khudobakhshov, and Riwaj Sapkota, the company is headquartered in Munich, Germany. Cheptsov, who previously worked on developer tools at JetBrains, brings a focus on developer experience to the infrastructure space. The team's central thesis is that orchestration should be 'agentic.' In this context, agentic means the control plane does not just wait for static commands; it actively manages the state of the infrastructure to fulfill the user's intent, handling failures, scaling, and provisioning autonomously.
dstack is primarily used by ML engineers and DevOps teams at startups and research labs who find Kubernetes too complex to maintain. It is particularly relevant for those running distributed training across multiple nodes or serving large language models that require dynamic scaling. By remaining open-source, the project attracts a community of users who value portability and transparency in their infrastructure stack, positioning dstack as a key component in the growing move toward decentralized and multi-cloud AI computing.
An open-source control plane for GPU provisioning and orchestration.
Dstack is hiring