Case Study | Building Flexible GPU Clouds with HAMi at DaoCloud
Discover how DaoCloud operates two major cloud native platforms for AI workloads—D.run Compute Cloud and DaoCloud Enterprise—using HAMi to achieve >80% GPU utilization across 10,000+ GPUs spanning 10+ data centers.
Company Overview
DaoCloud operates two major cloud native platforms for AI workloads. D.run Compute Cloud is a public GPU cloud serving individual developers and small teams, while DaoCloud Enterprise (DCE) is a private Kubernetes platform for enterprise customers running both training and inference.
Two major cloud native platforms
D.run Compute Cloud for public GPU cloud
DaoCloud Enterprise (DCE) for private K8s
10+ data centers across China and Hong Kong
DaoCloud
Cloud native platform provider for AI workloads
Challenges in GPU Resource Management
As GPU demand grew rapidly across both platforms, several challenges emerged that required a flexible GPU virtualization solution.
Whole-card Allocation
Many inference and lightweight workloads used only a fraction of GPU resources, leaving significant portions of compute and memory underutilized and limiting how DaoCloud could package GPU SKUs.
Heterogeneous Hardware
DaoCloud needed to support mainstream NVIDIA GPUs while also integrating domestic accelerators from multiple vendors. Proprietary vGPU solutions increased licensing costs.
Multi-tenant Governance
On DCE, enterprise customers wanted shared GPU pools with department-level quotas, queue-based resource allocation, and clear isolation across teams.
Cloud Native Alignment
DaoCloud's core strategy revolves around Kubernetes and open-source technologies. Any GPU sharing solution had to stay fully cloud native, vendor-agnostic, and compatible with existing CNCF tooling.
HAMi as the Unified GPU Layer
DaoCloud adopted HAMi, a CNCF Sandbox project, for heterogeneous AI computing virtualization, as the unified GPU layer across both D.run and DCE. HAMi provides device virtualization, vGPU partitioning, and scheduling for heterogeneous accelerators in Kubernetes clusters.
D.run Compute Cloud: vGPU SKUs for Public GPU Users
On D.run, DaoCloud integrated HAMi into each regional Kubernetes cluster to enable fine-grained GPU sharing and higher utilization.
vGPU Slicing
Physical GPUs partitioned into multiple vGPU slices with defined compute and memory. Lightweight inference jobs can run on fractional GPUs.
SKU-based Marketplace
vGPU slices are exposed as standardized SKUs in a central marketplace. Users select GPU SKUs based on workload size.
Multi-region Deployment
HAMi powers 7 active D.run regions across Mainland China and Hong Kong, covering over 10 data centers.
Domestic Accelerator Support
DaoCloud extended HAMi to support domestic GPU vendors, ensuring consistent management under a unified abstraction layer.
DaoCloud Enterprise (DCE): Shared GPU Pool for Enterprise
On DCE, DaoCloud built a centralized GPU resource pool using HAMi, unifying GPU capacity for multiple enterprise tenants.
Unified GPU Pool
Enterprise users contribute and consume GPUs from a central pool that serves both training and inference workloads.
Quotas & RBAC
HAMi's vGPU resources are integrated with DaoCloud's existing quota and role-based access systems.
Simplified Experience
Algorithm engineers request GPU resources through the platform without worrying about underlying hardware differences.
Co-developing HAMi with the Community
DaoCloud has been one of HAMi's earliest and most active contributors.
Contributed real-world insights from D.run and DCE back to the open-source community
Collaborated upstream to improve GPU over-subscription mechanisms, node configuration management, and heterogeneous hardware handling
Helped maintain documentation and deployment guides to support production adoption by other cloud providers
Significant Results: Cost Reduction and Improved Efficiency
By integrating HAMi, DaoCloud consolidated previously fragmented GPU resources into a more unified, efficient, and scalable GPU layer across both public and private clouds.
GPU Utilization
>80%
Average utilization per card after HAMi deployment
Cost Reduction
20-30%
Reduction in GPU-related operating costs
Unified Abstraction
Single Layer
Across NVIDIA and domestic GPUs
Deployment Scale
10,000+ GPUs
Across 10+ data centers
Multi-region
7 Regions
Active D.run regions across China
Open Collaboration
Active
Contributing improvements upstream
“HAMi is more than compatible with DaoCloud's business, it's something we've built together. As one of HAMi's earliest contributors, we've witnessed its evolution from inception to maturity. HAMi now runs across both D.run and DCE, and our real-world improvements continuously flow back to the community. HAMi and DaoCloud share the same open-source DNA, and we'll continue contributing to HAMi to bring true vGPU technology to the world.”
Open-Source Partnership Success
DaoCloud has successfully integrated HAMi across both its public and private GPU cloud platforms, achieving dramatic improvements in utilization and cost efficiency while contributing back to the open-source community.