Note
Access to this page requires authorization. You can try signing in or .
Access to this page requires authorization. You can try .
Plan your CycleCloud production deployment
Before you deploy Azure CycleCloud in a production environment, you need to carefully plan your infrastructure, configuration, and operational processes. This article provides guidance on key decisions and requirements to ensure a successful and reliable CycleCloud deployment. It covers initial setup, application integration, data management, and disaster recovery.
Azure CycleCloud deployment
- Choose the version of CycleCloud to deploy:
- Prepare your Azure subscription by choosing the subscription, virtual network, subnet, and resource group for the CycleCloud server deployment
- Choose the resource group to host clusters or let CycleCloud create the resource group (default setting)
- Create a storage account for locker access
- Decide if you want to use SSH keys, Microsoft Entra ID, or LDAP for authentication
- Confirm which SKU to use for CycleCloud: CycleCloud System Requirements
- Decide if you want to deploy the environment in a locked down network. If so, consider the following requirements: Operating in a locked down network
- Deploy the CycleCloud server
Warning
Don't set "Enable hierarchical namespace" for Azure Data Lake Storage Gen 2 during storage account creation. CycleCloud can't use Blob storage with ADLS Gen 2 enabled as a storage Locker.
Azure CycleCloud Configuration
- Sign in to the CycleCloud server and create a site and a CycleCloud admin account: CycleCloud Setup
- Create CycleCloud locker that points to the storage account
Azure CycleCloud cluster configuration
- Define user access to the clusters Cluster User Management
- Choose the scheduler to use.
- Choose the version for the scheduler and head node.
- Choose the versions for the compute and execute nodes. This choice depends entirely on the application you're running.
- Decide whether to deploy clusters using a template or manually:
- Define and upload cluster templates to the locker: Cluster Template Reference.
- Manually create a cluster: Create a New Cluster.
- Decide if you need to run any scripts on the scheduler or execute nodes once deployed:
Applications
- What dependencies (libraries, and so on) do the applications have? How will you make these dependencies available?
- How long does it take to set up and install an application? This factor might determine how you make the application available to the execution nodes. It might also require a custom image.
- Are there any license dependencies that you need to consider? Does the application need to contact an on-premises license server?
- Where will you execute the applications? This choice depends on install times and performance requirements:
- Through a custom image:
- Using a marketplace image
- From an NFS share, blob storage, Azure NetApp Files
- Is there a specific VM version you need to use for the applications to run on? Is MPI a requirement? If it is, you'll need a different family of machines, like the H series.
- What's the best number of cores per job for each application?
- Can you use spot VMs? Using Spot VMs in CycleCloud
- Make sure you have the right subscription quotas to meet the core requirements for the applications.
Data
- Determine where in Azure the input data resides. This determination depends on the performance of the applications and data size.
- Locally on the execute nodes
- From an NFS share
- In blob storage
- Using Azure NetApp Files
- Determine if there's any post-processing needed on the output data
- Decide where the output data resides once processing is complete
- Decide if the output data needs to be copied elsewhere
- Determine archive and backup requirements
Job Submission
- How do users submit jobs?
- Do users have a script to run on the scheduler VM, or is there a frontend to help with data upload and job submission?
Backup and disaster recovery
- Will you use templates for cluster creation? Using templates makes recreating a CycleCloud server faster and keeps deployments consistent.
- What are your disaster recovery requirements? What would happen to your business if an Azure region wasn't available when you expected?
- Did your internal business define any application SLAs?
- Can you use another region as a standby?
- Are your jobs long running? Would checkpointing help?
Feedback
Was this page helpful?
