cloudera.cloud.ml module – Create or Destroy CDP Machine Learning Workspaces

Note

This module is part of the cloudera.cloud collection (version 2.5.1).

It is not included in ansible-core. To check whether it is installed, run ansible-galaxy collection list.

To install it, use: ansible-galaxy collection install cloudera.cloud. You need further requirements to be able to use this module, see Requirements for details.

To use it in a playbook, specify: cloudera.cloud.ml.

Synopsis 

Create or Destroy CDP Machine Learning Workspaces

Requirements 

The below requirements are needed on the host that executes this module.

cdpy

Parameters 

Parameter	Comments
cdp_region aliases: cdp_endpoint_region, endpoint_region string	Specify the Cloudera Data Platform endpoint region. Default: `"default"`
database aliases: existing_database, database_config dictionary	Configuration for exporting model metrics to an existing Postgres database.
existingDatabaseHost string	The Postgres hostname
existingDatabaseName string	The Postgres database name
existingDatabasePassword string	The Postgres password
existingDatabasePort string	The Postgres port
existingDatabaseUser string	The Postgres user
debug aliases: debug_endpoints boolean	Capture the CDP SDK debug log. Choices: `false` ← (default) `true`
delay aliases: polling_delay integer	The internal polling interval (in seconds) while the module waits for the ML Workspace to achieve the declared state. Default: `15`
environment aliases: env string / required	The name of the Environment for the ML Workspace
force aliases: force_delete boolean	Flag to force delete a workspace even if errors occur during deletion. Force delete removes the guarantee that the cloud provider resources are destroyed. Applicable to `state=absent` only. Choices: `false` ← (default) `true`
governance aliases: enable_governance boolean	The flag to enable governance by integrating with Cloudera Atlas for the ML Workspace. Choices: `false` ← (default) `true`
ip_addresses aliases: loadbalancer_access_ips list / elements=string	List of allowed CIDR blocks for the load balancer.
k8s_request aliases: provision_k8s dictionary	Configuration for the Kubernetes provisioning of the ML Workspace.
environmentName string / required	The Environment for the ML Workspace
instanceGroups list / elements=dictionary / required	The instance groups for the ML Workspace provisioning request
autoscaling dictionary	The autoscaling configuration for the instance group
enabled boolean	The flag enabling autoscaling Choices: `false` `true` ← (default)
maxInstance integer / required	The maximum number of instances
minInstance integer / required	The minimum number of instances
ingressRules list / elements=string	The networking rules for the ingress
instanceCount integer	The initial number of instances Default: `0`
instanceTier string	The provision tier of the instances. For example, `ON_DEMAND`.
instanceType string / required	The cloud provider instance type for the instance. For example, (AWS) `m5.2xlarge`.
name string	A unique name of the instance group
rootVolume dictionary	Configuration of the root volume for each instance
size integer / required	The volume size (in GB)
network dictionary	The overlay network for the Container Network Interface (CNI). AWS only.
plugin string	The identifier for the specific Container Network Interface (CNI) vendor For example, calico, weave.
topology dictionary	The options for overlay topology
subnets list / elements=string	Configuration for the topology subnets
tags dictionary	Tags to add to the cloud provider resources
key string	The key/value pair for the tag
metrics aliases: enable_metrics boolean	The flag to enable the exporting of model metrics to a metrics store for the ML Workspace. Choices: `false` ← (default) `true`
monitoring aliases: enable_monitoring boolean	The flag to manage monitoring for the ML Workspace. Choices: `false` ← (default) `true`
name aliases: workspace string / required	The name of the ML Workspace
nfs aliases: existing_nfs string	An existing NFS mount (hostname and desired path). Applicable to Azure and Private Cloud deployments only.
nfs_version string	The NFS Protocol version of the NFS server as declared in `nfs`. Applicable to Azure and Private Cloud deployments only.
private_cluster aliases: enable_private_cluster boolean	Flag to specify if a private K8s cluster should be created. Choices: `false` ← (default) `true`
profile string	If provided, the CDP SDK will use this value as its profile.
public_loadbalancer aliases: enable_public_loadbalancer boolean	Flag to manage the usage of a public load balancer. Choices: `false` ← (default) `true`
state string	The declarative state of the ML Workspace Choices: `"present"` ← (default) `"absent"`
storage aliases: remove_storage boolean	Flag to delete the ML Workspace backing storage during delete operations. Applicable to `state=absent` only. Choices: `false` `true` ← (default)
timeout aliases: polling_timeout integer	The internal polling timeout (in seconds) while the module waits for the ML Workspace to achieve the declared state. Default: `3600`
tls aliases: enable_tls boolean	The flag to manage TLS for the ML Workspace. Choices: `false` `true` ← (default)
verify_endpoint_tls aliases: endpoint_tls boolean	Verify the TLS certificates for the CDP endpoint. Choices: `false` `true` ← (default)
wait boolean	Flag to enable internal polling to wait for the ML Workspace to achieve the declared state. If set to FALSE, the module will return immediately. Choices: `false` `true` ← (default)

Examples 

# Note: These examples do not set authentication details.

# Create a ML Workspace with TLS turned off and wait for setup completion
- cloudera.cloud.ml:
    name: ml-example
    env: cdp-env
    tls: no
    wait: yes

# Create a ML Workspace (in AWS) with a custom Kubernetes request configuration
- cloudera.cloud.ml:
    name: ml-k8s-example
    env: cdp-env
    k8s_request:
      environmentName: cdp-env
      instanceGroups:
        - name: default_settings
          autoscaling:
            maxInstances: 10
            minInstances: 1
          instanceType: m5.2xlarge
        - name: cpu_settings
          autoscaling:
            maxInstances: 10
            minInstances: 1
          instanceCount: 0
          instanceTier: "ON_DEMAND"
          instanceType: m5.2xlarge
          rootVolume:
            size: 60
        - name: gpu_settings
          autoscaling:
            maxInstances: 1
            minInstances: 0
          instanceCount: 0
          instanceTier: "ON_DEMAND"
          instanceType: "p2.8xlarge"
          rootVolume:
            size: 40
      wait: yes

# Remove a ML Workspace, but return immediately
- cloudera.cloud.ml:
    name: ml-example
    env: cdp-env
    state: absent
    wait: no

Return Values 

Common return values are documented here, the following are the fields unique to this module:

Key	Description
sdk_out string	Returns the captured CDP SDK log. Returned: when supported
sdk_out_lines list / elements=string	Returns a list of each line of the captured CDP SDK log. Returned: when supported
workspace dictionary	The information about the ML Workspace Returned: when supported
cloudPlatform string	The cloud platform of the environment that was used to create this workspace. Returned: always
clusterBaseDomain string	The basedomain of the cluster. Returned: when supported
creationDate string	Creation date of workspace (date-time). Returned: always Sample: `"2021-05-19T15:35:17.997000+00:00"`
creatorCrn string	The CRN of the creator of the workspace. Returned: always
crn string	The CRN of the workspace. Returned: always
endpointPublicAccess boolean	Flag indicating if the cluster is publicly accessible. Returned: always
environmentCrn string	CRN of the environment. Returned: always
environmentName string	The name of the workspace’s environment. Returned: always
failureMessage string	Failure message from the most recent failure that has occurred during workspace provisioning. Returned: during failure
filesystemID string	A filesystem ID referencing the filesystem that was created on the cloud provider environment that this workspace uses. Returned: always
governanceEnabled boolean	Flag indicating if Cloudera Atlas governance is enabled for the cluster. Returned: when supported
healthInfoLists list / elements=dictionary	The health info information of the workspace. Returned: success
HealthInfo list / elements=string	Healthinfo object contains the health information of a resource. Returned: always
details list / elements=string	The detail of the health info. Returned: always
isHealthy boolean	The boolean that indicates the health status. Returned: always
message string	The message to show for the health info. Returned: always
resourceName string	The resource name being checked. Returned: always
updatedAt string	The unix timestamp for the heartbeat. Returned: always
httpsEnabled boolean	Indicates if HTTPS communication was enabled on this workspace when provisioned. Returned: always
instanceGroups list / elements=dictionary	The instance groups details for the cluster. Returned: always
instanceCount integer	The initial number of instance nodes. Returned: always
instanceGroupName string	The unique name of the instance group. Returned: always
instances list / elements=dictionary	Instances in the instance group. Returned: always
availabilityZone string	Availability zone of the instance. Returned: always
instanceId string	Unique instance Id generated by the cloud provider. Returned: always
instanceType string	The cloud provider instance type for the node instances. Returned: always
maxInstances integer	The maximum number of instances that can be deployed to this instance group. Returned: always
minInstances integer	The minimum number of instances that can be deployed to this instance group. If the value is 0, the group might be empty. Returned: always
tags list / elements=dictionary	Key/value pairs applied to all applicable resources deployed in cloud provider. Returned: always
key string	Tag name Returned: always
value string	Tag value Returned: always
instanceName string	The name of the workspace. Returned: always
instanceStatus string	The workspace’s current status. Returned: always
instanceUrl string	URL of the workspace’s user interface. Returned: always
k8sClusterName string	The Kubernetes cluster name. Returned: always
loadBalancerIPWhitelists list / elements=string	The whitelist of ips for loadBalancer. Returned: always
modelMetricsEnabled boolean	Flag indicating if model metrics export is enabled for the cluster. Returned: when supported
monitoringEnabled boolean	If usage monitoring is enabled or not on this workspace. Returned: always
tags list / elements=dictionary	Tags provided by the user at the time of workspace creation. Returned: always
key string	Tag name Returned: always
value string	Tag value Returned: always
version string	The version of Cloudera Machine Learning that was installed on the workspace. Returned: always

Authors

Webster Mudge (@wmudge)
Dan Chaffelson (@chaffelson)

cloudera.cloud.ml module – Create or Destroy CDP Machine Learning Workspaces

Synopsis

Requirements

Parameters

Examples

Return Values