cloudera.cloud.datahub_cluster module – Manage CDP Datahubs
Note
This module is part of the cloudera.cloud collection (version 2.5.1).
It is not included in ansible-core
.
To check whether it is installed, run ansible-galaxy collection list
.
To install it, use: ansible-galaxy collection install cloudera.cloud
.
You need further requirements to be able to use this module,
see Requirements for details.
To use it in a playbook, specify: cloudera.cloud.datahub_cluster
.
Synopsis
Create and delete CDP Datahubs.
Requirements
The below requirements are needed on the host that executes this module.
cdpy
Parameters
Parameter |
Comments |
---|---|
Name of the image catalog to use for cluster instances |
|
Specify the Cloudera Data Platform endpoint region. Default: |
|
Capture the CDP SDK debug log. Choices:
|
|
The name or CRN of the cluster definition to use for cluster creation. |
|
The internal polling interval (in seconds) while the module waits for the datahub to achieve the declared state. Default: |
|
The CDP environment name or CRN to which the datahub will be attached. |
|
Cluster extensions for Data Hub cluster. |
|
Flag indicating if the datahub should be force deleted. This option can be used when cluster deletion fails. This removes the entry from Cloudera Datahub service. Any lingering resources have to be deleted from the cloud provider manually. Choices:
|
|
Instance group details. |
|
The attached volume configuration. This does not include root volume. |
|
The attached volume count. |
|
The attached volume size. |
|
The attached volume type. |
|
The instance group name. |
|
The instance group type. |
|
The cloud provider specific instance type to be used. |
|
Number of instances in the instance group |
|
The names or CRNs of the recipes that would be applied to the instance group. |
|
Recovery mode for the instance group. |
|
The root volume size. |
|
The list of subnet IDs in case of multi-availability zone setup. Specifying this field overrides the datahub level subnet ID setup for the multi-availability zone configuration. |
|
The volume encryption settings. This setting does not apply to Azure, which always encrypts volumes. |
|
Enable encyrption for all volumes in the instance group. Default is false. Choices:
|
|
The ARN of the encryption key to use. If nothing is specified, the default key will be used. |
|
ID of the image used for cluster instances |
|
(AWS) Flag indicating whether to defer to the CDP Environment for availability zone/subnet placement. Useful for when you are not sure which subnet is available to the datahub cluster. Choices:
|
|
The name of the datahub. This name must be unique, must have between 5 and 20 characters, and must contain only lowercase letters, numbers, and hyphens. Names are case-sensitive. |
|
If provided, the CDP SDK will use this value as its profile. |
|
The declarative state of the datahub. If creating a datahub, the associate Environment and Datalake must be started as well. Choices:
|
|
The subnet ID in AWS, or the Subnet Name on Azure or GCP Mutually exclusive with the subnet and subnets options |
|
List of subnet IDs in case of multi availability zone setup. Mutually exclusive with the subnet and subnets options |
|
JMESPath expression to filter the subnets to be used for the load balancer The expression will be applied to the full list of subnets for the specified environment Each subnet in the list is an object with the following attributes - subnetId, subnetName, availabilityZone, cidr The filter expression must only filter the list, but not apply any attribute projection Mutually exclusive with the subnet and subnets options |
|
Tags associated with the datahub and its resources. |
|
Name or CRN of the cluster template to use for cluster creation. |
|
The internal polling timeout (in seconds) while the module waits for the datahub to achieve the declared state. Default: |
|
Verify the TLS certificates for the CDP endpoint. Choices:
|
|
Flag to enable internal polling to wait for the datahub to achieve the declared state. If set to FALSE, the module will return immediately. Choices:
|
Examples
# Note: These examples do not set authentication details.
- name: Create a datahub specifying instance group details (and do not wait for status change)
cloudera.cloud.datahub_cluster:
name: datahub-name
env: name-or-crn
state: present
subnet: subnet-id-for-cloud-provider
image: image-uuid-from-catalog
catalog: name-of-catalog-for-image
template: template-name
groups:
- nodeCount: 1
instanceGroupName: master
instanceGroupType: GATEWAY
instanceType: instance-type-for-cloud-provider
rootVolumeSize: 100
recoveryMode: MANUAL
recipeNames: []
attachedVolumeConfiguration:
- volumeSize: 100
volumeCount: 1
volumeType: volume-type-for-cloud-provider
tags:
project: Arbitrary content
wait: no
- name: Create a datahub specifying only a definition name
cloudera.cloud.datahub_cluster:
name: datahub-name
env: name-or-crn
definition: definition-name
tags:
project: Arbitrary content
wait: no
- name: Stop the datahub (and wait for status change)
cloudera.cloud.datahub_cluster:
name: example-datahub
state: stopped
- name: Start the datahub (and wait for status change)
cloudera.cloud.datahub_cluster:
name: example-datahub
state: started
- name: Delete the datahub (and wait for status change)
cloudera.cloud.datahub_cluster:
name: example-datahub
state: absent
Return Values
Common return values are documented here, the following are the fields unique to this module:
Key |
Description |
---|---|
The information about the Datahub Returned: always |
|
The Cloudera Manager details. Returned: success |
|
CDP Platform version. Returned: when supported |
|
Cloudera Manager version. Returned: always |
|
The cloud platform. Returned: when supported |
|
The name of the cluster. Returned: always |
|
The status of the cluster. Returned: when supported |
|
The CRN of the cluster template used for the cluster creation. Returned: when supported |
|
The date when the cluster was created. Return value is a date timestamp. Returned: when supported |
|
The CRN of the credential. Returned: when supported |
|
The CRN of the cluster. Returned: always |
|
The CRN of the attached datalake. Returned: when supported |
|
The exposed service API endpoints. Returned: when supported |
|
The endpoints. Returned: always |
|
The more consumable name of the exposed service. Returned: always |
|
The related knox entry. Returned: always |
|
The SSO mode of the given service. Returned: always |
|
Flag of the access status of the given endpoint. Returned: always |
|
The name of the exposed service. Returned: always |
|
The server url for the given exposed service’s API. Returned: always |
|
The CRN of the environment. Returned: when supported |
|
The name of the environment. Returned: when supported |
|
The image details. Returned: when supported |
|
The image catalog name. Returned: when supported |
|
The image catalog URL. Returned: when supported |
|
The ID of the image used for cluster instances. This is internally generated by the cloud provider to uniquely identify the image. Returned: when supported |
|
The name of the image used for cluster instances. Returned: when supported |
|
The instance details. Returned: when supported |
|
List of availability zones associated with the instance group. Returned: when supported |
|
List of instances in this instance group. Returned: always |
|
List of volumes attached to this instance. Returned: when supported |
|
The number of volumes. Returned: when supported |
|
The size of each volume in GB. Returned: when supported |
|
The type of volumes. Returned: when supported |
|
The availability zone of the instance. Returned: when supported |
|
Flag indicating if Cloudera Manager has been deployed or not. Returned: when supported |
|
The fully-qualified domain name (FQDN) of the instance. Returned: when supported |
|
The ID of the given instance. Returned: always |
|
The name of the instance group associated with the instance. Returned: when supported |
|
The type of the given instance. Values are Returned: always |
|
The VM type of the instance. Supported values depend on the cloud platform. Returned: when supported |
|
The private IP of the given instance. Returned: when supported |
|
The public IP of the given instance. Returned: when supported |
|
The rack ID of the instance in Cloudera Manager. Returned: when supported |
|
The SSH port for the instance. Returned: when supported |
|
The health state of the instance.
Returned: always |
|
The status of the instance. This includes information like whether the instance is being provisioned, stopped, decommissioning failures etc. Returned: when supported |
|
The reason for the current status of this instance. Returned: when supported |
|
The subnet ID of the instance. Returned: when supported |
|
The name of the instance group where the given instance is located. Returned: always |
|
The list of subnet IDs in case of multi-availability zone setup. Returned: when supported |
|
The cluster node count. Returned: when supported |
|
The status of the stack. Returned: when supported |
|
The status reason. Returned: when supported |
|
The workload type for the cluster. Returned: when supported |
|
Returns the captured CDP SDK log. Returned: when supported |
|
Returns a list of each line of the captured CDP SDK log. Returned: when supported |