cloudera.cloud.datalake module – Manage CDP Datalakes
Note
This module is part of the cloudera.cloud collection (version 2.5.1).
It is not included in ansible-core
.
To check whether it is installed, run ansible-galaxy collection list
.
To install it, use: ansible-galaxy collection install cloudera.cloud
.
You need further requirements to be able to use this module,
see Requirements for details.
To use it in a playbook, specify: cloudera.cloud.datalake
.
Synopsis
Create and delete CDP Datalakes.
To start and stop a datalake, use the cloudera.cloud.env module to change the associated CDP Environment’s state.
Requirements
The below requirements are needed on the host that executes this module.
cdpy
Parameters
Parameter |
Comments |
---|---|
Specify the Cloudera Data Platform endpoint region. Default: |
|
Capture the CDP SDK debug log. Choices:
|
|
The internal polling interval (in seconds) while the module waits for the datalake to reach the declared state. Default: |
|
The CDP environment name or CRN to which the datalake will be attached. If the environment is AWS-based, instance_profile and storage must be present. Choices:
|
|
Flag indicating if the datalake should be force deleted. This option can be used when cluster deletion fails. This removes the entry from Cloudera Datalake service. Any lingering resources have to be deleted from the cloud provider manually. Choices:
|
|
(AWS) The IAM instance profile of the ID Broker role, which can assume the Datalake Admin S3 role. (Azure) The URI of the Identity of the ID Broker Role, which can assume the Datalake Admin ADLS role. (GCP) The Service Account email of the ID Broker Role, which can assume the Datalake Admin GCS role. |
|
(AWS) Flag indicating if the datalake is deployed across multi-availability zones. Choices:
|
|
The name of the datalake. This name must be unique, must have between 5 and 100 characters, and must contain only lowercase letters, numbers, and hyphens. Names are case-sensitive. |
|
If provided, the CDP SDK will use this value as its profile. |
|
Flag indicating if Ranger RAZ fine grained access should be enabled for the datalake Choices:
|
|
Recipes that will be attached on the datalake instances groups |
|
Datalake instance/host group group name, e.g. `master` or `idbroker`. |
|
Names of the recipes |
|
The Cloudera Runtime version for the datalake, when supported |
|
The scale of the datalake. Note that the choice of MEDIUM_DUTY_HA is unsupported since datalake version 7.2.18. Choices:
|
|
The declarative state of the datalake. If creating a datalake, the associate environment must be started as well. Choices:
|
|
(AWS) The S3 bucket (and optional path) for the Storage Location Base for the datalake, starting with (Azure) The ADLS bucket URI (and optional path) for the Datalake storage (GCP) The bucket name and optional path for the GCS Storage Location Base for the Datalake, starting with |
|
Tags associated with the datalake and its resources. |
|
The internal polling timeout (in seconds) while the module waits for the datalake to achieve the declared state. Default: |
|
Verify the TLS certificates for the CDP endpoint. Choices:
|
|
Flag to enable internal polling to wait for the datalake to achieve the declared state. If set to FALSE, the module will return immediately. Choices:
|
Examples
# Note: These examples do not set authentication details.
# Create a datalake in AWS
- cloudera.cloud.datalake:
name: example-datalake
state: present
environment: an-aws-environment-name-or-crn
instance_profile: arn:aws:iam::1111104421142:instance-profile/example-role
storage: s3a://example-bucket/datalake/data
tags:
project: Arbitrary content
# Create a datalake in AWS, but don't wait for completion (see datalake_info for datalake status)
- cloudera.cloud.datalake:
name: example-datalake
state: present
wait: no
environment: an-aws-environment-name-or-crn
instance_profile: arn:aws:iam::1111104421142:instance-profile/example-role
storage: s3a://example-bucket/datalake/data
tags:
project: Arbitrary content
# Delete the datalake (and wait for status change)
cloudera.cloud.datalake:
name: example-datalake
state: absent
Return Values
Common return values are documented here, the following are the fields unique to this module:
Key |
Description |
---|---|
The information about the Datalake Returned: on success |
|
AWS-specific configuration details. Returned: when supported |
|
The instance profile used for the ID Broker instance. Returned: always |
|
Azure-specific environment configuration information. Returned: when supported |
|
The managed identity used for the ID Broker instance. Returned: always |
|
The Cloudera Manager details. Returned: when supported |
|
Cloudera Manager repository URL. Returned: always |
|
Cloudera Manager server URL. Returned: when supported |
|
Cloudera Manager version. Returned: always Sample: |
|
Cloud provider of the Datalake. Returned: when supported Sample: |
|
The timestamp when the Datalake was created. Returned: when supported Sample: |
|
CRN of the CDP Credential. Returned: when supported |
|
CRN value for the Datalake. Returned: always |
|
Name of the Datalake. Returned: always |
|
Whether or not RAZ is enabled Returned: always |
|
Details for the exposed service API endpoints of the Datalake. Returned: when supported |
|
The exposed API endpoints. Returned: always |
|
User-friendly name of the exposed service. Returned: always Sample: |
|
The related Knox entry for the service. Returned: always Sample: |
|
The Single Sign-On (SSO) mode for the service. Returned: always Sample: |
|
Flag for the access status of the service. Returned: always |
|
The name of the exposed service. Returned: always Sample: |
|
The server URL for the exposed service’s API. Returned: always Sample: |
|
CRN of the associated Environment. Returned: when supported |
|
GCP-specific environment configuration information. Returned: when supported |
|
The email id of the service account used for the ID Broker instance. Returned: always |
|
The instance details of the Datalake. Returned: when supported |
|
Details about the instances. Returned: always |
|
The identifier of the instance. Returned: always Sample: |
|
The state of the instance. Returned: always Sample: |
|
Name of the instance group associated with the instances. Returned: always Sample: |
|
The product versions. Returned: when supported |
|
The name of the product. Returned: always Sample: |
|
The version of the product. Returned: always Sample: |
|
The region of the Datalake. Returned: when supported |
|
The status of the Datalake. Returned: when supported Sample: |
|
An explanation of the status. Returned: when supported Sample: |
|
Returns the captured CDP SDK log. Returned: when supported |
|
Returns a list of each line of the captured CDP SDK log. Returned: when supported |