Module emr

@pulumi/aws > emr

class Cluster

extends CustomResource

Provides an Elastic MapReduce Cluster, a web service that makes it easy to process large amounts of data efficiently. See Amazon Elastic MapReduce Documentation for more information.

Example Usage

Enable Debug Logging

Debug logging in EMR is implemented as a step. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.emr.Cluster("example", {
    steps: [{
        action: "TERMINATE_CLUSTER",
        hadoopJarStep: {
            args: ["state-pusher-script"],
            jar: "command-runner.jar",
        },
        name: "Setup Hadoop Debugging",
    }],
});

ec2_attributes

Attributes for the Amazon EC2 instances running the job flow

  • key_name - (Optional) Amazon EC2 key pair that can be used to ssh to the master node as the user called hadoop
  • subnet_id - (Optional) VPC subnet id where you want the job flow to launch. Cannot specify the cc1.4xlarge instance type for nodes of a job flow launched in a Amazon VPC
  • additional_master_security_groups - (Optional) String containing a comma separated list of additional Amazon EC2 security group IDs for the master node
  • additional_slave_security_groups - (Optional) String containing a comma separated list of additional Amazon EC2 security group IDs for the slave nodes as a comma separated string
  • emr_managed_master_security_group - (Optional) Identifier of the Amazon EC2 EMR-Managed security group for the master node
  • emr_managed_slave_security_group - (Optional) Identifier of the Amazon EC2 EMR-Managed security group for the slave nodes
  • service_access_security_group - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet
  • instance_profile - (Required) Instance Profile for EC2 instances of the cluster assume this role

NOTE on EMR-Managed security groups: These security groups will have any missing inbound or outbound access rules added and maintained by AWS, to ensure proper communication between instances in a cluster. The EMR service will maintain these rules for groups provided in emr_managed_master_security_group and emr_managed_slave_security_group; attempts to remove the required rules may succeed, only for the EMR service to re-add them in a matter of minutes. This may cause Terraform to fail to destroy an environment that contains an EMR cluster, because the EMR service does not revoke rules added on deletion, leaving a cyclic dependency between the security groups that prevents their deletion. To avoid this, use the revoke_rules_on_delete optional attribute for any Security Group used in emr_managed_master_security_group and emr_managed_slave_security_group. See Amazon EMR-Managed Security Groups for more information about the EMR-managed security group rules.

kerberos_attributes

Attributes for Kerberos configuration

  • ad_domain_join_password - (Optional) The Active Directory password for ad_domain_join_user
  • ad_domain_join_user - (Optional) Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain.
  • cross_realm_trust_principal_password - (Optional) Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms.
  • kdc_admin_password - (Required) The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster.
  • realm - (Required) The name of the Kerberos realm to which all nodes in a cluster belong. For example, EC2.INTERNAL

instance_group

Attributes for each task instance group in the cluster

  • instance_role - (Required) The role of the instance group in the cluster. Valid values are: MASTER, CORE, and TASK.
  • instance_type - (Required) The EC2 instance type for all instances in the instance group
  • instance_count - (Optional) Target number of instances for the instance group
  • name - (Optional) Friendly name given to the instance group
  • bid_price - (Optional) If set, the bid price for each EC2 instance in the instance group, expressed in USD. By setting this attribute, the instance group is being declared as a Spot Instance, and will implicitly create a Spot request. Leave this blank to use On-Demand Instances. bid_price can not be set for the MASTER instance group, since that group must always be On-Demand
  • ebs_config - (Optional) A list of attributes for the EBS volumes attached to each instance in the instance group. Each ebs_config defined will result in additional EBS volumes being attached to each instance in the instance group. Defined below
  • autoscaling_policy - (Optional) The autoscaling policy document. This is a JSON formatted string. See EMR Auto Scaling

ebs_config

Attributes for the EBS volumes attached to each EC2 instance in the instance_group

  • size - (Required) The volume size, in gibibytes (GiB).
  • type - (Required) The volume type. Valid options are gp2, io1, standard and st1. See EBS Volume Types.
  • iops - (Optional) The number of I/O operations per second (IOPS) that the volume supports
  • volumes_per_instance - (Optional) The number of EBS volumes with this configuration to attach to each EC2 instance in the instance group (default is 1)

bootstrap_action

  • name - (Required) Name of the bootstrap action
  • path - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system
  • args - (Optional) List of command line arguments to pass to the bootstrap action script

step

Attributes for step configuration

  • action_on_failure - (Required) The action to take if the step fails. Valid values: TERMINATE_JOB_FLOW, TERMINATE_CLUSTER, CANCEL_AND_WAIT, and CONTINUE
  • hadoop_jar_step - (Required) The JAR file used for the step. Defined below.
  • name - (Required) The name of the step.

hadoop_jar_step

Attributes for Hadoop job step configuration

  • args - (Optional) List of command line arguments passed to the JAR file’s main function when executed.
  • jar - (Required) Path to a JAR file run during the step.
  • main_class - (Optional) Name of the main class in the specified Java file. If not specified, the JAR file should specify a Main-Class in its manifest file.
  • properties - (Optional) Key-Value map of Java properties that are set when the step runs. You can use these properties to pass key value pairs to your main function.

constructor

new Cluster(name: string, args: ClusterArgs, opts?: pulumi.CustomResourceOptions)

Create a Cluster resource with the given unique name, arguments, and options.

  • name The unique name of the resource.
  • args The arguments to use to populate this resource's properties.
  • opts A bag of options that control this resource's behavior.

method get

public static get(name: string, id: pulumi.Input<pulumi.ID>, state?: ClusterState, opts?: pulumi.CustomResourceOptions): Cluster

Get an existing Cluster resource’s state with the given name, ID, and optional extra properties used to qualify the lookup.

method getProvider

getProvider(moduleMember: string): ProviderResource | undefined

method isInstance

static isInstance(obj: any): boolean

Returns true if the given object is an instance of CustomResource. This is designed to work even when multiple copies of the Pulumi SDK have been loaded into the same process.

property additionalInfo

public additionalInfo: pulumi.Output<string | undefined>;

A JSON string for selecting additional features such as adding proxy information. Note: Currently there is no API to retrieve the value of this argument after EMR cluster creation from provider, therefore Terraform cannot detect drift from the actual EMR cluster if its value is changed outside Terraform.

property applications

public applications: pulumi.Output<string[] | undefined>;

A list of applications for the cluster. Valid values are: Flink, Hadoop, Hive, Mahout, Pig, Spark, and JupyterHub (as of EMR 5.14.0). Case insensitive

property autoscalingRole

public autoscalingRole: pulumi.Output<string | undefined>;

An IAM role for automatic scaling policies. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate EC2 instances in an instance group.

property bootstrapActions

public bootstrapActions: pulumi.Output<{
    args: string[];
    name: string;
    path: string;
}[] | undefined>;

List of bootstrap actions that will be run before Hadoop is started on the cluster nodes. Defined below

property clusterState

public clusterState: pulumi.Output<string>;

property configurations

public configurations: pulumi.Output<string | undefined>;

List of configurations supplied for the EMR cluster you are creating

property configurationsJson

public configurationsJson: pulumi.Output<string | undefined>;

A JSON string for supplying list of configurations for the EMR cluster.

property coreInstanceCount

public coreInstanceCount: pulumi.Output<number>;

Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster’s master node and use the remainder of the nodes (core_instance_count-1) as core nodes. Cannot be specified if instance_groups is set. Default 1

property coreInstanceType

public coreInstanceType: pulumi.Output<string>;

The EC2 instance type of the slave nodes. Cannot be specified if instance_groups is set

property customAmiId

public customAmiId: pulumi.Output<string | undefined>;

A custom Amazon Linux AMI for the cluster (instead of an EMR-owned AMI). Available in Amazon EMR version 5.7.0 and later.

property ebsRootVolumeSize

public ebsRootVolumeSize: pulumi.Output<number | undefined>;

Size in GiB of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.

property ec2Attributes

public ec2Attributes: pulumi.Output<{
    additionalMasterSecurityGroups: string;
    additionalSlaveSecurityGroups: string;
    emrManagedMasterSecurityGroup: string;
    emrManagedSlaveSecurityGroup: string;
    instanceProfile: string;
    keyName: string;
    serviceAccessSecurityGroup: string;
    subnetId: string;
} | undefined>;

Attributes for the EC2 instances running the job flow. Defined below

property id

id: Output<ID>;

id is the provider-assigned unique ID for this managed resource. It is set during deployments and may be missing (undefined) during planning phases.

property instanceGroups

public instanceGroups: pulumi.Output<{
    autoscalingPolicy: string;
    bidPrice: string;
    ebsConfigs: {
        iops: number;
        size: number;
        type: string;
        volumesPerInstance: number;
    }[];
    id: string;
    instanceCount: number;
    instanceRole: string;
    instanceType: string;
    name: string;
}[]>;

A list of instance_group objects for each instance group in the cluster. Exactly one of master_instance_type and instance_group must be specified. If instance_group is set, then it must contain a configuration block for at least the MASTER instance group type (as well as any additional instance groups). Defined below

property keepJobFlowAliveWhenNoSteps

public keepJobFlowAliveWhenNoSteps: pulumi.Output<boolean>;

Switch on/off run cluster with no steps or when all steps are complete (default is on)

property kerberosAttributes

public kerberosAttributes: pulumi.Output<{
    adDomainJoinPassword: string;
    adDomainJoinUser: string;
    crossRealmTrustPrincipalPassword: string;
    kdcAdminPassword: string;
    realm: string;
} | undefined>;

Kerberos configuration for the cluster. Defined below

property logUri

public logUri: pulumi.Output<string | undefined>;

S3 bucket to write the log files of the job flow. If a value is not provided, logs are not created

property masterInstanceType

public masterInstanceType: pulumi.Output<string>;

The EC2 instance type of the master node. Exactly one of master_instance_type and instance_group must be specified.

property masterPublicDns

public masterPublicDns: pulumi.Output<string>;

The public DNS name of the master EC2 instance.

property name

public name: pulumi.Output<string>;

The name of the job flow

property releaseLabel

public releaseLabel: pulumi.Output<string>;

The release label for the Amazon EMR release

property scaleDownBehavior

public scaleDownBehavior: pulumi.Output<string>;

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.

property securityConfiguration

public securityConfiguration: pulumi.Output<string | undefined>;

The security configuration name to attach to the EMR cluster. Only valid for EMR clusters with release_label 4.8.0 or greater

property serviceRole

public serviceRole: pulumi.Output<string>;

IAM role that will be assumed by the Amazon EMR service to access AWS resources

property steps

public steps: pulumi.Output<{
    actionOnFailure: string;
    hadoopJarStep: {
        args: string[];
        jar: string;
        mainClass: string;
        properties: {[key: string]: any};
    };
    name: string;
}[]>;

List of steps to run when creating the cluster. Defined below. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform.

property tags

public tags: pulumi.Output<{[key: string]: any} | undefined>;

list of tags to apply to the EMR Cluster

property terminationProtection

public terminationProtection: pulumi.Output<boolean>;

Switch on/off termination protection (default is off)

property urn

urn: Output<URN>;

urn is the stable logical URN used to distinctly address a resource, both before and after deployments.

property visibleToAllUsers

public visibleToAllUsers: pulumi.Output<boolean | undefined>;

Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default true

class InstanceGroup

extends CustomResource

Provides an Elastic MapReduce Cluster Instance Group configuration. See Amazon Elastic MapReduce Documentation for more information.

NOTE: At this time, Instance Groups cannot be destroyed through the API nor web interface. Instance Groups are destroyed when the EMR Cluster is destroyed. Terraform will resize any Instance Group to zero when destroying the resource.

Example Usage

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const task = new aws.emr.InstanceGroup("task", {
    clusterId: aws_emr_cluster_tf_test_cluster.id,
    instanceCount: 1,
    instanceType: "m5.xlarge",
});

constructor

new InstanceGroup(name: string, args: InstanceGroupArgs, opts?: pulumi.CustomResourceOptions)

Create a InstanceGroup resource with the given unique name, arguments, and options.

  • name The unique name of the resource.
  • args The arguments to use to populate this resource's properties.
  • opts A bag of options that control this resource's behavior.

method get

public static get(name: string, id: pulumi.Input<pulumi.ID>, state?: InstanceGroupState, opts?: pulumi.CustomResourceOptions): InstanceGroup

Get an existing InstanceGroup resource’s state with the given name, ID, and optional extra properties used to qualify the lookup.

method getProvider

getProvider(moduleMember: string): ProviderResource | undefined

method isInstance

static isInstance(obj: any): boolean

Returns true if the given object is an instance of CustomResource. This is designed to work even when multiple copies of the Pulumi SDK have been loaded into the same process.

property clusterId

public clusterId: pulumi.Output<string>;

ID of the EMR Cluster to attach to. Changing this forces a new resource to be created.

property ebsConfigs

public ebsConfigs: pulumi.Output<{
    iops: number;
    size: number;
    type: string;
    volumesPerInstance: number;
}[] | undefined>;

One or more ebs_config blocks as defined below. Changing this forces a new resource to be created.

property ebsOptimized

public ebsOptimized: pulumi.Output<boolean | undefined>;

Indicates whether an Amazon EBS volume is EBS-optimized. Changing this forces a new resource to be created.

property id

id: Output<ID>;

id is the provider-assigned unique ID for this managed resource. It is set during deployments and may be missing (undefined) during planning phases.

property instanceCount

public instanceCount: pulumi.Output<number | undefined>;

Target number of instances for the instance group. Defaults to 0.

property instanceType

public instanceType: pulumi.Output<string>;

The EC2 instance type for all instances in the instance group. Changing this forces a new resource to be created.

property name

public name: pulumi.Output<string>;

Human friendly name given to the instance group. Changing this forces a new resource to be created.

property runningInstanceCount

public runningInstanceCount: pulumi.Output<number>;

property status

public status: pulumi.Output<string>;

property urn

urn: Output<URN>;

urn is the stable logical URN used to distinctly address a resource, both before and after deployments.

class SecurityConfiguration

extends CustomResource

Provides a resource to manage AWS EMR Security Configurations

Example Usage

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const foo = new aws.emr.SecurityConfiguration("foo", {
    configuration: `{
  "EncryptionConfiguration": {
    "AtRestEncryptionConfiguration": {
      "S3EncryptionConfiguration": {
        "EncryptionMode": "SSE-S3"
      },
      "LocalDiskEncryptionConfiguration": {
        "EncryptionKeyProviderType": "AwsKms",
        "AwsKmsKey": "arn:aws:kms:us-west-2:187416307283:alias/tf_emr_test_key"
      }
    },
    "EnableInTransitEncryption": false,
    "EnableAtRestEncryption": true
  }
}
`,
});

constructor

new SecurityConfiguration(name: string, args: SecurityConfigurationArgs, opts?: pulumi.CustomResourceOptions)

Create a SecurityConfiguration resource with the given unique name, arguments, and options.

  • name The unique name of the resource.
  • args The arguments to use to populate this resource's properties.
  • opts A bag of options that control this resource's behavior.

method get

public static get(name: string, id: pulumi.Input<pulumi.ID>, state?: SecurityConfigurationState, opts?: pulumi.CustomResourceOptions): SecurityConfiguration

Get an existing SecurityConfiguration resource’s state with the given name, ID, and optional extra properties used to qualify the lookup.

method getProvider

getProvider(moduleMember: string): ProviderResource | undefined

method isInstance

static isInstance(obj: any): boolean

Returns true if the given object is an instance of CustomResource. This is designed to work even when multiple copies of the Pulumi SDK have been loaded into the same process.

property configuration

public configuration: pulumi.Output<string>;

A JSON formatted Security Configuration

property creationDate

public creationDate: pulumi.Output<string>;

Date the Security Configuration was created

property id

id: Output<ID>;

id is the provider-assigned unique ID for this managed resource. It is set during deployments and may be missing (undefined) during planning phases.

property name

public name: pulumi.Output<string>;

The name of the EMR Security Configuration. By default generated by Terraform.

property namePrefix

public namePrefix: pulumi.Output<string | undefined>;

Creates a unique name beginning with the specified prefix. Conflicts with name.

property urn

urn: Output<URN>;

urn is the stable logical URN used to distinctly address a resource, both before and after deployments.

interface ClusterArgs

The set of arguments for constructing a Cluster resource.

property additionalInfo

additionalInfo?: pulumi.Input<string>;

A JSON string for selecting additional features such as adding proxy information. Note: Currently there is no API to retrieve the value of this argument after EMR cluster creation from provider, therefore Terraform cannot detect drift from the actual EMR cluster if its value is changed outside Terraform.

property applications

applications?: pulumi.Input<pulumi.Input<string>[]>;

A list of applications for the cluster. Valid values are: Flink, Hadoop, Hive, Mahout, Pig, Spark, and JupyterHub (as of EMR 5.14.0). Case insensitive

property autoscalingRole

autoscalingRole?: pulumi.Input<string>;

An IAM role for automatic scaling policies. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate EC2 instances in an instance group.

property bootstrapActions

bootstrapActions?: pulumi.Input<pulumi.Input<{
    args: pulumi.Input<pulumi.Input<string>[]>;
    name: pulumi.Input<string>;
    path: pulumi.Input<string>;
}>[]>;

List of bootstrap actions that will be run before Hadoop is started on the cluster nodes. Defined below

property configurations

configurations?: pulumi.Input<string>;

List of configurations supplied for the EMR cluster you are creating

property configurationsJson

configurationsJson?: pulumi.Input<string>;

A JSON string for supplying list of configurations for the EMR cluster.

property coreInstanceCount

coreInstanceCount?: pulumi.Input<number>;

Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster’s master node and use the remainder of the nodes (core_instance_count-1) as core nodes. Cannot be specified if instance_groups is set. Default 1

property coreInstanceType

coreInstanceType?: pulumi.Input<string>;

The EC2 instance type of the slave nodes. Cannot be specified if instance_groups is set

property customAmiId

customAmiId?: pulumi.Input<string>;

A custom Amazon Linux AMI for the cluster (instead of an EMR-owned AMI). Available in Amazon EMR version 5.7.0 and later.

property ebsRootVolumeSize

ebsRootVolumeSize?: pulumi.Input<number>;

Size in GiB of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.

property ec2Attributes

ec2Attributes?: pulumi.Input<{
    additionalMasterSecurityGroups: pulumi.Input<string>;
    additionalSlaveSecurityGroups: pulumi.Input<string>;
    emrManagedMasterSecurityGroup: pulumi.Input<string>;
    emrManagedSlaveSecurityGroup: pulumi.Input<string>;
    instanceProfile: pulumi.Input<string>;
    keyName: pulumi.Input<string>;
    serviceAccessSecurityGroup: pulumi.Input<string>;
    subnetId: pulumi.Input<string>;
}>;

Attributes for the EC2 instances running the job flow. Defined below

property instanceGroups

instanceGroups?: pulumi.Input<pulumi.Input<{
    autoscalingPolicy: pulumi.Input<string>;
    bidPrice: pulumi.Input<string>;
    ebsConfigs: pulumi.Input<pulumi.Input<{
        iops: pulumi.Input<number>;
        size: pulumi.Input<number>;
        type: pulumi.Input<string>;
        volumesPerInstance: pulumi.Input<number>;
    }>[]>;
    id: pulumi.Input<string>;
    instanceCount: pulumi.Input<number>;
    instanceRole: pulumi.Input<string>;
    instanceType: pulumi.Input<string>;
    name: pulumi.Input<string>;
}>[]>;

A list of instance_group objects for each instance group in the cluster. Exactly one of master_instance_type and instance_group must be specified. If instance_group is set, then it must contain a configuration block for at least the MASTER instance group type (as well as any additional instance groups). Defined below

property keepJobFlowAliveWhenNoSteps

keepJobFlowAliveWhenNoSteps?: pulumi.Input<boolean>;

Switch on/off run cluster with no steps or when all steps are complete (default is on)

property kerberosAttributes

kerberosAttributes?: pulumi.Input<{
    adDomainJoinPassword: pulumi.Input<string>;
    adDomainJoinUser: pulumi.Input<string>;
    crossRealmTrustPrincipalPassword: pulumi.Input<string>;
    kdcAdminPassword: pulumi.Input<string>;
    realm: pulumi.Input<string>;
}>;

Kerberos configuration for the cluster. Defined below

property logUri

logUri?: pulumi.Input<string>;

S3 bucket to write the log files of the job flow. If a value is not provided, logs are not created

property masterInstanceType

masterInstanceType?: pulumi.Input<string>;

The EC2 instance type of the master node. Exactly one of master_instance_type and instance_group must be specified.

property name

name?: pulumi.Input<string>;

The name of the job flow

property releaseLabel

releaseLabel: pulumi.Input<string>;

The release label for the Amazon EMR release

property scaleDownBehavior

scaleDownBehavior?: pulumi.Input<string>;

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.

property securityConfiguration

securityConfiguration?: pulumi.Input<string>;

The security configuration name to attach to the EMR cluster. Only valid for EMR clusters with release_label 4.8.0 or greater

property serviceRole

serviceRole: pulumi.Input<string>;

IAM role that will be assumed by the Amazon EMR service to access AWS resources

property steps

steps?: pulumi.Input<pulumi.Input<{
    actionOnFailure: pulumi.Input<string>;
    hadoopJarStep: pulumi.Input<{
        args: pulumi.Input<pulumi.Input<string>[]>;
        jar: pulumi.Input<string>;
        mainClass: pulumi.Input<string>;
        properties: pulumi.Input<{[key: string]: any}>;
    }>;
    name: pulumi.Input<string>;
}>[]>;

List of steps to run when creating the cluster. Defined below. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform.

property tags

tags?: pulumi.Input<{[key: string]: any}>;

list of tags to apply to the EMR Cluster

property terminationProtection

terminationProtection?: pulumi.Input<boolean>;

Switch on/off termination protection (default is off)

property visibleToAllUsers

visibleToAllUsers?: pulumi.Input<boolean>;

Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default true

interface ClusterState

Input properties used for looking up and filtering Cluster resources.

property additionalInfo

additionalInfo?: pulumi.Input<string>;

A JSON string for selecting additional features such as adding proxy information. Note: Currently there is no API to retrieve the value of this argument after EMR cluster creation from provider, therefore Terraform cannot detect drift from the actual EMR cluster if its value is changed outside Terraform.

property applications

applications?: pulumi.Input<pulumi.Input<string>[]>;

A list of applications for the cluster. Valid values are: Flink, Hadoop, Hive, Mahout, Pig, Spark, and JupyterHub (as of EMR 5.14.0). Case insensitive

property autoscalingRole

autoscalingRole?: pulumi.Input<string>;

An IAM role for automatic scaling policies. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate EC2 instances in an instance group.

property bootstrapActions

bootstrapActions?: pulumi.Input<pulumi.Input<{
    args: pulumi.Input<pulumi.Input<string>[]>;
    name: pulumi.Input<string>;
    path: pulumi.Input<string>;
}>[]>;

List of bootstrap actions that will be run before Hadoop is started on the cluster nodes. Defined below

property clusterState

clusterState?: pulumi.Input<string>;

property configurations

configurations?: pulumi.Input<string>;

List of configurations supplied for the EMR cluster you are creating

property configurationsJson

configurationsJson?: pulumi.Input<string>;

A JSON string for supplying list of configurations for the EMR cluster.

property coreInstanceCount

coreInstanceCount?: pulumi.Input<number>;

Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster’s master node and use the remainder of the nodes (core_instance_count-1) as core nodes. Cannot be specified if instance_groups is set. Default 1

property coreInstanceType

coreInstanceType?: pulumi.Input<string>;

The EC2 instance type of the slave nodes. Cannot be specified if instance_groups is set

property customAmiId

customAmiId?: pulumi.Input<string>;

A custom Amazon Linux AMI for the cluster (instead of an EMR-owned AMI). Available in Amazon EMR version 5.7.0 and later.

property ebsRootVolumeSize

ebsRootVolumeSize?: pulumi.Input<number>;

Size in GiB of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.

property ec2Attributes

ec2Attributes?: pulumi.Input<{
    additionalMasterSecurityGroups: pulumi.Input<string>;
    additionalSlaveSecurityGroups: pulumi.Input<string>;
    emrManagedMasterSecurityGroup: pulumi.Input<string>;
    emrManagedSlaveSecurityGroup: pulumi.Input<string>;
    instanceProfile: pulumi.Input<string>;
    keyName: pulumi.Input<string>;
    serviceAccessSecurityGroup: pulumi.Input<string>;
    subnetId: pulumi.Input<string>;
}>;

Attributes for the EC2 instances running the job flow. Defined below

property instanceGroups

instanceGroups?: pulumi.Input<pulumi.Input<{
    autoscalingPolicy: pulumi.Input<string>;
    bidPrice: pulumi.Input<string>;
    ebsConfigs: pulumi.Input<pulumi.Input<{
        iops: pulumi.Input<number>;
        size: pulumi.Input<number>;
        type: pulumi.Input<string>;
        volumesPerInstance: pulumi.Input<number>;
    }>[]>;
    id: pulumi.Input<string>;
    instanceCount: pulumi.Input<number>;
    instanceRole: pulumi.Input<string>;
    instanceType: pulumi.Input<string>;
    name: pulumi.Input<string>;
}>[]>;

A list of instance_group objects for each instance group in the cluster. Exactly one of master_instance_type and instance_group must be specified. If instance_group is set, then it must contain a configuration block for at least the MASTER instance group type (as well as any additional instance groups). Defined below

property keepJobFlowAliveWhenNoSteps

keepJobFlowAliveWhenNoSteps?: pulumi.Input<boolean>;

Switch on/off run cluster with no steps or when all steps are complete (default is on)

property kerberosAttributes

kerberosAttributes?: pulumi.Input<{
    adDomainJoinPassword: pulumi.Input<string>;
    adDomainJoinUser: pulumi.Input<string>;
    crossRealmTrustPrincipalPassword: pulumi.Input<string>;
    kdcAdminPassword: pulumi.Input<string>;
    realm: pulumi.Input<string>;
}>;

Kerberos configuration for the cluster. Defined below

property logUri

logUri?: pulumi.Input<string>;

S3 bucket to write the log files of the job flow. If a value is not provided, logs are not created

property masterInstanceType

masterInstanceType?: pulumi.Input<string>;

The EC2 instance type of the master node. Exactly one of master_instance_type and instance_group must be specified.

property masterPublicDns

masterPublicDns?: pulumi.Input<string>;

The public DNS name of the master EC2 instance.

property name

name?: pulumi.Input<string>;

The name of the job flow

property releaseLabel

releaseLabel?: pulumi.Input<string>;

The release label for the Amazon EMR release

property scaleDownBehavior

scaleDownBehavior?: pulumi.Input<string>;

The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.

property securityConfiguration

securityConfiguration?: pulumi.Input<string>;

The security configuration name to attach to the EMR cluster. Only valid for EMR clusters with release_label 4.8.0 or greater

property serviceRole

serviceRole?: pulumi.Input<string>;

IAM role that will be assumed by the Amazon EMR service to access AWS resources

property steps

steps?: pulumi.Input<pulumi.Input<{
    actionOnFailure: pulumi.Input<string>;
    hadoopJarStep: pulumi.Input<{
        args: pulumi.Input<pulumi.Input<string>[]>;
        jar: pulumi.Input<string>;
        mainClass: pulumi.Input<string>;
        properties: pulumi.Input<{[key: string]: any}>;
    }>;
    name: pulumi.Input<string>;
}>[]>;

List of steps to run when creating the cluster. Defined below. It is highly recommended to utilize the lifecycle configuration block with ignore_changes if other steps are being managed outside of Terraform.

property tags

tags?: pulumi.Input<{[key: string]: any}>;

list of tags to apply to the EMR Cluster

property terminationProtection

terminationProtection?: pulumi.Input<boolean>;

Switch on/off termination protection (default is off)

property visibleToAllUsers

visibleToAllUsers?: pulumi.Input<boolean>;

Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default true

interface InstanceGroupArgs

The set of arguments for constructing a InstanceGroup resource.

property clusterId

clusterId: pulumi.Input<string>;

ID of the EMR Cluster to attach to. Changing this forces a new resource to be created.

property ebsConfigs

ebsConfigs?: pulumi.Input<pulumi.Input<{
    iops: pulumi.Input<number>;
    size: pulumi.Input<number>;
    type: pulumi.Input<string>;
    volumesPerInstance: pulumi.Input<number>;
}>[]>;

One or more ebs_config blocks as defined below. Changing this forces a new resource to be created.

property ebsOptimized

ebsOptimized?: pulumi.Input<boolean>;

Indicates whether an Amazon EBS volume is EBS-optimized. Changing this forces a new resource to be created.

property instanceCount

instanceCount?: pulumi.Input<number>;

Target number of instances for the instance group. Defaults to 0.

property instanceType

instanceType: pulumi.Input<string>;

The EC2 instance type for all instances in the instance group. Changing this forces a new resource to be created.

property name

name?: pulumi.Input<string>;

Human friendly name given to the instance group. Changing this forces a new resource to be created.

interface InstanceGroupState

Input properties used for looking up and filtering InstanceGroup resources.

property clusterId

clusterId?: pulumi.Input<string>;

ID of the EMR Cluster to attach to. Changing this forces a new resource to be created.

property ebsConfigs

ebsConfigs?: pulumi.Input<pulumi.Input<{
    iops: pulumi.Input<number>;
    size: pulumi.Input<number>;
    type: pulumi.Input<string>;
    volumesPerInstance: pulumi.Input<number>;
}>[]>;

One or more ebs_config blocks as defined below. Changing this forces a new resource to be created.

property ebsOptimized

ebsOptimized?: pulumi.Input<boolean>;

Indicates whether an Amazon EBS volume is EBS-optimized. Changing this forces a new resource to be created.

property instanceCount

instanceCount?: pulumi.Input<number>;

Target number of instances for the instance group. Defaults to 0.

property instanceType

instanceType?: pulumi.Input<string>;

The EC2 instance type for all instances in the instance group. Changing this forces a new resource to be created.

property name

name?: pulumi.Input<string>;

Human friendly name given to the instance group. Changing this forces a new resource to be created.

property runningInstanceCount

runningInstanceCount?: pulumi.Input<number>;

property status

status?: pulumi.Input<string>;

interface SecurityConfigurationArgs

The set of arguments for constructing a SecurityConfiguration resource.

property configuration

configuration: pulumi.Input<string>;

A JSON formatted Security Configuration

property name

name?: pulumi.Input<string>;

The name of the EMR Security Configuration. By default generated by Terraform.

property namePrefix

namePrefix?: pulumi.Input<string>;

Creates a unique name beginning with the specified prefix. Conflicts with name.

interface SecurityConfigurationState

Input properties used for looking up and filtering SecurityConfiguration resources.

property configuration

configuration?: pulumi.Input<string>;

A JSON formatted Security Configuration

property creationDate

creationDate?: pulumi.Input<string>;

Date the Security Configuration was created

property name

name?: pulumi.Input<string>;

The name of the EMR Security Configuration. By default generated by Terraform.

property namePrefix

namePrefix?: pulumi.Input<string>;

Creates a unique name beginning with the specified prefix. Conflicts with name.