dataproc

class pulumi_gcp.dataproc.Cluster(resource_name, opts=None, cluster_config=None, labels=None, name=None, project=None, region=None, __name__=None, __opts__=None)

Manages a Cloud Dataproc cluster resource within GCP. For more information see the official dataproc documentation.

!> Warning: Due to limitations of the API, all arguments except labels,cluster_config.worker_config.num_instances and cluster_config.preemptible_worker_config.num_instances are non-updateable. Changing others will cause recreation of the whole cluster!

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • cluster_config (pulumi.Input[dict]) – Allows you to configure various aspects of the cluster. Structure defined below.
  • labels (pulumi.Input[dict]) – The list of labels (key/value pairs) to be applied to instances in the cluster. GCP generates some itself including goog-dataproc-cluster-name which is the name of the cluster.
  • name (pulumi.Input[str]) – The name of the cluster, unique within the project and zone.
  • project (pulumi.Input[str]) – The ID of the project in which the cluster will exist. If it is not provided, the provider project is used.
  • region (pulumi.Input[str]) – The region in which the cluster and associated nodes will be created in. Defaults to global.
cluster_config = None

Allows you to configure various aspects of the cluster. Structure defined below.

labels = None

The list of labels (key/value pairs) to be applied to instances in the cluster. GCP generates some itself including goog-dataproc-cluster-name which is the name of the cluster.

name = None

The name of the cluster, unique within the project and zone.

project = None

The ID of the project in which the cluster will exist. If it is not provided, the provider project is used.

region = None

The region in which the cluster and associated nodes will be created in. Defaults to global.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_gcp.dataproc.Job(resource_name, opts=None, force_delete=None, hadoop_config=None, hive_config=None, labels=None, pig_config=None, placement=None, project=None, pyspark_config=None, reference=None, region=None, scheduling=None, spark_config=None, sparksql_config=None, __name__=None, __opts__=None)

Manages a job resource within a Dataproc cluster within GCE. For more information see the official dataproc documentation.

!> Note: This resource does not support ‘update’ and changing any attributes will cause the resource to be recreated.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • force_delete (pulumi.Input[bool]) – By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.

:param pulumi.Input[dict] hadoop_config :param pulumi.Input[dict] hive_config :param pulumi.Input[dict] labels: The list of labels (key/value pairs) to add to the job. :param pulumi.Input[dict] pig_config :param pulumi.Input[dict] placement :param pulumi.Input[str] project: The project in which the cluster can be found and jobs

subsequently run against. If it is not provided, the provider project is used.

:param pulumi.Input[dict] pyspark_config :param pulumi.Input[dict] reference :param pulumi.Input[str] region: The Cloud Dataproc region. This essentially determines which clusters are available

for this job to be submitted to. If not specified, defaults to global.

:param pulumi.Input[dict] scheduling :param pulumi.Input[dict] spark_config :param pulumi.Input[dict] sparksql_config

driver_controls_files_uri = None

If present, the location of miscellaneous control files which may be used as part of job setup and handling. If not present, control files may be placed in the same location as driver_output_uri.

driver_output_resource_uri = None

A URI pointing to the location of the stdout of the job’s driver program.

force_delete = None

By default, you can only delete inactive jobs within Dataproc. Setting this to true, and calling destroy, will ensure that the job is first cancelled before issuing the delete.

labels = None

The list of labels (key/value pairs) to add to the job.

project = None

The project in which the cluster can be found and jobs subsequently run against. If it is not provided, the provider project is used.

region = None

The Cloud Dataproc region. This essentially determines which clusters are available for this job to be submitted to. If not specified, defaults to global.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str