glue

class pulumi_aws.glue.CatalogDatabase(resource_name, opts=None, catalog_id=None, description=None, location_uri=None, name=None, parameters=None, __name__=None, __opts__=None)

Provides a Glue Catalog Database Resource. You can refer to the Glue Developer Guide for a full explanation of the Glue Data Catalog functionality

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • catalog_id (pulumi.Input[str]) – ID of the Glue Catalog to create the database in. If omitted, this defaults to the AWS Account ID.
  • description (pulumi.Input[str]) – Description of the database.
  • location_uri (pulumi.Input[str]) – The location of the database (for example, an HDFS path).
  • name (pulumi.Input[str]) – The name of the database.
  • parameters (pulumi.Input[dict]) – A list of key-value pairs that define parameters and properties of the database.
catalog_id = None

ID of the Glue Catalog to create the database in. If omitted, this defaults to the AWS Account ID.

description = None

Description of the database.

location_uri = None

The location of the database (for example, an HDFS path).

name = None

The name of the database.

parameters = None

A list of key-value pairs that define parameters and properties of the database.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.CatalogTable(resource_name, opts=None, catalog_id=None, database_name=None, description=None, name=None, owner=None, parameters=None, partition_keys=None, retention=None, storage_descriptor=None, table_type=None, view_expanded_text=None, view_original_text=None, __name__=None, __opts__=None)

Provides a Glue Catalog Table Resource. You can refer to the Glue Developer Guide for a full explanation of the Glue Data Catalog functionality.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • catalog_id (pulumi.Input[str]) – ID of the Glue Catalog and database to create the table in. If omitted, this defaults to the AWS Account ID plus the database name.
  • database_name (pulumi.Input[str]) – Name of the metadata database where the table metadata resides. For Hive compatibility, this must be all lowercase.
  • description (pulumi.Input[str]) – Description of the table.
  • name (pulumi.Input[str]) – Name of the SerDe.
  • owner (pulumi.Input[str]) – Owner of the table.
  • parameters (pulumi.Input[dict]) – A map of initialization parameters for the SerDe, in key-value form.
  • partition_keys (pulumi.Input[list]) – A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.
  • retention (pulumi.Input[float]) – Retention time for this table.
  • storage_descriptor (pulumi.Input[dict]) –

    A storage descriptor object containing information about the physical storage of this table. You can refer to the Glue Developer Guide for a full explanation of this object.

  • table_type (pulumi.Input[str]) – The type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).
  • view_expanded_text (pulumi.Input[str]) – If the table is a view, the expanded text of the view; otherwise null.
  • view_original_text (pulumi.Input[str]) – If the table is a view, the original text of the view; otherwise null.
catalog_id = None

ID of the Glue Catalog and database to create the table in. If omitted, this defaults to the AWS Account ID plus the database name.

database_name = None

Name of the metadata database where the table metadata resides. For Hive compatibility, this must be all lowercase.

description = None

Description of the table.

name = None

Name of the SerDe.

owner = None

Owner of the table.

parameters = None

A map of initialization parameters for the SerDe, in key-value form.

partition_keys = None

A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.

retention = None

Retention time for this table.

storage_descriptor = None

A storage descriptor object containing information about the physical storage of this table. You can refer to the Glue Developer Guide for a full explanation of this object.

table_type = None

The type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).

view_expanded_text = None

If the table is a view, the expanded text of the view; otherwise null.

view_original_text = None

If the table is a view, the original text of the view; otherwise null.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.Classifier(resource_name, opts=None, grok_classifier=None, json_classifier=None, name=None, xml_classifier=None, __name__=None, __opts__=None)

Provides a Glue Classifier resource.

NOTE: It is only valid to create one type of classifier (grok, JSON, or XML). Changing classifier types will recreate the classifier.
Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • grok_classifier (pulumi.Input[dict]) – A classifier that uses grok patterns. Defined below.
  • json_classifier (pulumi.Input[dict]) – A classifier for JSON content. Defined below.
  • name (pulumi.Input[str]) – The name of the classifier.
  • xml_classifier (pulumi.Input[dict]) – A classifier for XML content. Defined below.
grok_classifier = None

A classifier that uses grok patterns. Defined below.

json_classifier = None

A classifier for JSON content. Defined below.

name = None

The name of the classifier.

xml_classifier = None

A classifier for XML content. Defined below.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.Connection(resource_name, opts=None, catalog_id=None, connection_properties=None, connection_type=None, description=None, match_criterias=None, name=None, physical_connection_requirements=None, __name__=None, __opts__=None)

Provides a Glue Connection resource.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • catalog_id (pulumi.Input[str]) – The ID of the Data Catalog in which to create the connection. If none is supplied, the AWS account ID is used by default.
  • connection_properties (pulumi.Input[dict]) – A map of key-value pairs used as parameters for this connection.
  • connection_type (pulumi.Input[str]) – The type of the connection. Defaults to JBDC.
  • description (pulumi.Input[str]) – Description of the connection.
  • match_criterias (pulumi.Input[list]) – A list of criteria that can be used in selecting this connection.
  • name (pulumi.Input[str]) – The name of the connection.
  • physical_connection_requirements (pulumi.Input[dict]) – A map of physical connection requirements, such as VPC and SecurityGroup. Defined below.
catalog_id = None

The ID of the Data Catalog in which to create the connection. If none is supplied, the AWS account ID is used by default.

connection_properties = None

A map of key-value pairs used as parameters for this connection.

connection_type = None

The type of the connection. Defaults to JBDC.

description = None

Description of the connection.

match_criterias = None

A list of criteria that can be used in selecting this connection.

name = None

The name of the connection.

physical_connection_requirements = None

A map of physical connection requirements, such as VPC and SecurityGroup. Defined below.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.Crawler(resource_name, opts=None, classifiers=None, configuration=None, database_name=None, description=None, dynamodb_targets=None, jdbc_targets=None, name=None, role=None, s3_targets=None, schedule=None, schema_change_policy=None, security_configuration=None, table_prefix=None, __name__=None, __opts__=None)

Manages a Glue Crawler. More information can be found in the AWS Glue Developer Guide

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • classifiers (pulumi.Input[list]) – List of custom classifiers. By default, all AWS classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
  • configuration (pulumi.Input[str]) – JSON string of configuration information.
  • database_name (pulumi.Input[str]) – Glue database where results are written.
  • description (pulumi.Input[str]) – Description of the crawler.
  • dynamodb_targets (pulumi.Input[list]) – List of nested DynamoDB target arguments. See below.
  • jdbc_targets (pulumi.Input[list]) – List of nested JBDC target arguments. See below.
  • name (pulumi.Input[str]) – Name of the crawler.
  • role (pulumi.Input[str]) – The IAM role friendly name (including path without leading slash), or ARN of an IAM role, used by the crawler to access other resources.
  • s3_targets (pulumi.Input[list]) – List nested Amazon S3 target arguments. See below.
  • schedule (pulumi.Input[str]) – A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).
  • schema_change_policy (pulumi.Input[dict]) – Policy for the crawler’s update and deletion behavior.
  • security_configuration (pulumi.Input[str]) – The name of Security Configuration to be used by the crawler
  • table_prefix (pulumi.Input[str]) – The table prefix used for catalog tables that are created.
arn = None

The ARN of the crawler

classifiers = None

List of custom classifiers. By default, all AWS classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.

configuration = None

JSON string of configuration information.

database_name = None

Glue database where results are written.

description = None

Description of the crawler.

dynamodb_targets = None

List of nested DynamoDB target arguments. See below.

jdbc_targets = None

List of nested JBDC target arguments. See below.

name = None

Name of the crawler.

role = None

The IAM role friendly name (including path without leading slash), or ARN of an IAM role, used by the crawler to access other resources.

s3_targets = None

List nested Amazon S3 target arguments. See below.

schedule = None

A cron expression used to specify the schedule. For more information, see Time-Based Schedules for Jobs and Crawlers. For example, to run something every day at 12:15 UTC, you would specify: cron(15 12 * * ? *).

schema_change_policy = None

Policy for the crawler’s update and deletion behavior.

security_configuration = None

The name of Security Configuration to be used by the crawler

table_prefix = None

The table prefix used for catalog tables that are created.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.GetScriptResult(dag_edges=None, dag_nodes=None, language=None, python_script=None, scala_code=None, id=None)

A collection of values returned by getScript.

python_script = None

The Python script generated from the DAG when the language argument is set to PYTHON.

scala_code = None

The Scala code generated from the DAG when the language argument is set to SCALA.

id = None

id is the provider-assigned unique ID for this managed resource.

class pulumi_aws.glue.Job(resource_name, opts=None, allocated_capacity=None, command=None, connections=None, default_arguments=None, description=None, execution_property=None, max_capacity=None, max_retries=None, name=None, role_arn=None, security_configuration=None, timeout=None, __name__=None, __opts__=None)

Provides a Glue Job resource.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • allocated_capacity (pulumi.Input[float]) – DEPRECATED (Optional) The number of AWS Glue data processing units (DPUs) to allocate to this Job. At least 2 DPUs need to be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory.
  • command (pulumi.Input[dict]) – The command of the job. Defined below.
  • connections (pulumi.Input[list]) – The list of connections used for this job.
  • default_arguments (pulumi.Input[dict]) – The map of default arguments for this job. You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes. For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide. For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.
  • description (pulumi.Input[str]) – Description of the job.
  • execution_property (pulumi.Input[dict]) – Execution property of the job. Defined below.
  • max_capacity (pulumi.Input[float]) – The maximum number of AWS Glue data processing units (DPUs) that can be allocated when this job runs.
  • max_retries (pulumi.Input[float]) – The maximum number of times to retry this job if it fails.
  • name (pulumi.Input[str]) – The name of the job command. Defaults to glueetl
  • role_arn (pulumi.Input[str]) – The ARN of the IAM role associated with this job.
  • security_configuration (pulumi.Input[str]) – The name of the Security Configuration to be associated with the job.
  • timeout (pulumi.Input[float]) – The job timeout in minutes. The default is 2880 minutes (48 hours).
allocated_capacity = None

DEPRECATED (Optional) The number of AWS Glue data processing units (DPUs) to allocate to this Job. At least 2 DPUs need to be allocated; the default is 10. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory.

command = None

The command of the job. Defined below.

connections = None

The list of connections used for this job.

default_arguments = None

The map of default arguments for this job. You can specify arguments here that your own job-execution script consumes, as well as arguments that AWS Glue itself consumes. For information about how to specify and consume your own Job arguments, see the Calling AWS Glue APIs in Python topic in the developer guide. For information about the key-value pairs that AWS Glue consumes to set up your job, see the Special Parameters Used by AWS Glue topic in the developer guide.

description = None

Description of the job.

execution_property = None

Execution property of the job. Defined below.

max_capacity = None

The maximum number of AWS Glue data processing units (DPUs) that can be allocated when this job runs.

max_retries = None

The maximum number of times to retry this job if it fails.

name = None

The name of the job command. Defaults to glueetl

role_arn = None

The ARN of the IAM role associated with this job.

security_configuration = None

The name of the Security Configuration to be associated with the job.

timeout = None

The job timeout in minutes. The default is 2880 minutes (48 hours).

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.SecurityConfiguration(resource_name, opts=None, encryption_configuration=None, name=None, __name__=None, __opts__=None)

Manages a Glue Security Configuration.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • encryption_configuration (pulumi.Input[dict]) – Configuration block containing encryption configuration. Detailed below.
  • name (pulumi.Input[str]) – Name of the security configuration.
encryption_configuration = None

Configuration block containing encryption configuration. Detailed below.

name = None

Name of the security configuration.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
class pulumi_aws.glue.Trigger(resource_name, opts=None, actions=None, description=None, enabled=None, name=None, predicate=None, schedule=None, type=None, __name__=None, __opts__=None)

Manages a Glue Trigger resource.

Parameters:
  • resource_name (str) – The name of the resource.
  • opts (pulumi.ResourceOptions) – Options for the resource.
  • actions (pulumi.Input[list]) – List of actions initiated by this trigger when it fires. Defined below.
  • description (pulumi.Input[str]) – A description of the new trigger.
  • enabled (pulumi.Input[bool]) – Start the trigger. Defaults to true. Not valid to disable for ON_DEMAND type.
  • name (pulumi.Input[str]) – The name of the trigger.
  • predicate (pulumi.Input[dict]) – A predicate to specify when the new trigger should fire. Required when trigger type is CONDITIONAL. Defined below.
  • schedule (pulumi.Input[str]) –

    A cron expression used to specify the schedule. Time-Based Schedules for Jobs and Crawlers

  • type (pulumi.Input[str]) – The type of trigger. Valid values are CONDITIONAL, ON_DEMAND, and SCHEDULED.
actions = None

List of actions initiated by this trigger when it fires. Defined below.

description = None

A description of the new trigger.

enabled = None

Start the trigger. Defaults to true. Not valid to disable for ON_DEMAND type.

name = None

The name of the trigger.

predicate = None

A predicate to specify when the new trigger should fire. Required when trigger type is CONDITIONAL. Defined below.

schedule = None

A cron expression used to specify the schedule. Time-Based Schedules for Jobs and Crawlers

type = None

The type of trigger. Valid values are CONDITIONAL, ON_DEMAND, and SCHEDULED.

translate_output_property(prop)

Provides subclasses of Resource an opportunity to translate names of output properties into a format of their choosing before writing those properties to the resource object.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
translate_input_property(prop)

Provides subclasses of Resource an opportunity to translate names of input properties into a format of their choosing before sending those properties to the Pulumi engine.

Parameters:prop (str) – A property name.
Returns:A potentially transformed property name.
Return type:str
pulumi_aws.glue.get_script(dag_edges=None, dag_nodes=None, language=None, opts=None)

Use this data source to generate a Glue script from a Directed Acyclic Graph (DAG).