/create/datasource

URL: http://<db.host>:<db.port>/create/datasource

Creates a data source, which contains the location and connection information for a data store that is external to the database.

Input Parameter Description

NameTypeDescription
namestringName of the data source to be created.
locationstringLocation of the remote storage in 'storage_provider_type://[storage_path[:storage_port]]' format. Supported storage provider types are 'azure','gcs','hdfs','jdbc','kafka', 'confluent' and 's3'.
user_namestringName of the remote system user; may be an empty string
passwordstringPassword for the remote system user; may be an empty string
optionsmap of string to strings

Optional parameters. The default value is an empty map ( {} ).

Supported Parameters (keys)Parameter Description
skip_validation

Bypass validation of connection to remote source. The default value is false. The supported values are:

  • true
  • false
connection_timeoutTimeout in seconds for connecting to this storage provider
wait_timeoutTimeout in seconds for reading from this storage provider
credentialName of the credential object to be used in data source
s3_bucket_nameName of the Amazon S3 bucket to use as the data source
s3_regionName of the Amazon S3 region where the given bucket is located
s3_verify_ssl

Set to false for testing purposes or when necessary to bypass TLS errors (e.g. self-signed certificates). This value is true by default. The default value is true. The supported values are:

  • true
  • false
s3_use_virtual_addressing

Whether to use virtual addressing when referencing the Amazon S3 source The default value is true.

Supported ValuesDescription
trueThe requests URI should be specified in virtual-hosted-style format where the bucket name is part of the domain name in the URL.
falseUse path-style URI for requests.
s3_aws_role_arnAmazon IAM Role ARN which has required S3 permissions that can be assumed for the given S3 IAM user
s3_encryption_customer_algorithmCustomer encryption algorithm used encrypting data
s3_encryption_customer_keyCustomer encryption key to encrypt or decrypt data
hdfs_kerberos_keytabKerberos keytab file location for the given HDFS user. This may be a KIFS file.
hdfs_delegation_tokenDelegation token for the given HDFS user
hdfs_use_kerberos

Use kerberos authentication for the given HDFS cluster The default value is false. The supported values are:

  • true
  • false
azure_storage_account_nameName of the Azure storage account to use as the data source, this is valid only if tenant_id is specified
azure_container_nameName of the Azure storage container to use as the data source
azure_tenant_idActive Directory tenant ID (or directory ID)
azure_sas_tokenShared access signature token for Azure storage account to use as the data source
azure_oauth_tokenOAuth token to access given storage container
gcs_bucket_nameName of the Google Cloud Storage bucket to use as the data source
gcs_project_idName of the Google Cloud project to use as the data source
gcs_service_account_keysGoogle Cloud service account keys to use for authenticating the data source
is_stream

To load from Azure/GCS/S3 as a stream continuously. The default value is false. The supported values are:

  • true
  • false
kafka_topic_nameName of the Kafka topic to use as the data source
jdbc_driver_jar_pathJDBC driver jar file location. This may be a KIFS file.
jdbc_driver_class_nameName of the JDBC driver class
anonymous

Use anonymous connection to storage provider--DEPRECATED: this is now the default. Specify use_managed_credentials for non-anonymous connection. The default value is true. The supported values are:

  • true
  • false
use_managed_credentials

When no credentials are supplied, we use anonymous access by default. If this is set, we will use cloud provider user settings. The default value is false. The supported values are:

  • true
  • false
use_https

Use https to connect to datasource if true, otherwise use http The default value is true. The supported values are:

  • true
  • false
schema_registry_locationLocation of Confluent Schema Registry in '[storage_path[:storage_port]]' format.
schema_registry_credentialConfluent Schema Registry credential object name.
schema_registry_portConfluent Schema Registry port (optional).

Output Parameter Description

The GPUdb server embeds the endpoint response inside a standard response structure which contains status information and the actual response to the query. Here is a description of the various fields of the wrapper:

NameTypeDescription
statusString'OK' or 'ERROR'
messageStringEmpty if success or an error message
data_typeString'create_datasource_response' or 'none' in case of an error
dataStringEmpty string
data_strJSON or String

This embedded JSON represents the result of the /create/datasource endpoint:

NameTypeDescription
namestringValue of input parameter name.
infomap of string to stringsAdditional information.

Empty string in case of an error.