Version:

SQL Support

Kinetica has broad support for the SQL-92 standard through its ODBC connector interface. For details on installation, configuration, & use, see the The ODBC/JDBC Connector section.

Query

The basic form of the supported SELECT statement is:

SELECT [DISTINCT | TOP <n>] <column expression list>
FROM [<schema name>.]<table name>
    [<join type> JOIN <join table name> ON <join expression>],...
[WHERE <filtering expression list>]
[GROUP BY <grouping expression list>]
[HAVING <group filtering expression list>]
[ORDER BY <ordering expression list>]
[LIMIT [<offset>, ]<num rows>]

Note

  • The * can be used to specify all columns in the column expression list.
  • Columns can be double-quoted to induce case-sensitivity and to use reserved words as column names; e.g., "PERCENT".
  • TOP <n> returns the first n records (up to 20000 records by default), but is configurable.
  • See Joins for more detail & limitations on joins. Supported join type keywords are:
    • INNER
    • CROSS
    • LEFT
    • RIGHT
    • FULL OUTER
  • The grouping expression list may contain column names, expressions, or column positions (e.g., GROUP BY 2 to aggregate on the 2nd column in the SELECT list).
  • Ordering is NULLS FIRST. Changing this behavior is not currently supported.
  • LIMIT applies paging to the result set, starting at the 0-based offset (if specified) and returning num rows records.

For example:

SELECT
    e.first_name || ' ' || e.last_name as "Employee_Name",
    m.first_name || ' ' || m.last_name as "Manager_Name"
FROM
    emp e
LEFT JOIN
    emp m ON e.manager_id = m.id
WHERE
    e.dept_id = 1
ORDER BY
    e.hire_date

Tableless Query

A query without a FROM clause can be used to return a single row of data containing a constant or constant expression.

For example, to select the value of pi using the arccosine function:

SELECT acos(0) * 2

Aggregation

The GROUP BY clause can be used to segment data into groups and apply aggregate functions over the values within each group. Aggregation functions applied to data without a GROUP BY clause will be applied over the entire result set.

For example, to find the average salary across all employees:

SELECT AVG(sal)
FROM emp

To find the average salary of employees within each department:

SELECT dept_id, AVG(sal)
FROM emp
GROUP BY dept_id
ORDER BY dept_id

Set Operations

UNION

The UNION set operator creates a single list of records from the results of two SELECT statements. Use the ALL keyword to keep all records from both sets; omit it to remove duplicate records and form a single list of records unique between the two sets. See Limitations and Cautions for limitations.

<SELECT statement>
UNION [ALL]
<SELECT statement>

INTERSECT

The INTERSECT set operator creates a single list of records that exist in both of the result sets from two SELECT statements. See Limitations for limitations.

<SELECT statement>
INTERSECT
<SELECT statement>

EXCEPT

The EXCEPT set operator performs set subtraction, creating a single list of records that exist in the first SELECT statement's result set, but not in the second SELECT statement's result set. See Limitations for limitations:

<SELECT statement>
EXCEPT
<SELECT statement>

WITH (Common Table Expressions)

The WITH set operation, also known as a Common Table Expression (CTE) creates a set of data that can be aliased and used one or more times in subsequent operations. The aliased set can be used within the SELECT, FROM, or WHERE clauses of a subsequent query; a subsequent CTE within the same WITH operation; or an INSERT, UPDATE, or DELETE statement.

Recursive WITH operations are not supported--the aliased set cannot refer to itself. The alias must be unique within the WITH statement--no other column or column alias can be similarly named, for example. Also, when used in a FROM clause and given a table alias, the table alias must be preceded with AS.

Each CTE definition within a WITH statement is structured as follows:

<cte name> [(column list)] AS (<SELECT statement>)

Each WITH statement can contain one or more CTE definitions, followed by a SELECT, INSERT, UPDATE, or DELETE statement, as shown here:

WITH <cte definition>,...
<SELECT | INSERT | UPDATE | DELETE statement>

For example:

WITH
    dept2_emp_sal_by_mgr (manager_id, sal) AS
    (
        SELECT manager_id, sal
        FROM emp
        WHERE dept_id = 2
    )
SELECT
    manager_id dept2_mgr_id,
    MAX(sal) dept2_highest_emp_sal_per_mgr,
    COUNT(*) as dept2_total_emp_per_mgr
FROM dept2_emp_sal_by_mgr
GROUP BY manager_id
WITH
    dept2_emp AS
    (
        SELECT first_name, last_name, manager_id
        FROM emp
        WHERE dept_id = 2
    ),
    dept2_mgr AS
    (
        SELECT first_name, last_name, id
        FROM emp
        WHERE dept_id = 2
    )
INSERT INTO dept2_emp_mgr_roster (emp_first_name, emp_last_name, mgr_first_name, mgr_last_name)
SELECT d2emp.first_name, d2emp.last_name, d2mgr.first_name, d2mgr.last_name
FROM
    dept2_emp as d2emp
    JOIN dept2_mgr as d2mgr ON d2emp.manager_id = d2mgr.id

Expressions

An expression can consist of a literal constant, a column name, or a function applied to a constant or column name. A compound expression is an operation or function applied to one or more expressions.

The following are the supported expression operators:

  • + addition
  • - subtraction
  • * multiplication
  • / division
  • () grouping
  • || string concatenation

Note

Use double quotes to specify column names in a case-sensitive manner.

Conditional Functions

Scalar Function Details
DECODE(expr, match_a, value_a, ..., match_N, value_N) Evaluates expr: returns the first value whose corresponding match is equal to expr
IF(expr, value_if_true, value_if_false))

Evaluates expr: if true, returns value_if_true; otherwise, value_if_false

  • expr - any true/false condition. Note that when an integer column is used directly, this function will interpret non-zero values as true and zero values as false)
  • value_if_true - any numeric value
  • value_if_false - any numeric value

Conversion Functions

Conversion Function Description

CAST(expr AS [SQL_]<conv_type>)

or

CONVERT(expr, SQL_<conv_type>)

Converts expr into conv_type data type. See Conversion Types for the complete set of data types.

Note: CONVERT requires the SQL_ prefix, while it is optional with CAST; however, CAST does require the prefix for the following data types:

  • SQL_TYPE_DATE
  • SQL_TYPE_TIME
  • SQL_TYPE_TIMESTAMP
  • SQL_WVARCHAR
CHAR256(charN) Converts the given charN to char256 type
CHAR128(charN) Converts the given charN to char128 type
CHAR64(charN) Converts the given charN to char64 type
CHAR32(charN) Converts the given charN to char32 type
CHAR16(charN) Converts the given charN to char16 type
CHAR8(charN) Converts the given charN to char8 type
CHAR4(charN) Converts the given charN to char4 type
CHAR2(charN) Converts the given charN to char2 type
CHAR1(charN) Converts the given charN to char1 type
CHAR(int) Returns the character associated with the ASCII code in int

Conversion Types

Numeric String Date/Time
  • BIGINT
  • DECIMAL
  • DOUBLE
  • FLOAT
  • INTEGER
  • NUMERIC
  • REAL
  • SMALLINT
  • TINYINT
  • CHAR
  • LONGVARCHAR
  • VARCHAR
  • WVARCHAR
  • DATE
  • TIME
  • TIMESTAMP
  • TYPE_DATE
  • TYPE_TIME
  • TYPE_TIMESTAMP

Date/Time Functions

Date/Time Function Description
CURRENT_DATE() Returns the date as YYYY-MM-DD
CURRENT_DATETIME() Returns the date & time as YYYY-MM-DD HH24:MI:SS.mmm
CURRENT_TIME() Returns the time as HH24:MI:SS.mmm
CURRENT_TIMESTAMP() Returns the date & time as the number of milliseconds since the epoch
DATE(expr) Returns date in the format YYYY-MM-DD from expr
DATEDIFF(expr_end, expr_begin) Determines the difference between two dates, irrespective of time component, as the number of days when expr_begin is subtracted from expr_end; returns a negative number of days if expr_begin occurs after expr_end
DATETIME(expr) Returns expr (as a string) as a datetime (YYYY-MM-DD HH:MM:SS.mmm)
DAY(expr) Alias for DAYOFMONTH
DAYNAME(expr) Extracts the day of the week from expr and converts it to the corresponding day name [Sunday - Saturday ]
DAYOFMONTH(expr) Extracts the day of the month from expr [1 - 31]
DAYOFWEEK(expr) Extracts the day of the week from expr [1 - 7]; (1 = Sunday)
DAY_OF_WEEK(expr) Synonymous with DAYOFWEEK(expr)
DAYOFYEAR(expr) Extracts the day of the year from expr [1 - 366]
DAY_OF_YEAR(expr) Synonymous with DAYOFYEAR(expr)
HOUR(expr) Extracts the hour of the day from expr [0 - 23]

<expr> + INTERVAL <amount> <part>

<expr> - INTERVAL <amount> <part>

Adds to or subtracts from the date/time expr the integral amount units of type part. This mirrors the behavior of the TIMESTAMPADD function, only with a different format and different date/time part constants. The following date/time constants are supported for part:

Constant Description
YEAR Year is modified by interval amount (not affected by leap year, etc.)
MONTH Month is modified by interval amount and date adjusted if overflow/underflow occurs; day adjusted to last day of calculated month if not a valid day for that month (e.g. Apr 31st -> Apr 30th)
DAY Day is modified by interval amount (time not affected by daylight savings time, etc.); date is adjusted, if overflow/underflow occurs
HOUR Hour is modified by interval amount (time not affected by daylight savings time, etc.); date is adjusted, if overflow/underflow occurs
MINUTE Minute is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
SECOND Second is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
MILLISECOND Millisecond is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
MICROSECOND Microsecond is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
QUARTER Month is modified by three times the interval amount, irrespective of the number of days in the months between; day adjusting performed the same way as in MONTH description, but only on final month (i.e., Jan 31st + 1 quarter will be Apr 30th, not Apr 28th because of February)
WEEK Day is modified by 7 times the interval amount (time not affected by daylight savings time, etc.); month & year are adjusted, if overflow/underflow occurs
MINUTE(expr) Extracts the minute of the day from expr [0 - 59]
MONTH(expr) Extracts the month of the year from expr [1 - 12]
MONTHNAME(expr) Extracts the month of the year from expr and converts it to the corresponding month name [January - December ]
MSEC(expr) Extracts the millsecond of the second from expr [0 - 999]
QUARTER(expr) Extracts the quarter of the year from expr [1 - 4]; (1 = January, February, & March)
SECOND(expr) Extracts the seconds of the minute from expr [0 - 59]
SEC(expr) Alias for SECOND(expr)
TIME(expr) Returns the time (HH:MM:SS) from the expr
TIMESTAMP(expr) Returns the timestamp from the expr (as string)
TIMESTAMPADD(ts_part, ts_amount, expr)

Adds the positive or negative integral ts_amount units of type ts_part to expr. The following date/time constants are supported for ts_part:

Constant Description
SQL_TSI_YEAR Year is modified by interval amount (not affected by leap year, etc.)
SQL_TSI_MONTH Month is modified by interval amount and date adjusted if overflow/underflow occurs; day adjusted to last day of calculated month if not a valid day for that month (e.g. Apr 31st -> Apr 30th)
SQL_TSI_DAY Day is modified by interval amount (time not affected by daylight savings time, etc.); date is adjusted, if overflow/underflow occurs
SQL_TSI_HOUR Hour is modified by interval amount (time not affected by daylight savings time, etc.); date is adjusted, if overflow/underflow occurs
SQL_TSI_MINUTE Minute is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
SQL_TSI_SECOND Second is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
SQL_TSI_FRAC_SECOND Nanosecond is modified by interval amount; date/time are adjusted, if overflow/underflow occurs
SQL_TSI_QUARTER Month is modified by three times the interval amount, irrespective of the number of days in the months between; day adjusting performed the same as the SQL_TSI_MONTH description, but only on final month (i.e., Jan 31st + 1 quarter will be Apr 30th, not Apr 28th because of February)
SQL_TSI_WEEK Day is modified by 7 times the interval amount (time not affected by daylight savings time, etc.); month & year are adjusted, if overflow/underflow occurs
TIMESTAMPDIFF(ts_part, expr_begin, expr_end)

Determines the difference between two dates, calculating the result in the units specified; more precisely, how many units of ts_part need to be added to or subtracted from expr_begin to equal expr_end (or get as close as possible to after going past) using the rules specified in TIMESTAMPADD. The ts_part constants are the same as in TIMESTAMPADD.

Note: This is symmetric with TIMESTAMPADD in all cases; e.g., adding 1 SQL_TSI_MONTH to Mar 31st results in Apr 30th, and the TIMESTAMPDIFF in SQL_TSI_MONTH units between those two dates is 1.

WEEK(expr) Extracts the week of the year from expr [1 - 53]; each full week starts on Sunday (1 = week containing Jan 1st)
YEAR(expr) Extracts the year from expr; 4-digit year, A.D.

Timestamp/Date Conversion Functions

Function Details
DATE_TO_EPOCH_SECS(year, month, day, hours, minutes, seconds) Converts the full date to seconds since the epoch. Negative values are accepted (e.g., DATE_TO_EPOCH_SECS(2017,06,-15,09,22,15) would return 1494926535, which resolves to Tuesday, May 16, 2017 9:22:15 AM)
DATE_TO_EPOCH_MSECS(year, month, day, hours, minutes, seconds, milliseconds) Converts the full date to milliseconds since the epoch. Negative values are accepted
WEEK_TO_EPOCH_SECS(year, week_number) Converts the year and week number to seconds since the epoch. Negative values are accepted (e.g., WEEK_TO_EPOCH_SECS(2017,-32) would return 1463270400, which resolves to Sunday, May 15, 2016 12:00:00 AM). Each new week begins Sunday at midnight
WEEK_TO_EPOCH_MSECS(year, week_number) Converts the year and week number to seconds since the epoch. Negative values are accepted
MSECS_SINCE_EPOCH(timestamp) Converts the timestamp to millseconds since the epoch
TIMESTAMP_FROM_DATE_TIME(date, time) converts the date and time (as strings) to timestamp format, e.g., TIMESTAMP_FROM_DATE_TIME('2017-06-15', '10:37:30') would return 1497523050000, which resolves to Thursday, June 15, 2017 10:37:30 AM.

Geospatial Functions

Enhanced Performance Functions

The functions below all compare x and y coordinates to geometry objects (or vice versa), thus increasing their performance in queries. Each of these functions have a geometry-to-geometry version listed in the next section.

Function Description
STXY_CONTAINS(geom, x, y) Returns 1 (true) if geom contains the x and y coordinate, e.g. lies in the interior of geom. The coordinate cannot be on the boundary and also be contained because geom does not contain its boundary
STXY_CONTAINSPROPERLY(x, y, geom) Returns 1 (true) if the x and y coordinate intersects the interior of geom but not the boundary (or exterior) because geom does not contain its boundary but does contain itself
STXY_COVEREDBY(x, y, geom) Returns 1 (true) if the x and y coordinate is covered by geom
STXY_COVERS(geom, x, y) Returns 1 (true) if geom covers the x and y coordinate
STXY_DISJOINT(x, y, geom) Returns 1 (true) if the given x and y coordinate and the geometry geom do not spatially intersect.
STXY_DISTANCE(x, y, geom) Returns the minimum 2-D Cartesian distance (in degrees) between the given x and y coordinate and geom.
STXY_DWITHIN(x, y, geom, distance) Returns 1 (true) if the x and y coordinate is within the specified distance from geom.
STXY_ENVDWITHIN(x, y, geom, distance) Returns 1 (true) if the x and y coordinate is within the specified distance from the bounding box of geom.
STXY_ENVINTERSECTS(x, y, geom) Returns 1 (true) if the bounding box of the given geometry geom intersects the x and y coordinate.
STXY_INTERSECTION(x, y, geom) Returns the shared portion between the x and y coordinate and the given geometry geom, i.e. the point itself.
STXY_INTERSECTS(x, y, geom) Returns 1 (true) if the x and y coordinate and geom intersect in 2-D.
STXY_TOUCHES(x, y, geom) Returns 1 (true) if the x and y coordinate and geometry geom have at least one point in common but their interiors do not intersect.
STXY_WITHIN(x, y, geom) Returns 1 (true) if the x and y coordinate is completely inside the geom geometry i.e., not on the boundary

Geometry Functions

Function Details
DIST(x1, y1, x2, y2) Computes the Euclidean distance (in degrees), i.e. SQRT( (x1-x2)*(x1-x2) + (y1-y2)*(y1-y2) ).
GEODIST(lon1, lat1, lon2, lat2) Computes the geographic great-circle distance (in meters) between two lat/lon points.
ST_ADDPOINT(linestring, point, position) Adds a the given point geometry to the given linestring geometry at the specified position, which is a 0-based index.
ST_ALMOSTEQUALS(geom1, geom2, decimal) Returns 1 (true) if given geometries, geom1 and geom2, are almost spatially equal within the given amount of decimal scale. Note that geometries will still be considered equal if the decimal scale for the geometries is within a half order of magnitude of each other, e.g, if decimal is set to 2, then POINT(63.4 123.45) and POINT(63.4 123.454) are equal, but POINT(63.4 123.45) and POINT(63.4 123.459) are not equal. The geometry types must match to be considered equal.
ST_AREA(geom) Returns the area of the given geometry geom (in degrees) if it is a POLYGON or MULTIPOLYGON. Returns 0 if the input geometry type is (MULTI)POINT or (MULTI)LINESTRING.
ST_BOUNDARY(geom) Returns the closure of the combinatorial boundary of a given geometry geom. Returns an empty geometry if geom is an empty geometry. Returns a null if geom is a GEOMETRYCOLLECTION
ST_BOUNDINGDIAGONAL(geom) Returns the diagonal of the given geometry's (geom) bounding box.
ST_BUFFER(geom, radius, style)

Returns a geometry that represents all points whose distance from the given geometry geom is less than or equal to the given distance radius in degrees. Additional style parameters can be specified as a list of blank-separated key-value pairs, e.g., 'quad_segs=8 endcap=round' :

  • quad_segs -- the number of segments used to approximate a quarter circle (default is 8)
  • endcap -- the endcap style of the buffer (default is round); options are round, flat (or butt), and square
  • join -- the join style of the buffer (default is round); options are round, mitre (or miter), and bevel
  • mitre_limit -- the mitre ratio limit expressed as a floating point number (miter_limit is also acceptable)
ST_CENTROID(geom) Calculates the center of the given geometry geom as a POINT. For (MULTI)POINTs, the center is calculated as the average of the input coordinates. For (MULTI)LINESTRINGs, the center is calculated as the weighted length of each given LINESTRING. For (MULTI)POLYGONs, the center is calculated as the weighted area of each given POLYGON. If geom is an empty geometry, an empty GEOMETRYCOLLECTION is returned
ST_CLIP(geom1, geom2) Returns the geometry shared between given geometries geom1 and geom2.
ST_COLLECTIONEXTRACT(collection, type)

Returns only the specified type from the given geometry collection. Type is a number that maps to the following:

  • 1 = POINT
  • 2 = LINESTRING
  • 3 = POLYGON
ST_COLLECTIONHOMOGENIZE(collection) Returns the simplest form of the given collection, e.g., a collection with a single POINT will be returned as POINT(x y), and a collection with multiple individual points will be returned as a MULTIPOINT.
ST_CONTAINS(geom1, geom2) Returns 1 (true) if no points of geom2 lie in the exterior of geom1 and at least one point of geom2 lies in the interior of geom1. Note that geom1 does not contain its boundary but does contain itself.
ST_CONTAINSPROPERLY(geom1, geom2) Returns 1 (true) if geom2 intersects the interior of geom1 but not the boundary (or exterior). Note that geom1 does not contain its boundary but does contain itself.
ST_CONVEXHULL(geom) Returns the minimum convex geometry that encloses all geometries in the given geom set.
ST_COORDDIM(geom) Returns the coordinate dimension of the given geom, e.g., a geometry with x, y, and z coordinates would return 3.
ST_COVEREDBY(geom1, geom2) Returns 1 (true) if no point in geom1 is outside geom2.
ST_COVERS(geom1, geom2) Returns 1 (true) if no point in geom2 is outside geom1.
ST_CROSSES(geom1, geom2) Returns 1 (true) if the given geometries, geom1 and geom2, spatially cross, meaning some but not all interior points in common.
ST_DIFFERENCE(geom1, geom2) Returns a geometry that represents the part of geom1 that does not intersect with geom2.
ST_DIMENSION(geom) Returns the dimension of the given geometry geom, which is less than or equal to the coordinate dimension. If geom is a single geometry, a 0 is for POINT, a 1 is for LINESTRING, and a 2 is for POLYGON. If geom is a collection, it will return the largest dimension from the collection. If geom is empty, 0 is returned.
ST_DISJOINT(geom1, geom2) Returns 1 (true) if the given geometries, geom1 and geom2, do not spatially intersect.
ST_DISTANCE(geom1, geom2) Returns the minimum 2-D Cartesian distance (in degrees) between the given geometries, geom1 and geom2.
ST_DWITHIN(geom1, geom2, distance) Returns 1 (true) if the geometries geom1 and geom2 are within the specified distance of each other.
ST_ELLIPSE(centerx, centery, height, width)

Returns an ellipse using the following values:

  • centerx -- the x coordinate or longitude used to center the ellipse
  • centery -- the y coordinate or latitude used to center the ellipse
  • height -- the height of the ellipse (in degrees)
  • width -- the width of the ellipse (in degrees)
ST_ENDPOINT(geom) Returns the last point of the given geom as a POINT if it's a LINESTRING. If geom is not a a LINESTRING, null is returned.
ST_ENVDWITHIN(geom1, geom2, distance) Returns 1 (true) if geom1 is within the specified distance of the bounding box of geom2.
ST_ENVELOPE(geom) Returns the bounding box of a given geometry geom.
ST_ENVINTERSECTS(geom1, geom2) Returns 1 (true) if the bounding box of the given geometries, geom1 and geom2, intersect.
ST_EQUALS(geom1, geom2) Returns 1 (true) if the given geometries, geom1 and geom2, are spatially equal. Note that order does not matter.
ST_EQUALSEXACT(geom1, geom2, tolerance) Returns 1 (true) if the given geometries, geom1 and geom2, are almost spatially equal within some given tolerance. If the values within the given geometries are within the tolerance value of each other, they're considered equal, e.g., if tolerance is 2, POINT(1 1) and POINT(1 3) are considered equal, but POINT(1 1) and POINT(1 3.1) are not. Note that the geometry types have to match for them to be considered equal.
ST_ERASE(geom1, geom2) Returns the result of erasing a portion of geom1 equal to the size of geom2.
ST_EXPAND(geom, units) Returns the bounding box expanded in all directions by the given units of the given geom. The expansion can also be defined for separate directions by providing separate parameters for each direction, e.g., ST_EXPAND(geom, unitsx, unitsy, unitsz, unitsm).
ST_EXPANDBYRATE(geom, rate) Returns the bounding box expanded by a given rate (a ratio of width and height) for the given geometry geom. The rate must be between 0 and 1.
ST_EXTERIORRING(geom) Returns a LINESTRING representing the exterior ring of the given POLYGON geom
ST_GENERATEPOINTS(geom, num) Creates a MULTIPOINT containing a number num of randomly generated points within the boundary of geom.
ST_GEOMETRYN(geom, index) Returns the index geometry back from the given geom geometry. The index starts from 1 to the number of geometry in geom.
ST_GEOMETRYTYPE(geom) Returns the type of geometry from the given geom.
ST_GEOMETRYTYPEID(geom)

Returns the type ID of from geom. Type and ID mappings:

  • POINT = 0
  • LINESTRING = 1
  • LINEARRING = 2
  • POLYGON = 3
  • MULTIPOINT = 4
  • MULTILINESTRING = 5
  • MULTIPOLYGON = 6
  • GEOMETRYCOLLECTION = 7
ST_GEOMFROMTEXT(wkt) Returns a geometry from the given Well-Known text representation wkt. Note that this function is only compatible with constants.
ST_INTERIORRINGN(geom, n) Returns the n-th interior LINESTRING ring of the POLYGON geom. If geom is not a POLYGON or the given n is out of range, a null is returned. The index begins at 1.
ST_INTERSECTION(geom1, geom2) Returns the shared portion between given geometries geom1 and geom2.
ST_INTERSECTS(geom1, geom2) Returns 1 (true) if the given geometries, geom1 and geom2, intersect in 2-D.
ST_ISCLOSED(geom) Returns 1 (true) if the given geometry's (geom) start and end points coincide.
ST_ISCOLLECTION(geom) Returns 1 (true) if geom is a collection, e.g., GEOMETRYCOLLECTION, MULTIPOINT, MULTILINESTRING, etc.
ST_ISEMPTY(geom) Returns 1 (true) if geom is empty.
ST_ISRING(geom) Returns 1 (true) if LINESTRING geom is both closed (per ST_ISCLOSED) and "simple" (per ST_ISSIMPLE). Returns 0 if geom is not a LINESTRING.
ST_ISSIMPLE(geom) Returns 1 (true) if geom has no anomalous geometric points, e.g., self-intersection or self-tangency.
ST_ISVALID(geom) Returns 1 (true) if geom (typically a [MULTI]POLYGON) is well formed. A POLYGON is valid if its rings do not cross and its boundary intersects only at POINTs (not along a line). The POLYGON must also not have dangling LINESTRINGs. A MULTIPOLYGON is valid if all of its elements are also valid and the interior rings of those elements do not intersect. Each element's boundaries may touch but only at POINTs (not along a line).
ST_LENGTH(geom) Returns the 2-D length of the geometry (in degrees) if it is a LINESTRING or MULTILINESTRING. Returns 0 if another type of geometry, e.g., POINT, MULTIPOINT, etc.
ST_LINEFROMMULTIPOINT(geom) Creates a LINESTRING from geom if it is a MULTIPOINT. Returns null if geom is not a MULTIPOINT.
ST_MAKEENVELOPE(xmin, ymin, xmax, ymax) Creates a rectangular POLYGON from the given min and max parameters.
ST_MAKELINE(geom[, geom2, geom3])

Creates a LINESTRING from geom if it is a MULTIPOINT. If geom is a POINT, there must be at least one other POINT to construct a LINESTRING. If geom is a LINESTRING, it must have at least two points. Returns null if geom is not a POINT, MULTIPOINT, or LINESTRING.

Note: This function can be rather costly in terms of performance.

ST_MAKEPOINT(x, y)

Creates a POINT at the given coordinate.

Note: This function can be rather costly in terms of performance.

ST_MAKEPOLYGON(geom)

Creates a POLYGON from geom. Inputs must be closed LINESTRINGs.

Note: This function can be rather costly in terms of performance.

ST_MAXX(geom) Returns the maximum x coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MAXY(geom) Returns the maximum y coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MAXZ(geom) Returns the maximum z coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MINX(geom) Returns the minimum x coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MINY(geom) Returns the minimum y coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MINZ(geom) Returns the minimum z coordinate of a bounding box for the given geom geometry. This function works for 2-D and 3-D geometries.
ST_MULTI(geom) Returns geom as a MULTI- geometry, e.g., a POINT would return a MULTIPOINT.
ST_MULTIPLERINGBUFFERS(geom, distance, outside)

Creates multiple buffers at specified distance around the given geom geometry. Multiple distances are specified as comma-separated values in an array, e.g., [10,20,30]. Valid values for outside are:

  • FULL -- indicates that buffers will overlap or cover the given geom geometry. This is the default.
  • OUTSIDE_ONLY -- indicates that buffers will be rings around the given geom geometry.
ST_NORMALIZE(geom) Returns geom in its normalized (canonical) form, which may rearrange the points in lexicographical order.
ST_NPOINTS(geom) Returns the number of points (vertices) in geom.
ST_NUMGEOMETRIES(geom) If geom is a collection or MULTI- geometry, returns the number of geometries. If geom is a single geometry, returns 1.
ST_NUMINTERIORRINGS(geom) Returns the number of interior rings if geom is a POLYGON. Returns null if geom is anything else.
ST_NUMPOINTS(geom) Returns the number of points in the geom LINESTRING. Returns null if geom is not a LINESTRING.
ST_OVERLAPS(geom1, geom2) Returns 1 (true) if given geometries geom1 and geom2 share space.
ST_POINT(x, y) Returns a POINT with the given x and y coordinates.
ST_POINTN(geom, n) Returns the n-th point in LINESTRING geom. Negative values are valid, but note that they are counted backwards from the end of geom. A null is returned if geom is not a LINESTRING.
ST_POINTS(geom) Returns a MULTIPOINT containing all of the coordinates of geom.
ST_REMOVEPOINT(geom, offset) Remove a point from LINESTRING geom. Index starts at 0, but you can use offset to make point selection easier.
ST_REMOVEREPEATEDPOINTS(geom, tolerance) Removes points from geom if the point's vertices are greater than or equal to the tolerance of the previous point in the geometry's list. If geom is not a MULTIPOINT, MULTILINESTRING, or a MULTIPOLYGON, no points will be removed.
ST_REVERSE(geom) Return the geometry with its coordinate order reversed.
ST_SCALE(geom, x, y) Scales geom by multiplying its respective vertices by the given x and y values. This function also supports scaling geom using another geometry object, e.g., ST_SCALE('POINT(3 4)', 'POINT(5 6)') would return POINT(15 24). If specifying x and y for scale, note that the default value is 0, e.g., ST_SCALE('POINT(1 3)', 4) would return POINT(4 0).
ST_SETPOINT(geom1, position, geom2) Replace a point of LINESTRING geom1 with POINT geom2 at position (base 0). Negative values are valid, but note that they are counted backwards from the end of geom.
ST_SHAREDPATH(geom1, geom2) Returns a collection containing paths shared by geom1 and geom2.
ST_SNAP(geom1, geom2, tolerance) Snaps geom1 to geom2 within the given tolerance. If the tolerance causes geom1 to not snap, the geometries will be returned unchanged.
ST_SPLIT(geom1, geom2) Returns a collection of geometries resulting from the split between geom1 and geom2 geometries.
ST_STARTPOINT(geom) Returns the first point of LINESTRING geom as a POINT. Returns null if geom is not a LINESTRING.
ST_SYMDIFFERENCE(geom1, geom2) Returns a geometry that represents the portions of geom1 and geom2 geometries that do not intersect.
ST_TOUCHES(geom1, geom2) Returns 1 (true) if the given geometries, geom1 and geom2, have at least one point in common but their interiors do not intersect.
ST_TRANSLATE(geom, deltax, deltay) Translate geom by given offsets deltax and deltay. A z-coordinate offset can be applied using deltaz.
ST_UNION(geom1, geom2) Returns a geometry that represents the point set union of the two given geometries, geom1 and geom2.
ST_UNIONCOLLECTION(geom) Returns a geometry that represents the point set union of a single given geometry geom.
ST_UPDATE(geom1, geom2) Returns a geometry that is geom1 geometry updated by geom2 geometry
ST_WITHIN(geom1, geom2) Returns 1 (true) if the geom1 geometry is inside the geom2 geometry. Note that as long as at least one point is inside of geom2, geom1 is considered within geom2 even if the rest of the geom1 lies along the boundary of geom2
ST_X(geom) Returns the X coordinate of the POINT geom; if the coordinate is not available, null is returned. geom must be a POINT.
ST_Y(geom) Returns the Y coordinate of the POINT geom; if the coordinate is not available, null is returned. geom must be a POINT.

Geospatial Aggregation Functions

Function Details
ST_DISSOLVE(geom) Dissolves all geometries within a given set into a single geometry. Note that the resulting single geometry can still be a group of noncontiguous geometries but represented as a single group.

Math Functions

Math Function Description
ABS(expr) Returns the absolute value of expr
ACOS(expr) Calculates the inverse cosine (arccosine) of expr
ACOSH(expr) Calculates the inverse hyperbolic cosine of expr
ASIN(expr) Calculates the inverse sine (arcsine) of expr
ASINH(expr) Calculates the inverse hyperbolic sine of expr
ATAN(expr) Calculates the inverse tangent (arctangent) of expr
ATANH(expr) Calculates the inverse hyperbolic tangent of expr
ATAN2(x, y) Calculates the inverse tangent (arctangent) using two arguments
ATN2(x, y) Alias for ATAN2(x, y)
CBRT(expr) Calculates the cube root of expr
CEIL(expr) Alias for CEILING
CEILING(expr) Rounds expr up to the next highest integer
COS(expr) Calculates the cosine of expr
COSH(expr) Calculates the hyperbolic cosine of expr
COT(expr) Calculates the cotangent of expr
DEGREES(expr) Converts expr (in radians) to degrees
DIVZ(a, b, c) Returns the quotient a / b unless b == 0, in which case it returns c
EXP(expr) Raises e to the power of expr
FLOOR(expr) Rounds expr down to the next lowest integer
GREATEST(expr_a, ..., expr_N) Returns whichever of expr_a through expr_N has the largest value, based on typed comparison
HYPOT(x, y) Calculates the hypotenuse of x and y
ISNAN(expr) Returns 1 (true) if expr is not a number by IEEE standard; otherwise, returns 0 (false)
IS_NAN(expr) Synonymous with ISNAN(expr)
ISINFINITY(expr) Returns 1 (true) if expr is infinity by IEEE standard; otherwise, returns 0 (false)
IS_INFINITY(expr) Synonymous with ISINFINITY(expr)
LDEXP(x, exp) Returns the value of x * 2^ exp
LEAST(expr_a, ..., expr_N) Returns whichever of expr_a through expr_N has the smallest value, based on typed comparison
LN(expr) Calculates the natural logarithm of expr
LOG(expr) Synonymous with LN(expr)
LOG10(expr) Calculates the base-10 logarithm of expr
MOD(dividend, divisor) Calculates the remainder after integer division of dividend by divisor
PI() Returns the value of pi
POW(base, exponent) Alias for POWER(base, exponent)
POWER(base, exponent) Calculates base raised to the power of exponent
RADIANS(expr) Converts expr (in degrees) to radians
ROUND(expr, scale)

Rounds expr to the nearest decimal number with scale decimal places when scale is a positive number; rounds to the nearest number such that the result has -(scale) zeros to the left of the decimal point when scale is negative; use scale of 0 to round to the nearest integer. Examples:

  • ROUND(12345.678, 2) -> 12345.68
  • ROUND(12345.678, 0) -> 12346
  • ROUND(12345.678, -2) -> 12300
SIGN(expr)

Determines whether a number is positive, negative, or zero; returns one of the following three values:

  • positive: 1
  • zero: 0
  • negative: -1
SIN(expr) Calculates the sine of expr
SINH(expr) Calculates the hyperbolic sine of expr
SQRT(expr) Calculates the square root of expr
TAN(expr) Calculates the tangent of expr
TANH(expr) Calculates the hyperbolic tangent of expr
TRUNCATE(expr, scale)

Rounds expr down to the nearest decimal number with scale decimal places, following the same rules as ROUND. Examples:

  • TRUNCATE(12345.678, 2) -> 12345.67
  • TRUNCATE(12345.678, 0) -> 12345
  • TRUNCATE(12345.678, -2) -> 12300

Null Functions

Some of the following null functions require parameters to be of convertible data types. Note that limited-width (charN) & unlimited-width (non-charN) string types are not convertible.

Null Function Description
COALESCE(expr_a, ..., expr_N) Returns the value of the first expression that is not null starting with expr_a and ending with expr_N. If all are null, then null is returned. All expressions should be of the same or convertible data type.
IFNULL(expr_a, expr_b) Returns expr_a if it is not null; otherwise, returns expr_b. Both should be of the same or convertible data type.
ISNULL(expr) Returns 1 if expr is null; otherwise, returns 0
IS_NULL(expr) Synonymous with ISNULL(expr)
NULLIF(expr_a, expr_b) Returns null if expr_a equals expr_b; otherwise, returns the value of expr_a; both expressions should be of the same or convertible data type.
NVL(expr_a, expr_b) Alias for IFNULL
NVL2(expr, value_if_not_null, value_if_null) Evaluates expr: if not null, returns value_if_not_null; if null, returns value_if_null. Both value_if_not_null & value_if_null should be of the same data type as expr or implicitly convertible.
ZEROIFNULL(expr) Alias for IFNULL(expr, 0); not applicable to strings

String Functions

The following functions only apply to limited-width (charN) strings.

String Function Description
ASCII(expr) Returns the ASCII code for the first character in expr
CHAR(expr) The character represented by the standard ASCII code expr in the range [ 0 - 127 ]
CONCAT(expr_a, expr_b) Performs a string concatenation of expr_a & expr_b
CONCAT_TRUNCATE(string 1, string2) A concatenation of string1 and string2 that truncates string2 to fit within the size of string1, e.g., CONCAT_TRUNCATE('ABC123','DEFG') would return ABC123DE because ABC123 is treated as a char8 (rounding up from 6 characters).
CONTAINS(pattern, expr) Returns 1 if expr contains the string-literal pattern; otherwise, returns 0
ENDS_WITH(pattern, expr) Returns 1 if expr ends with the string-literal pattern; otherwise, returns 0
LCASE(expr) Converts expr to lowercase
LEFT(expr, num_chars) Returns the leftmost num_chars characters from expr
LENGTH(expr) Returns the number of characters in expr
LOCATE(match_expr, ref_expr, [start_pos]) Returns the starting position of the first match of match_expr in ref_expr, starting from position 1 or start_pos (if specified)
LOWER(expr) Alias for LCASE
LTRIM(expr) Removes whitespace from the left side of expr
POSITION(match_expr, ref_expr, [start_pos]) Alias for LOCATE
REPLACE(ref_expr, match_expr, repl_expr) Replaces every occurrence of match_expr in ref_expr with repl_expr
RIGHT(expr, num_chars) Returns the rightmost num_chars characters from the expr
RTRIM(expr) Removes whitespace from the right side of expr
STARTS_WITH(pattern, expr) Returns 1 if expr starts with the string-literal pattern; otherwise, returns 0
SUBSTR(expr, start_pos, num_chars) Alias for SUBSTRING
SUBSTRING(expr, start_pos, num_chars) Returns num_chars characters from the expr, starting at the 1-based start pos
TRIM(expr) Removes whitespace from both sides of expr
UCASE(expr) Converts expr to uppercase
UPPER(expr) Alias for UCASE

CASE

The case statement acts as a scalar function, but has two more complex forms. Note that for each of these CASE statements, the value expressions must all be of the same or convertible data type.

In the first form, each WHEN is followed by a conditional expression whose corresponding THEN expression will have its value returned, if true. Control will continue through each WHEN until a match is found and the corresponding value returned; if no match is found, the value of the ELSE expression will be returned, or null, if no ELSE clause exists.

CASE
    WHEN <cond_expr_a> THEN <value_expr_a>
    ...
    WHEN <cond_expr_N> THEN <value_expr_N>
    ELSE <value_expr>
END

In the second form, the CASE expression is evaluated. A match of that result will be attempted against each WHEN expression until a match is found and the value of the corresponding THEN expression returned; if no match is found, the value of the ELSE expression will be returned, or null, if no ELSE clause exists.

CASE <expr>
    WHEN <match_expr_a> THEN <value_expr_a>
    ...
    WHEN <match_expr_N> THEN <value_expr_N>
    ELSE <value_expr>
END

Note

This second version below has greater optimization than the first.

Examples:

CASE
    WHEN color = 1 THEN 'Red'
    WHEN color >= 2 THEN 'Green'
    ELSE 'Blue'
END
CASE mod(length(text), 2)
    WHEN 0 THEN 'Even'
    WHEN 1 THEN 'Odd'
    ELSE null
END

Aggregation Functions

Function Description
ATTR(expr) If MIN(expr) = MAX(expr), returns expr; otherwise *
AVG(expr) Calculates the average value of expr
COUNT(*) Returns the number of records in a table
COUNT(expr) Returns the number of non-null data values in expr
COUNT(DISTINCT expr) Returns the number of distinct non-null data values in expr
MAX(expr) Finds the maximum value of expr
MEAN(expr) Alias for AVG
MIN(expr) Finds the minimum value of expr
STDDEV(expr) Alias for STDDEV_POP
STDDEV_POP(expr) Calculates the population standard deviation of the values of expr
STDDEV_SAMP(expr) Calculates the sample standard deviation of the values of expr
SUM(expr) Sums all the values of expr
VAR(expr) Alias for VAR_POP
VAR_POP(expr) Calculates the population variance of the values of expr
VAR_SAMP(expr) Calculates the sample variance of the values of expr

Predicates

Predicate are generally used within a SQL WHERE clause to query records. They compare the values of two or more expressions; whenever a record meets the criteria defined in a predicate clause it will be marked as eligible to be part of the query result set. If it meets all predicate clauses defined within a query, it will be returned in the result set.

A single predicate clause may use a simple predicate operator to compare the values of two expressions or a more complex predicate clause form. A compound predicate clause uses a compound predicate operator to link together multiple predicate clauses to further refine a result set.

Unlimited-width (non-charN) strings can only be used within equality-based predicates, e.g. =, IN, etc.

Predicate Operators

  • = equality
  • != or <> inequality
  • < less than
  • <= less than or equal to
  • > greater than
  • >= greater than or equal to

Predicate Clauses

In the following list of predicate clauses, ref_expr is the reference expression to apply the predicate to; note that EXISTS has no reference expression.

Predicate Clause Description
<expr_a> <pred_op> <expr_b> Matches records where expr_a relates to expr_b according to predicate operator pred_op.
<ref_expr> <pred_op> ALL (<SELECT statement>) Matches records where the reference expression ref_expr relates to all of the results of SELECT statement according to the predicate operator pred_op
<ref_expr> <pred_op> ANY (<SELECT statement>) Matches records where the reference expression ref_expr relates to any of the results of SELECT statement according to the predicate operator pred_op
<ref_expr> [NOT] BETWEEN <begin_expr> AND <end_expr> Matches records where the reference expression ref_expr is (or is NOT) between the values of begin_expr and end_expr
<ref_expr> [NOT] IN (<match_list>) Matches records where the reference expression ref_expr is (or is NOT) in the match_list list of match values. The list can either be a comma-separated list of terms/expressions or the result of a SELECT statement.
<ref_expr> IS [NOT] NULL Matches records where the reference expression ref_expr is (or is NOT) null.
<ref_expr> [NOT] LIKE <match_expr>

Matches records where reference expression ref_expr does (or does NOT) match the string value of match_expr. The match is a string literal one, with the following exceptions:

  • % matches any string of 0 or more characters
  • _ matches any single character
[NOT] EXISTS (<SELECT statement>)

Matches records where SELECT statement returns 1 or more records.

Note: This clause has limited utility, as correlated subqueries, upon whose real value it relies, are unsupported at this time.

Compound Predicate Operators

Predicate Operator Description
<pred_a> AND <pred_b> Matches records where both pred_a & pred_b are true
<pred_a> OR <pred_b> Matches records where either pred_a or pred_b is true
NOT <pred_b> Matches records where pred is false

Non-Correlated Subqueries

These are subqueries that are self-contained, in that they can be executed independently of the surrounding query.

SELECT num0
FROM calcs
WHERE num0 = (SELECT max(num0) FROM calcs)
SELECT num0, (SELECT max(num0) FROM calcs) as max_num0
FROM calcs
SELECT *
FROM table1
WHERE x IN (SELECT y FROM table2)
SELECT *
FROM table1
WHERE EXISTS (SELECT y FROM table2 WHERE y > 5)
SELECT max(num0)
FROM (SELECT num0 FROM calcs) as num0s

Note

SQL-92 requires that subqueries in FROM clauses have aliases, even if a FROM clause only consists of a single subquery; the num0s table alias in the last example serves that function. The as keyword is optional.

Hints

Hint strings (KI_HINT) can be added as comments within queries, and affect just the query in which they appear. They will override the corresponding client & server settings of the same names (when such settings exist). For example:

SELECT /* KI_HINT_KEEP_TEMP_TABLES, KI_HINT_ROWS_PER_FETCH(20000) */ first_name, last_name
FROM customer
Hint Description
KI_HINT_ALLOW_PARTIAL_PASSDOWN This avoids the error: Query not fully handled, though the query should be reported for further investigation & handling. This hint will try to process the query as best as it can, which will probably not have optimal efficiency.
KI_HINT_DONT_COMBINE Don’t combine joins and unions for this query.
KI_HINT_DONT_FILTER_IN_AGGREGATE Use when issuing a filter on an aggregation of a join
KI_HINT_EXPLAIN_JOINS Output join explain plan to logs
KI_HINT_JOBID_PREFIX(x) Tag corresponding database job names(s) with x; e.g., KI_HINT_JOBID_PREFIX(tag) will result in job names like ODBC_tag_01234567-89ab-cdef-0123-456789abcdef
KI_HINT_KEEP_CROSSJOINS Use when issuing a cross join that is being handled as an inner join and returning incorrect results
KI_HINT_KEEP_TEMP_TABLES Don’t erase temp tables created by this query
KI_HINT_MATERIALIZE_AFTER_JOIN Use projection to materialize joins
KI_HINT_MAX_QUERY_DIMENSIONS(n) Number of joins within the query that are not against primary key columns
KI_HINT_MAX_ROWS_TO_FETCH(n) Set maximum number of rows for a query to retrieve
KI_HINT_NO_PASSDOWN Don’t use optimizations for this query
KI_HINT_ROWS_PER_FETCH(n) Set number of rows to be requested in each batch (also used for batch size when inserting)
KI_HINT_SIMULATION Don’t issue calls to the database, but output the calls that would be made to the log

Data Definition (DDL)

Kinetica supports the basic notion of SQL tables as containers of one or more columns of data. Tables can be created, altered, and dropped.

A column definition consists of a column type and optional column size, column properties, and nullability. Column properties are used to optimize data storage & speed.

The format of a defined column is column name, followed by column definition. A column definition is column type optionally followed by any column size limit or column properties all enclosed in parentheses, followed by an optional nullability statement:

<column name> <column type> [(<column size / column property>,...)] [[NOT] NULL]

This format applies to any DDL statement requiring the definition of columns, like CREATE TABLE and ALTER TABLE (when adding/modifying a column).

For example, the following are valid defined columns:

id INT(SHARD_KEY)                -- makes the id column an integer that is also the table's shard key
name VARCHAR(50, TEXT_SEARCH)    -- makes the name column a 50-char limited string that is text-searchable
ip VARCHAR(IPV4)                 -- makes the ip column a string in IPv4 format
cost DECIMAL(10, 2, STORE_ONLY)  -- makes the cost column an 8.2 decimal that is not held in memory

Column Types

Category Data Type Description
Number INTEGER Native type: integer
INT Alias for INTEGER
BIGINT Native type: long
LONG Alias for BIGINT
FLOAT Native type: float
REAL Alias for FLOAT
DOUBLE Native type: double
DECIMAL Alias for BIGINT
DECIMAL(P,S)

Native type: varies by P & S

P S Native Type
1, 2 0 int8
3, 4 0 int16
5, 6 0 integer
7- n 0 long
1- n 1-4 decimal
5, 6 5, 6 float
7- n 5- n double
NUMERIC Alias for DECIMAL
String VARCHAR Native type: string; character limit based on configured system property
VARCHAR(N) Native type: char1 - char256 or string, whichever is large enough to hold N characters
STRING Alias for VARCHAR
TEXT Alias for VARCHAR
Date/Time TYPE_DATE Native type: date
TYPE_TIME Native type: time
TYPE_TIMESTAMP Native type: timestamp
DATE Alias for TYPE_DATE
DATETIME Native type: datetime
TIME Alias for TYPE_TIME
TIMESTAMP Alias for TYPE_TIMESTAMP
Binary BINARY Native type: bytes
BYTES Alias for BINARY
VARBINARY Alias for BINARY
LONGVARBINARY Alias for BINARY
Geospatial WKT Native type: wkt
GEOMETRY Alias for WKT
ST_GEOMETRY Alias for WKT

Column Properties

Property Description
DICT Applies dict data handling to a column, enabling dictionary-encoding of its values; see Dictionary Encoding for details
DISK_OPTIMIZED Applies disk-optimized data handling to a column
IPV4 Treats the associated string-based column as an IPv4 address
PRIMARY_KEY Treats the associated column as a primary key, or part of a composite primary key if other columns also have this property
SHARD_KEY Treats the associated column as a shard key, or part of a composite shard key if other columns also have this property
STORE_ONLY Applies store-only data handling to a column
TEXT_SEARCH Applies text-searchability to a column

CREATE SCHEMA

Schemas are logical containers for tables, referred to as collections, natively.

To create one:

CREATE < SCHEMA | COLLECTION > <schema name>

ALTER SCHEMA

Any of the following facets of a schema can be altered:

Rename Schema

A schema can be renamed.

ALTER SCHEMA <schema name>
RENAME TO <new schema name>

Set Protection Mode

A schema can be put into a protected mode, which will prevent tables & view within it from expiring. This will override the individual protection modes of this contained entities, though it will not overwrite their modes. When a schema is taken out of protected status, the contained tables & views return to their previous protection mode. A protected entity cannot have its TTL updated.

Setting PROTECTED to TRUE enables protection for a schema's tables & views, and prevents ALTER...SET TTL from being executed against it or its contained entities.

Setting PROTECTED to FALSE disables protection for a schema's tables & views, and allows subsequent ALTER...SET TTL requests to succeed.

ALTER SCHEMA <schema name>
SET PROTECTED < TRUE | FALSE >

Set TTL

A schema can have a time-to-live (TTL) set, which is a convenience for setting all of the tables & views within that schema to a given TTL.

ALTER SCHEMA <schema name>
SET TTL <new ttl>

DROP SCHEMA

When removing a schema from the database, there are two options available, which control how the removal takes place. Normally, an error will be reported if the schema to drop doesn't exist; if IF EXISTS is specified, no error will be reported. Also, an error will be reported if the schema to drop contains any database objects (tables, etc.); if CASCADE is specified, the schema and all objects within it will be removed.

DROP < SCHEMA | COLLECTION > [IF EXISTS] <schema name> [CASCADE]

CREATE TABLE

Creates a new table in the configured ParentSet collection, unless a schema is specified. If OR REPLACE is specified, any existing table with the same name will be dropped before creating this one. If REPLICATED is specified, the table will be distributed within the database as a replicated table. If TEMP is specified, the table will be removed the next time the database is restarted; otherwise, the table will persist through database restarts.

The basic form of the supported CREATE TABLE statement follows. See here for column definition format.

CREATE [OR REPLACE] [REPLICATED] [TEMP] TABLE [<schema name>.]<table name>
(
    <column name> <column definition>,
    ...
    <column name> <column definition>,
    [PRIMARY KEY (<column name>,...)]
)

For example:

CREATE REPLICATED TABLE master.various_types
(
    i   INTEGER NOT NULL,                  -- non-nullable integer
    bi  BIGINT(SHARD_KEY) NOT NULL,        -- long, shard key
    f   FLOAT,                             -- float
    d   DOUBLE(STORE_ONLY),                -- double, not in-memory
    s   VARCHAR(STORE_ONLY, TEXT_SEARCH),  -- string, searchable, not in-memory, only limited in size by system-configured value
    c   VARCHAR(30, DICT),                 -- char32 using dictionary-encoding of values
    ip  VARCHAR(IPV4),                     -- IP address
    ts  TYPE_TIMESTAMP,                    -- timestamp
    td  TYPE_DATE                          -- simple date
    tt  TYPE_TIME                          -- simple time
    dt  DATETIME                           -- date/time
    dc1 DECIMAL,                           -- long
    dc2 DECIMAL(18,4),                     -- decimal
    dc3 DECIMAL(6,5),                      -- float
    dc4 DECIMAL(7, 5, STORE_ONLY),         -- double, not in-memory
    n   NUMERIC(5, 3),                     -- will use the next largest native numeric type to hold the specified number type
    wkt WKT,                               -- geospatial column for WKT string data
    PRIMARY KEY (i, bi)                    -- primary key columns must be NOT NULL
)

CREATE TABLE ... AS

Creates a new table from the given query in the configured ParentSet collection, unless a schema is specified. If OR REPLACE is specified, any existing table with the same name will be dropped before creating this one. If TEMP is specified, the table will be removed the next time the database is restarted; otherwise, the table will persist through database restarts. If VRAM is specified, the table will be loaded into GPU memory.

While primary keys & foreign keys are not transferred to the new table, shard keys will be, if the column(s) composing them are part of the SELECT list. A new shard key can be specified for the created table by using the KI_SHARD_KEY(<column list>) pseudo-function in the SELECT list.

See Limitations for other restrictions.

The general format is:

CREATE [OR REPLACE] [TEMP] [VRAM] TABLE [<schema name>.]<table name> AS
(
    <SELECT statement>
)

For example, to create a temporary table that is a copy of an existing table, failing if a table with the same name as the target table already exists:

CREATE TEMP TABLE new_temporary_table AS
(
    SELECT *
    FROM old_table
)

To create a permanent table with columns a, b, c, & d and a new shard key on columns a & b, replacing a table with the same name as the target table, if it exists:

CREATE OR REPLACE TABLE new_sharded_table AS
(
    SELECT a, b, c, d, KI_SHARD_KEY(a, b)
    FROM old_table
)

ALTER TABLE

Any of the following facets of a table can be altered:

Rename Table

A table can be renamed.

ALTER TABLE [<schema name>.]<table name>
RENAME TO <new table name>

Move Table

A table can be moved from one schema to another.

The general form of the command is:

ALTER TABLE [<schema name>.]<table name>
< MOVE TO | SET SCHEMA > <new schema name>

For example, to move the sales_2017 table from the olap schema to the archive schema:

ALTER TABLE olap.sales_2017
MOVE TO archive

To move the sales_2017 table from the archive schema to the root schema:

ALTER TABLE archive.sales_2017
MOVE TO ""

Set Access Mode

A table can have its global accessibility modified for all users in the system, independently from and further restricting any role-based access controls in place. Note that changing the access mode cannot widen access for users not already granted access; it can only narrow access for those who already have access. This setting will also trump administrative access to a table.

ALTER TABLE [<schema name>.]<table name>
SET ACCESS MODE < NO_ACCESS | READ_ONLY | WRITE_ONLY | READ_WRITE >

Set Protection Mode

A table can be altered to not expire, by altering its protection mode. Note that tables don't normally expire, but can be set to expire.

Setting PROTECTED to TRUE enables protection for a table and prevents ALTER TABLE...SET TTL from being executed against it.

Setting PROTECTED to FALSE disables protection for a table and allows subsequent ALTER TABLE...SET TTL requests to succeed.

ALTER TABLE <table name>
SET PROTECTED < TRUE | FALSE >

Set TTL

A table's time-to-live (TTL) can be altered.

ALTER TABLE <table name>
SET TTL <new ttl>

Add Column

A column can be added, specifying a column definition.

A new column can have its values initially populated through the use of the DEFAULT keyword. These values can either be a string/numeric constant or the name of an existing column in the table from which values can be copied into the new column. This default value is only in effect for the column creation; the new column will have no default value after that.

ALTER TABLE [<schema name>.]<table name>
ADD <column name> <column definition> [DEFAULT <string/numeric constant | column name>]

For example:

ALTER TABLE employee
ADD salary NUMERIC(10, 2, STORE_ONLY) NOT NULL DEFAULT 0

Modify Column

A column can have its column definition modified, affecting column type, column size, column properties, and nullability.

If a column is modified to be non-nullable, it will be populated with default values--empty string for string fields and 0 for numeric fields.

Either of the following can be used to modify a column:

ALTER TABLE [<schema name>.]<table name>
MODIFY [COLUMN] <column name> <column definition>
ALTER TABLE [<schema name>.]<table name>
ALTER COLUMN <column name> <column definition>

For example:

ALTER TABLE employee
ALTER COLUMN first_name VARCHAR(50, TEXT_SEARCH) NOT NULL

Compress Column

A column can have its data compressed in memeory.

The general form to alter a column's compression setting is:

ALTER TABLE [<schema name>.]<table name>
SET COLUMN <column name> COMPRESSION [TO] <compression type>

For example, to use LZ4 compression on a column:

ALTER TABLE employee
SET COLUMN first_name COMPRESSION lz4

To use no compression on a column:

ALTER TABLE employee
SET COLUMN first_name COMPRESSION none

Drop Column

An existing column can be removed from a table:

ALTER TABLE [<schema name>.]<table name>
DROP COLUMN <column name>

Add Column Index

An index can be added to any column not marked as store-only in order to improve the performance of operations whose expressions contain relational operators against the column.

ALTER TABLE [<schema name>.]<table name>
ADD INDEX (<column name>)

For example, to index the employee table's department ID column:

ALTER TABLE employee
ADD INDEX (dept_id)

Drop Column Index

An existing column index can be removed from a table:

ALTER TABLE [<schema name>.]<table name>
DROP INDEX (<column name>)

For example, to drop the index on the employee table's department ID column:

ALTER TABLE employee
DROP INDEX (dept_id)

TRUNCATE TABLE

Deletes all the records from a table:

TRUNCATE TABLE [<schema name>.]<table name>

DROP TABLE

Drops a table:

DROP TABLE [<schema name>.]<table name>

Data Manipulation (DML)

INSERT

To insert one record with literal values, use this format:

INSERT INTO [<schema name>.]<table name> [(<column list>)]
VALUES (<column value list>)

For example:

INSERT INTO target_table (x, y, point_name)
VALUES (99, 100, 'Vertex A')

To insert multiple records, using another table as the source, use this format:

INSERT INTO [<schema name>.]<table name> [(<column list>)]
<SELECT statement>

For example:

INSERT INTO target_table
SELECT *
FROM source_table
WHERE x > 100

Note

When specifying a column list, any non-nullable fields not included in the list will be given default values--empty string for strings, and 0 for numerics. The fields in the column list and the fields selected must align.

UPDATE

Updates can only set columns to constant values, not other column values or results of expressions. The general format is:

UPDATE [<schema name>.]<table name>
SET
  <key 1> = <value 1>,
  ...
  <key n> = <value n>
[WHERE <expression list>]

For example:

UPDATE emp
SET
  sal = 100000,
  mgr = 1234
WHERE empno = 1111

DELETE

Deletes records from a table; the general format is:

DELETE
FROM [<schema name>.]<table name>
[WHERE <expression list>]

For example:

DELETE
FROM emp
WHERE empno = 1111

Working with Files

The ODBC Server allows files to be read from and written to via SQL. The files are accessed by the ODBC Server process, which means they need to have system file permissions set appropriately for that process owner.

Writing to CSV Files

The ODBC Server can export data to a file, prepending the header information necessary for reading the data back from the file again.

A new file can be created with a data export via the following syntax:

CREATE TABLE FILE."<file name>" AS
SELECT <column list>
FROM <table name>
WHERE <expression list>

For example, to write all records from the emp table to an emp_2017.csv file:

CREATE TABLE FILE."emp_2017.csv" AS
SELECT *
FROM emp

A file can have data appended to it using the following syntax:

INSERT INTO FILE."<file name>"
SELECT <column list>
FROM <table name>
WHERE <expression list>

For example, to append records from the emp table of employees hired in the second half of 2017 to an emp_2017.csv file:

INSERT INTO FILE."emp_2017.csv"
SELECT id, dept_id, manager_id, first_name, last_name, sal, hire_date
FROM emp
WHERE hire_date BETWEEN '2017-07-01' AND '2017-12-31'

Reading from CSV Files

The ODBC Server can query data from a CSV file using a standard SELECT statement.

In order for the ODBC Server to read data from a CSV file, the file must be properly formatted:

  • The first row must contain the same column name/type format required by GAdmin for importing data. This is also the same format written during export, described in Writing to CSV Files above.

  • The fields must be comma-delimited

  • Strings can optionally be enclosed in double-quotes; double-quotes must be used when the data contains commas; use two consecutive double-quotes as an escape code for double-quoted string data containing double-quotes:

    "This string contains a "" quote mark and a "","" double-quoted comma."

Data can be read from a file with the following syntax:

SELECT <column list>
FROM FILE."<file name>"
WHERE <expression list>

For example, to read employee IDs & names in department 2 from an emp_2017.csv file:

SELECT id, first_name || ' ' || last_name as "Full Name"
FROM FILE."emp_2017.csv"
WHERE dept_id = 2

The SELECT statement can be paired with a CREATE TABLE ... AS to import records into a new table or an INSERT INTO to import records into an existing table.

For example, to import all employee records from an emp_2017.csv file into a new table named emp_2017_archive:

CREATE TABLE emp_2017_archive AS
SELECT *
FROM FILE."emp_2017.csv"

To import department 2 employee records from an emp_2017.csv file into an emp table, whose column types match the file's field types:

INSERT INTO emp (id, dept_id, manager_id, first_name, last_name)
SELECT id, dept_id, manager_id, first_name, last_name
FROM FILE."emp_2017.csv"
WHERE dept_id = 2

Security

Kinetica provides basic table-level role-based access control for users. It also allows global read/write and administrative access to be granted.

CREATE ROLE

Roles are groups of permissions that can be assigned to users or other roles. A role name must contain only lowercase letters, digits, and underscores, and cannot begin with a digit. It also must not be named the same as an existing user or role, including the default roles authenticated & public.

To create a new role, use this format:

CREATE ROLE <role name>

For example, to create an analyst role:

CREATE ROLE analyst

CREATE USER

Users can be added to the system and assigned table-level & system-level permissions either directly or via roles. A user name must contain only lowercase letters, digits, and underscores, and cannot begin with a digit. It also must not be named the same as an existing user or role.

To add a user to the system, use this format:

CREATE USER <user name> [ < [WITH] PASSWORD [=] | IDENTIFIED BY [PASSWORD] > '<user password>' ]

Note

The password needs to be single-quoted and must not contain single quotes.

For example, two of the ways to create a new internal user with the user ID of jdoe and a password of secret are:

CREATE USER jdoe IDENTIFIED BY 'secret'
CREATE USER jdoe WITH PASSWORD 'secret'

To create a user with an existing external LDAP user, the user name should match the LDAP user name and be prepended with the @ symbol; no password is supplied, as the user will be externally authenticated:

CREATE USER @jdoe

DROP ROLE

Dropping a role will remove the associated permissions & roles granted through the role to all users with the role. Users & roles granted the same permissions either directly or via other roles will retain those permissions.

To drop an existing role, use this format:

DROP ROLE <role name>

For example, to drop the analyst role:

DROP ROLE analyst

DROP USER

To remove a user from the system, use this format:

DROP USER <user name>

For example, to drop an internal user jdoe:

DROP USER jdoe

To drop an external LDAP user jdoe:

DROP USER @jdoe

GRANT Role

To grant a role to a user or role:

GRANT <role name> TO <user name | role name>

For example, to grant a role allowing access to analyst tables to the analyst role, and then grant that analyst role to user jdoe:

GRANT analyst_table_access TO analyst
GRANT analyst TO jdoe

GRANT SYSTEM Permission

To grant a system-level permission to a user or role:

GRANT SYSTEM < ADMIN | READ | WRITE > TO <user name | role name>

For example, to grant system administrator permission to jdoe and then grant read access to all tables to the auditor role:

GRANT SYSTEM admin TO jdoe
GRANT SYSTEM read TO auditor

GRANT Table Permission

To grant a table-level permission to a user or role:

GRANT < SELECT | INSERT | UPDATE | DELETE | ALL > [PRIVILEGES] ON [TABLE] [<schema name>.]<table name> TO <user name | role name>

For example, to grant full access on the order table to jdoe and then grant SELECT access on the order_history table to the analyst role:

GRANT ALL PRIVILEGES ON TABLE order TO jdoe
GRANT SELECT ON order_history TO analyst

REVOKE Role

To revoke a role from a user or role:

REVOKE <role name> FROM <user name | role name>

For example, to revoke a role allowing access to analyst tables from the analyst role, and then revoke that analyst role from user jdoe:

REVOKE analyst_table_access FROM analyst
REVOKE analyst FROM jdoe

REVOKE SYSTEM Permission

To revoke a system-level permission from a user or role:

REVOKE SYSTEM < ADMIN | READ | WRITE > FROM <user name | role name>

For example, to revoke system administrator permission from jdoe and then revoke write access to all tables from the auditor role:

REVOKE SYSTEM admin FROM jdoe
REVOKE SYSTEM write FROM auditor

REVOKE Table Permission

To revoke a table-level permission from a user or role:

REVOKE < SELECT | INSERT | UPDATE | DELETE | ALL > [PRIVILEGES] ON [TABLE] [<schema name>.]<table name> FROM <user name | role name>

For example, to revoke full access on the order table from jdoe and then revoke DELETE access on the order_history table from the analyst role:

REVOKE ALL PRIVILEGES ON TABLE order FROM jdoe
REVOKE DELETE ON order_history FROM analyst

SHOW SECURITY

To list the permissions & roles for one or more users and/or roles (or all users and roles in the system):

SHOW SECURITY [FOR <user name | role name>,...]

For example, to show the permissions & roles for jdoe:

SHOW SECURITY FOR jdoe

To show all users & roles:

SHOW SECURITY

Limitations

Kinetica has some limitations for any columns marked as store-only and string columns lacking a charN attribute.

Unlimited-Width Strings

String columns with no charN attribute are stored in whole on disk and in hashed form in memory. Because the strings are only available for processing as hashes, only equality-based operations can be applied to them.

Partially Applicable Operations

  • Scalar Column Functions
    • Null Functions (except ZEROIFNULL )
    • Conditional Functions (equality-based)
  • Aggregate Column Functions
    • COUNT
    • COUNT DISTINCT
  • WHERE (predicate condition must be equality-based)
  • JOIN (join condition must be equality-based)

Inapplicable Operations

  • Subqueries requiring a temporary view to be created
  • GROUP BY
  • UNION
  • INTERSECT
  • EXCEPT
  • CREATE TABLE...AS

Store-Only Columns

Columns marked store-only are only stored on disk, not in memory. Because they are not available for processing, only data extraction operations can be applied to them.

Inapplicable Operations

  • Scalar or Aggregate Column Functions
  • Subqueries requiring a temporary view to be created
  • WHERE
  • GROUP BY
  • JOIN
  • UNION
  • INTERSECT
  • EXCEPT
  • CREATE TABLE...AS

Not Yet Supported SQL

  • Correlated Subqueries (though these can be supported via rewrites as joins)
  • Ranking Functions
    • RANK
    • DENSE_RANK
    • ROW_NUMBER
    • NTILE
  • Partitioning
    • OVER/PARTITIONED BY
  • Spatial Queries (e.g., SQL GIS)
  • View DDL
    • CREATE VIEW
    • CREATE OR REPLACE VIEW
    • DROP VIEW