VOOZH about

URL: https://docs.databricks.com/aws/en/sql/language-manual/functions/cast

⇱ cast function | Databricks on AWS


Skip to main content
Last updated on

Applies to: πŸ‘ check marked yes
Databricks SQL πŸ‘ check marked yes
Databricks Runtime

Casts the value expr to the target data type type. This operator is a synonym for :: (colon colon sign) operator

Syntax​

cast(sourceExpr AS targetType)

Arguments​

  • sourceExpr: Any castable expression.
  • targetType: The data type of the result.

Returns​

The result is type targetType.

The following combinations of data type casting are valid:

Source (row) Target(column)

VOID

numeric

STRING

DATE

TIME

TIMESTAMP

TIMESTAMP_NTZ

year-month interval

day-time interval

BOOLEAN

BINARY

ARRAY

MAP

STRUCT

VARIANT

OBJECT

VOID

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

N

numeric

N

Y

Y

N

N

Y

N

Y

Y

Y

N

N

N

N

Y

N

STRING

N

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

N

N

N

Y

N

DATE

N

N

Y

Y

N

Y

Y

N

N

N

N

N

N

N

Y

N

TIME

N

N

Y

N

Y

N

N

N

N

N

N

N

N

N

N

N

TIMESTAMP

N

Y

Y

Y

N

Y

Y

N

N

N

N

N

N

N

Y

N

TIMESTAMP_NTZ

N

N

Y

Y

N

Y

Y

N

N

N

N

N

N

N

Y

N

year-month interval

N

Y

Y

N

N

N

N

Y

N

N

N

N

N

N

N

N

day-time interval

N

Y

Y

N

N

N

N

N

Y

N

N

N

N

N

N

N

BOOLEAN

N

Y

Y

N

N

Y

N

N

N

Y

N

N

N

N

Y

N

BINARY

N

Y

Y

N

N

N

N

N

N

N

Y

N

N

N

Y

N

ARRAY

N

N

Y

N

N

N

N

N

N

N

N

Y

N

N

Y

N

MAP

N

N

Y

N

N

N

N

N

N

N

N

N

Y

N

N

N

STRUCT

N

N

Y

N

N

N

N

N

N

N

N

N

N

Y

N

N

VARIANT

N

Y

Y

Y

N

Y

Y

N

N

Y

Y

Y

Y

Y

Y

N

OBJECT

N

N

N

N

N

N

N

N

N

N

N

N

Y

Y

N

N

Common error conditions​

Rules and limitations based on targetType​

warning

In Databricks Runtime, if spark.sql.ansi.enabled is false, an overflow will not cause an error but instead will β€œwrap” the result.

A sourceExpr value with an invalid format or invalid characters for targetType will result in a NULL.

numeric​

If the targetType is a numeric and sourceExpr is of type:

  • VOID

    The result is a NULL of the specified numeric type.

  • numeric

    If targetType is an integral numeric, the result is sourceExpr truncated to a whole number.

    Otherwise, the result is sourceExpr rounded to a fit the available scale of targetType.

    If the value is outside the range of targetType, Databricks raises CAST_OVERFLOW.

    Use try_cast to turn overflow errors into NULL.

  • STRING

    sourceExpr is read as a literal value of the targetType.

    If sourceExpr doesn't comply with the format for literal values, Databricks raises CAST_INVALID_INPUT.

    If the value is outside the range of the targetType, Databricks raises CAST_OVERFLOW.

    Use try_cast to turn overflow and invalid format errors into NULL.

  • TIMESTAMP

    The result is the number of seconds elapsed between 1970-01-01 00:00:00 UTC and sourceExpr.

    If targetType is an integral numeric, the result is truncated to a whole number.

    Otherwise, the result is rounded to a fit the available scale of targetType.

    If the result is outside the range of targetType, Databricks raises CAST_OVERFLOW.

    Use try_cast to turn overflow errors into NULL.

  • INTERVAL

    Applies to: πŸ‘ check marked yes
    Databricks SQL πŸ‘ check marked yes
    Databricks Runtime 11.3 LTS and above

    Target type must be an exact numeric.

    Given an INTERVAL upper_unit TO lower_unit the result is measured in total number of lower_unit. If the lower_unit is SECOND, fractional seconds are stored to the right of the decimal point. For all other intervals the result is always an integral number.

  • BOOLEAN

    If sourceExpr is:

    • true: The result is 1.
    • false: The result is 0.
    • NULL: The result is NULL.
  • VARIANT

    The rules of the type of the actual value of the VARIANT type apply.

Examples​

SQL
>SELECT cast(NULLASINT);
NULL

>SELECT cast(5.6ASINT);
5

>SELECT cast(5.6ASDECIMAL(2,0));
6

>SELECT cast(-5.6ASINT);
-5

>SELECT cast(-5.6ASDECIMAL(2,0));
-6

>SELECT cast(128ASTINYINT);
Error: CAST_OVERFLOW

>SELECT cast(128ASDECIMAL(2,0));
Error: CAST_OVERFLOW

>SELECT cast('123'ASINT);
123

>SELECT cast('123.0'ASINT);
Error: CAST_INVALID_INPUT

>SELECT cast(TIMESTAMP'1970-01-01 00:00:01'AS LONG);
1

>SELECT cast(TIMESTAMP'1970-01-01 00:00:00.000001'ASDOUBLE);
1.0E-6

>SELECT cast(TIMESTAMP'2022-02-01 00:00:00'ASSMALLINT);
Error: CAST_OVERFLOW
>SELECT cast(trueASBOOLEAN);
1

>SELECT cast(INTERVAL'1-2'YEARTOMONTHASINTEGER);
14

>SELECT cast(INTERVAL'1:30.5'MINUTETOSECONDASDECIMAL(5,2));
90.50

>SELECT cast(TRUEASINT);
1

>SELECT cast(FALSEASINT);
0

>SELECT cast('15'::VARIANT ASINT);
15

STRING​

If the sourceExpr is a STRING the resulting STRING inherits the collation of sourceExpr. In all other cases the collation of the resulting STRING is the default collation.

To change the collation add the collate expression.

If the targetType is a STRING type and sourceExpr is of type:

  • VOID

    The result is a NULL string.

  • exact numeric

    The result is the literal number with an optional minus-sign and no leading zeros except for the single digit to the left of the decimal point. If the targetType is DECIMAL(p, s) with s greater 0, a decimal point is added and trailing zeros are added up to scale.

  • floating-point binary

    If the absolute number is less that 10,000,000 and greater or equal than 0.001, the result is expressed without scientific notation with at least one digit on either side of the decimal point.

    Otherwise, Databricks uses a mantissa followed by E and an exponent. The mantissa has an optional leading minus sign followed by one digit to the left of the decimal point, and the minimal number of digits greater than zero to the right. The exponent has and optional leading minus sign.

  • DATE

    If the year is between 9999 BCE and 9999 CE, the result is a dateString of the form -YYYY-MM-DD and YYYY-MM-DD respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

  • TIME

    The result is a timeString of the form hh:mm:ss. Fractional seconds .f... are added if necessary.

  • TIMESTAMP

    If the year is between 9999 BCE and 9999 CE, the result is a timestampString of the form -YYYY-MM-DD hh:mm:ss and YYYY-MM-DD hh:mm:ss respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

    Fractional seconds .f... are added if necessary.

  • TIMESTAMP_NTZ

    If the year is between 9999 BCE and 9999 CE, the result is a timestampString of the form -YYYY-MM-DD hh:mm:ss and YYYY-MM-DD hh:mm:ss respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

    Fractional seconds .f... are added if necessary.

  • year-month interval

    The result is its shortest representation of the interval literal. If the interval is negative, the sign is embedded in the interval-string. For units smaller than 10,leading zeros are omitted.

    A typical year-month interval string has the form:

    • INTERVAL 'Y' YEAR
    • INTERVAL 'Y-M' YEAR TO MONTH
    • INTERVAL 'M' MONTH
  • day-time interval

    The result is its shortest representation of the interval literal. If the interval is negative, the sign is embedded in the interval-string. For units smaller than 10, leading zeros are omitted.

    A typical day time interval string has the form:

    • INTERVAL 'D' DAY
    • INTERVAL 'D h' DAY TO HOUR
    • INTERVAL 'D h:m' DAY TO MINUTE
    • INTERVAL 'D h:m:s' DAY TO SECOND
    • INTERVAL 'h' HOUR
    • INTERVAL 'h:m' HOUR TO MINUTE
    • INTERVAL 'm:s' MINUTE TO SECOND
    • INTERVAL 's' SECOND
  • BOOLEAN

    The result of the true boolean is the STRING literal true. For false it's the STRING literal false. For NULL it's the NULL string.

  • BINARY

    A result is the binary sourceExpr interpreted as a UTF-8 character sequence.

    Databricks doesn't validate the UTF-8 characters. A cast from BINARY to STRING will never inject substitution characters or raise an error.

  • ARRAY

    The result is a comma separated list of cast elements, which is braced with square brackets [ ]. One space follows each comma. A NULL element is translated to a literal null.

    Databricks doesn't quote or otherwise mark individual elements, which may themselves contain brackets or commas.

  • MAP

    The result is a comma separated list of cast key value pairs, which is braced with curly braces { }. One space follows each comma. Each key value pair is separated by a ->. A NULL map value is translated to literal null.

    Databricks doesn't quote or otherwise mark individual keys or values, which may themselves may contain curly braces, commas or ->.

  • STRUCT

    The result is a comma separated list of cast field values, which is braced with curly braces { }. One space follows each comma. A NULL field value is translated to a literal null.

    Databricks doesn't quote or otherwise mark individual field values, which may themselves may contain curly braces, or commas.

  • VARIANT

    The rules of the type of the actual value of the VARIANT type apply.

Examples​

SQL
>SELECT cast(NULLAS STRING);
NULL

>SELECT cast(-3Y AS STRING);
-3

>SELECT cast(5::DECIMAL(10,5)AS STRING);
5.00000

>SELECT cast(12345678e-4AS STRING);
1234.5678

>SELECT cast(1e7 as string);
1.0E7

>SELECT cast(1e6 as string);
1000000.0

>SELECT cast(1e-4as string);
1.0E-4

>SELECT cast(1e-3as string);
0.001

>SELECT cast(12345678e7 AS STRING);
1.2345678E14

>SELECT cast(DATE'1900-12-31'AS STRING);
1900-12-31

-- Caesar no more
>SELECT cast(DATE'-0044-03-15'AS STRING);
-0044-03-15

>SELECT cast(DATE'100000-12-31'AS STRING);
+100000-12-31

>SELECT cast(current_timestamp()AS STRING);
2022-04-0222:29:09.783

>SELECT cast(TIMESTAMP_NTZ'2023-01-01'AS STRING);
2023-01-0100:00:00

>SELECT cast(INTERVAL-'13-02'YEARTOMONTHAS STRING);
INTERVAL'-13-2'YEARTOMONTH

>SELECT cast(INTERVAL'12:04.9900'MINUTETOSECONDAS STRING);
INTERVAL'12:04.99'MINUTETOSECOND

>SELECT cast(trueAS STRING);
true

>SELECT cast(falseAS STRING);
false

-- A bad UTF-8 string
>SELECT cast(x'33800033'AS STRING);
3οΏ½3

>SELECT hex(cast(x'33800033'AS STRING));
33800033

>SELECT cast(array('hello',NULL,'world')AS STRING);
[hello,null, world]

>SELECT cast(array('hello','wor, ld')AS STRING);
[hello, wor, ld]

>SELECT cast(array()AS STRING);
[]

>SELECT cast(map('hello',1,'world',null)AS STRING);
{hello ->1, world ->null}

>SELECT cast(map('hello -> 1',DATE'2022-01-01')AS STRING);
{hello ->1->2022-01-01}

>SELECT cast(map()AS STRING);
{}

>SELECT cast(named_struct('a',5,'b',6,'c',NULL)AS STRING);
{5,6,null}

>SELECT cast(named_struct()AS STRING);
{}

>SELECT cast(DATE'2024-01-05'::VARIANT AS STRING);
2024-01-05

>SELECT cast(5AS STRING)COLLATE UNICODE;
5

DATE​

If the targetType is a DATE type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • STRING

    sourceExpr must be a valid dateString.

    If sourceExpr is not a valid dateString, Databricks raises CAST_INVALID_INPUT.

    Use try_cast to turn invalid data errors into NULL.

  • TIMESTAMP

    The result is date portion of the timestamp sourceExpr.

  • TIMESTAMP_NTZ

    The result is date portion of the timestamp_ntz sourceExpr.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

Examples​

SQL
>SELECT cast(NULLASDATE);
NULL

>SELECT cast('1900-10-01'ASDATE);
1900-10-01

>SELECT cast('1900-10-01'ASDATE);
1900-10-01

-- There is no February 30.
>SELECT cast('1900-02-30'ASDATE);
Error: CAST_INVALID_INPUT

>SELECT cast(TIMESTAMP'1900-10-01 12:13:14'ASDATE);
1900-10-01

>SELECT cast(TIMESTAMP_NTZ'1900-10-01 12:13:14'ASDATE);
1900-10-01

>SELECT cast(TIMESTAMP_NTZ'1900-10-01 12:13:14'::VARIANT ASDATE);
1900-10-01

TIMESTAMP​

If the targetType is a TIMESTAMP type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • numeric

    sourceExpr is read as the number of seconds since 1970-01-01 00:00:00 UTC.

    Fractions smaller than microseconds are truncated.

    If the value is outside of the range of TIMESTAMP, Databricks raises CAST_OVERFLOW.

    Use try_cast to turn overflow errors into NULL.

  • STRING

    sourceExpr must be a valid timestampString.

    If sourceExpr is not a valid timestampString, Databricks raises CAST_INVALID_INPUT.

    Use try_cast to turn invalid data errors into NULL.

  • DATE

    The result is the sourceExpr DATE at 00:00:00hrs.

  • TIMESTAMP_NTZ

The result is a timestamp value with the same year/month/day/hour/minute/second fields of timestamp_ntz sourceExpr.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

Examples​

SQL
>SELECT cast(NULLASTIMESTAMP);
NULL

>SETTIME ZONE '+00:00';
>SELECT cast(0.0ASTIMESTAMP);
1970-01-0100:00:00

>SELECT cast(0.0000009ASTIMESTAMP);
1970-01-0100:00:00

>SELECT cast(1e20 ASTIMESTAMP);
Error: CAST_OVERFLOW

>SELECT cast('1900'ASTIMESTAMP);
1900-01-0100:00:00

>SELECT cast('1900-10-01 12:13:14'ASTIMESTAMP);
1900-10-0112:13:14

>SELECT cast('1900-02-30 12:13:14'ASTIMESTAMP);
Error: CAST_INVALID_INPUT

>SELECT cast(DATE'1900-10-01'ASTIMESTAMP);
1900-10-0100:00:00

>SELECT cast(TIMESTAMP_NTZ'2023-01-01 02:03:04.567'asTIMESTAMP)
2023-01-0102:03:04.567

>SELECT cast(DATE'1900-10-01'::VARIANT ASTIMESTAMP);
1900-10-0100:00:00

TIMESTAMP_NTZ​

If the targetType is a TIMESTAMP_NTZ type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • STRING

    sourceExpr must be a valid timestampString.

    If sourceExpr is not a valid timestampString, Databricks raises CAST_INVALID_INPUT.

    Use try_cast to turn invalid data errors into NULL.

  • DATE

    The result is the sourceExpr DATE at 00:00:00hrs.

  • TIMESTAMP

    The result is local time as the sourceExpr in the session time zone.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

Examples​

SQL
>SELECT cast(NULLAS TIMESTAMP_NTZ);
NULL

>SELECT cast('1900'AS TIMESTAMP_NTZ);
1900-01-0100:00:00

>SELECT cast('1900-10-01 12:13:14'AS TIMESTAMP_NTZ);
1900-10-0112:13:14

>SELECT cast('1900-02-30 12:13:14'AS TIMESTAMP_NTZ);
Error: CAST_INVALID_INPUT

>SELECT cast(DATE'1900-10-01'AS TIMESTAMP_NTZ);
1900-10-0100:00:00

>SELECT current_timezone(), CAST(TIMESTAMP'2021-7-1T8:43:28'as TIMESTAMP_NTZ);
America/Los_Angeles 2021-07-0108:43:28

>SELECT current_timezone(), CAST(TIMESTAMP'2021-7-1T8:43:28UTC+3'as TIMESTAMP_NTZ);
America/Los_Angeles 2021-06-3022:43:28

>SELECT cast(DATE'1900-10-01'::VARIANT AS TIMESTAMP_NTZ);
1900-10-0100:00:00

TIME​

If the targetType is a TIME type and sourceExpr is of type:

  • VOID

    The result is a NULL TIME.

  • STRING

    sourceExpr must be a valid timeString.

    If sourceExpr is not a valid timeString, Databricks raises CAST_INVALID_INPUT.

    Use try_cast to turn invalid data errors into NULL.

  • TIME

    If the target precision is lower, the fractional seconds are truncated (not rounded). If the target precision is higher, the result is padded with trailing zeros.

Examples​

SQL
>SELECT CAST(NULLASTIME);
NULL

>SELECT CAST('10:30:00'ASTIME);
10:30:00

>SELECT CAST('25:00:00'ASTIME);
Error: CAST_INVALID_INPUT

>SELECT CAST(TIME'09:15:30.123456'ASTIME(3));
09:15:30.123

year-month interval​

If the targetType is a year-month interval and sourceExpr is of type:

Examples​

SQL
>SELECT cast(NULLASINTERVALYEAR);
NULL

>SELECT cast('1-4'ASINTERVALYEARTOMONTH)::STRING;
INTERVAL'1-4'YEARTOMONTH

>SELECT cast('1'ASINTERVALYEARTOMONTH);
Error: CAST_INVALID_INPUT

>SELECT cast(INTERVAL'1-4'YEARTOMONTHASINTERVALMONTH)::STRING;
INTERVAL'16'MONTH

>SELECT cast(14ASINTERVALYEARTOMONTH)::STRING;
INTERVAL'1-2'YEARTOMONTH

>SELECT cast(INTERVAL'1-11'YEARTOMONTHASINTERVALYEAR)::STRING;
INTERVAL'1'YEAR

day-time interval​

If the targetType is a day-time interval and sourceExpr is of type:

SQL
>SELECT cast(NULLASINTERVALHOUR);
NULL

>SELECT cast('1 4:23'ASINTERVALDAYTOMINUTE)::STRING;
INTERVAL'1 04:23'DAYTOMINUTE

>SELECT cast('1'ASINTERVALDAYTOMINUTE);
Error: CAST_INVALID_INPUT

>SELECT cast(INTERVAL'1 4:23'DAYTOMINUTEASINTERVALMINUTE)::STRING;
INTERVAL'1703'MINUTE

>SELECT cast(INTERVAL'1 4:23'DAYTOMINUTEASINTERVALHOUR)::STRING;
INTERVAL'28'HOUR

>SELECT cast(125.3ASINTERVALMINUTETOSECOND)::STRING;
INTERVAL'2:5.3'MINUTETOSECOND

BOOLEAN​

If the targetType is a BOOLEAN and sourceExpr is of type:

  • VOID

    The result is a NULL of type BOOLEAN.

  • numeric

    If sourceExpr is:

  • STRING

    If sourcEexpr is (case insensitive):

    • 'T', 'TRUE', 'Y', 'YES', or '1': The result is true
    • 'F', 'FALSE', 'N', 'NO', or '0': The result is false
    • NULL: The result is NULL

    Otherwise, Databricks raises CAST_INVALID_INPUT.

    Use try_cast to turn invalid data errors into NULL.

  • VARIANT

    The rules of the type of the actual value of the VARIANT type apply.

Examples​

SQL
>SELECT cast(NULLASBOOLEAN);
NULL

>SELECT cast('T'ASBOOLEAN);
true

>SELECT cast('True'ASBOOLEAN);
true

>SELECT cast('1'ASBOOLEAN);
true

>SELECT cast('0'ASBOOLEAN);
false

>SELECT cast('n'ASBOOLEAN);
false

>SELECT cast('on'ASBOOLEAN);
Error: CAST_INVALID_INPUT

>SELECT cast(0ASBOOLEAN);
false

>SELECT cast(0.0E10 ASBOOLEAN);
false

>SELECT cast(1ASBOOLEAN);
true

>SELECT cast(0.1ASBOOLEAN);
true

>SELECT cast('NaN'::FLOATASBOOLEAN);
true

>SELECT cast(1::VARIANT ASBOOLEAN);
true

BINARY​

If the targetType is a BINARY and sourceExpr is of type:

  • VOID

    The result is a NULL of type BINARY.

  • STRING

    The result is the UTF-8 encoding of the surceExpr.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

Examples​

SQL
>SELECT cast(NULLASBINARY);
NULL

>SELECT hex(cast('Spark SQL'ASBINARY));
537061726B2053514C

>SELECT hex(cast('OΠ΄esa'ASBINARY));
4FD0B4657361

>SELECT hex(cast('OΠ΄esa'::VARIANT ASBINARY));
4FD0B4657361

ARRAY​

If the targetType is an ARRAY < targetElementType > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targeType.

  • ARRAY < sourceElementType >

    If the cast from sourceElementType to targetElementType is supported, the result is an ARRAY<targetElementType> with all elements cast to the targetElementType.

    Databricks raises an error if the cast isn't supported or if any of the elements can't be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

Examples​

SQL
>SELECT cast(NULLAS ARRAY<INT>);
NULL

>SELECT cast(array('t','f',NULL)AS ARRAY<BOOLEAN>);
[true,false,NULL]

>SELECT cast(array('t','f',NULL)ASINTERVALYEAR);
Error: DATATYPE_MISMATCH

>SELECT cast(array('t','f','o')AS ARRAY<BOOLEAN>);
Error: CAST_INVALID_INPUT

>SELECT cast(array('t','f',NULL)::VARIANT AS ARRAY<BOOLEAN>);
[true,false,NULL]

MAP​

If the targetType is an MAP < targetKeyType, targetValueType > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targetType.

  • MAP <sourceKeyType, sourceValueType >

    If the casts from sourceKeyType to targetKeyType and sourceValueType to targetValueType are supported, the result is an MAP<targetKeyType, targetValueType> with all keys cast to the targetKeyType and all values cast to the targetValueType.

    Databricks raises an error if the cast isn't supported or if any of the keys or values can't be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

  • OBJECT < [sourceFieldName : sourceFieldType [, …]] >

    Each sourceFieldName of type STRING is cast to targetKeyType and mapped to a map key. Each source field value of sourceFieldType is cast targetValueType and mapped the respective map value.

    Databricks raises an error if any casts are not supported or if any of the field values or key values can't be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples​

SQL
>SELECT cast(NULLAS MAP<STRING,INT>);
NULL

>SELECT cast(map('10','t','15','f','20',NULL)AS MAP<INT,BOOLEAN>);
{10->true,15->false,20->null}

>SELECT cast(map('10','t','15','f','20',NULL)AS MAP<INT, ARRAY<INT>>);
Error: DATATYPE_MISMATCH

>SELECT cast(map('10','t','15','f','20','o')AS MAP<INT,BOOLEAN>);
Error: CAST_INVALID_INPUT

-- Casting an OBJECT to a MAP
>SELECT schema_of_variant(parse_json('{"cars": 12, "bicycles": 5 }'));
OBJECT<bicycles: BIGINT, cars: BIGINT>
>SELECT CAST(parse_json('{"cars": 12, "bicycles": 5 }')AS MAP<STRING,INTEGER>);
{bicycles ->5, cars ->12}

STRUCT​

If the targetType is a STRUCT <[targetFieldName : targetFieldType [NOT NULL] [COMMENT str] [, …]] > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targetType.

  • STRUCT < [sourceFieldName : sourceFieldType [NOT NULL] [COMMENT str] [, …]] >

    The sourceExpr can be cast to targetType if all of these conditions are true:

    • The source type has the same number of fields as the target
    • For all fields: sourceFieldTypeN can be cast to the targetFieldTypeN.
    • For all field values: The source field value N can be cast to targetFieldTypeN and the value isn't null if target field N is marked as NOT NULL.

    sourceFieldNames, source NOT NULL constraints, and source COMMENTs need not match the targetType and are ignored.

    Databricks raises an error if the cast isn't supported or if any of the fields can't be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

  • VARIANT

    The type rules for the actual value held by the VARIANT data type apply.

  • OBJECT < [sourceFieldName : sourceFieldType [, …]] >

    All sourceFieldNames are matched to sourceFieldNames. Each source field value of sourceFieldType is cast to the matched targetValueType and mapped to the respective map value.

    If a targetFieldName is not matched, the field value is NULL.

    If a sourceFieldName is not matched, the field is ignored.

    Databricks raises an error if any casts are not supported or if any of the field values or key values can't be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples​

SQL
>SELECT cast(NULLAS STRUCT<a:INT>);
NULL

>SELECT cast(named_struct('a','t','b','1900-01-01')AS STRUCT<b:BOOLEAN, c:DATENOTNULLCOMMENT'Hello'>);
{"b":true,"c":1900-01-01}

>SELECT cast(named_struct('a','t','b',NULL::DATE)AS STRUCT<b:BOOLEAN, c:DATENOTNULLCOMMENT'Hello'>);
Error: DATATYPE_MISMATCH

>SELECT cast(named_struct('a','t','b','1900')AS STRUCT<b:BOOLEAN, c:ARRAY<INT>>);
Error: DATATYPE_MISMATCH

>SELECT cast(named_struct('a','t','b','hello')AS STRUCT<b:BOOLEAN, c:DATE>);
Error: CAST_INVALID_INPUT

>SELECT cast(named_struct('a','t','b','1900-01-01')::VARIANT AS STRUCT<b:BOOLEAN, c:DATENOTNULLCOMMENT'Hello'>);
{"b":true,"c":1900-01-01}

-- Casting an OBJECT to a STRUCT
>SELECT schema_of_variant(parse_json('{"name": "jason", "age": 25 }'));
OBJECT<age: BIGINT, name: STRING>
>SELECT CAST(parse_json('{"name": "jason", "age": 25 }')AS STRUCT<id: BIGINT, name: STRING>);
{"id":null,"name":"jason"}

VARIANT​

If the targetType is a VARIANT and sourceExpr is of type:

  • VOID

    The result is a NULL of type VARIANT.

  • numeric

    The result is a VARIANT, representing the numeric value. The precision of DECIMAL types must be <= 38.

    All integral numerics are mapped to BIGINT.

    All DECIMAL values are mapped to their narrowest precision and scale.

  • STRING

    The result is a VARIANT, representing the STRING value.

  • DATE

    The result is a VARIANT, representing the DATE value.

  • TIMESTAMP

    The result is a VARIANT, representing the TIMESTAMP value.

  • TIMESTAMP_NTZ

    The result is a VARIANT, representing the TIMESTAMP NTZ value.

  • BOOLEAN

    The result is a VARIANT, representing the BOOLEAN value.

  • BINARY

    The result is a VARIANT, representing the BINARY value.

  • ARRAY < sourceElementType >

    If the cast from sourceElementType to VARIANT is supported, the result is VARIANT, representing ARRAY<sourceElementType>.

    Databricks raises an error if the cast isn't supported.

  • STRUCT

    This cast is not directly supported since VARIANT cannot preserve the order of struct fields. Use to_variant_object to convert the STRUCT to VARIANT as an OBJECT.

  • MAP

    This cast is not directly supported since VARIANT cannot preserve the MAP type. Use to_variant_object to convert the MAP to VARIANT as an OBJECT. The map keys must be strings.

    Databricks raises an error if the cast isn't supported.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples​

SQL
>SELECT cast(NULLAS VARIANT);
NULL

>SELECT cast(5.1000AS VARIANT);
5.1

>SELECT schema_of_variant(cast(5AS VARIANT));
BIGINT