CREATE TABLE
On this page
The CREATE TABLE command creates a new table.
Syntax
CREATE[ROWSTORE][REFERENCE |TEMPORARY|GLOBALTEMPORARY]TABLE[IFNOTEXISTS]<table_name>(<create_definition>,...)[<table_options>][[AS]SELECT...]CREATETABLE[IFNOTEXISTS] new_tbl_name{ LIKE original_tbl_name |(LIKE original_tbl_name) }[WITH DEEP | SHALLOW COPY]<create_definition>:<column_name> { <column_definition>|AS<computed_column_definition> }|[CONSTRAINT[symbol]]PRIMARYKEY[<index_type>](<index_column_name>,...)[<index_option>]...| { INDEX|KEY } [<index_name>][<index_type>](<index_col_name>,...)[<index_option>]...|[CONSTRAINT[symbol]]UNIQUE[INDEX|KEY][<index_name>][<index_type>](<index_column_name>,...)[<index_option>]...|[CONSTRAINT[symbol]] SHARD KEY[<index_type>](<index_column_name>,...)[METADATA_ONLY][<index_option>]...| SORT KEY(<index_column_name>,...[DESC])|COLUMNGROUP[column_group_name](*)| FULLTEXT[USING VERSION 1][<index_name>](<index_column_name>,...)| FULLTEXT USING VERSION 2[<index_name>](<index_column_name>,...)| VECTOR {INDEX|KEY} [<index_name>](<column>)[INDEX_OPTIONS '<json>']| MULTI VALUEINDEX(col1) INDEX_OPTIONS='<options>'|FOREIGNKEY[foreign_key_name](col1,..., coln)REFERENCES table_referenced (col_referenced))<column_definition>:<data_type>[NOTNULL|NULL][DEFAULT<default_value>][ONUPDATE<update_value>][AUTO_INCREMENT[AS SEQUENCE]][UNIQUE[KEY]|[PRIMARY]KEY][SPARSE][SERIES TIMESTAMP]<computed_column_definition>:computed_column_expression PERSISTED [data_type | AUTO]<data_type>:BIT[(<length>)]|TINYINT[(<length>)][UNSIGNED]|SMALLINT[(<length>)][UNSIGNED]|INT[(<length>)][UNSIGNED]|INTEGER[(<length>)][UNSIGNED]|BIGINT[(<length>)][UNSIGNED]|REAL[(<length>,<decimals>)][UNSIGNED]|DOUBLE[(<length>,<decimals>)][UNSIGNED]|DECIMAL[(<length>[,<decimals>])][UNSIGNED]|NUMERIC[(<length>[,<decimals>])][UNSIGNED]|DATETIME|DATETIME(6)|TIMESTAMP/0]p|TIMESTAMP(6)|DATE|TIME|CHAR[(<length>)][CHARACTERSET<character_set_name>][COLLATE<collation_name>]|VARCHAR(<length>)[CHARACTERSET<character_set_name>][COLLATE<collation_name>]|TINYBLOB|BLOB|MEDIUMBLOB|LONGBLOB|TINYTEXT[BINARY]|TEXT[BINARY]|MEDIUMTEXT[BINARY]|LONGTEXT[BINARY]|ENUM(<value1>,<value2>,<value3>,...)|SET(<value1>,<value2>,<value3>,...)| JSON [COLLATE<collation_name>]| GEOGRAPHY| GEOGRAPHYPOINT<index_column_name>:<column_name>[(<length>)][ASC|DESC]<index_type>:|USING { BTREE|HASH }<index_option>:KEY_BLOCK_SIZE [=]<value>|<index_type>|COMMENT'<string>'| BUCKET_COUNT [=]<value>|WITH(<index_kv_options>)| UNENFORCED [RELY | NORELY]<index_kv_options>:<index_kv_option>[,<index_kv_option>]...<index_kv_option>:RESOLUTION =<value>| COLUMNSTORE_SEGMENT_ROWS =<value>| COLUMNSTORE_FLUSH_BYTES =<value><table_options>:<table_option>[[,]<table_option>]...<table_option>:AUTO_INCREMENT[=]<value>|COMMENT[=]'<string>'| AUTOSTATS_ENABLED = { TRUE|FALSE }| AUTOSTATS_CARDINALITY_MODE = {INCREMENTAL|PERIODIC|OFF}| AUTOSTATS_HISTOGRAM_MODE = {CREATE|UPDATE|OFF}| AUTOSTATS_SAMPLING = {ON|OFF}| COMPRESSION = SPARSE
CREATE { TABLE | TABLES } AS INFER PIPELINE Syntax
CREATETABLE[IFNOTEXISTS]<table_name>AS INFER PIPELINE ASLOADDATA{ LINK <link_name>|MONGODB "<collection>" CONFIG '<config_json>' CREDENTIALS '<credentials_json>'|MYSQL "<source_db>.<source_table>" CONFIG '<config_json>' CREDENTIALS '<credentials_json>' }FORMAT AVRO;
CREATETABLES[IFNOTEXISTS]AS INFER PIPELINE ASLOADDATA{ LINK <link_name>|MONGODB '*' CONFIG '<conf_json>' CREDENTIALS '<cred_json>'|MYSQL "*" CONFIG '<config_json>' CREDENTIALS '<credentials_json>' }FORMAT AVRO;
SingleStore supports replicating data via Change Data Capture (CDC) pipelines using the CREATE { TABLE | TABLES } AS INFER PIPELINE syntax only from MongoDB® and MySQL data sources. Refer to the following for replicating data from the respective data source:
Note
The CREATE {TABLE|TABLES} ... AS INFER PIPELINE statement only supports links (LINK clause) to the MySQL and MongoDB® data sources.
Remarks
Note
Unless CREATE ROWSTORE TABLE ... or SORT KEY() are specified, the value of the default_table_type engine variable determines the type of table (columnstore or rowstore) that is created.
When default_table_type is set to columnstore, you can create a columnstore table using standard CREATE TABLE syntax.
default_table_type is set to columnstore for SingleStore Helios workspaces. You cannot change the value of default_table_type.
The setting of default_table_type applies to temporary tables. When creating GLOBAL TEMPORARY tables, if default_table_type is set to columnstore, you must use CREATE ROWSTORE GLOBAL TEMPORARY TABLE. GLOBAL TEMPORARY is not supported on columnstore tables.
-
For more information about the data types listed above, and for an explanation of
UNSIGNED, refer to the Data Types topic. -
The
SETdata type restricts the values that can be inserted for a table column. Only the set of strings that are listed for a column at the time of table creation can be inserted. -
<table_name>is the name of the table to create in the SingleStore Helios database. -
The following note applies when the engine variable
table_name_case_sensitivityis set toOFF: After you create a table, you cannot create another table having the same table name with a different case. Refer to the Database Object Case Sensitivity topic for more information. -
CREATE TABLEis slower in SingleStore Helios than in MySQL. See Code Generation for more information. -
The
MULTI VALUE INDEXclause is only supported forBSONandJSONtype columns. Refer to Multi-Value Hash Index (BSON) or Multi-Value Hash Index (JSON) for creating multi-value indexes on the respective column type. -
The
KEYsyntax is equivalent to usingINDEXsyntax when used inCREATE TABLE. The convention is to use theKEYsyntax.INDEXsyntax is generally used when creating an index on an existing table. See CREATE INDEX for more information. -
Foreign key referential integrity enforcement is not supported by SingleStore, but
FOREIGN KEYsyntax can be supported in SingleStore by setting theignore_foreign_keysengine variable toON. The default value forignore_foreign_keysisOFF. See the Specifying Unenforced Unique Constraints page for more information. -
The
BTREEindex type creates a skip list index in SingleStore Helios. This index has very similar characteristics to a BTREE index. -
If you do not want to specify a column (or columns) to sort on, or do not care about the sort order for your data, you can specify an empty key (e.g.
SORT KEY()). -
The
SORT KEY()order can be specified as ascending (SORT KEY(index_column_name)) or descending (SORT KEY(index_column_name DESC)). SingleStore does not support scanning aSORT KEY()in reverse order to its sort order:CREATETABLE ct_sort (col1 int, SORT KEY(col1 DESC));EXPLAINSELECT*FROM ct_sort ORDERBY col1 DESC;+-------------------------------------------------------------------------------------------+ | EXPLAIN | +-------------------------------------------------------------------------------------------+ | Project [remote_0.col1] | TopSort limit:[@@SESSION.`sql_select_limit`] [remote_0.col1 DESC] | | Gather partitions:all alias:remote_0 parallelism_level:sub_partition | | Project [t1.col1] | | Top limit:[?] | | ColumnStoreFilter [<after per-thread scan begin> AND <before per-thread scan end>] | | OrderedColumnStoreScan test1.t1, SORT KEY col1 (col1 DESC) table_type:sharded_columnstore | +-------------------------------------------------------------------------------------------+EXPLAINSELECT*FROM ct_sort ORDERBY col1;+------------------------------------------------------------------------------------+ | EXPLAIN | +------------------------------------------------------------------------------------+ | Project [remote_0.col1] | | TopSort limit:[@@SESSION.`sql_select_limit`] [remote_0.col1] | | Gather partitions:all alias:remote_0 parallelism_level:segment | | Project [t1.col1] | | TopSort limit:[?] [t1.col1] | | ColumnStoreScan test1.t1, SORT KEY col1 (col1 DESC) table_type:sharded_columnstore | +------------------------------------------------------------------------------------+ -
SORT KEY()is not allowed when usingCREATE ROWSTORE TABLE .... -
KEY() USING CLUSTERED COLUMNSTOREis a legacy syntax that is equivalent toSORT KEY(). SingleStore recommends usingSORT KEY(). -
BUCKET_COUNTis specific to theHASHindex type. It controls the bucket count of the hash table. It applies to rowstore hash indexes only and does not effect columnstore hash indexes. -
The
UNENFORCEDindex option can be used on aUNIQUEconstraint to specify that the unique constraint is unenforced. See Unenforced Unique Constraints. -
RESOLUTIONis specific to index on geospatial columns. See Working with Geospatial Features for more information. -
COLUMNSTORE_SEGMENT_ROWS,COLUMNSTORE_FLUSH_BYTEScontrols configuration variables specific to columnstore tables. See Advanced Columnstore Configuration Options) for more information. -
SingleStore supports
binary,utf8, andutf8mb4character sets. TheCOLLATEclause in theCREATE TABLEstatement can be used to override the server character set and collation, which are used as default values for the table. -
AUTOSTATS_ENABLEDcontrols if automatic statistics should be collected on this table. There are three categories of autostats -AUTOSTATS_CARDINALITY_MODE,AUTOSTATS_HISTOGRAM_MODE, andAUTOSTATS_SAMPLING. SingleStore Helios allows you to independently control how each category of statistics is automatically gathered. Multiple autostats settings can be combined in a singleCREATE TABLEstatement. See Automatic Statistics for more information. -
This command causes implicit commits. Refer to COMMIT for more information.
-
Specify the
AUTOoption instead of the<data_type>in the<computed_column_definition>clause to automatically infer the data type of the column from the<computed_column_expression>.Note
When using
CREATE TABLEto define a persistent computed column, do not rely onAUTO NOT NULLto enforce non-nullability. For persistent computed columns,AUTOcan cause theNOT NULLconstraint to be ignored.To ensure that the computed column is non-nullable, define the computed column with an explicit type rather thanAUTO. For more information, refer to Persistent Computed Columns. -
<computed_column_expression>defines the value of a computed column using other columns in the table, constants, built-in functions, operators, and combinations thereof. For more information see Persistent Computed Columns. -
Temporary tables, created with the
TEMPORARYoption, will be deleted when the client session terminates. For ODBC/JDBC, this is when the connection closes. For interactive client sessions, it is when the user terminates the client program. -
Global temporary tables, created with the
GLOBAL TEMPORARYoption, exist beyond the duration of a client session. If failover occurs, the global temporary tables lose data and enter an errored state; they need to be dropped and recreated. This command can be run only on the master aggregator. See Global Temporary Tables for details. -
The
SERIES TIMESTAMPclause can be used to designate a column as the default time-ordering column for time series functions such asFIRST(),LAST(), andTIME_BUCKET(). Refer to SERIES TIMESTAMP for more information. -
Keyless sharding distributes data across partitions uniformly at random but with the limitation that it does not allow single partition queries or local joins since rows are not assigned to specific partitions. Keyless sharding is the default for tables that do not have primary key or explicit shard key. You can explicitly declare a table as keyless sharded by specifying a shard key with an empty list of columns in the
SHARD KEY()constraint in the table definition. -
The
METADATA_ONLYoption on theSHARD KEYsyntax prevents an index being created on the shard key. It will decrease overall memory usage. It can cause queries to run slower. It can only be used when creating your table. -
The optional
COLUMN GROUPclause creates a materialized copy of each row as a separate index. It is supported on columnstore tables only. This structure can be used to accelerate full-row retrievals and updates for wide columnstore tables.Column group indexes use less RAM than rowstore tables which can reduce operation costs. Using column group indexes on columnstores can allow you to get both fast lookups and fast analytics on the same table. The column group index improves the performance of lookups, and the standard columnar representation is available to give fast analytics. Using a column group index on a columnstore table is easier to manage than having to move data between rowstore and columnstore tables.
To find the size of a column group index, refer to the How the Columnstore Works page.
The
column_group_nameargument is optional. If a column group name is not specified when creating a table, one is chosen by the engine. UsingCOLUMN GROUP [column_group_name] (*)creates a column group index on all columns in the table. Column group indexes created on a subset of table columns are not supported.The following is an syntax example of a columnstore table that creates a column group on all columns:
CREATETABLE col_group_1(id BIGINT, col1 VARCHAR(10), col2 VARCHAR(10),..., coln INT,COLUMNGROUP col_gp_inx (*)); -
The
WITH DEEP COPYargument of theCREATE TABLE new_tbl_name LIKE original_tbl_namestatement copies an existing table (original_tbl_name) and creates a new table that will have the same definition as the original table - including all of the data and metadata (such as indexes) in the original table. Users must haveSELECTpermissions to be able to execute SQL statements against the new table. Computed columns will be recomputed during theWITH DEEP COPYprocess. -
The
WITH SHALLOW COPYargument of theCREATE TABLE new_tbl_name LIKE original_tbl_namestatement copies an existing table (original_tbl_name) and creates a new table that will have the same definition as the original table. The data is not physically copied to the new table, but referenced against the original table. So anySELECTquery made against either table produces the same result, until one of them is updated. Users must haveSELECTpermissions to be able to execute SQL statements against the new table. -
Refer to the Permissions Matrix for the required permissions.
MySQL Compatibility
SingleStore Helios’s syntax differs from MySQL mainly in the data types and storage it supports, and some specific index hints.
-
KEY_BLOCK_SIZE [=] <value>: value is currently ignored.
DEFAULT Behavior
If DEFAULT <default_value> is specified in <column_definition>, and no value is inserted in the column, then <default_value> will be placed in the column during an INSERT operation.
ON UPDATE Behavior
If ON UPDATE <update_value> is specified in <column_definition> and if any other column is updated but the specified column is not explicitly updated, then update_value will be placed in the column during an UPDATE operation. If the column is of the type TIMESTAMP, TIMESTAMP(6), DATETIME , or DATETIME(6), then you can update <update_value> to one of the following values: CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP(6), NOW(), or NOW(6).
ON UPDATE can be used with these TIMESTAMP/DATETIME[(6)] types only, and you can only use one of the time functions as the argument. For more information, refer to Data Types.
Note that if an ON UPDATE or DEFAULT clause is defined as DATETIME or DATETIME(6) and its precision does not match the precision of the column data type, SingleStore issues a warning during table creation. The operation still proceeds and SingleStore creates the table by adjusting the data type of the ON UPDATE or DEFAULT clause to match the column data type.
For example, consider the following CREATE TABLE statement:
CREATETABLE example (col1 DATETIME(6)DEFAULTCURRENT_TIMESTAMP(6)ONUPDATECURRENT_TIMESTAMP(),col2 DATETIME(6)DEFAULTCURRENT_TIMESTAMP(6)ONUPDATECURRENT_TIMESTAMP(6),col3 DATETIMEDEFAULTCURRENT_TIMESTAMP()ONUPDATECURRENT_TIMESTAMP(),col4 DATETIMEDEFAULTCURRENT_TIMESTAMPONUPDATECURRENT_TIMESTAMP(6));
Query OK, 0 rows affected, 2 warningsBecause of data type mismatch between the DEFAULT and ON UPDATE clauses and the respective col1 and col4 column data types, this command returns warnings. Run the SHOW WARNINGS command to view the warnings (output is formatted for readability):
SHOWWARNINGS\G
*** 1. row ***
Level: Warning
Code: 1706
Message: Feature 'DATETIME type with conflicting scale' is not supported
by SingleStore. Execution will continue, but the DEFAULT and/or ON UPDATE
timestamp scale will match the declared value of column 'col1'. To avoid
this warning, change the scale of your DEFAULT and/or ON UPDATE expression
to match the declared column type.
*** 2. row ***
Level: Warning
Code: 1706
Message: Feature 'DATETIME type with conflicting scale' is not supported
by SingleStore. Execution will continue, but the DEFAULT and/or ON UPDATE
timestamp scale will match the declared value of column 'col4'. To avoid
this warning, change the scale of your DEFAULT and/or ON UPDATE expression
to match the declared column type.To verify that the table was created with the proper data type, run the SHOW CREATE TABLE command (output is formatted for readability):
SHOWCREATETABLE example\G
Table: example
Create Table: CREATE TABLE `example` (
`col1` datetime(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
`col2` datetime(6) DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
`col3` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`col4` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
SORT KEY `__UNORDERED` ()
, SHARD KEY ()
) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE='STRICT_ALL_TABLES,NO_AUTO_CREATE_USER' CHARACTER SET=`utf8mb4` COLLATE=`utf8mb4_bin`The type of the DEFAULT and ON UPDATE clauses was updated to match the respective data type of col1 and col4 columns.
SERIES TIMESTAMP
The SERIES TIMESTAMP clause designates a table column as the default time-ordering column for implicit use by time series functions. This designation eliminates the need to repeatedly specify the time column in function calls, reduces query verbosity, and helps prevent errors in complex time series analyses. This clause does not change the column's data type. Instead, it defines the column's behavior in time series functions such as FIRST(), LAST(), and TIME_BUCKET().
You can designate only one column per table as SERIES TIMESTAMP. Attempting to designate a second column returns an error. The column must use one of the following data types:
-
DATE -
TIME -
DATETIME -
DATETIME(6) -
TIMESTAMP -
TIMESTAMP(6)
SingleStore recommends using DATETIME or DATETIME(6) instead of TIMESTAMP types because the TIMESTAMP and TIMESTAMP(6) types only extend through 2038 and thus may require future application maintenance that may be difficult for production systems. Refer to Timestamp Behavior for more information.
For optimal time series query performance, designate the SERIES TIMESTAMP column as the table's SORT KEY unless careful analysis shows that a different sort key is better.
You can add SERIES TIMESTAMP to an existing table by using ALTER TABLE. To verify the SERIES TIMESTAMP setting for a table, use SHOW CREATE TABLE.
For example, create a table with a SERIES TIMESTAMP column:
CREATETABLE sensor_readings (sensor_id INT,reading_value DOUBLE,recorded_at DATETIME(6)NOTNULL SERIES TIMESTAMP,SORT KEY(recorded_at),SHARD KEY(sensor_id));
Add SERIES TIMESTAMP to an existing table:
ALTERTABLE eventsADD event_time DATETIME(6) SERIES TIMESTAMP;
Without SERIES TIMESTAMP, you must explicitly pass the time column to each function:
SELECT sensor_id,TIME_BUCKET('5m', recorded_at)AS bucket,FIRST(reading_value, recorded_at),LAST(reading_value, recorded_at)FROM sensor_readingsGROUPBY sensor_id, bucketORDERBY bucket;
With SERIES TIMESTAMP designated on recorded_at, the same query is shorter and easier to write:
SELECT sensor_id,TIME_BUCKET('5m')AS bucket,FIRST(reading_value),LAST(reading_value)FROM sensor_readingsGROUPBY sensor_id, bucketORDERBY bucket;
Refer to Analyzing Time Series Data for more information on working with time series data.
Storage of CHAR(<length>) as VARCHAR(<length>)
For a column defined as type CHAR of length len, store the column as a VARCHAR of length len if len greater than or equal to the value of the engine variable varchar_column_string_optimization_length. If the value of the variable is 0, the column is not stored as a VARCHAR.
Storing a CHAR of length len as a VARCHAR of length len will allow the column to realize the performance benefit of a VARCHAR, in many cases. For example, CHARs use 3 bytes of memory, and VARCHARs use 1 byte. VARCHAR can have better performance and use less memory during query execution because SingleStore does not need to allocate and process full-length strings with 3 bytes per char, as it does when using CHAR.
Suppose the value of varchar_column_string_optimization_length is 3 and you run:
CREATETABLE ct_char(a CHAR(4));
The column a is stored as a VARCHAR(4).
CREATE TEMPORARY
CREATE TEMPORARY or CREATE ROWSTORE TEMPORARY (if default_table_type is set to columnstore, you must use the latter syntax) creates a table that will be deleted when the client session terminates.
CREATETEMPORARYTABLEIFNOTEXISTS ct_temp_1 (id INTAUTO_INCREMENTPRIMARYKEY, a INT, b INT, SHARD KEY(id));CREATE ROWSTORE TEMPORARYTABLEIFNOTEXISTS ct_temp_1 (id INTAUTO_INCREMENTPRIMARYKEY, a INT, b INT, SHARD KEY(id));
CREATE ROWSTORE GLOBAL TEMPORARY
CREATE ROWSTORE GLOBAL TEMPORARY creates a table that exists beyond the duration of a client session. If a failover occurs, a global temporary table loses data and enters an errored state; the global temporary table needs to be dropped and recreated.
CREATE ROWSTORE GLOBALTEMPORARYTABLEIFNOTEXISTS ct_temp_2 (id INTAUTO_INCREMENTPRIMARYKEY, a INT, b INT, SHARD KEY(id));
CREATE TABLE with an AUTO_INCREMENT Column
Refer to CREATE TABLE with an AUTO_INCREMENT Column for more information.
CREATE TABLE AS SELECT
CREATE TABLE AS SELECT (also referred to as CREATE TABLE ... SELECT) can create one table from results of a SELECT query.
Here is the basic syntax. You can create the new table and set shard keys, sort keys, and or other indexes:
CREATE[ROWSTORE][REFERENCE |TEMPORARY|GLOBALTEMPORARY]TABLE[IFNOTEXISTS]<table_name_2>(column_name(s),[SHARD KEY(column_name)]|[SORT KEY(column_name)]|[KEY(column_name)]ASSELECT[*]|[column_name(s)]FROM table_name_1;
Here is an example of a CREATE TABLE AS SELECT command with a shard key, sort key and an index:
CREATETABLE ctas_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b),KEY(a))ASSELECT*FROM orig_table;
The table will include a column for each column of the SELECT query. You can define indexes, additional columns, and other parts of the table definition in the create_definition. Persisted computed columns can also be specified this way. Some examples:
CREATETABLE table_1 (PRIMARYKEY(a, b))ASSELECT*FROM table_2;CREATETABLE table_1 (SORT KEY(a, b))ASSELECT*FROM table_2;CREATETABLE table_1 (a int, b int)ASSELECT c, d FROM table_2;CREATETABLE table_1 (b AS a+1 PERSISTED int)ASSELECT a FROM table_2;
In the case that the original table (table_2 in the above examples) has an AUTO_INCREMENT column, it will be created as a non-auto-increment column in the new table (table_1).
CREATE TABLE AS SELECT to Extract Data from One Column in an Existing Table
Extract time column from an event table to build a times table.
CREATETABLE events (typeVARCHAR(256),timeTIMESTAMP);INSERTINTO events VALUES('WRITE',NOW());
CREATETABLE times (id INTAUTO_INCREMENTKEY,timeTIMESTAMP)ASSELECTtimeFROM events;
SELECT*FROM times;
+----+---------------------+
| id | time |
+----+---------------------+
| 1 | 2023-06-21 15:57:35 |
+----+---------------------+CREATE TABLE AS SELECT to Extract Distinct Values from an Existing Table
SELECT*FROM courses ORDERBY course_code, section_number;
+-------------+----------------+-----------------+
| course_code | section_number | number_students |
+-------------+----------------+-----------------+
| CS-101 | 1 | 20 |
| CS-101 | 2 | 16 |
| CS-101 | 3 | 22 |
| CS-101 | 4 | 25 |
| CS-101 | 5 | 22 |
| CS-150 | 1 | 10 |
| CS-150 | 2 | 16 |
| CS-150 | 3 | 11 |
| CS-150 | 4 | 17 |
| CS-150 | 5 | 9 |
| CS-201 | 1 | 14 |
| CS-201 | 2 | 17 |
| CS-301 | 1 | 7 |
| CS-301 | 2 | 10 |
+-------------+----------------+-----------------+CREATETABLEIFNOTEXISTS distinct_courses (PRIMARYKEY(course_code))ASSELECTDISTINCT(course_code)FROM courses;
SELECT*FROM distinct_courses ORDERby course_code;
+-------------+
| course_code |
+-------------+
| CS-101 |
| CS-150 |
| CS-201 |
| CS-301 |
+-------------+CREATE TABLE USING HASH
The USING HASH clause creates a hash index in a table.
If a rowstore or columnstore table is being created, the following applies:
-
You can create single-column or multi-column hash indexes.
-
When you create a unique single-column hash index, the shard key can contain only one column, and that column must be the same column that you have created the index on. When you create a unique multi-column hash index, the shard key must be a subset of the columns that you have created the index on.
-
You can create multiple single-column hash indexes on a reference table.
If a columnstore table is being created, the following applies:
-
You can create at most one unique hash index. You can create multiple multi-column hash indexes.
-
You cannot create a unique hash index on a
FLOAT,REAL, orDOUBLEcolumn.
If a rowstore is being created, the following applies:
-
Non-unique hash indexes on rowstore tables are not supported.
-
When the
USING HASHclause is used to define a non-unique index, a skiplist index is created. -
After a table has been created, the
SHOW WARNINGScommand will display a warning that a skiplist index was created instead of a hash index. -
The
SHOW INDEXEScommand can be used to verify what types of indexes have been created.
CREATE TABLE with Multiple Hash Indexes
The following example creates a columnstore table with three hash indexes. One of these indexes has a multi-column key.
CREATETABLE articles_3 (id INTUNSIGNED,monthintUNSIGNED,yearintUNSIGNED,title VARCHAR(200),body TEXT,SHARD KEY(title),SORT KEY(id),KEY(id)USINGHASH,UNIQUEKEY(title)USINGHASH,KEY(month,year)USINGHASH);
The query SELECT * FROM articles WHERE title = 'Interesting title here'; uses the hash index on title because the query contains an equality predicate on title. The query runs faster with the hash index than without.
The query SELECT * FROM articles WHERE year > 2010 AND month > 5; does not use the hash index on month and year since the query does not use an equality predicate.
See the ColumnstoreFilter in the Query Plan Operations topic for an example EXPLAIN plan for a columnstore query that uses a hash index.
See Highly Selective Joins for an example of a columnstore query with a join that uses a hash index.
CREATE TABLE with One Hash Index Containing Multiple Columns
KEY(<column 1 name>,<column 2 name>, ... <column n name>) is equivalent to KEY(<column 1 name>,<column 2 name>,... <column n name>) USING HASH. For example,
CREATETABLE ct_hash_1(a INT, b INT, c INT,KEY(a,b));
is equivalent to:
CREATETABLE ct_hash_1(a INT, b INT, SORT KEY(),KEY(a,b)USINGHASH);
A query against t with an equality filter on a, an equality filter on b, or equality filters on both a and b could benefit from KEY(a,b) USING HASH. A query that uses both equality filters would be the most efficient.
Depending on the cardinality, the performance of the query may be worse than the performance of the same query, where t is a rowstore table and KEY(a,b) is defined on that table.
CREATE TABLE with FULLTEXT Columns
SingleStore supports full-text search across text columns in a columnstore table using the FULLTEXT index type. A full-text index can only be added in a CREATE TABLE or ALTER TABLE ADD FULLTEXT statement and only on the text types CHAR, VARCHAR, TEXT, and LONGTEXTData Types
If you query a column c that is part of a multi-column FULLTEXT index, where the query uses a FULLTEXT MATCH on c, the index on c will be applied.
This differs from a multi-column non-FULLTEXT index, where behavior is as follows: if you query column c that is part of index i, where the query uses an equality filter on c, the index on c will only be applied if c is the leftmost column in i.
Any column that is part of a FULLTEXT index can be queried, even if it is not the leftmost. Searches across FULLTEXT columns are done using the SELECT ... MATCH AGAINST syntax. For more information, see MATCH.
CREATE TABLE with FULLTEXT Index on Two Columns
This example creates a FULLTEXT index for both the title column and the body column. Either column could be queried separately using MATCH <column_name>, and the index on the column would be applied. The USING VERSION 1 syntax is optional.
CREATETABLE articles_1 (id INTUNSIGNED,yearintUNSIGNED,title VARCHAR(200),body TEXT,SORT KEY(id),FULLTEXT USING VERSION 1(title,body));
CREATE TABLE with Version 2 FULLTEXT Index
This example creates a FULLTEXT index for both the title column and the body column. Either column can be queried separately using MATCH (TABLE <table_name>) AGAINST (<expression>), and the index on the column will be applied.
Refer to Working with Full-Text Search for more information about version 2 full-text search.
CREATETABLE articles (id INTUNSIGNED,yearintUNSIGNED,title VARCHAR(200),body TEXT,SORT KEY(id),FULLTEXT USING VERSION 2 art_ft_index (title, body));
INSERTINTO articles (id,year, title, body)VALUES(1,2021,'Introduction to SQL','SQL is a standard language for accessing and manipulating databases.'),(2,2022,'Advanced SQL Techniques','Explore advanced techniques and functions in SQL for better data manipulation.'),(3,2020,'Database Optimization','Learn about various optimization techniques to improve database performance.'),(4,2023,'SQL in Web Development','Discover how SQL is used in web development to interact with databases.'),(5,2019,'Data Security in SQL','An overview of best practices for securing data in SQL databases.'),(6,2021,'SQL and Data Analysis','Using SQL for effective data analysis and reporting.'),(7,2022,'Introduction to Database Design','Fundamentals of designing a robust and scalable database.'),(8,2020,'SQL Performance Tuning','Tips and techniques for tuning SQL queries for better performance.'),(9,2023,'Using SQL with Python','Integrating SQL with Python for data science and automation tasks.'),(10,2019,'NoSQL vs SQL','A comparison of NoSQL and SQL databases and their use cases.');OPTIMIZETABLE articles FLUSH;
SELECT*FROM articles WHEREMATCH(TABLE articles) AGAINST ('body:database');
+----+------+---------------------------------+-----------------------------------------------------------------------------+
| id | year | title | body |
+----+------+---------------------------------+-----------------------------------------------------------------------------+
| 7 | 2022 | Introduction to Database Design | Fundamentals of designing a robust and scalable database. |
| 3 | 2020 | Database Optimization | Learn about various optimization techniques to improve database performance.|
+----+------+---------------------------------+-----------------------------------------------------------------------------+Refer to the MATCH ... AGAINST page for more legacy and version 2 full-text examples.
Errors
These are the possible errors you may encounter when using FULLTEXT.
|
Error |
Error String |
|---|---|
|
Invalid Type specified for column |
Invalid type specified for |
|
Specifying |
|
|
Specifying the same column multiple times |
Column may only be specified once in a |
|
Specifying a column that is not defined on the table |
Column not defined |
|
Specifying |
Only column store tables may have a |
CREATE TABLE with VECTOR Index
SingleStore supports indexed vector search across VECTOR columns in columnstore tables using the VECTOR index type. Vector indexes use Approximate Nearest Neighbor (ANN) search which is appropriate for very large data sets and/or use cases with high concurrency requirements for a nearest-neighbor search. Vector indexes can be added in CREATE TABLE or ALTER TABLE statements and are supported for indexes on a single column of type VECTOR(<N>[, F32]) where <N> is the number of dimensions.
A variety of index types and configuration parameters are available. SingleStore recommends using the IVF_PQFS and HNSW_FLAT index types. Refer to Vector Indexing, Tuning Vector Indexes and Queries, and Configuring Full Text and Vector Indexes for details.
Example
This example creates an IVF_PQFS VECTOR index for column v in a table named vect.
CREATETABLE vect (k int, v VECTOR(2)NOTNULL);INSERTINTO vect VALUES …ALTERTABLE vect ADD VECTOR INDEX(v) INDEX_OPTIONS'{"index_type":"IVF_PQFS", "nlist":1024, "nprobe":20}';OPTIMIZETABLE vect FLUSH;
CREATE TABLE WITH DEEP COPY
The WITH DEEP COPY feature creates a new table that will have the same table definition as the original table. CREATE TABLE WITH DEEP COPY makes a full copy of the data and indexes in the original table. Computed columns will be recomputed during the WITH DEEP COPY process. Any operation (INSERT, UPDATE, DELETE, ALTER TABLE, or DROP TABLE) on the original table will not affect the copied table and vice versa.
The WITH DEEP COPY feature eliminates the need to run a CREATE TABLE LIKE followed by an INSERT SELECT statement. In one statement, a table can be created that contains all of the columns, data, and other metadata (such as indexes) of the table from which it is copied.
There are some cases where WITH DEEP COPY is not supported. Error messages will be generated if these cases are attempted.
Creating a rowstore deep copy of an existing table:
CREATETABLE orig_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b));CREATE ROWSTORE TABLE ctdc_table LIKE orig_table WITH DEEP COPY;
ERROR 1706 ER_MEMSQL_FEATURE_LOCKDOWN: Feature 'CREATE Explicit ROWSTORE TABLE LIKE' is not supported by SingleStore.Creating a reference deep copy of an existing table:
CREATETABLE orig_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b));CREATE REFERENCE TABLE ctdc_table LIKE orig_table WITH DEEP COPY;
ERROR 1706 ER_MEMSQL_FEATURE_LOCKDOWN: Feature 'CREATE REFERENCE TABLE LIKE' is not supported by SingleStore.CREATE TABLE WITH DEEP COPY Same Database
CREATETABLE orig_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b));
INSERTINTO orig_table (a,b)VALUES(9,3),(5,2),(10,4),(12,7);
SELECT*FROM orig_table;
+----+----+
| a | b |
+----+----+
| 10 | 4 |
| 12 | 7 |
| 5 | 2 |
| 9 | 3 |
+----+----+CREATETABLE ctdc_table LIKE orig_table WITH DEEP COPY;
SELECT*FROM ctdc_table;
+----+----+
| a | b |
+----+----+
| 10 | 4 |
| 12 | 7 |
| 5 | 2 |
| 9 | 3 |
+----+----+CREATE TABLE WITH DEEP COPY Across Databases
CREATETABLE test1.orig_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b));
INSERTINTO test1.orig_table (a,b)VALUES(9,3),(5,2),(10,4),(12,7);
SELECT*FROM test1.orig_table;
+----+----+
| a | b |
+----+----+
| 10 | 4 |
| 12 | 7 |
| 5 | 2 |
| 9 | 3 |
+----+----+CREATETABLE test2.ctdc_table LIKE test1.orig_table WITH DEEP COPY;
SELECT*FROM test2.ctdc_table;
+----+----+
| a | b |
+----+----+
| 10 | 4 |
| 12 | 7 |
| 5 | 2 |
| 9 | 3 |
+----+----+CREATE TABLE WITH SHALLOW COPY
The WITH SHALLOW COPY feature creates a new table that will have the same structure as the original table. CREATE TABLE WITH SHALLOW COPY is a metadata-only operation and will not duplicate the actual data. Any operation (INSERT, UPDATE, DELETE, ALTER TABLE, or DROP TABLE) on the original table will not affect the shallow copy table and vice versa.
The WITH SHALLOW COPY feature is useful when you want to test new data models or perform operations on existing tables without risking live data. After testing or modifying the copied table, it can be easily deleted or promoted to the main table.
Creating a shallow table copy is orders of magnitude faster and uses less disk space than creating a table (CREATE TABLE new_table LIKE original_table) and then inserting all the rows from the original table into the new table (INSERT INTO new_table...SELECT * FROM original_table). For example, it's possible to shallow copy a large table in a second whereas a full copy would take minutes.
When a shallow copy of a table is created, SingleStore copies some of the in-memory metadata of the table from which the copy is created ("source table"), i.e the columnar blobs and index blobs. However the in-memory rowstore portion of the source table is not copied. Therefore, the memory utilization of the copy at the time of creation is bounded by the memory utilization of the source table. Post that, any changes made to the copy take up additional memory similar to any other columnstore table.
To promote a shallow copy table to a main table, drop the original table and then rename the shallow copy table to the original table name. This approach is faster for resuming query operations on the table. For optimal performance, ensure that autostats is enabled and that the background merger is turned on by executing ALTER TABLE col_table BACKGROUND_MERGER=ON; before using the table.
When creating a shallow copy of an existing table:
-
The plancache of the source table is not reused.
-
Only the on-disk portion of the table is copied. In-memory rowstore segment rows of the table are not copied. If you want the in-memory rowstore rows, do
OPTIMIZE TABLE tbl_name FLUSHon the source table before the shallow copy. -
Autostats on the new table are disabled. As a result, queries on the new table may run with a worse execution plan. See Disabling and Enabling Automatic Statistics for how to enable autostats for the new table.
-
The background merger is disabled for the new table. This can be changed by running an
ALTER TABLEstatement on the new table to enable it after the table is created:CREATETABLE ctsc_table LIKE orig_table WITH SHALLOW COPY;ALTERTABLE ctsc_table BACKGROUND_MERGER=ON; -
An exclusive lock is taken on the original table which blocks operations on the table while a shallow copy is created.
-
To verify the memory usage of a shallow copy table, use the TABLE_STATISTICS and INTERNAL_TABLE_STATISTICS views, just as with any other table.
-
After a shallow copy is created, it is treated like any other columnstore table. This means that information about shallow copies is not specifically tracked, and currently it is not possible to retrieve a list of all shallow copy tables from information_schema.
CREATE TABLE WITH SHALLOW COPY Performance Example
This example creates a table ctsc_r with 4,194,304 rows, then does a shallow copy of it, and a full copy of it using a INSERT INTO...SELECT statement. The shallow copy is over 200 times faster on a small test system.
First, create a table of dummy random id values .
CREATETABLE ctsc_r(id BINARY(16));
INSERT ctsc_r VALUES(sys_guid());
DELIMITER//DODECLAREc BIGINT;BEGINSELECTCOUNT(*)INTO c FROM ctsc_r;WHILE(c <4*1024*1024)LOOPINSERTINTO ctsc_r SELECT sys_guid()FROM ctsc_r;SELECTCOUNT(*)INTO c FROM ctsc_r;ENDLOOP;END//DELIMITER;
SELECTCOUNT(*)FROM ctsc_r;
+----------+
| COUNT(*) |
+----------+
| 4194304 |
+----------+Make a shallow copy of all rows of ctsc_r to ctsc_r2:
OPTIMIZETABLE ctsc_r FLUSH;CREATETABLE ctsc_r2 LIKE ctsc_r WITH SHALLOW COPY;/* shallow copy time = 0.02 sec */
SELECTCOUNT(*)FROM ctsc_r2;
+----------+
| COUNT(*) |
+----------+
| 4194304 |
+----------+All rows were copied to ctsc_r2 and took 0.02 seconds to execute.
Next, make a full copy of all rows of ctsc_r to ctsc_r3:
CREATETABLE ctsc_r3 LIKE ctsc_r;INSERTINTO ctsc_r3 SELECT id FROM ctsc_r;/* full copy time = 4.86 sec */
SELECTCOUNT(*)FROM ctsc_r3;
+----------+
| COUNT(*) |
+----------+
| 4194304 |
+----------+All rows were copied to ctsc_r3 ; however, it took almost 5 seconds to execute.
Restrictions
The WITH SHALLOW COPY option is not supported for the following operations: copying across databases, copying rowstore tables, copying temporary tables, or creating a temporary table as a shallow copy. Error messages will be generated if these cases are attempted.
Across databases:
CREATETABLE test1.orig_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b));CREATETABLE test2.ctsc_table LIKE test1.orig_table WITH SHALLLOW COPY;
ERROR 1706 (HY000): Feature 'shallow copy of a table across databases' is not supported by SingleStore.Rowstore tables:
CREATE ROWSTORE TABLE orig_table (a BIGINT, b BIGINT, SHARD KEY(a));CREATETABLE ctsc_table LIKE orig_table WITH SHALLOW COPY;
ERROR 1706 (HY000): Feature 'shallow copy of non-columnstore tables' is not supported by SingleStore.Temporary or rowstore global temporary tables:
CREATE ROWSTORE GLOBALTEMPORARYTABLE orig_table (a BIGINT, b BIGINT, SHARD KEY(a));CREATETABLE ctsc_table LIKE orig_table WITH SHALLOW COPY;
ERROR 1706 (HY000): Feature 'shallow copy of temporary and global temporary tables' is not supported by SingleStore.Creating a temporary table as a shallow copy:
CREATETABLE orig_table(a INT);CREATETEMPORARYTABLE ctsc_table LIKE orig_table WITH SHALLOW COPY;
ERROR 1706 (HY000): Feature 'temporary or global temporary table that is created as a shallow copy' is not supported by SingleStore.CREATE TABLE with COMPRESSION = SPARSE
SingleStore Helios supports sparse data compression for rowstore tables. Nullable structured columns can use sparse data compression. The data types of these columns include numbers, dates, datetimes, timestamps, times, and varchars.
Columns that use sparse data compression only store non-NULL data values. Example 4 discusses an excellent sparse data compression use case, which also includes the query to retrieve actual memory usage of rowstore tables that use sparse data compression.
Sparse compression has the following limitations:
-
The
SPARSEclause cannot be used for key columns. However, if a rowstore table uses sparse data compression using theCOMPRESSION = SPARSEclause, then the key columns are stored in-row. -
The
SPARSEclause cannot be used for columns where the non-NULLsize of the column is greater than 15 bytes.
Refer to the Data Types topic for details.
CREATE TABLE with COMPRESSION = SPARSE All Sparse Columns
The following example demonstrates the COMPRESSION = SPARSE clause. This clause indicates that all columns in the table will use sparse data compression.
CREATE ROWSTORE TABLE transaction_1(id BIGINTNOTNULL,explanation VARCHAR(70),shares DECIMAL(18,2),share_price DECIMAL(18,2),total_amount as shares * share_price PERSISTED DECIMAL(18,2),transaction_date DATE,dividend_exdate DATE,misc_expenses DECIMAL(18,2),country_abbreviation CHAR(6),correction_date DATE,settlement_date DATE) COMPRESSION = SPARSE;
CREATE TABLE with COMPRESSION = SPARSE Selected Sparse Columns
The following example demonstrates the SPARSE clause. This clause is applied to the columns that will use sparse data compression.
CREATE ROWSTORE TABLE transaction_2(id BIGINTNOTNULL,explanation VARCHAR(70) SPARSE,shares DECIMAL(18,2) SPARSE,share_price DECIMAL(18,2),total_amount as shares * share_price PERSISTED DECIMAL(18,2),transaction_date DATE,dividend_exdate DATE SPARSE,misc_expenses DECIMAL(18,2) SPARSE,country_abbreviation CHAR(6),correction_date DATE SPARSE,settlement_date DATE SPARSE);
Listing Whether Columns use Sparse Compression
The following query lists the columns in the transaction table that was created in Example 2. The query indicates, for each column, whether the column uses sparse compression.
SELECT column_name, is_sparse FROM information_schema.columnsWHERE table_name ='transaction_2';
+----------------------+-----------+
| column_name | is_sparse |
+----------------------+-----------+
| id | NO |
| explanation | YES |
| shares | YES |
| share_price | NO |
| total_amount | NO |
| transaction_date | NO |
| dividend_exdate | YES |
| misc_expenses | YES |
| country_abbreviation | NO |
| correction_date | YES |
| settlement_date | YES |
+----------------------+-----------+CREATE TABLE with COMPRESSION = SPARSE Use Case
Sparse rowstore compression works best on a wide table with more than half NULL values. The distribution of the NULL values in the table does not contribute to the amount of memory used.
For example, consider this wide table t having three-hundred columns:
CREATE ROWSTORE TABLE ct_sparse (c1 double,c2 double,…c300 double) COMPRESSION = SPARSE;
In SingleStore 7.3, table t was loaded with 1.05 million rows, two-thirds of which are NULL. To retrieve the actual memory usage (in GB) of table t, run the following command:
SELECT table_name,SUM(memory_use) memory_usage FROM information_schema.table_statisticsWHERE table_name ='ct_sparse'GROUPBY table_name;
+-------------+--------------+
| table_name | memory_usage |
+-------------+--------------+
| ct_sparse | 1.23 |
+-------------+--------------+The following table lists the memory usage of table t, with and without sparse compression:
|
Compression Setting |
Memory Use |
Savings (Percent) |
|---|---|---|
|
NONE |
2.62 GB |
NA |
|
SPARSE |
1.23 GB |
53% |
For this wide table with two-thirds NULL values, you can store more than twice the data in the same amount of RAM.
Last modified:
