VOOZH about

URL: https://www.cockroachlabs.com/docs/v26.2/user-defined-functions

โ‡ฑ User-Defined Functions


User-Defined Functions

A user-defined function (UDF) is a named function defined at the database level that can be called in queries and other contexts. CockroachDB supports invoking UDFs in SELECT, FROM, and WHERE clauses of DML statements.

Both stored procedures and user-defined functions are types of routines. However, they differ in the following ways:

  • Functions return a value, and procedures do not return a value.
  • Procedures must be invoked using a CALL statement. Functions can be invoked in nearly any context, such as SELECT, FROM, and WHERE clauses, DEFAULT expressions, and computed column expressions.
  • Functions have volatility settings, and procedures do not.

Overview

The basic components of a user-defined function are a name, list of arguments, return type, volatility, language, and function body.

  • An argument has a mode and a type.
  • The return type can be a built-in SQL type, user-defined ENUM or composite type, RECORD, TABLE, PL/pgSQL REFCURSOR type, implicit record type, TRIGGER, or VOID.
  • The volatility indicates whether the function has side effects. VOLATILE and NOT LEAKPROOF are the default.
    • Annotate a function with side effects with VOLATILE. This also prevents the cost-based optimizer from pre-evaluating the function.
    • A STABLE or IMMUTABLE function does not mutate data. You cannot create a STABLE or IMMUTABLE function that executes a mutation (INSERT, UPSERT, UPDATE, DELETE) statement.
    • LEAKPROOF indicates that a function has no side effects and that it communicates nothing that depends on its arguments besides the return value (i.e., it cannot throw an error that depends on the value of its arguments). You must precede LEAKPROOF with IMMUTABLE, and only IMMUTABLE can be set to LEAKPROOF. NOT LEAKPROOF is allowed with any other volatility.
    • Non-VOLATILE functions can be optimized through inlining. For more information, see Create an inlined UDF.
  • LANGUAGE specifies the language of the function body. CockroachDB supports the languages SQL and PLpgSQL (PL/pgSQL).
  • The function body:
    • Can reference arguments by name or by their ordinal in the function definition with the syntax $1.
    • Can be enclosed in a single line with single quotes '' or multiple lines with $$.
    • Can reference tables.
    • Can reference only the SELECT statement.

Statement statistics

SQL statements executed within user-defined functions are tracked in the SQL statistics subsystem and will appear in the SQL Activity > Statements page and the Insights page in the DB Console. This allows you to monitor the performance and execution statistics of individual statements within your functions.

When the UDF is invoked as part of a transaction, the statements executed within the function also appear in the Transaction details in the Statement Fingerprints table.

Note:

Statement diagnostics cannot be collected for statements executed inside user-defined functions. You can request statement diagnostics for the top-level invocation of the function, and the resulting trace includes spans for each statement executed. However, there is no way to target statements executed inside the function with a statement diagnostics request. For details, refer to Known limitations.

Examples

Create a UDF

The following is a UDF that returns the sum of two integers:

CREATE FUNCTION add(a INT, b INT) RETURNS INT IMMUTABLE LEAKPROOF LANGUAGE SQL AS 'SELECT a + b';

Where:

  • name: add
  • arguments: a of type INT, b of type INT
  • return type: INT
  • volatility: IMMUTABLE LEAKPROOF
  • language: SQL
  • function body: 'SELECT a + b'

Alternatively, you could define this function as:

CREATE FUNCTION add(a INT, b INT) RETURNS INT IMMUTABLE LEAKPROOF LANGUAGE SQL AS 'SELECT $1 + $2';

Or as:

CREATE FUNCTION add(a INT, b INT) RETURNS INT LANGUAGE SQL AS $$
 SELECT a + b;
$$;

For more examples of UDF creation, see CREATE FUNCTION.

View a UDF definition

To view the definition for the add() function:

SHOW CREATE FUNCTION add;

If you do not specify a schema for the function add when you create it, the default schema is public:

 function_name | create_statement
----------------+---------------------------------------------------
 add | CREATE FUNCTION public.add(IN a INT8, IN b INT8)
 | RETURNS INT8
 | IMMUTABLE
 | LEAKPROOF
 | CALLED ON NULL INPUT
 | LANGUAGE SQL
 | AS $$
 | SELECT a + b;
 | $$
(1 row)

Invoke a UDF

You invoke a UDF like a built-in function.

To invoke the add() function:

SELECT add(3,5) as sum;
 sum
-------
 8
(1 row)

Call a UDF from a view

You can include UDF calls in a view definition. The function is invoked each time the view is queried. A materialized view will invoke the function only when the view is created or refreshed.

Refer to Create a view that references routines.

Create a UDF using PL/pgSQL

The following user-defined function returns the nth integer in the Fibonacci sequence.

It uses the PL/pgSQL LOOP syntax to iterate through a simple calculation, and RAISE EXCEPTION to return an error message if the specified n is negative.

CREATE FUNCTION fib(n int) RETURNS INT AS $$
 DECLARE
 tmp INT;
 a INT := 0;
 b INT := 1;
 i INT := 2;
 BEGIN
 IF n < 0 THEN
 RAISE EXCEPTION 'n must be non-negative';
 END IF;
 IF n = 0 THEN RETURN 0; END IF;
 IF n = 1 THEN RETURN 1; END IF;
 LOOP
 IF i > n THEN EXIT; END IF;
 tmp := a + b;
 a := b;
 b := tmp;
 i := i + 1;
 END LOOP;
 RETURN b;
 END
 $$ LANGUAGE PLpgSQL;
SELECT fib(8);
 fib
-------
 21

Create an inlined UDF

When possible, the cost-based optimizer will improve a function's performance by inlining the UDF within the query plan. The UDF must have the following attributes:

The following example demonstrates how inlining improves a UDF's performance.

  1. Create tables a and b:

    CREATE TABLE a (
     a INT
    );
    
    CREATE TABLE b (
     b INT PRIMARY KEY
    );
    
  2. Insert a value (10) into 1000 rows in a and 1 row in b:

    INSERT INTO a SELECT 10 FROM generate_series(1, 1000);
    INSERT INTO b VALUES (10);
    
  3. Create a VOLATILE function foo_v() and a STABLE function foo_s():

    CREATE FUNCTION foo_v(x INT) RETURNS INT VOLATILE LANGUAGE SQL AS $$
     SELECT b FROM b WHERE b = x
    $$;
    
    CREATE FUNCTION foo_s(x INT) RETURNS INT STABLE LANGUAGE SQL AS $$
     SELECT b FROM b WHERE b = x
    $$;
    

    Each function returns a specified value from table b.

  4. View the query plan when foo_v() (the VOLATILE function) is used in a selection query to retrieve equal values from table a:

    EXPLAIN ANALYZE SELECT foo_v(a) FROM a WHERE a = 10;
    
     info
    --------------------------------------------------------------------------------------------
     planning time: 2ms
     execution time: 77ms
     distribution: local
     vectorized: true
     rows read from KV: 1,000 (39 KiB, 1 gRPC calls)
     cumulative time spent in KV: 330ยตs
     maximum memory usage: 80 KiB
     network usage: 0 B (0 messages)
     sql cpu time: 75ms
     estimated RUs consumed: 0
    
     โ€ข render
     โ”‚
     โ””โ”€โ”€ โ€ข filter
     โ”‚ nodes: n1
     โ”‚ actual row count: 1,000
     โ”‚ sql cpu time: 75ms
     โ”‚ estimated row count: 1,000
     โ”‚ filter: a = 10
     โ”‚
     โ””โ”€โ”€ โ€ข scan
     nodes: n1
     actual row count: 1,000
     KV time: 330ยตs
     KV contention time: 0ยตs
     KV rows read: 1,000
     KV bytes read: 39 KiB
     KV gRPC calls: 1
     estimated max memory allocated: 60 KiB
     sql cpu time: 87ยตs
     estimated row count: 1,000 (100% of the table; stats collected 19 seconds ago)
     table: a@a_pkey
     spans: FULL SCAN
    (33 rows)
    

    The query takes 77ms to execute because the function is invoked for each row scanned in table a.

  5. View the query plan when using foo_s() (the STABLE function) instead:

    EXPLAIN ANALYZE SELECT foo_s(a) FROM a WHERE a = 10;
    
     info
    ------------------------------------------------------------------------------------------------
     planning time: 5ms
     execution time: 4ms
     distribution: local
     vectorized: true
     rows read from KV: 1,001 (39 KiB, 2 gRPC calls)
     cumulative time spent in KV: 832ยตs
     maximum memory usage: 420 KiB
     network usage: 0 B (0 messages)
     sql cpu time: 3ms
     estimated RUs consumed: 0
    
     โ€ข render
     โ”‚
     โ””โ”€โ”€ โ€ข merge join (left outer)
     โ”‚ nodes: n1
     โ”‚ actual row count: 1,000
     โ”‚ estimated max memory allocated: 340 KiB
     โ”‚ estimated max sql temp disk usage: 0 B
     โ”‚ sql cpu time: 3ms
     โ”‚ estimated row count: 1,000
     โ”‚ equality: (a) = (b)
     โ”‚ right cols are key
     โ”‚
     โ”œโ”€โ”€ โ€ข filter
     โ”‚ โ”‚ nodes: n1
     โ”‚ โ”‚ actual row count: 1,000
     โ”‚ โ”‚ sql cpu time: 5ยตs
     โ”‚ โ”‚ estimated row count: 1,000
     โ”‚ โ”‚ filter: a = 10
     โ”‚ โ”‚
     โ”‚ โ””โ”€โ”€ โ€ข scan
     โ”‚ nodes: n1
     โ”‚ actual row count: 1,000
     โ”‚ KV time: 722ยตs
     โ”‚ KV contention time: 0ยตs
     โ”‚ KV rows read: 1,000
     โ”‚ KV bytes read: 39 KiB
     โ”‚ KV gRPC calls: 1
     โ”‚ estimated max memory allocated: 60 KiB
     โ”‚ sql cpu time: 202ยตs
     โ”‚ estimated row count: 1,000 (100% of the table; stats collected 42 seconds ago)
     โ”‚ table: a@a_pkey
     โ”‚ spans: FULL SCAN
     โ”‚
     โ””โ”€โ”€ โ€ข scan
     nodes: n1
     actual row count: 1
     KV time: 110ยตs
     KV contention time: 0ยตs
     KV rows read: 1
     KV bytes read: 30 B
     KV gRPC calls: 1
     estimated max memory allocated: 20 KiB
     sql cpu time: 11ยตs
     estimated row count: 1 (100% of the table; stats collected 42 seconds ago)
     table: b@b_pkey
     spans: FULL SCAN
    (57 rows)
    

    The query takes only 4ms to execute because the function is inlined and transformed to a join with an equality comparison (a) = (b), which has much less overhead than invoking a function for each row scanned in table a.

Video Demo

For a deep-dive demo on UDFs, watch the following video:

Known limitations

User-defined functions have the following limitations:

  • User-defined functions are not supported in partial index predicates. #155488
  • Views cannot reference a UDF that contains mutation statements (INSERT, UPDATE, UPSERT, DELETE). #151686
  • A RECORD-returning UDF cannot be created without a RETURN statement in the root block, which would restrict the wildcard type to a concrete one. #122945
  • User-defined functions are not currently supported in:
    • Expressions (column, index, constraint) in tables. #87699
    • Partial index predicates. #155488
  • User-defined functions cannot call themselves recursively. #93049
  • The setval function cannot be resolved when used inside UDF bodies. #110860
  • Casting subqueries to user-defined types in UDFs is not supported. #108184
  • Routines cannot be invoked with named arguments, e.g., SELECT foo(a => 1, b => 2); or SELECT foo(b := 1, a := 2);. #122264
  • Routines cannot be created if they reference temporary tables. #121375
  • Routines cannot be created with unnamed INOUT parameters. For example, CREATE PROCEDURE p(INOUT INT) AS $$ BEGIN NULL; END; $$ LANGUAGE PLpgSQL;. #121251
  • Routines cannot be created if they return fewer columns than declared. For example, CREATE FUNCTION f(OUT sum INT, INOUT a INT, INOUT b INT) LANGUAGE SQL AS $$ SELECT (a + b, b); $$;. #121247
  • Routines cannot be created with an OUT parameter of type RECORD. #123448
  • DDL statements (e.g., CREATE TABLE, CREATE INDEX) are not allowed within UDFs or stored procedures. #110080
  • Polymorphic types cannot be cast to other types (e.g., TEXT) within routine parameters. #123536
  • Routine parameters and return types cannot be declared using the ANYENUM polymorphic type, which is able to match any ENUM type. #123048
  • Statement diagnostics cannot be collected for statements executed inside UDFs or stored procedures. You can request statement diagnostics for the top-level invocation of the function or procedure, and the resulting trace includes spans for each statement executed. However, there is no way to target statements executed inside the function or procedure with a statement diagnostics request. #159526
  • Statements within routines do not currently respect hint injections. The workaround is to modify the inline hints directly in the body by replacing the routine. #162627

Also refer to the PL/pgSQL known limitations.

See also

On this page

Product

Resources

Learn

Support Channels

Company

Get developer news

Thanks!

ร—
โ˜… โ˜… โ˜… โ˜… โ˜