Scalar User-Defined Functions

To register scalar UDFs, call one of the RegisterScalarFunction overloads specifying the function name, input parameter type(s), return type, and the actual callback for the scalar function:

connection.RegisterScalarFunction<int, bool>("is_prime", (readers, writer, rowCount) =>
{
    for (ulong index = 0; index < rowCount; index++)
    {
        var prime = true;
        var value = readers[0].GetValue<int>(index);

        for (int i = 2; i <= Math.Sqrt(value); i++)
        {
            if (value % i == 0)
            {
                prime = false;
                break;
            }
        }

        writer.WriteValue(prime, index);
    }
});

var primes = connection.Query<int>("SELECT i FROM range(2, 100) t(i) where is_prime(i::INT)").ToList();

//primes will be 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97

In the example above, RegisterScalarFunction has two type parameters: int indicates the input parameter type and bool indicates the return type of the function. The readers parameter passed to the callback is an array of type IDuckDBDataReader and will have a length that matches the number of input parameters. The writer parameter of type IDuckDBDataWriter is used for writing to the output.

Pure Scalar Functions

By default scalar UDFs are treated as pure functions, meaning that the scalar function will return the same result when the input is the same. For example, if your table has repeated values, the is_prime will be called only once for every unique value. If your function doesn't follow this rule, you need to pass false to isPureFunction parameter:

connection.RegisterScalarFunction<long, long>("my_random_scalar", (readers, writer, rowCount) =>
{
    for (ulong index = 0; index < rowCount; index++)
    {
        var value = Random.Shared.NextInt64(readers[0].GetValue<long>(index));

        writer.WriteValue(value, index);
    }
}, false);

connection.Query<long>("SELECT my_random_scalar(i) FROM some_table");

Variable Number of Arguments

Similar to the params keyword in C#, DuckDB scalar functions can also accept variable number of arguments. To register such function pass true to params parameter:

connection.RegisterScalarFunction<long, long>("my_rand", (readers, writer, rowCount) =>
{
    for (ulong index = 0; index < rowCount; index++)
    {
        var value = 0L;

        if (readers.Count == 0)
        {
            value = Random.Shared.NextInt64();
        }

        if (readers.Count == 1)
        {
            value = Random.Shared.NextInt64(readers[0].GetValue<long>(index));
        }

        if (readers.Count == 2)
        {
            value = Random.Shared.NextInt64(readers[0].GetValue<long>(index), readers[1].GetValue<long>(index));
        }

        writer.WriteValue(value, index);
    }
}, false, true);

In this example my_rand will return a random number if no parameter is passed to it, or return a random number less than the specified value, or return a random number between the two specified values.

SELECT my_rand() FROM some_table;

SELECT my_rand(upper_bound) FROM some_table;

SELECT my_rand(lower_bound, upper_bound) FROM some_table;

Supporting Different Input Types

If your scalar function supports different input types, you can use object as input type. The following scalar function to_string formats the input according to the specified format and returns it to DuckDB:

connection.RegisterScalarFunction<object, string, string>("to_string", (readers, writer, rowCount) =>
{
    for (ulong index = 0; index < rowCount; index++)
    {
        var format = readers[1].GetValue<string>(index);

        var value = readers[0].GetValue(index);

        if (value is IFormattable formattable)
        {
            writer.WriteValue(formattable.ToString(format, CultureInfo.InvariantCulture), index);
        }
    }
});

You can call this function like this:

SELECT id, to_string(id, 'G'), order_date, to_string(order_date, 'dd-MM-yyyy'), amount, to_string(amount, 'G'), FROM TestTableAnyType

You can also inspect the type of the input and read the input in strongly-typed way:

//to_string accepts and object and format string and returns string
connection.RegisterScalarFunction<object, string, string>("to_string", (readers, writer, rowCount) =>
{
    for (ulong index = 0; index < rowCount; index++)
    {
        var format = readers[1].GetValue<string>(index);

        switch (readers[0].DuckDBType)
        {
            case DuckDBType.Integer:
                writer.WriteValue(readers[0].GetValue<int>(index).ToString(format, CultureInfo.InvariantCulture), index);
                break;
            case DuckDBType.Date:
                writer.WriteValue(readers[0].GetValue<DateOnly>(index).ToString(format, CultureInfo.InvariantCulture), index);
                break;
            case DuckDBType.Double:
                writer.WriteValue(readers[0].GetValue<double>(index).ToString(format, CultureInfo.InvariantCulture), index);
                break;
            default:
                writer.WriteValue(readers[0].GetValue(index).ToString(), index);
                break;
        }
    }
});