Scalar User-Defined Functions
To register scalar UDFs, call one of the Register
connection.RegisterScalarFunction<int, bool>("is_prime", (readers, writer, rowCount) =>
{
for (ulong index = 0; index < rowCount; index++)
{
var prime = true;
var value = readers[0].GetValue<int>(index);
for (int i = 2; i <= Math.Sqrt(value); i++)
{
if (value % i == 0)
{
prime = false;
break;
}
}
writer.WriteValue(prime, index);
}
});
var primes = connection.Query<int>("SELECT i FROM range(2, 100) t(i) where is_prime(i::INT)").ToList();
//primes will be 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
In the example above, RegisterScalarFunction
has two type parameters: int
indicates the input parameter type and bool
indicates the return type of the function. The readers
parameter passed to the callback is an array of type IDuckDBDataReader
and will have a length that matches the number of input parameters. The writer
parameter of type IDuckDBDataWriter
is used for writing to the output.
Pure Scalar Functions
By default scalar UDFs are treated as pure functions, meaning that the scalar function will return the same result when the input is the same. For example, if your table has repeated values, the is_prime
will be called only once for every unique value. If your function doesn't follow this rule, you need to pass false
to isPureFunction
parameter:
connection.RegisterScalarFunction<long, long>("my_random_scalar", (readers, writer, rowCount) =>
{
for (ulong index = 0; index < rowCount; index++)
{
var value = Random.Shared.NextInt64(readers[0].GetValue<long>(index));
writer.WriteValue(value, index);
}
}, false);
connection.Query<long>("SELECT my_random_scalar(i) FROM some_table");
Variable Number of Arguments
Similar to the params
keyword in C#, DuckDB scalar functions can also accept variable number of arguments. To register such function pass true
to params
parameter:
connection.RegisterScalarFunction<long, long>("my_rand", (readers, writer, rowCount) =>
{
for (ulong index = 0; index < rowCount; index++)
{
var value = 0L;
if (readers.Count == 0)
{
value = Random.Shared.NextInt64();
}
if (readers.Count == 1)
{
value = Random.Shared.NextInt64(readers[0].GetValue<long>(index));
}
if (readers.Count == 2)
{
value = Random.Shared.NextInt64(readers[0].GetValue<long>(index), readers[1].GetValue<long>(index));
}
writer.WriteValue(value, index);
}
}, false, true);
In this example my_rand
will return a random number if no parameter is passed to it, or return a random number less than the specified value, or return a random number between the two specified values.
SELECT my_rand() FROM some_table;
SELECT my_rand(upper_bound) FROM some_table;
SELECT my_rand(lower_bound, upper_bound) FROM some_table;
Supporting Different Input Types
If your scalar function supports different input types, you can use object
as input type. The following scalar function to_string
formats the input according to the specified format and returns it to DuckDB:
connection.RegisterScalarFunction<object, string, string>("to_string", (readers, writer, rowCount) =>
{
for (ulong index = 0; index < rowCount; index++)
{
var format = readers[1].GetValue<string>(index);
var value = readers[0].GetValue(index);
if (value is IFormattable formattable)
{
writer.WriteValue(formattable.ToString(format, CultureInfo.InvariantCulture), index);
}
}
});
You can call this function like this:
SELECT id, to_string(id, 'G'), order_date, to_string(order_date, 'dd-MM-yyyy'), amount, to_string(amount, 'G'), FROM TestTableAnyType
You can also inspect the type of the input and read the input in strongly-typed way:
//to_string accepts and object and format string and returns string
connection.RegisterScalarFunction<object, string, string>("to_string", (readers, writer, rowCount) =>
{
for (ulong index = 0; index < rowCount; index++)
{
var format = readers[1].GetValue<string>(index);
switch (readers[0].DuckDBType)
{
case DuckDBType.Integer:
writer.WriteValue(readers[0].GetValue<int>(index).ToString(format, CultureInfo.InvariantCulture), index);
break;
case DuckDBType.Date:
writer.WriteValue(readers[0].GetValue<DateOnly>(index).ToString(format, CultureInfo.InvariantCulture), index);
break;
case DuckDBType.Double:
writer.WriteValue(readers[0].GetValue<double>(index).ToString(format, CultureInfo.InvariantCulture), index);
break;
default:
writer.WriteValue(readers[0].GetValue(index).ToString(), index);
break;
}
}
});