<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>DuckDB | Giorgi Dalakishvili | Personal Website</title><link>https://www.giorgi.dev/tags/duckdb/</link><atom:link href="https://www.giorgi.dev/tags/duckdb/index.xml" rel="self" type="application/rss+xml"/><description>DuckDB</description><generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>© Giorgi Dalakishvili 2019 - 2026</copyright><lastBuildDate>Tue, 10 Mar 2026 13:00:00 +0400</lastBuildDate><image><url>img/map[gravatar:%!s(bool=false) shape:circle]</url><title>DuckDB</title><link>https://www.giorgi.dev/tags/duckdb/</link></image><item><title>DuckDB.NET 1.5.0: Simplified Scalar Functions, Table Functions, and NULL Handling</title><link>https://www.giorgi.dev/database/duckdb-net-1-5-udf-improvements/</link><pubDate>Tue, 10 Mar 2026 13:00:00 +0400</pubDate><guid>https://www.giorgi.dev/database/duckdb-net-1-5-udf-improvements/</guid><description>&lt;p>In the
&lt;a href="https://www.giorgi.dev/database/duckdb-net-1-5-performance/">previous post&lt;/a>, I covered the performance improvements in DuckDB.NET 1.5.0. This post covers the API side: a new high-level registration API for scalar and table user-defined functions, named parameter support for table functions, and explicit NULL handling for scalar UDFs.&lt;/p>
&lt;h2 id="scalar-functions-before-and-after">Scalar Functions: Before and After&lt;/h2>
&lt;p>DuckDB.NET has supported scalar UDFs since version 1.0. The existing API gives you full control over vectors and row iteration - but even simple functions require boilerplate.&lt;/p>
&lt;p>Here&amp;rsquo;s an &lt;code>is_prime&lt;/code> function using the low-level API:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">bool&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;is_prime&amp;#34;&lt;/span>, (readers, writer, rowCount) =&amp;gt;
{
&lt;span style="color:#66d9ef">for&lt;/span> (&lt;span style="color:#66d9ef">ulong&lt;/span> index = &lt;span style="color:#ae81ff">0&lt;/span>; index &amp;lt; rowCount; index++)
{
&lt;span style="color:#66d9ef">var&lt;/span> &lt;span style="color:#66d9ef">value&lt;/span> = readers[&lt;span style="color:#ae81ff">0&lt;/span>].GetValue&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>&amp;gt;(index);
&lt;span style="color:#66d9ef">var&lt;/span> prime = &lt;span style="color:#66d9ef">true&lt;/span>;
&lt;span style="color:#66d9ef">for&lt;/span> (&lt;span style="color:#66d9ef">int&lt;/span> i = &lt;span style="color:#ae81ff">2&lt;/span>; i &amp;lt;= Math.Sqrt(&lt;span style="color:#66d9ef">value&lt;/span>); i++)
{
&lt;span style="color:#66d9ef">if&lt;/span> (&lt;span style="color:#66d9ef">value&lt;/span> % i == &lt;span style="color:#ae81ff">0&lt;/span>)
{
prime = &lt;span style="color:#66d9ef">false&lt;/span>;
&lt;span style="color:#66d9ef">break&lt;/span>;
}
}
writer.WriteValue(prime, index);
}
});
&lt;/code>&lt;/pre>&lt;/div>&lt;p>You have to manually iterate over rows, read values from input vectors by index, and write results to the output vector. This works, but it&amp;rsquo;s a lot of ceremony for a function that just checks if a number is prime.&lt;/p>
&lt;h3 id="high-level-scalar-functions">High-Level Scalar Functions&lt;/h3>
&lt;p>The new simplified API lets you register a scalar function with a plain &lt;code>Func&amp;lt;&amp;gt;&lt;/code> delegate. The framework handles row iteration and vector access automatically:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">bool&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;is_prime&amp;#34;&lt;/span>, IsPrime);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>That&amp;rsquo;s it. The library wraps your function, iterates over each row in the chunk, reads the input value, calls your function, and writes the result.&lt;/p>
&lt;p>The API supports zero to four parameters, plus variable arguments:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">&lt;span style="color:#75715e">// Zero parameters
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterScalarFunction(&lt;span style="color:#e6db74">&amp;#34;the_answer&amp;#34;&lt;/span>, () =&amp;gt; &lt;span style="color:#ae81ff">4&lt;/span>&lt;span style="color:#ae81ff">2&lt;/span>);
&lt;span style="color:#75715e">// Two parameters
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">long&lt;/span>, &lt;span style="color:#66d9ef">long&lt;/span>, &lt;span style="color:#66d9ef">long&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;add&amp;#34;&lt;/span>, (a, b) =&amp;gt; a + b);
&lt;span style="color:#75715e">// Three parameters
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">int&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;clamp&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">value&lt;/span>, min, max) =&amp;gt; Math.Clamp(&lt;span style="color:#66d9ef">value&lt;/span>, min, max));
&lt;span style="color:#75715e">// Variable arguments
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterScalarFunction(&lt;span style="color:#e6db74">&amp;#34;sum_all&amp;#34;&lt;/span>, (&lt;span style="color:#66d9ef">long&lt;/span>[] args) =&amp;gt; args.Sum());
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Variable argument functions receive all inputs as an array. DuckDB allows calling them with any number of arguments:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> sum_all(&lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#ae81ff">2&lt;/span>, &lt;span style="color:#ae81ff">3&lt;/span>); &lt;span style="color:#75715e">-- 6
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> sum_all(&lt;span style="color:#ae81ff">10&lt;/span>); &lt;span style="color:#75715e">-- 10
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> sum_all(); &lt;span style="color:#75715e">-- 0
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="dynamic-type-support">Dynamic Type Support&lt;/h3>
&lt;p>The simplified API also supports &lt;code>object&lt;/code> as an input type, which maps to DuckDB&amp;rsquo;s &lt;code>ANY&lt;/code> type. This lets you write functions that accept any column type:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">object&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;net_type_name&amp;#34;&lt;/span>,
&lt;span style="color:#66d9ef">value&lt;/span> =&amp;gt; &lt;span style="color:#66d9ef">value&lt;/span>.GetType().Name);
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> net_type_name(&lt;span style="color:#ae81ff">42&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;Int32&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> net_type_name(&lt;span style="color:#ae81ff">42&lt;/span>::BIGINT); &lt;span style="color:#75715e">-- &amp;#39;Int64&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> net_type_name(&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">hello&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;String&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> net_type_name(&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">2024-01-01&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>::DATE); &lt;span style="color:#75715e">-- &amp;#39;DateOnly&amp;#39;
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You can combine &lt;code>object&lt;/code> with typed parameters. Here&amp;rsquo;s a .NET format function that accepts any formattable value:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction(&lt;span style="color:#e6db74">&amp;#34;format_net&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">object&lt;/span> &lt;span style="color:#66d9ef">value&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span> format) =&amp;gt; &lt;span style="color:#66d9ef">value&lt;/span> &lt;span style="color:#66d9ef">is&lt;/span> IFormattable f
? f.ToString(format, CultureInfo.InvariantCulture)
: &lt;span style="color:#66d9ef">value&lt;/span>.ToString());
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> format_net(&lt;span style="color:#ae81ff">255&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">X&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;FF&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> format_net(&lt;span style="color:#ae81ff">0&lt;/span>.&lt;span style="color:#ae81ff">15&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">P&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;15.00 %&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> format_net(&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">2024-11-06&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>::DATE, &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">yyyy/MM/dd&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;2024/11/06&amp;#39;
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="null-handling-for-scalar-functions">NULL Handling for Scalar Functions&lt;/h2>
&lt;p>By default, DuckDB short-circuits NULLs automatically: if any input to a scalar function is NULL, the result is NULL and your function is never called. This is the correct behavior for most functions, but sometimes you need to handle NULLs explicitly - for example, to implement COALESCE-like logic or to return a default value.&lt;/p>
&lt;p>With the high-level API, the opt-in is the type signature itself. Use a nullable parameter type and DuckDB.NET automatically detects it at registration time and configures the function to receive NULLs:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">int?&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;describe_val&amp;#34;&lt;/span>,
x =&amp;gt; x.HasValue ? x.Value.ToString() : &lt;span style="color:#e6db74">&amp;#34;nothing&amp;#34;&lt;/span>);
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> describe_val(&lt;span style="color:#ae81ff">42&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;42&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> describe_val(&lt;span style="color:#66d9ef">NULL&lt;/span>::INT); &lt;span style="color:#75715e">-- &amp;#39;nothing&amp;#39;
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This works for reference types too - &lt;code>string?&lt;/code> (with nullable annotations enabled) opts into null handling, while &lt;code>string&lt;/code> keeps the default NULL propagation:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">string?&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;echo_or_default&amp;#34;&lt;/span>,
s =&amp;gt; s ?? &lt;span style="color:#e6db74">&amp;#34;was_null&amp;#34;&lt;/span>);
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> echo_or_default(&lt;span style="color:#66d9ef">NULL&lt;/span>::VARCHAR); &lt;span style="color:#75715e">-- &amp;#39;was_null&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> echo_or_default(&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">hello&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;hello&amp;#39;
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You can mix nullable and non-nullable parameters. If any parameter is nullable, the function receives NULLs - but non-nullable parameters will throw a clear error if they get one:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">int?&lt;/span>, &lt;span style="color:#66d9ef">int&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;coalesce_add&amp;#34;&lt;/span>,
(a, b) =&amp;gt; a.HasValue ? (a.Value + b).ToString() : b.ToString());
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> coalesce_add(&lt;span style="color:#66d9ef">NULL&lt;/span>::INT, &lt;span style="color:#ae81ff">5&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;5&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> coalesce_add(&lt;span style="color:#ae81ff">10&lt;/span>, &lt;span style="color:#ae81ff">5&lt;/span>); &lt;span style="color:#75715e">-- &amp;#39;15&amp;#39;
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>NULL handling also works with the low-level vector API. In scalar function callbacks, &lt;code>GetValue&amp;lt;T&amp;gt;&lt;/code> returns &lt;code>null&lt;/code> for NULL rows when &lt;code>T&lt;/code> is a nullable type (&lt;code>int?&lt;/code>, &lt;code>string&lt;/code>, etc.) instead of throwing, so you can write:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterScalarFunction&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>, &lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;echo_nullable&amp;#34;&lt;/span>, (readers, writer, rowCount) =&amp;gt;
{
&lt;span style="color:#66d9ef">for&lt;/span> (&lt;span style="color:#66d9ef">ulong&lt;/span> i = &lt;span style="color:#ae81ff">0&lt;/span>; i &amp;lt; rowCount; i++)
{
&lt;span style="color:#66d9ef">var&lt;/span> &lt;span style="color:#66d9ef">value&lt;/span> = readers[&lt;span style="color:#ae81ff">0&lt;/span>].GetValue&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>&amp;gt;(i);
writer.WriteValue(&lt;span style="color:#66d9ef">value&lt;/span> ?? &lt;span style="color:#e6db74">&amp;#34;was_null&amp;#34;&lt;/span>, i);
}
}, &lt;span style="color:#66d9ef">new&lt;/span>() { HandlesNulls = &lt;span style="color:#66d9ef">true&lt;/span> });
&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="table-functions-before-and-after">Table Functions: Before and After&lt;/h2>
&lt;p>The low-level table function API requires you to define columns, return a &lt;code>TableFunction&lt;/code> with data, and provide a mapper callback that writes each row to output vectors:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>&amp;gt;(&lt;span style="color:#e6db74">&amp;#34;employees&amp;#34;&lt;/span>, parameters =&amp;gt;
{
&lt;span style="color:#66d9ef">var&lt;/span> count = parameters[&lt;span style="color:#ae81ff">0&lt;/span>].GetValue&amp;lt;&lt;span style="color:#66d9ef">int&lt;/span>&amp;gt;();
&lt;span style="color:#66d9ef">var&lt;/span> employees = Enumerable.Range(&lt;span style="color:#ae81ff">1&lt;/span>, count)
.Select(i =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> Employee(i, &lt;span style="color:#e6db74">$&amp;#34;Employee{i}&amp;#34;&lt;/span>, &lt;span style="color:#ae81ff">5&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span> + i * &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>));
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#66d9ef">new&lt;/span> TableFunction(
&lt;span style="color:#66d9ef">new&lt;/span> List&amp;lt;ColumnInfo&amp;gt; { &lt;span style="color:#66d9ef">new&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;id&amp;#34;&lt;/span>, &lt;span style="color:#66d9ef">typeof&lt;/span>(&lt;span style="color:#66d9ef">int&lt;/span>)), &lt;span style="color:#66d9ef">new&lt;/span>(&lt;span style="color:#e6db74">&amp;#34;name&amp;#34;&lt;/span>, &lt;span style="color:#66d9ef">typeof&lt;/span>(&lt;span style="color:#66d9ef">string&lt;/span>)) },
employees);
},
(item, writers, rowIndex) =&amp;gt;
{
&lt;span style="color:#66d9ef">var&lt;/span> employee = (Employee)item!;
writers[&lt;span style="color:#ae81ff">0&lt;/span>].WriteValue(employee.Id, rowIndex);
writers[&lt;span style="color:#ae81ff">1&lt;/span>].WriteValue(employee.Name, rowIndex);
});
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The column definitions and mapper are separate, so they can easily get out of sync - add a column to the schema but forget the mapper, or reorder them differently.&lt;/p>
&lt;h3 id="high-level-table-functions">High-Level Table Functions&lt;/h3>
&lt;p>The new API uses a projection expression to define both the columns and the mapper in one place:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;employees&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count) =&amp;gt; GetEmployees(count),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name });
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The first argument is the function name. The second is a data function that receives the SQL parameters and returns an &lt;code>IEnumerable&amp;lt;T&amp;gt;&lt;/code>. The third is a projection expression that defines which properties to expose as columns - column names and types are extracted automatically from the expression.&lt;/p>
&lt;p>The projection supports anonymous types, object initializers, computed columns, and single properties:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">&lt;span style="color:#75715e">// Computed columns
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;ext_computed&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count) =&amp;gt; GetEmployees(count),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { FullName = &lt;span style="color:#e6db74">&amp;#34;Dr. &amp;#34;&lt;/span> + e.Name, DoubleSalary = e.Salary * &lt;span style="color:#ae81ff">2&lt;/span> });
&lt;span style="color:#75715e">// Object initializer
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;ext_init&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count) =&amp;gt; GetEmployees(count),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> EmployeeDto { Id = e.Id, Name = e.Name });
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Like the scalar API, table functions support zero to four parameters:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">&lt;span style="color:#75715e">// No parameters
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;all_employees&amp;#34;&lt;/span>,
() =&amp;gt; employees.AsEnumerable(),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name });
&lt;span style="color:#75715e">// Four parameters
&lt;/span>&lt;span style="color:#75715e">&lt;/span>connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;ext_four&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> start, &lt;span style="color:#66d9ef">int&lt;/span> count, &lt;span style="color:#66d9ef">string&lt;/span> prefix, &lt;span style="color:#66d9ef">double&lt;/span> multiplier) =&amp;gt;
Enumerable.Range(start, count).Select(i =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> Employee(i, &lt;span style="color:#e6db74">$&amp;#34;{prefix}{i}&amp;#34;&lt;/span>, i * multiplier)),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name, e.Salary });
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Async data sources work too - just call &lt;code>ToBlockingEnumerable()&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;ext_async&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count) =&amp;gt; FetchEmployeesAsync(count).ToBlockingEnumerable(),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name });
&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="named-parameters-for-table-functions">Named Parameters for Table Functions&lt;/h2>
&lt;p>DuckDB supports named parameters for table functions. DuckDB.NET 1.5.0 exposes this with the &lt;code>[Named]&lt;/code> attribute:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;employees&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count, [Named] &lt;span style="color:#66d9ef">string?&lt;/span> prefix) =&amp;gt;
GetEmployees(count).Select(e =&amp;gt; e with { Name = (prefix ?? &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>) + e.Name }),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name });
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In SQL, positional parameters come first, and named parameters use the &lt;code>=&lt;/code> syntax:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#66d9ef">FROM&lt;/span> employees(&lt;span style="color:#ae81ff">3&lt;/span>, &lt;span style="color:#66d9ef">prefix&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">Dr. &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>);
&lt;span style="color:#75715e">-- (1, &amp;#39;Dr. Employee1&amp;#39;), (2, &amp;#39;Dr. Employee2&amp;#39;), (3, &amp;#39;Dr. Employee3&amp;#39;)
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
&lt;span style="color:#75715e">-- Named parameters are optional - omit them and they&amp;#39;re NULL
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#66d9ef">FROM&lt;/span> employees(&lt;span style="color:#ae81ff">3&lt;/span>);
&lt;span style="color:#75715e">-- (1, &amp;#39;Employee1&amp;#39;), (2, &amp;#39;Employee2&amp;#39;), (3, &amp;#39;Employee3&amp;#39;)
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You can have multiple named parameters:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;employees&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count, [Named] &lt;span style="color:#66d9ef">string?&lt;/span> prefix, [Named] &lt;span style="color:#66d9ef">double?&lt;/span> multiplier) =&amp;gt;
GetEmployees(count).Select(e =&amp;gt; e with
{
Name = (prefix ?? &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>) + e.Name,
Salary = e.Salary * (multiplier ?? &lt;span style="color:#ae81ff">1&lt;/span>)
}),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name, e.Salary });
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#75715e">-- Provide both
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#66d9ef">FROM&lt;/span> employees(&lt;span style="color:#ae81ff">2&lt;/span>, &lt;span style="color:#66d9ef">prefix&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">X&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>, multiplier &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>.&lt;span style="color:#ae81ff">0&lt;/span>);
&lt;span style="color:#75715e">-- Provide only one
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">SELECT&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#66d9ef">FROM&lt;/span> employees(&lt;span style="color:#ae81ff">2&lt;/span>, &lt;span style="color:#66d9ef">prefix&lt;/span> &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">Y&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>By default, the SQL parameter name matches the C# parameter name. You can override it with a custom name:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-cs" data-lang="cs">connection.RegisterTableFunction(&lt;span style="color:#e6db74">&amp;#34;employees&amp;#34;&lt;/span>,
(&lt;span style="color:#66d9ef">int&lt;/span> count, [Named(&lt;span style="color:#e6db74">&amp;#34;max_rows&amp;#34;&lt;/span>)] &lt;span style="color:#66d9ef">int?&lt;/span> limit) =&amp;gt; GetEmployees(limit ?? count),
e =&amp;gt; &lt;span style="color:#66d9ef">new&lt;/span> { e.Id, e.Name });
&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#66d9ef">SELECT&lt;/span> &lt;span style="color:#f92672">*&lt;/span> &lt;span style="color:#66d9ef">FROM&lt;/span> employees(&lt;span style="color:#ae81ff">10&lt;/span>, max_rows &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>);
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Named parameters should typically be nullable types (&lt;code>string?&lt;/code>, &lt;code>int?&lt;/code>) since they&amp;rsquo;re optional in SQL. If a named parameter is non-nullable and the caller omits it, DuckDB.NET throws a clear error:&lt;/p>
&lt;pre>&lt;code>Table function 'employees' named parameter 'limit' is NULL, but parameter type 'Int32' is non-nullable.
&lt;/code>&lt;/pre>&lt;h2 id="summary">Summary&lt;/h2>
&lt;p>DuckDB.NET 1.5.0 adds four improvements to user-defined functions:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>High-level scalar functions&lt;/strong>: Register with a plain &lt;code>Func&amp;lt;&amp;gt;&lt;/code> instead of manually iterating vectors. Supports 0–4 parameters, variable arguments, and dynamic &lt;code>object&lt;/code> type.&lt;/li>
&lt;li>&lt;strong>High-level table functions&lt;/strong>: Define columns and mapping with a single projection expression. Supports 0–4 parameters, computed columns, and async data sources.&lt;/li>
&lt;li>&lt;strong>NULL handling for scalar functions&lt;/strong>: Use nullable parameter types (&lt;code>int?&lt;/code>, &lt;code>string?&lt;/code>) and DuckDB.NET automatically configures the function to receive NULLs instead of short-circuiting.&lt;/li>
&lt;li>&lt;strong>Named parameters for table functions&lt;/strong>: The &lt;code>[Named]&lt;/code> attribute maps C# parameters to DuckDB&amp;rsquo;s &lt;code>=&lt;/code> syntax for optional, named arguments.&lt;/li>
&lt;/ul>
&lt;p>The low-level vector-based APIs remain available for cases where you need direct control over chunk processing.&lt;/p></description></item><item><title>DuckDB.NET 1.5.0 Performance: Up to 40% Faster Writes and 22% Fewer Allocations</title><link>https://www.giorgi.dev/database/duckdb-net-1-5-performance/</link><pubDate>Tue, 03 Mar 2026 05:00:00 +0400</pubDate><guid>https://www.giorgi.dev/database/duckdb-net-1-5-performance/</guid><description>&lt;h2 id="whats-new-in-duckdbnet-150">What&amp;rsquo;s New in DuckDB.NET 1.5.0&lt;/h2>
&lt;p>
&lt;a href="https://github.com/Giorgi/DuckDB.NET" target="_blank" rel="noopener">DuckDB.NET&lt;/a> 1.5.0 focuses on performance. The codebase has been optimized across multiple layers - from the low-level native interop (
&lt;a href="https://github.com/Giorgi/DuckDB.NET/commit/370ca9070710784258cde85e43a88144f8200b6f" target="_blank" rel="noopener">LibraryImport migration&lt;/a>,
&lt;a href="https://github.com/Giorgi/DuckDB.NET/commit/377c3a1c76e2117015c0b29b4896b4535ac15cce" target="_blank" rel="noopener">SuppressGCTransition&lt;/a>) to the ADO.NET provider (reader reuse, appender boxing elimination) and type converters (decimal rewrite, BigNum conversion). The results show up across every major code path: reading, writing, and type conversion.&lt;/p>
&lt;p>Note that the current pre-release (1.5.0-alpha) still uses the DuckDB 1.4.4 native library under the hood - the performance gains come entirely from improvements on the .NET side. The stable DuckDB.NET 1.5.0 release will ship with DuckDB 1.5.0 once it becomes available.&lt;/p>
&lt;h2 id="what-changed">What Changed&lt;/h2>
&lt;h3 id="libraryimport-migration">LibraryImport Migration&lt;/h3>
&lt;p>All P/Invoke declarations have been migrated from &lt;code>[DllImport]&lt;/code> to the source-generated &lt;code>[LibraryImport]&lt;/code>. The runtime no longer needs to generate marshalling stubs at JIT time - the source generator produces them at compile time, eliminating stub overhead on every native call. As part of this migration, string returns now use custom marshallers (&lt;code>DuckDBOwnedStringMarshaller&lt;/code> and &lt;code>DuckDBCallerOwnedStringMarshaller&lt;/code>) that correctly and transparently handle ownership semantics - whether DuckDB or the caller is responsible for freeing the memory.&lt;/p>
&lt;h3 id="suppressgctransition-on-fast-native-calls">SuppressGCTransition on Fast Native Calls&lt;/h3>
&lt;p>Many DuckDB C API calls are trivially fast - retrieving a vector data pointer, checking validity, getting chunk size. These calls complete in nanoseconds, but the .NET runtime&amp;rsquo;s GC transition (cooperative → preemptive → cooperative) adds measurable overhead on each call. Adding &lt;code>[SuppressGCTransition]&lt;/code> to these methods skips the transition entirely. This is only safe for native functions that execute in under a microsecond, perform no blocking syscalls or I/O, don&amp;rsquo;t call back into the runtime, don&amp;rsquo;t throw exceptions, and don&amp;rsquo;t manipulate locks. The attribute was applied to every DuckDB C API method that meets these criteria - primarily vector data and validity pointer access, chunk size queries, and similar lightweight operations.&lt;/p>
&lt;h3 id="aggressiveinlining-on-hot-path-methods">AggressiveInlining on Hot-Path Methods&lt;/h3>
&lt;p>&lt;code>[MethodImpl(MethodImplOptions.AggressiveInlining)]&lt;/code> was added to frequently called methods in the reader and writer paths - including &lt;code>IsValid()&lt;/code>, &lt;code>GetFieldData&amp;lt;T&amp;gt;()&lt;/code>, &lt;code>AppendValueInternal&amp;lt;T&amp;gt;()&lt;/code>, and type conversion helpers. For example, &lt;code>IsValid()&lt;/code> and &lt;code>GetFieldData&amp;lt;T&amp;gt;()&lt;/code> are called for every column of every row. At 100,000 rows and 20 columns, that&amp;rsquo;s 2 million calls per query. Inlining these small methods directly into the read/write loops removes call overhead.&lt;/p>
&lt;h3 id="appender-boxing-elimination">Appender Boxing Elimination&lt;/h3>
&lt;p>In 1.4.4, all &lt;code>AppendValue()&lt;/code> overloads funneled into a single generic &lt;code>AppendValueInternal&amp;lt;T&amp;gt;(T? value)&lt;/code> without a &lt;code>struct&lt;/code> constraint. When &lt;code>T&lt;/code> is a value type, the nullable wrapper was boxed on every call. In 1.5.0, this is split into &lt;code>struct&lt;/code>-constrained and &lt;code>class&lt;/code>-constrained overloads - the struct path unwraps &lt;code>Nullable&amp;lt;T&amp;gt;&lt;/code> directly via &lt;code>HasValue&lt;/code>/&lt;code>Value&lt;/code>, eliminating the boxing. With 8 columns per row, that&amp;rsquo;s 6 fewer heap allocations per row for a typical mixed-type schema.&lt;/p>
&lt;h3 id="decimal-conversion-rewrite">Decimal Conversion Rewrite&lt;/h3>
&lt;p>The decimal reader was rewritten for all four internal storage paths:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>SmallInt/Integer/BigInt paths&lt;/strong>: Replaced &lt;code>decimal.Divide(raw, powerOfTen)&lt;/code> with the direct constructor &lt;code>new decimal(Math.Abs(raw), 0, 0, raw &amp;lt; 0, scale)&lt;/code>. This constructs the decimal from its binary components without any arithmetic.&lt;/li>
&lt;li>&lt;strong>HugeInt path&lt;/strong>: Uses &lt;code>BigInteger.DivRem&lt;/code> with pre-computed &lt;code>BigInteger[]&lt;/code> powers of ten instead of repeated intermediate BigInteger arithmetic.&lt;/li>
&lt;li>&lt;strong>Pre-computed lookup tables&lt;/strong>: Static &lt;code>decimal[]&lt;/code> and &lt;code>BigInteger[]&lt;/code> arrays for powers of ten (scales 0–28 and 0–38 respectively) are computed once at startup and reused for all conversions.&lt;/li>
&lt;/ul>
&lt;h3 id="vector-reader-reuse-across-chunks">Vector Reader Reuse Across Chunks&lt;/h3>
&lt;p>DuckDB returns data in chunks of up to 2,048 rows. In 1.4.4, new &lt;code>VectorDataReader&lt;/code> instances were allocated for each chunk. In 1.5.0, readers implement &lt;code>Reset(IntPtr vector)&lt;/code> which updates data and validity pointers in place, reusing the same reader objects across all chunks in a result set. Composite readers (struct, list, map, decimal) override &lt;code>Reset&lt;/code> to also update their nested child readers.&lt;/p>
&lt;h3 id="bignum-on-conversion">BIGNUM O(n) Conversion&lt;/h3>
&lt;p>The &lt;code>BIGNUM&lt;/code> (previously &lt;code>VarInt&lt;/code>) to &lt;code>BigInteger&lt;/code> conversion was rewritten from an O(n²) digit-by-digit algorithm to a direct O(n) construction using &lt;code>BigInteger&lt;/code>'s byte-span constructor. For positive values, the raw bytes are passed directly. For negative values, a byte complement is needed - small payloads (≤128 bytes) use &lt;code>stackalloc&lt;/code> for this, larger ones rent from &lt;code>ArrayPool&amp;lt;byte&amp;gt;&lt;/code>. The result: &lt;strong>93% faster&lt;/strong> reads and &lt;strong>98-99% fewer allocations&lt;/strong> at 10K–100K rows.&lt;/p>
&lt;h2 id="benchmarks">Benchmarks&lt;/h2>
&lt;p>All benchmarks compare &lt;strong>DuckDB.NET.Data.Full 1.4.4&lt;/strong> (stable, from NuGet.org) against &lt;strong>1.5.0-alpha.35&lt;/strong> (pre-release, from GitHub) using
&lt;a href="https://github.com/dotnet/BenchmarkDotNet" target="_blank" rel="noopener">BenchmarkDotNet&lt;/a>. Each version is compiled and run independently using &lt;code>WithMsBuildArguments&lt;/code> to swap the package version at build time, so both versions run against the same benchmark code with no shared state.&lt;/p>
&lt;h3 id="reader-17-22-faster">Reader: 17-22% Faster&lt;/h3>
&lt;p>The reader benchmark creates a table with a mix of column types (INT, VARCHAR, DOUBLE, BOOLEAN, DECIMAL, BIGINT, TIMESTAMP) and reads 100,000 rows using typed getters (&lt;code>GetInt32&lt;/code>, &lt;code>GetString&lt;/code>, &lt;code>GetDecimal&lt;/code>, etc.).&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Method&lt;/th>
&lt;th>Job&lt;/th>
&lt;th>ColumnCount&lt;/th>
&lt;th align="right">Mean&lt;/th>
&lt;th align="right">Ratio&lt;/th>
&lt;th align="right">Allocated&lt;/th>
&lt;th align="right">Alloc Ratio&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>5&lt;/td>
&lt;td align="right">20.45 ms&lt;/td>
&lt;td align="right">-22%&lt;/td>
&lt;td align="right">3.82 MB&lt;/td>
&lt;td align="right">-1%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>5&lt;/td>
&lt;td align="right">26.12 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">3.85 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>10&lt;/td>
&lt;td align="right">44.15 ms&lt;/td>
&lt;td align="right">-17%&lt;/td>
&lt;td align="right">7.63 MB&lt;/td>
&lt;td align="right">-1%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>10&lt;/td>
&lt;td align="right">53.60 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">7.68 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>20&lt;/td>
&lt;td align="right">89.56 ms&lt;/td>
&lt;td align="right">-17%&lt;/td>
&lt;td align="right">11.45 MB&lt;/td>
&lt;td align="right">-1%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReadAllColumns&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>20&lt;/td>
&lt;td align="right">107.50 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">11.56 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Allocations are nearly identical because both versions allocate the same &lt;code>string&lt;/code> objects for VARCHAR columns. The speedup comes from &lt;code>LibraryImport&lt;/code>, &lt;code>SuppressGCTransition&lt;/code>, and &lt;code>AggressiveInlining&lt;/code> working together on the hot read path.&lt;/p>
&lt;h3 id="appender-20-40-faster-22-less-memory">Appender: 20-40% Faster, 22% Less Memory&lt;/h3>
&lt;p>The appender benchmark creates rows with 8 columns (INT, VARCHAR, DOUBLE, BOOLEAN, DECIMAL, TIMESTAMP, BIGINT, VARCHAR) using the &lt;code>CreateRow().AppendValue(...).EndRow()&lt;/code> API.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Method&lt;/th>
&lt;th>Job&lt;/th>
&lt;th>RowCount&lt;/th>
&lt;th align="right">Mean&lt;/th>
&lt;th align="right">Ratio&lt;/th>
&lt;th align="right">Allocated&lt;/th>
&lt;th align="right">Alloc Ratio&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>10,000&lt;/td>
&lt;td align="right">19.58 ms&lt;/td>
&lt;td align="right">-20%&lt;/td>
&lt;td align="right">8.85 MB&lt;/td>
&lt;td align="right">-22%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>10,000&lt;/td>
&lt;td align="right">24.41 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">11.36 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>100,000&lt;/td>
&lt;td align="right">86.90 ms&lt;/td>
&lt;td align="right">-41%&lt;/td>
&lt;td align="right">89.20 MB&lt;/td>
&lt;td align="right">-22%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>100,000&lt;/td>
&lt;td align="right">147.46 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">114.38 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>alpha-1.5.0&lt;/td>
&lt;td>1,000,000&lt;/td>
&lt;td align="right">803.55 ms&lt;/td>
&lt;td align="right">-40%&lt;/td>
&lt;td align="right">899.61 MB&lt;/td>
&lt;td align="right">-22%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AppendRows&lt;/td>
&lt;td>stable-1.4.4&lt;/td>
&lt;td>1,000,000&lt;/td>
&lt;td align="right">1,349.50 ms&lt;/td>
&lt;td align="right">baseline&lt;/td>
&lt;td align="right">1151.38 MB&lt;/td>
&lt;td align="right">&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The consistent 22% allocation reduction is the boxing elimination at work. At 1 million rows with 8 columns, that&amp;rsquo;s millions of heap allocations removed. The 40% speed gain comes from boxing elimination, reduced GC pressure from fewer allocations, and &lt;code>AggressiveInlining&lt;/code> on the write path.&lt;/p>
&lt;h2 id="summary">Summary&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Area&lt;/th>
&lt;th>Speedup&lt;/th>
&lt;th>Memory Reduction&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Reader (mixed types)&lt;/td>
&lt;td>17-22%&lt;/td>
&lt;td>~1%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Appender (8 columns)&lt;/td>
&lt;td>20-40%&lt;/td>
&lt;td>22%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>These improvements require no code changes - just update the DuckDB.NET.Data package to 1.5.0 and the optimizations apply automatically.&lt;/p>
&lt;p>In the next post, I&amp;rsquo;ll cover the API improvements in 1.5.0 - including a simplified API for scalar and table user-defined functions.&lt;/p></description></item></channel></rss>