Join the DuckDB Discord server!

DuckDB is an in-process
SQL OLAP database management system

Installation ↓

Documentation

Live Demo

Why DuckDB?



Simple and portable

In-process, serverless
C++11, no dependencies, single-file build
APIs for Python, R, Java, Julia, Swift, …
Runs on Windows, Linux, macOS, OpenBSD, …



Feature-rich

Transactions, persistence
Extensive SQL support
Direct Parquet, CSV, and JSON querying
Joins, aggregates, window functions



Fast

Optimized for analytics
Vectorized and parallel engine
Larger than memory processing
Parallel Parquet, CSV, and NDJSON loaders



Free and extensible

Free & open-source
Permissive MIT License
Flexible extension mechanism

Installation

Choose your environment to use for DuckDB

Command Line
Python
R
Java
node.js
Julia
C++
ODBC

https://github.com/duckdb/duckdb/releases/download/v0.8.1/duckdb_cli-windows-amd64.zip

Latest release: DuckDB 0.8.1 System detected: Other Installations

When to use DuckDB



Processing and storing tabular datasets, e.g. from CSV or Parquet files
Interactive data analysis, e.g. Joining & aggregate multiple large tables
Concurrent large changes, to multiple large tables, e.g. appending rows, adding/removing/updating columns
Large result set transfer to client

When to not use DuckDB



High-volume transactional use cases (e.g. tracking orders in a webshop)
Large client/server installations for centralized enterprise data warehousing
Writing to a single database from multiple concurrent processes
Multiple concurrent processes reading from a single writable database

Blog

DuckDB ADBC - Zero-Copy data transfer via Arrow Database Connectivity

TLDR: DuckDB has added support for Arrow Database Connectivity (ADBC), an API standard that enables efficient data ingestion and retrieval from database systems, similar to Open Database Connectivity (ODBC) interface. However, unlike ODBC, ADBC specifically caters to the columnar storage model, facilitating fast data transfers between a columnar database and […]

2023-07-07

From Waddle to Flying: Quickly expanding DuckDB's functionality with Scalar Python UDFs

TLDR: DuckDB now supports vectorized Scalar Python User Defined Functions (UDFs). By implementing Python UDFs, users can easily expand the functionality of DuckDB while taking advantage of DuckDB’s fast execution model, SQL and data safety. User Defined Functions (UDFs) enable users to extend the functionality of a Database Management System […]

2023-05-26

Correlated Subqueries in SQL

Subqueries in SQL are a powerful abstraction that allow simple queries to be used as composable building blocks. They allow you to break down complex problems into smaller parts, and subsequently make it easier to write, understand and maintain large and complex queries. DuckDB uses a state-of-the-art subquery decorrelation optimizer […]

DuckDB is an in-process SQL OLAP database management system