WebAssembly (Wasm) has transformed the capabilities of browsers, enabling high-performance applications without needing anything beyond the browser itself. DuckDB, which can also run in browsers via Wasm, opens up numerous possibilities. In this blog, we'll explore various use cases of DuckDB in the browser and introduce a fun, practical example that you can try yourself, complete with source code.

Why Wasm? Wasm is a powerful tool that is gaining traction in web development. Popular applications like Figma use Wasm to run complex software written in languages such as C++ or Rust directly in the browser. This allows for fast, lightweight applications that are easy to deploy. As browsers become more capable, even utilizing WebGPU to harness GPU power directly, possibilities such as training machine learning models locally on your machine via a browser link are becoming feasible, eliminating setup hassles. An exciting project in the Wasm ecosystem is pyodide, which ports CPython to WebAssembly, offering a full Python environment in your browser just from a URL, minimizing reliance on cloud resources. Check out the pyodide REPL here.

Current Uses of DuckDB Wasm DuckDB, being a C++ written, embedded database, is ideal for Wasm. It has been compiled to WebAssembly, allowing it to operate inside any browser. You can experience this here by running DuckDB directly in your browser. DuckDB Wasm is particularly useful in user interfaces requiring lightweight analytic operations, reducing network traffic. Here are some common scenarios: Ad-hoc queries on data lakes, such as schema exploration or data previews. Dynamic querying in dashboards by adjusting filters on-the-fly. Educational tools for SQL learning or in-browser SQL IDEs. For example, lakeFS has integrated DuckDB Wasm for ad-hoc queries within their Web UI. Similarly, companies like Evidence and Count leverage DuckDB Wasm to enhance performance. Running DuckDB, embedded in the lakeFS UI Universal SQL Architecture from Evidence: Data -> Storage -> DuckDB Wasm -> Components

DuckDB Wasm as a Firefox extension It's pretty common when navigating to object storage (would it AWS S3 or GCP Cloud storage, or Azure blob storage), that you want to quickly inspect a file or its schema, would it be for debugging or quickly preview a sample of data. In this small project, we have created a Firefox extension that displays the schema of Parquet files when you hover your mouse over them in GCP Cloud Storage. Here's a short video demo. Your browser does not support the video tag. The internals are pretty simple - with DuckDB Wasm, we can run directly a query on the client side, which does a query of the remote parquet file, and display its metadata. Let's get a grasp of the main component of the Firefox extension code, written in Javascript. We instantiate the database : async function makeDB ( ) { const logger = new duckdb. ConsoleLogger (); const worker = await duckdb. createWorker (bundle. mainWorker ); const db = new duckdb. AsyncDuckDB (logger, worker); await db. instantiate (bundle. mainModule ); return db } Create a function to handle query results : async function query ( sql ) { const q = await conn. query (sql); const rows = q. toArray (). map ( Object . fromEntries ); rows. columns = q. schema . fields . map ( ( d ) => d. name ); return rows; } And finally a function to handle hover events : async function hover ( request, sender, sendResponse ) { const fileName = request[ 'filename' ]; const url = sender. url ; const bucketName = url. split ( '/storage/browser/' )[ 1 ]. split ( ';' )[ 0 ]; const filePath = `s3:// ${bucketName} / ${fileName} ` ; console . log (filePath); const schema = await query ( `SELECT path_in_schema AS column_name, type FROM parquet_metadata(' ${filePath} ');` ); return Promise . resolve ({ schema }); } As you can see, we are using the parquet_metadata() function to retrieve parquet schema here. After that what is left is to define the handler and the panel displayed. You can check out the full code here. Check out the complete extension code here, and watch our full livestream with Christophe Blefari discussing DuckDB Wasm and this project.

What about MotherDuck? The MotherDuck UI uses DuckDB Wasm to ensure responsive querying, especially when manipulating data already loaded locally. This means there is no need to communicate with the cloud, and both data and computing remain on your local machine. We've also launched our Wasm SDK to enable developers to create data-driven applications using Wasm, powered by DuckDB and MotherDuck.