If you are a data scientist or do anything with data... duckdb is like a swiss army knife. So many great ways it can help your workflow. The original video from CMU in 2020 [1] is a classic. Minutes 3-8 present a good argument for adding duckdb to your data cleaning/processing workflow.
And if you want to add a semantic layer on top of data, Malloy [2] is my favorite so far (it has duckdb built in):
What you may be remembering were reports of exceptional cases where it didn’t handle out of memory errors well. I was one of the people affected. I was running complex analytic queries on 400 GB parquets and I only had 128GB memory. It used jemalloc which didn’t gracefully degrade. They fixed a lot of the OOM issues so it’s more robust now. I haven’t had a crash for a long time.
I'm currently using it to provide a SQL interface for doing data analytics for customers that don't have data large enough to justify a large platform like Snowflake or Databricks. Its syntax is close enough to the other two that, especially since the logic is already normalized through abstracted query definitions, it's a drop in replacement.
Given that it's so lightweight, I can use it run searches in an AWS Lambda function, which is crazy useful.
And if you want to add a semantic layer on top of data, Malloy [2] is my favorite so far (it has duckdb built in):
[1]: https://www.youtube.com/watch?v=PFUZlNQIndo [2]: https://docs.malloydata.dev/documentation/