Hasty Briefsbeta

ThalamusDB: Query text, tables, images, and audio

4 days ago
  • #SQL
  • #multimodal-data
  • #approximate-processing
  • ThalamusDB is an approximate processing engine supporting SQL queries with semantic operators on multimodal data.
  • Install ThalamusDB using pip: `pip install thalamusdb`.
  • Set environment variables for API keys, e.g., `export OPENAI_API_KEY=[Your Key]`.
  • Run ThalamusDB console with a DuckDB database file and model configuration.
  • Example database `cars.db` contains a table with text descriptions and image paths.
  • Supports semantic queries like `nlfilter(pic, 'the car in the picture is red')`.
  • Works with text, images, and audio files stored as paths in text columns.
  • Supports two semantic filter operators: `NLfilter` and `NLjoin`.
  • Model configuration file specifies models for different data types and operators.
  • Designed for approximate processing, displaying bounds for aggregation queries and intersection rows for retrieval queries.
  • Error bounds help track progress toward exact results.
  • Configurable stopping criteria include max time, LLM calls, tokens, and error threshold.
  • Documentation and example available on GitHub and Google Colab.