Hasty Briefsbeta

LightlyStudio – an open-source multimodal data curation and labeling tool

3 days ago
  • #data-annotation
  • #machine-learning
  • #open-source
  • LightlyStudio is an open-source tool for data curation, annotation, and management.
  • Built with Rust for performance, it supports COCO and ImageNet datasets on a Macbook Pro with M1 and 16GB RAM.
  • Compatible with Python 3.8+ on Windows, Linux, and MacOS.
  • Install via pip: `pip install lightly-studio`.
  • Example datasets can be downloaded from a GitHub repository or use your own YOLO/COCO dataset.
  • Includes examples for image-only datasets, YOLO object detection, COCO instance segmentation, and COCO captions.
  • LightlyStudio features a powerful Python interface for dataset indexing, querying, and manipulation.
  • Supports loading data from cloud storage (e.g., S3, GCS) and local folders.
  • Sample attributes include ID, file name, path, tags, and metadata, which can be accessed and modified.
  • Dataset queries allow filtering, sorting, and slicing operations using expressions.
  • Premium feature for automated data selection to pick the most useful samples based on typicality and diversity.
  • Version 0.4.0 released as a preview on 2025-10-21.
  • Contributions are welcome via the issues page for tasks and improvements.