Show HN: Cordon – Reduce large log files to anomalous sections

4 days ago

Copy Link

Cordon uses transformer-based embeddings and density-based scoring for semantic anomaly detection in log files.
Key principle: Repetitive patterns are considered normal; unusual, rare, or clustered events are highlighted.
Features include semantic analysis, density-based scoring, noise reduction, and multiple backends (sentence-transformers or llama.cpp).
GPU acceleration requires NVIDIA GPUs with Pascal architecture or newer; CPU mode is always available.
Installation options include pip, uv, and cloning the repository for development.
Basic usage involves running Cordon on log files with options for window size, k-neighbors, and anomaly percentile.
Advanced configurations allow for GPU acceleration, anomaly range filtering, and detailed output.
Cordon reduces large log files to semantically significant sections, achieving up to 98% reduction in some cases.
Workflow includes ingestion, segmentation, vectorization, scoring, thresholding, merging, and formatting.
Parameters like window_size, k_neighbors, and anomaly_percentile can be adjusted for different log types.
Use cases include LLM pre-processing, initial triage, anomaly detection, and exploratory analysis.
GPU acceleration provides significant speedups for large log files, with PyTorch used for k-NN scoring.

Hasty Briefsbeta