Hasty Briefsbeta

Show HN: Cordon – Reduce large log files to anomalous sections

4 days ago
  • #anomaly-detection
  • #log-analysis
  • #machine-learning
  • Cordon uses transformer-based embeddings and density-based scoring for semantic anomaly detection in log files.
  • Key principle: Repetitive patterns are considered normal; unusual, rare, or clustered events are highlighted.
  • Features include semantic analysis, density-based scoring, noise reduction, and multiple backends (sentence-transformers or llama.cpp).
  • GPU acceleration requires NVIDIA GPUs with Pascal architecture or newer; CPU mode is always available.
  • Installation options include pip, uv, and cloning the repository for development.
  • Basic usage involves running Cordon on log files with options for window size, k-neighbors, and anomaly percentile.
  • Advanced configurations allow for GPU acceleration, anomaly range filtering, and detailed output.
  • Cordon reduces large log files to semantically significant sections, achieving up to 98% reduction in some cases.
  • Workflow includes ingestion, segmentation, vectorization, scoring, thresholding, merging, and formatting.
  • Parameters like window_size, k_neighbors, and anomaly_percentile can be adjusted for different log types.
  • Use cases include LLM pre-processing, initial triage, anomaly detection, and exploratory analysis.
  • GPU acceleration provides significant speedups for large log files, with PyTorch used for k-NN scoring.