Show HN: Flint – A 30B model fine-tuned for less repetition
19 hours ago
- #AI Creativity
- #Model Diversity
- #Divergent Thinking
- Flint is a language model designed specifically for inspiration and creative tasks, not for giving correct answers.
- It addresses the issue of output convergence in frontier LLMs, which leads to repetitive and similar responses, hindering divergent thinking.
- Flint achieves a dramatic increase in output diversity on creative tasks without degrading performance in other areas like accuracy or responsible AI benchmarks.
- On NoveltyBench, Flint scores 7.47 mean distinct responses out of 10, significantly outperforming SOTA models such as Gemini 3.1 Pro (3.19), GPT-5.4 (2.54), and Claude 4.6 Sonnet (1.83).
- In intra-model similarity tests, Flint has a mean similarity of 0.721 (lower is better), compared to higher scores for other models (e.g., GPT-5.4: 0.864, Gemini 3.1 Pro: 0.871, Claude 4.6 Sonnet: 0.905).
- Flint also shows lower inter-model similarity (0.672) than other models, making it the most distinctive in the comparison.
- It maintains performance on benchmarks like MMLU-STEM (78.9%), TruthfulQA MC1 (34.4%), and ToxiGen (59.6%), showing divergence tuning does not compromise capability.
- Flint is not a replacement for frontier models but acts as a multiplier in creative workflows, generating range while larger models provide depth and reasoning, with humans applying judgment.
- The model is available in alpha version via the Springboards app.