Show HN: Flint – A 30B model fine-tuned for less repetition

19 hours ago

Flint is a language model designed specifically for inspiration and creative tasks, not for giving correct answers.
It addresses the issue of output convergence in frontier LLMs, which leads to repetitive and similar responses, hindering divergent thinking.
Flint achieves a dramatic increase in output diversity on creative tasks without degrading performance in other areas like accuracy or responsible AI benchmarks.
On NoveltyBench, Flint scores 7.47 mean distinct responses out of 10, significantly outperforming SOTA models such as Gemini 3.1 Pro (3.19), GPT-5.4 (2.54), and Claude 4.6 Sonnet (1.83).
In intra-model similarity tests, Flint has a mean similarity of 0.721 (lower is better), compared to higher scores for other models (e.g., GPT-5.4: 0.864, Gemini 3.1 Pro: 0.871, Claude 4.6 Sonnet: 0.905).
Flint also shows lower inter-model similarity (0.672) than other models, making it the most distinctive in the comparison.
It maintains performance on benchmarks like MMLU-STEM (78.9%), TruthfulQA MC1 (34.4%), and ToxiGen (59.6%), showing divergence tuning does not compromise capability.
Flint is not a replacement for frontier models but acts as a multiplier in creative workflows, generating range while larger models provide depth and reasoning, with humans applying judgment.
The model is available in alpha version via the Springboards app.

Hasty Briefsbeta