Preventing Flash of Incomplete Markdown when streaming AI responses
a year ago
- #AI
- #Markdown
- #Streaming
- Flash of Incomplete Markdown (FOIM) occurs when streaming AI responses, similar to Flash of Unstyled Content (FOUC).
- Streak encountered FOIM and hallucinated URLs in AI-generated responses, leading to incorrect links.
- Solution: Use short, Wikipedia-style reference links (e.g., [1](#REF3)) to reduce token usage and prevent hallucinations.
- Implemented a state machine to buffer and replace short URLs with full URLs server-side before streaming to the client.
- Benefits include fewer tokens, faster responses, no link hallucinations, and improved privacy by not transmitting URLs to OpenAI.
- Markdown link handling requires careful state management to support various formats and prevent incomplete flashes.