Hasty Briefsbeta

Bilingual

Step 3.5 Flash: Fast Enough to Think. Reliable Enough to Act

6 days ago
  • #AI
  • #Machine Learning
  • #Foundation Models
  • Step 3.5 Flash is an open-source foundation model with 196B parameters, activating only 11B per token for efficient reasoning and agentic capabilities.
  • Features deep reasoning at speed with 100–300 tok/s generation throughput, powered by Multi-Token Prediction (MTP-3).
  • Excels in coding and agentic tasks, scoring 74.4% on SWE-bench Verified and 51.0% on Terminal-Bench 2.0.
  • Supports a cost-efficient 256K context window using a 3:1 Sliding Window Attention (SWA) ratio.
  • Optimized for local deployment on high-end consumer hardware like Mac Studio M4 Max and NVIDIA DGX Spark.
  • Demonstrates superior tool-use capabilities, orchestrating complex tasks like stock investment scenarios with seamless MCP integration.
  • Achieves high scores on elite logic and mathematics benchmarks, including AIME 2025 (99.8) and HMMT 2025 Nov. (98.0).
  • Supports agentic coding, decomposing complex requirements into actionable steps within a codebase.
  • Performs well in deep research tasks, scoring 65.27% on Scale AI Research Rubrics.
  • Features a multi-agent orchestration framework for complex task handling.
  • Enables edge-cloud collaboration, enhancing performance in complex scenarios like AndroidDaily Hard tasks.
  • Shows reliability in interaction, with proactive intent clarification and professional advisory capabilities.
  • Built on a sparse Mixture of Experts (MoE) architecture with optimized decoding and inference speeds.
  • Scalable RL framework (MIS-PO) ensures stable, long-horizon optimization for continuous self-improvement.
  • Benchmarked against top open-source models, showing strong performance in reasoning, coding, and agentic capabilities.
  • Known issues include reliance on longer generation trajectories and reduced stability in specialized domains.