TradingKey - On September 29, DeepSeek officially released its new model, DeepSeek-V3.2-Exp, and announced a significant reduction in its official API pricing, effective immediately.
The cost of input for DeepSeek-V3.2-Exp has been slashed by more than 50%, while output pricing has dropped by 75%, mainly due to reduced service costs associated with the new model. Industry experts suggest that this pricing strategy will make it difficult for other companies to compete with similar services.
Moreover, this version marks a breakthrough in core technology with the introduction of DeepSeek Sparse Attention (DSA), implementing fine-grained sparse attention mechanisms for the first time. DeepSeek describes V3.2-Exp as an experimental model, serving as a transitional step towards the next-generation architecture. Built on the foundation of V3.1-Terminus, it introduces the DSA mechanism, designed to lower computational resource consumption while enhancing model inference efficiency.
According to evaluations by DeepSeek, V3.2-Exp performs comparably to V3.1-Terminus, achieving a significant improvement in long-text training and inference efficiency without compromising output quality. The V3.2-Exp model is currently open-sourced on Huggingface and Modu. Meanwhile, speculation suggests that DeepSeek's V4 and R2 versions are unlikely to be released imminently.
Both Huawei Cloud and Cambricon have successfully completed the integration of the DeepSeek-V3.2-Exp model, with Huawei Cloud supporting a maximum context length of 160K for long sequences.