March_AI Modeling|Operation, Limits and Breakthroughs of Big Language Models(Next)
OpenAI's O-series models, DeepSeek R1/R1 zero, and xAI's recently launched Grok 3 are all equipped with reasoning capabilities, but according to KOLs and related papers on the Internet, we can still speculate on their core technologies and design concepts:
- Adding Reinforcement Learning to the original Pretraining process in LLM.
- In the final Inference Model (Inference Model) also added Reinforcement Learning (Reinforcement Learning) and Monte Carlo Tree Search (MCTS), etc., mainly to enhance or improve the test computing resources (Test-Time Compute).
In terms of end-inference users, the core mechanism of O3 is modeled in the
For more details, please register or log in.Member Login.