Have you ever wondered why AI sometimes answers the wrong questions or gives you unreasonable information?
This problem is about to disappear because researchers from Alibaba and the University of Science and Technology of China have developed a new AI called START (Self-Taught Reasoner with Tools).
START is not just an ordinary AI, but it is an AI that thinks step by step and knows how to use thinking tools, just like we use calculators or computer programs to solve difficult problems.
How does START work? What makes START different?
START is an AI that builds on the concept of Large Reasoning Models (LRMs) or large language models that focus on reasoning.
- Chain-of-Thought (CoT): Think systematically in stages:
- START uses a method called "Chain-of-Thought", which is to break down complex problems into small steps and gradually solve them step by step. This makes it possible to better analyze and deal with complex problems. It's like when we solve difficult math problems, we have to do it step by step.
- Using External Tools: Python Interpreter:
- When faced with a problem that requires complex calculations or needs more information, START runs a Python program to help calculate, examine data, or generate the desired result. This makes it possible to get accurate and reliable answers. It's like having a smart calculator and an assistant to analyze personal data.
- Self-Taught Learning: Learn and improve yourself:
- START has a system that helps it learn from its own experience and improve its reasoning and tooling skills, just like we practice repetitive problems until they get better.
- It uses techniques called Hint-infer and Hint Rejection Sampling Fine-Tuning (Hint-RFT), which allow the model to learn how to effectively use external tools. It does not require a large amount of sample data.
START mechanism
(A little technical, you can skip it.)

Illustration of the operation of START from the research paper
START works through two main processes:
- Hint-infer :
- At this stage, START inserts "hints" (hints). into the reasoning process to encourage the model to run external tools.
- An example of a hint such as "Wait, maybe using Python here is a good idea."
- Hint Rejection Sampling Fine-Tuning (Hint-RFT) :
- This step uses the results from the Hint-infer to be screened, scored, and refined to create a high-quality dataset (DSEED).
- The dseed is then taken to fine-tune the base model (QwQ-32B-Preview) to create a START-0.
- START-0 will be used to create a richer dataset (DSTART), which will lead to a final fine-tune to create a START.
How good is START? Examples of amazing talents
START is not just "good", but "very good".
Let's take a look at some examples of START's amazing capabilities.
- Solve competitive math problems:
- START scores high in high-level math competitions such as MATH500 and AMC23, with an accuracy of 95% in some competitions, which is higher than many elite students.
- Answer PhD Science Questions:
- START can accurately answer difficult and complex questions at the doctoral level. In the GPQA (Graduate-Level Science Questions) test, START scored significantly better than previous generations of AI.
- Computer Code and Debugging :
- START is not only programming code, but also programming. But they can also check and correct errors in their code on their own, which is a very important skill in software development.
What's even more amazing is that START can do these things without being told every step.
Surprisingly good (secretly scared slightly 😆)
Technology Behind START: QwQ-32B-Preview and Fine-Tuning
START is built on the basis of the QwQ-32B-Preview model, a highly efficient Large Language Model (LLM) that uses Python as an important tool to help think and process data.
START also uses a two-phase fine-tuning process to fine-tune the model for better reasoning and tooling.
How will START change our world? Application Potential
START has the potential to revolutionize and transform our world in many ways, such as:
- Scientific Research:
- START can help scientists analyze complex data, find hidden relationships, or even help invent new theories effectively.
- Education:
- START can be used to develop smart, personalized learning materials for each learner. It makes it easier to understand difficult content.
- Software Development:
- START can help programmers write code, check for errors, and optimize their programs, making software development faster and more efficient.
- Complex Solutions:
- Whether it's financial data analysis, business strategy planning, or logistics management, START can help us make better decisions and solve complex problems effectively.
The future of AI that "thinks" is not just "remembered"
START is not just an ordinary AI, but a giant leap forward in the field of artificial intelligence technology. It shows that AI can really "think", not just remember information to answer.
Although START still has some limitations, such as the ability to work with languages other than Python.
But it also opens the door to a new world of AI that is smarter, more context-sensitive, and ready to help humans solve more complex problems in the future.
Who knows, in the next few years, we may see AI that can interact and reason like a real human.
START could be the beginning of a major revolution in AI that will change our world forever.