Deepseek-R1 effect and web3-IA

The Artificial Intelligence (AI) world was taken by storm a few days ago with the release of Deepseek-R1, an Open Source Reasoning Model that matches the performance of the Top Foundation models while claiming to have been built by Help of a remarkably low training budget and new techniques after training. The release of Deepseek-R1 not only challenged the conventional wisdom of scaling laws in the foundation models-which traditionally favors massive training budgets-but did so in the most active research area in the field: reasoning.

The open weight (as opposed to the open source) character of the release made the model easily accessible to the AI ​​community, which led to a wave of clones within a few hours. In addition, Deepseek-R1 made its mark on the ongoing AI race between China and the United States, which strengthens what has become more evident: Chinese models are of exceptionally high quality and fully capable of operating innovation with original Ideas.

Unlike most progress in generative AI, which appears to expand the gap between web2 and web3 in the Foundation Models area, the release of Deepseek-R1 has real implications and presents exciting opportunities for web3-IA. To assess these, we must first take a closer look at Deepseek-R1’s most important innovations and differentiators.

Inside Deepseek-R1

Deepseek-R1 was the result of introducing step-by-step innovations into a well-established prior frame for foundation models. To a large extent, Deepseek-R1 follows the same training methodology as most high-profile foundation models. This approach consists of three central steps:

  1. Pretraining: The model is initially prior to predict the next word using massive amounts of unmarked data.
  2. Monitored fine tuning (SFT): This step optimizes the model in two critical areas: After instructions and answering questions.
  3. Customizing with human preferences: A final fine -tuning phase is performed to customize the model’s answers with human preferences.

Most major foundation models – including those developed by Openai, Google and Anthropic – comply with the same general process. At a high level, Deepseek-R1’s training procedure does not appear significantly different. However, rather than predicting a base model from scratch, the R1 geared the basic model for its predecessor, Deepseek-V3 base that boasts an impressive parameters of 617 billion.

Essentially, Deepseek-R1 is the result of applying SFT to the Deepseek-V3 base with a large-scale reasoning data set. The real innovation lies in the construction of these reasoning data sets, which are notoriously difficult to build.

First step: Deepseek-R1-zero

One of the most important aspects of Deepseek-R1 is that the process not only produced a single model but two. Perhaps the most significant innovation of Deepseek-R1 was the creation of an intermediate model called R1-Zero, which specializes in reasoning tasks. This model was trained almost exclusively by means of reinforcement learning with minimal dependence on labeled data.

Reinforcement learning is a technique where a model is rewarded for generating correct answers, enabling it to generalize knowledge over time.

R1-zero is quite impressive as it was able to match GPT-O1 in reasoning tasks. However, the model struggled with more general tasks such as question response and readability. That said, the purpose of R1-Zero was never to create a generalist model, but rather to demonstrate that it is possible to achieve advanced reasoning functions using reinforcement learning alone the model does not work well in other areas.

Second-Step: Deepseek-R1

Deepseek-R1 was designed to be a general purpose model that is distinguished by reasoning, which means it was needed to surpass R1-zero. To achieve this, Deepseek started again with its V3 model, but this time it fine-tuned it on a small reasoning data set.

As mentioned earlier, justification data sets are difficult to produce. This is where R1-Nul played a crucial role. The middle model was used to generate a synthetic reasoning data set, which was then used to fine -tune Deepseek V3. This process resulted in another intermediate reasoning model, which was subsequently implemented a comprehensive reinforcement learning phase using a data set of 600,000 samples, also generated by R1-zero. The final result of this process was Deepseek-R1.

While I have omitted several technical details of the R1 Pre -Outbound Process, here are the two most important takeaways:

  1. R1-Zero demonstrated that it is possible to develop sophisticated reasoning features using basic reinforcement learning. Although R1-zero was not a strong generalist model, it successfully generated the reasoning data needed for R1.
  2. R1 expanded the traditional prior pipeline used by most foundation models by incorporating R1-zero into the process. In addition, it utilized a significant amount of synthetic reasoning data generated by R1 zero.

As a result, Deepseek-R1 emerged as a model that matched the reasons in GPT-O1, while being built using a simpler and probably significantly cheaper prior process.

Everyone agrees that R1 marks an important milestone in the story of generative AI, one that is likely to reshape the way foundation models are developed. When it comes to web3, it will be interesting to investigate how R1 affects the evolving landscape of web3-IA.

Deepseek-R1 and Web3-Ai

Until now, web3 has struggled to establish compelling use cases that clearly add value to the creation and utilization of foundation models. To a certain extent, the traditional workflow for prior foundation models appears to be the antithesis of web3 architectures. However, despite being in its early stages, the release of Deepseek-R1 has highlighted several options that could naturally adapt to web3-IA architectures.

1) Reinforcement of learning of fine -tuning network

R1-zero demonstrated that it is possible to develop reasoning models using pure reinforcement learning. From a calculation point of view, reinforcement learning is very parallelizable, making it suitable for decentralized networks. Imagine a web3 network where nodes are compensated to fine-tune a model for reinforcing learning tasks, each using different strategies. This approach is far more possible than other prior paradigms that require complex GPU topologies and centralized infrastructure.

2) Synthetic reasoning production

Another important contribution from Deepseek-R1 was to show the importance of synthetic generated reasoning data sets for cognitive tasks. This process is also suitable for a decentralized network where nodes perform data set generation jobs and are compensated as these data sets are used for prior or fine -tuning of foundation models. As this data is synthetically generated, the entire network can be fully automated without human intervention, making it an ideal fit for web3 architectures.

3) Decentral inference for small distilled reasoning models

Deepseek-R1 is a massive model with 671 billion parameters. However, almost immediately after its release, a wave of distilled reasoning models arose, ranging from 1.5 to 70 billion parameters. These smaller models are significantly more practical for inference in decentralized networks. For example, a 1.5B -2B -distilled R1 model could be integrated into a defi protocol or implemented within nodes from a depin network. More simply, we are likely to see the increase in cost -effective reasoning final points driven by decentralized calculation network. Reasoning is a domain in which the performance gap between small and large models narrows, creating a unique opportunity for web3 to effectively utilize these distilled models in decentralized inference settings.

4) Reasoning data

One of the defining features of reasoning models is their ability to generate reasoning traces for a given task. Deepseek-R1 makes these tracks available as part of its inference output, which strengthens the importance of origin and traceability of reasoning tasks. The Internet today operates primarily on output with little visibility in the intermediate steps that lead to these results. Web3 provides an opportunity to trace and verify each reasoning step, potentially creating a “new internet of reasoning” where transparency and verifiableness become the norm.

Web3-IA have a chance in the after-R1-Reasoning time

The release of Deepseek-R1 has marked a turning point in the development of generative AI. By combining smart innovations with established prior paradigms, it has challenged traditional AI work and opened a new era in reasoning focus AI. Unlike many previous foundation models, Deepseek-R1 introduces elements that bring generative AI closer to web3.

The most important aspects of R1 – synthetic reasoning data sets, more parallelizable training and the growing need for traceability – are naturally adapted to web3 principles. While Web3-IA has struggled to get a meaningful traction, this new post-R1-Reasoning time may provide the best opportunity for web3 to play a more significant role in the future of AI.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top