- YouTubes Opt-in AI training makes creators quiet architects of future tech tools
- Many creators say yes to AI training access even when there is no money involved
- Oxylabs collected millions of videos in a dataset that AI developers can ethically trust
An increasing number of YouTubers allow AI companies to train models using their videos, and surprisingly, many do without direct compensation.
Under YouTube’s current setup, creators are given the opportunity to choose by crossing boxes that allow about 18 larger AI developers.
If no box is selected, YouTube will not allow the use of this video for AI training purposes. This means that the standard stand is non-participation and any admission is fully voluntary.
Creators choose influence on income
The lack of payment can seem unusual and the motivation seems to hang on to influence rather than income.
Creators who choose can see it as a strategic feature to shape how generative AI tools interpret and present information -by contributing their content they effectively make it more visible in AI -generated answers.
As a result, their work could shape how questions are answered by everything from AI authors to large language models (LLM) to coding.
Oxylabs has now launched the first consent-based YouTube data set, which consists of four million videos from a million different channels.
All contributors explicitly agreed on the use of their content for AI training, and according to Oxylabs, these videos are complete with transcripts and metadata carefully curated to be particularly useful for training AI in image and video cleaning tasks.
“In the ecosystem aimed at finding a reasonable balance between respecting copyright and light innovation, YouTube streamlining consent that gives AI training and gives creaters flexibility an important step forward,” said Julius Ĩerniauskas, CEO of Oxylabs.
This model not only simplifies the process for AI developers seeking ethically obtained data, but also insures the creators of the use of their work.
“Many channel owners have already chosen their videos to be used to develop the next generation of AI tools. This allows us to create and provide high quality structured video data sets. In the meantime, AI developers have no trouble verifying the data legitimate origin.”
However, wider concerns continue about how government organizations and legislators handle similar issues.
For example, Britain’s data (use and access) has stopped in parliament, causing figures like Elton John to criticize the government’s handling of creator’s rights.
In this regulatory vacuum, creators and developers are likely to face uncertainty.
Oxylabs presents itself as filling this hole with a consent -based model, but critics will still question whether such initiatives really address deeper questions about value and justice.



