- Apple started with almost no quick examples and achieved surprising results
- Starchat-Beta was pushed into unprotected territory without clear guidance
- Almost a million worked Swiftui programs emerged after repeated iterations
Apple researchers recently revealed an experiment in which an AI model was trained to generate the user -bringeral code in Swiftui, although almost no Swiftui examples were present in the original data.
The study began with Starchat-Beta, an open source model designed for coding. Its training sources, including the thestack and other collections, contained almost no quick code.
This absence meant that the model did not have the advantage of existing examples of guiding its answers, which made the results surprising when a stronger system eventually emerged.
Creating a loop of self -improvement
The team’s solution was to create a feedback cycle. They gave Starchat-Beta a set of interface descriptions and asked it to generate Swiftui programs from these prompt.
Each generated program was assembled to make sure it actually ran. Interfaces that worked were then compared to the original descriptions using another model, GPT-4V, which assessed whether the output matched the request.
Only those who passed both phases remained in the data set. This cycle was repeated five times and with each round the cleaner data set was returned to the next model.
At the end of the process, the researchers had almost a million who worked Swiftui tests and a model they called Uicoder.
The model was then measured against both automated tests and human evaluation, where the results showed that it not only worked better than its basic model, but also achieved a collection success higher than GPT-4.
One of the striking aspects of the study is that the Swift code had been exclusively excluded from the initial training data.
According to the team, this accidentally happened when the Thestack data set was created, which left only scattered examples found on web pages.
This oversight excludes the idea that Uicoder simply recycled code it had already seen – instead came its improvement from the iterative cycle of generating, filtering and retraining on its own outputs.
While the results were centered on Swiftui, the researchers suggested that the procedure “likely to generalize to other languages and UI tool sets.”
In that case, this can open paths for several models to be trained in specialized domains where training data is limited.
The prospect raises questions about reliability, sustainability and about synthetic data sets can continue to scale without introducing hidden deficiencies.
Uicoder was also trained under carefully controlled conditions and its success in wider environment is not guaranteed.
Via 9to5Mac



