While machine learning has advanced by leaps and bounds, it’s hard to create an AI that’s good at more than one thing. So, a machine could be trained with data to handle one class of programming challenges, but it would fail when given a different problem to tackle. So, the team decided to skip all the training on algorithms and code structure, instead treating it more like a translation problem.
Programming challenges usually include a description of the task, and the resulting code submitted by a human participant is technically just an expression of the description. The AI works in two phases: It takes the description and converts it to an internal representation. Then, it uses that representation to generate functional code based on the data it was shown in training. And there was a lot of data.
DeepMind fed the AI 700GB of code on GitHub, complete with the comments that explain the code. As Ars Technica points out, that’s a huge amount of text data. With the essence of programming internalized, DeepMind set up its own programming contests and fed the results into the AI to fine-tune the model’s performance. The team says this is an order of magnitude more training data than past coding machines have gotten, and that made all the difference.
The researchers found that AlphaCode was able to generate a huge number of potential answers to a coding problem, but about 40 percent of them would run through all the available system memory or fail to reach the answer in a reasonable amount of time. The data needs to be filtered to find the one percent of solutions that are actually good code. DeepMind found that clusters of similar code indicated better answers, whereas the wrong ones were randomly distributed. By focusing on those answers, AlphaCode was able to correctly answer about one-third of coding challenges. It turns out a lot of human programmers are little better, so AlphaCode placed in the top 54 percent. It’s not about to take jobs from DeepMind engineers, but give it time.
Now read: