Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The model was able to solve competition style coding problems above average human score.

I am not sure if I am thinking of the right study, but as far as I remember the model included a human wading through and filtering solutions and while there may have been a compiler attached they also scored themselves. The marketing blurb of course tried to make it sound as if they had competed.



The model generates a large number of solutions, then they filter those that actually compile and generate the right output when executed, then they cluster to select a few (<10 solutions) and submit them. They are not allowed to present too many attempts.

Here's a good analysis of the paper: https://www.youtube.com/watch?v=s9UAOmyah1A


Ah, the paper describes a fixed method for the last selection step and also AI generated tests to reduce the results even more before that. Quite a bit better, even if the participation is still only simulated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: