Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wait until you hear that most models tend to perform worse for non-English languages.


Do you know if that's true of non-English models?

As I said elsewhere, Deepseek injects Chinese characters into responses. Anecdotally, that seems to happen when the context gets longer. That suggests that they're primarily trained in Chinese and I would expect them to use fewer tokens for Chinese than English.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: