Hacker Newsnew | past | comments | ask | show | jobs light | darkhn

It still does a much better job at translation than llama 2 70b even, at 6.7b params

If it's MOE that may explain why it's faster and better...


MOE?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact |

Search: