| ▲ | brokensegue 4 hours ago | |
there are models between 2-grams and 600m param models that would be good options. i don't expect a 2-gram to do very well here. also i'm not sure why this model isn't a fine choice if it solves their problem | ||
| ▲ | throwa356262 3 hours ago | parent [-] | |
What would you suggest instead? | ||