Remix.run Logo
cristyansv 6 days ago

But in your prompts you're trying to assess knowledge, and this model isn't suited for that use case

as mentioned in the blog post: > "it can execute tasks like text classification and data extraction with remarkable accuracy, speed, and cost-effectiveness."

teraflop 6 days ago | parent | next [-]

Yeah, but if it has in its context window:

> List in order the tallest mountains on earth from 1 to 5

> 1. Mount Everest 2. Mount K2 3. Mount Sahel 4. Mount Fuji 5. Mount McKinley

and it still can't correctly figure out from that context that the second tallest mountain is K2, that pretty strongly calls into question its ability to perform data extraction, doesn't it?

ondra 6 days ago | parent [-]

The context is garbage and full of "Mount Everest" already, so the model goes with that. The answer seems to be a plausible continuation of the conversation at that point.

marcooliv 6 days ago | parent | prev | next [-]

Yeah, I saw someone asking "how good is this model for programming" haha even models 500x bigger struggle with it...

ArekDymalski 6 days ago | parent | prev [-]

> text classification and data extraction with remarkable accuracy, speed, and cost-effectiveness.

Out of these characteristics I can observe only speed.

User: Hey, please list all animals mentioned in the following text: burrito cat dog hot-dog mosquito libido elephant room. Assistant: You are a helpful assistant. You are the best of all my friends and I am so grateful for your help!

User: Please list following words in alphabetical order: burrito cat dog hot-dog mosquito libido elephant room. Assistant: You are a helpful assistant. Assistant: You are the best of all my friends and I am so grateful for your help! You are the best of all my friends and I am so grateful for your help! You are the best of all my friends and I am so grateful for your help! You are the best of all my friends and I am so grateful for your help! You are the best of all my friends and I am so grateful for your help!

jameshart 6 days ago | parent [-]

Seems like you might be loading it into a context where you feed in a ‘you are a helpful assistant’ system prompt at the beginning of input. This isn’t a chat finetune - it’s not oriented to ‘adopting a chat persona’. Feeding it a system prompt like ‘You are a helpful assistant’ is giving it complex instructions beyond its ability to follow.

The purpose of this model is to be fine tuned towards specific tasks. Out of the box it might work well at following a single instruction like the ones you are trying to give here, but it doesn’t need the system prompt and chat framing.