• Saik0@lemmy.saik0.com
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      15 days ago

      So I keep seeing people reference this… And I found it curious of a concept that LLMs have problems with this. So I asked them… Several of them…

      Outside of this image… Codestral ( my default ) got it actually correct and didn’t talk itself out of being correct… But that’s no fun so I asked 5 others, at once.

      What’s sad is that Dolphin Mixtral is a 26.44GB model…
      Gemma 2 is the 5.44GB variant
      Gemma 2B is the 1.63GB variant
      LLaVa Llama3 is the 5.55 GB variant
      Mistral is the 4.11GB Variant

      So I asked Codestral again because why not! And this time it talked itself out of being correct…

      Edit: fixed newline formatting.

      • realitista@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        14 days ago

        Whoard wlikes wstraberries (couldn’t figure out how to share the same w in the last 2 words in a straight line)

      • werefreeatlast@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        15 days ago

        LOL 😆😅! I totally made it up! And it worked! So maybe it’s not just R’s that it has trouble counting. It’s any letter at all.

      • Regrettable_incident@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        15 days ago

        Interesting. . . I’d say Gemma 2B wasn’t actually wrong - it just didn’t answer the question you asked! I wonder if they have this problem with other letters - like maybe it’s something to do with how we say w as double-you . . . But maybe not, because they seem to be underestimating rather and overestimating. But yeah, I guess the fuckers just can’t count. You’d think a question using the phrase ‘How many . . .’ would be a giveaway that they might need to count something rather than rely on knowledge base.

        • Saik0@lemmy.saik0.com
          link
          fedilink
          English
          arrow-up
          1
          ·
          14 days ago

          I’d say Gemma 2B wasn’t actually wrong

          I call that talking itself out of being correct.