• thedeadwalking4242@lemmy.world
    link
    fedilink
    arrow-up
    10
    arrow-down
    9
    ·
    8 hours ago

    Just a heads up for anyone who may use this in an argument. I just tested on several models and the generated response accounted for the logical fallacy. Unfortunately it isn’t real.

    ( Funny non-the less )

    • Axolotl@feddit.it
      link
      fedilink
      arrow-up
      40
      ·
      edit-2
      7 hours ago

      Tested on GPT-5 mini and it’s real tho?

      Edit: Gemini gives different results

      • xthexder@l.sw0.com
        link
        fedilink
        arrow-up
        3
        ·
        2 hours ago

        Bold of Gemini to imply any sort of liability for what it says. Google’s lawyers really don’t want that to be the case.

      • troglodytis@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        3 hours ago

        Gemini got jokes, but why does it think walking emits zero carbons? Humans are carbon emitters, more so with exercise. Hell, I farted while giggling at its humor.

        Much less carbon than the car? Yep. Zero? Nope

      • Ephera@lemmy.ml
        link
        fedilink
        English
        arrow-up
        30
        ·
        7 hours ago

        Man, I really hate how much they waffle. The only valid response is “You have to drive, because you need your car at the car wash in order to wash it”.

        I don’t need an explanation what kind of problem it is, nor a breakdown of the options. I don’t need a bulletpoint list of arguments. I don’t need pros and cons. And I definitely don’t need a verdict.

      • thedeadwalking4242@lemmy.world
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        ·
        edit-2
        7 hours ago

        I used paid models which will be the only ones the LLM bros will care about. Even they kinda know not to glaze the free models. So not surprising

        ( I have to have the paid models for work, my lead developer is a LLM nut )

    • Mniot@programming.dev
      link
      fedilink
      English
      arrow-up
      9
      ·
      8 hours ago

      It’s basically impossible to tell with these between the example being totally fabricated, true but only happens some small percentage of time, true and happens most of the time but you got lucky, and true and reliable but now the company has patched this specific case because it blew up online.