I’ve tried coding and every one I’ve tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

I’ve tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can’t really handle anything above 4B in a timely manner. 8B is about 1 t/s!

  • shnizmuffin@lemmy.inbutts.lol
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    4
    ·
    4 months ago

    Hey, you’re treating that data with the respect it demands, right? And you definitely collected consent from those chat participants before you Hoover’d up their [re-reads example] extremely Personal Identification Information AND Personal Health Information, right? Because if you didn’t, you’re in violation of a bunch of laws and the Twitch TOS.

    • CrayonDevourer@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      4
      ·
      edit-2
      4 months ago

      If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing. If you wanna spew your personal life on Twitch, there are bots that listen to all of the channels everywhere on twitch. They aren’t violating any laws, or Twitch TOS. So, *buzzer* WRONG.

      Right now, the same thing is being done to you on Lemmy. And Reddit. And Facebook. And everywhere else.

      Look at a bot called “FrostyTools” for Twitch. Reads Twitch chat, Uses an AI to provide summaries of chat every 30 minutes or so. If that’s not violating TOS, then neither am I. And thousands upon thousands of people use FrostyTools.

      I have the consent of the streamer, I have the consent of Twitch (through their developer API), and upon using Twitch, you give the right to them to collect, distribute, and use that data at their whim.

      • aksdb@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        4 months ago

        So, buzzer WRONG.

        Quite arrogant after you just constructed a faulty comparison.

        If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing.

        That’s absolutely not the same thing. Overhearing something that is in the background is fundamentally different from actively recording everything going on in a public space. You film yourself or some performance in a park and someone happens to be in the background? No problem. You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction. The intent of the recording(s) and the reasonable expectations of the people recorded are factored in in many jurisdictions, and being in public doesn’t automatically entail consent to being recorded.

        See for example https://www.freedomforum.org/recording-in-public/

        (And just to clarify: I am not arguing against your explanation of Twitch’s TOS, only against the bad comparison you brought.)

        • kattfisk@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          6
          ·
          4 months ago

          You’re both getting side-tracked by this discussion of recording. The recording is likely legal in most places.

          It’s the processing of that unstructured data to extract and store personal information that is problematic. At that point you go from simply recording a conversation of which you are a part, to processing and storing people’s personal data without their knowledge, consent, or expectation.

          • aksdb@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 months ago

            True.

            Although in Germany for example it can also be an issue when recording. If you have a security camera pointed at a public space (that can include the sidewalk infront of your house), passersby can sue you to take it down and potentially get you fined. Even pretending to constantly record such an area can yield that result.

            • tfm@europe.pub
              link
              fedilink
              English
              arrow-up
              1
              ·
              4 months ago

              I’m not a lawyer but I suppose it would depend on the ToS and if the user agrees to the recording and processing. But if it allows the extraction of the real identity of the user it’s probably a GDPR issue.

          • David J. Atkinson@c.im
            link
            fedilink
            arrow-up
            0
            arrow-down
            1
            ·
            4 months ago

            @kattfisk That seems to imply that you cannot personally listen to or watch recordings that you have made in public. In doing so, you are abstracting personal details that you might have missed before, refreshing your memory, and so on. What is the material difference between you doing this without machine help versus with automation that makes it ethically problematic? What if a friend helped you, not a machine?

            • shnizmuffin@lemmy.inbutts.lol
              link
              fedilink
              English
              arrow-up
              2
              ·
              4 months ago

              What is the material difference between you doing this without machine help versus with automation that makes it ethically problematic?

              Object permanence, perfect recall, data security and consent. It’s the difference between seeing someone naked vs taking a picture of someone naked.

              Regardless - users, streamers, and developers are all prohibited from scraping and storing the Twitch chat.

        • CrayonDevourer@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          3
          ·
          edit-2
          4 months ago

          You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction.

          Literally not. The police use this right now to record your location and time seen using license plates all over the nation - with private corporations providing the service.

          and being in public doesn’t automatically entail consent to being recorded.

          And yes, it’s called ‘expectation to the right of privacy’. Public venues are not ‘private’ locations, and thus do not need consent. You can, quite literally, record anyone in public.

          Even the link you provided agrees.

          • tfm@europe.pub
            link
            fedilink
            English
            arrow-up
            2
            ·
            4 months ago

            In the US maybe but not in Germany, Austria and probably most countries in Europe.

      • catty@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 months ago

        Doesn’t Twitch own all data that is written and their TOS will state something like you can’t store data yourself locally.

        • CrayonDevourer@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          2
          ·
          edit-2
          4 months ago

          I’m not storing their data. I’m feeding it to an LLM which infers things and storing that data. Other Twitch bots store twitch data too. Everything from birthdays to imaginary internet points.

            • CrayonDevourer@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              edit-2
              4 months ago

              There’s not actually that much code. It’s like 8 lines for an AI ‘agent’, and maybe another 16 lines for ‘tools’, and I’m using Streamlink for grabbing the audio stream, and pulseaudio has a ‘monitor’ device you can use to listen to what’s playing on the speakers. Throw it on a very minimal linux distro on a VM, and that’s it.

              I don’t do ‘vibe coding’, but that IS where I got the idea from. People who are doing ‘vibe coding’ nowadays aren’t just plugging things into a generic AI, they’re spinning up ‘agents’ and making tools via MCP and then those agents are tasked with specific things, and use the tools to directly write to files, search the internet, read documents, etc

              • tfm@europe.pub
                link
                fedilink
                English
                arrow-up
                3
                ·
                4 months ago

                I’d also consider writing a script with AI, which you don’t understand, as vibe coding. Basically if you wouldn’t be able to do it on your own it’s vibe coding.

      • shnizmuffin@lemmy.inbutts.lol
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 months ago

        Let’s take a look at the Developer Agreement that you cited:

        You must only retain chat logs as long as necessary for the operation of Your Services or to improve Your Services; do not do so for the purpose of creating public databases or websites, or, in general, to collect information about Twitch’s end users. You must enable, and process, all requests by end users to block, discontinue, delete, or otherwise opt-out of any retention of chat logs for Your Services.

        This very clearly states that you are disallowed from retaining chat logs for the general purpose of collecting information about Twitch’s end users.

        You said that you, “store ‘facts’ about specific users so that they can be referenced quickly,” but then later in a different thread state, “I’m not storing their data. I’m feeding it to an LLM which infers things and storing that data.” You’re retrieving information about specific users at a later time. You’ve built a database of structureless PII from chat logs. You’ve chosen to store the data as inferences, which makes it a bad database, but still a database.

        I have questions:

        When your streamer mentions something deeply personal, like, “how their mothers surgery went,” that your tool helped them remember, do they disclose that your tool was involved in that transaction? When the viewer gets weirded out and asks your streamer to not mention that again, or forget it entirely, do you have a way to remove that information from your database and a way to prove it’s been deleted? When other people in chat think it’s gross, and ask to opt-out, can you even do it?


        Regarding FrostyTools: I don’t think it’s storing the chat logs for a later time. They don’t have a data retention section in their TOS or Privacy Policy that isn’t related to the streamer. (As in, they hold on to the streamer’s Twitch account and some other information for billing, authentication, etc.) I think it’s taking the chat logs only for as long as it needs to output a response and then deleting it. Also, this excerpt from the FrostyTools TOS made me chuckle:

        This means that you, and not FrostyTools, are entirely responsible for all Content that you upload, post, email, transmit, stream, or otherwise make available via the Service. FrostyTools does not control the Content posted via the Service and, as such, does not guarantee the accuracy, integrity or quality of such Content. You understand that by using the Service, you may be exposed to Content that is offensive, indecent or objectionable. Under no circumstances will FrostyTools be liable in any way for any Content, including, but not limited to, any errors or omissions in any Content, or any loss or damage of any kind incurred as a result of the use of any Content posted, emailed, transmitted, streamed, or otherwise made available via the Service.

        You agree that you must evaluate, and bear all risks associated with, the use of any Content, including any reliance on the accuracy, completeness, or usefulness of such Content. In this regard, you acknowledge that you may not rely on any Content created by the Service or submitted to the Service.

        This leads me to believe that you can violate the Twitch TOS quoted above using FrostyTools. It is apparent that FrostyTools has positioned itself as an application that creates User Generated Content (like Photoshop or Word).

        • CrayonDevourer@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          4 months ago

          You must only retain chat logs as long as necessary for the operation of Your Services or to improve Your Services

          I’m not storing chat logs.

          do not do so for the purpose of creating public databases or websites, or, in general, to collect information about Twitch’s end users.

          Not creating any kind of public database either. It’s a private tool. Its purpose isn’t to massively-collect data about all of twitch either - it’s to provide reminders for social situations. If anything, it’s an accessibility tool for the disabled.

          You must enable, and process, all requests by end users to block, discontinue, delete, or otherwise opt-out of any retention of chat logs for Your Services.

          Again - Not storing chat logs. They are processed for information and that information inferred. I am storing reminders for the twitch streamer to talk about a certain subject at a certain time. If I put a reminder in my phone to remember to tell you happy birthday because I saw it on twitch; am I “creating a database of user information”? No. I’m creating a reminder for myself to remember to say happy birthday.

          Having a computer help me remember those things isn’t a violation. Hell, even something like Microsoft’s new AI in windows does the same thing - are THEY violating twitch TOS when you have a browser window open? The answer is no.

          When your streamer mentions something deeply personal, like, “how their mothers surgery went,” that your tool helped them remember, do they disclose that your tool was involved in that transaction?

          No, nor should they be required to.

          When the viewer gets weirded out and asks your streamer to not mention that again, or forget it entirely, do you have a way to remove that information from your database and a way to prove it’s been deleted? When other people in chat think it’s gross, and ask to opt-out, can you even do it?

          When they mention not wanting to talk about something, that’s listed as something they don’t like to talk about, so in a way, yes.

          Additionally, I instruct the ‘agent’ to disregard anything political or religious. - Though so far it’s not very good at distinguishing those things. Additionally it’s easy to feed it false information though it usually fixes it over time.

      • interdimensionalmeme@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 months ago

        There is no expectation of privacy in public spaces. Participants to these streams which are open to all do not have a prohibition on repeating what they have heard.

        • kattfisk@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 months ago

          Repeating what they heard is very different from automatically processing the chat to harvest personal information about the participants.

          Just because some data is publicly available doesn’t mean all processing of that data is legal and moral.

          • interdimensionalmeme@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            2
            ·
            4 months ago

            It is qualitatively equivalent. Any single piece of information could have been copied, it is safe to assume it has all been copied.

            Although I would be onboard for supporting an expectation of pruvacy in public spaces and making private cctv recording illegal.

        • carl_dungeon@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          Right and what I was saying was even if it wasnt “public”, single party consent means the person recording can be that single party- so still a non-issue.