• RIotingPacifist@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    1
    ·
    8 hours ago

    Seems like the easiest fix is to consider the produce of LLMs to be derivative products of the training data.

    No need for a new license, if you’re training code on GPL code the code produced by LLMs is GPL.

    • Ferk@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      4 hours ago

      You are not gonna protect abstract ideas using copyright. Essentially, what he’s proposing implies turning this “TGPL” in some sort of viral NDA, which is a different category of contract.

      It’s harder to convince someone that a content-focused license like the GPLv3 protects also abstract ideas, than creating a new form of contract/license that is designed specifically to protect abstract ideas (not just the content itself) from being spread in ways you don’t want it to spread.

      • RIotingPacifist@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        4 hours ago

        LLMs don’t have anything to do with abstract ideas, they quite literally produce derivative content based on their training data & prompt.

        • Ferk@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          3 hours ago

          LLMs abstract information collected from the content through an algorithm (what they store is the result of a series of tests/analysis, not the content itself, but a set of characteristics/ideas). If that’s derivative, then ALL abstract ideas are derivative. It’s not possible to make abstractions without collecting data derived from a source you are observing.

          If derivative abstractions were already something that copyright can protect then litigants wouldn’t have had to create patents, etc.

    • Joe@discuss.tchncs.de
      link
      fedilink
      arrow-up
      3
      ·
      7 hours ago

      Let me know if you convince any lawmakers, and I’ll show you some lawmakers about to be invited to expensive “business” trips and lunches by lobbyists.

      • RIotingPacifist@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        7 hours ago

        The same can be said of the approach described in the article, the “GPLv4” would be useless unless the resulting weights are considered a derivative product.

        A paint manufacturer can’t claim copyright on paintings made using that paint.

        • Joe@discuss.tchncs.de
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          6 hours ago

          Indeed. I suspect it would need to be framed around national security and national interests, to have any realistic chance of success. AI is being seen as a necessity for the future of many countries … embrace it, or be steamrolled in the future by those who did, so a soft touch is being embraced.

          Copyright and licensing uncertainty could hinder that, and the status quo today in many places is to not treat training as copyright infringement (eg. US), or to require an explicit opt-out (eg. EU). A lack of international agreements means it’s all a bit wishy washy, and hard to prove and enforce.

          Things get (only slightly) easier if the material is behind a terms-of-service wall.