• Dharma Curious (he/him)@slrpnk.net
    link
    fedilink
    arrow-up
    7
    ·
    2 days ago

    I really appreciate that! I was asking more for the information of it, I doubt I could do anything with the link. Lol. I don’t understand thing 1 about this stuff. I don’t even know wtf a weight is in this context lol

    • edinbruh@feddit.it
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      1 day ago

      In this context “weight” is a mathematical term. Have you ever heard the term “weighted average”? Basically it means calculating an average where some elements are more “influent/important” than others, the number that indicates the importance of an element is called a weight.

      One oversimplification of how any neural network work could be this:

      • The NN receives some values in input
      • The NN calculates many weighted averages from those values. Each average uses a different list of weights.
      • The NN does a simple special operation on each average. It’s not important what the operation actually is, but it must be there. Without this, every NN would be a single layer. It can be anything except sums and multiplications
      • The modified averages are the input values for the next layer.
      • Each layer has different lists of weights.
      • In reality this is all done using some mathematical and computational tricks, but the basic idea is the same.

      Training an AI means finding the weights that give the best results, and thus, for an AI to be open-source, we need both the weights and the training code that generated them.

      Personally, I feel that we should also have the original training data itself to call it open source, not just weights and code.

      • MrMcGasion@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        1 day ago

        Absolutely agree that to be called open source the training data should also be open. It would also pretty much mean that true open source models would be ethically trained.

      • Dharma Curious (he/him)@slrpnk.net
        link
        fedilink
        arrow-up
        1
        ·
        1 day ago

        Thank you!

        And yeah, it really does seem like the training data should be open. Like, not even just to be considered open source, just to be allowed to do this at all, ethically, the training data should be known, at least to some degree. Like, there’s so much shit out there, knowing what they trained on would help make some kind of ethical choice in using it