• Throwaway4669332255@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    What’s crazy to me is they are using tons of copyrighted data to train but put in a statement saying you can’t use LLama outputs to train other models.

  • drkt@feddit.dk
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    They’re capitalizing on all the free labor being done on local models.

    • wagesj45@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      That wouldn’t even bother me so much if they weren’t trying to lock people into the “llama ecosystem” by restricting its use to only improving other llama models.

      • rufus@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        I’m sorry for repeating myself. But didn’t Meta just stop disclosing the exact training dataset? Presumably because they’re using copyrighted data from the internet? Isn’t that hypocritical? IMHO we need laws and/or companies need to stop disregarding copyright when training their own models and then claiming copyright once other people start doing the same thing.

        • wagesj45@kbin.social
          link
          fedilink
          arrow-up
          3
          ·
          1 year ago

          Personally I don’t think copyright holders really have a leg to stand on as far as that goes. Simply having and using a copyrighted work isn’t a violation, and the work that is produced in the form of a trained neural network is the very definition of transformative. I also think Meta would have the same issue with trying to use a copyright claim for someone using their llama output to improve other non-llama models. That’s why they had to slip it into a terms of service.

          I guess what you might see going forward is every book that’s published comes with a user agreement you agree to by opening the book… But that doesn’t sound practical in any sense.