• fubo@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    1 year ago

    Oh sure, if a copyright holder can demonstrate that a specific work is reproduced. Not just “I think your AI read my book and that’s why it’s so good at carpentry.”

    • silence7@slrpnk.netOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      11
      ·
      1 year ago

      The thing is that they’re all reproduced, at least in part. That’s how these models work.

      • fubo@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        edit-2
        1 year ago

        Reproducing a work is a specific thing. Using an idea from that work, or a transformation of that idea, is not reproducing that work.

        Again: If a copyright holder can show that an AI system has reproduced the text (or images, etc.) of a specific work, they should absolutely have a copyright claim.

        But “you read my book, therefore everything you do is a derivative work of my book” is an incorrect legal argument. And when it escalates to “… and therefore I should get to shut you down,” it’s a threat of censorship.

        • silence7@slrpnk.netOP
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          6
          ·
          1 year ago

          The problem is that the LLMs (and image AIs) effectively store pieces of works as correlations inside them, occasionally spitting some of them back out. You can’t just say “it saw it” but can say “it’s like a scrapbook with fragments of all these different works”

          • fubo@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            ·
            1 year ago

            I’ve memorized some copyrighted works too.

            If I perform them publicly, the copyright holder would have a case against me.

            But the mere fact that I could recite those works doesn’t make everything that I say into a copyright violation.

            The copyright holder has to show that I’ve actually reproduced their work, not just that I’ve memorized it inside my brain.

            • silence7@slrpnk.netOP
              link
              fedilink
              English
              arrow-up
              0
              arrow-down
              8
              ·
              edit-2
              1 year ago

              The difference is that your brain isn’t a piece of media which gets copied. The AI is. So when it memorizes, it commits a copyright violation

              • fubo@lemmy.world
                link
                fedilink
                English
                arrow-up
                7
                ·
                edit-2
                1 year ago

                If that reasoning held, then every web browser, search engine bot, etc. would be violating copyright every time it accessed a web page, because doing so involves making a copy in memory.

                Making an internal copy isn’t the same as publishing, performing, etc. a work.

                • silence7@slrpnk.netOP
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  9
                  ·
                  1 year ago

                  There’s an implied license to use content for the purpose of displaying it for web content. Copies for other purposes…not so much. There have been a whole series of lawsuits over the years over just how much you can copy for what purpose.

                  • fubo@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    5
                    ·
                    edit-2
                    1 year ago

                    There isn’t an “implied license”. Rather, copyright is simply not infringed until the work is actually republished, performed, etc. without the copyright holder’s permission.

                    Making internal in-memory copies — e.g. for search-engine indexing — is simply not an infringement to begin with; just as it’s not an infringement for me to memorize a copyrighted work, but it would be an infringement if I were to recite it in a public performance without permission.

                    Copyright simply does not grant the copyright-holder absolute & total control of everything downstream from the work. It restricts republishing, performing, etc.; it does not restrict memorization, indexing, summarizing in a review, answering questions about the work, etc.

                    Again: if the AI system is made to regurgitate the actual text of the work, that’s still a copyright infringement. But merely having learned from it is not.

              • conciselyverbose@kbin.social
                link
                fedilink
                arrow-up
                5
                ·
                1 year ago

                No, it doesn’t. Learning from copyrighted material is black and white fair use.

                The fact that the AI isn’t intelligent doesn’t matter. It’s protected.

        • Cylusthevirus@kbin.social
          link
          fedilink
          arrow-up
          0
          arrow-down
          7
          ·
          1 year ago

          A person reading and internalizing concepts is considerably different than an algo slurping in every recorded work of fiction and occasionally shitting out a bit of mostly Shakespeare. One of these has agency and personhood, the other is a tool.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        No, that’s not how these models work. You’re repeating the old saw about these being “collage machines”, which is a gross mischaracterization.