‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

  • flop_leash_973@lemmy.world
    link
    fedilink
    English
    arrow-up
    72
    arrow-down
    2
    ·
    edit-2
    9 months ago

    If it ends up being OK for a company like OpenAI to commit copyright infringement to train their AI models it should be OK for John/Jane Doe to pirate software for private use.

    But that would never happen. Almost like the whole of copyright has been perverted into a scam.

    • tinwhiskers@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      21
      ·
      9 months ago

      Using copyrighted material is not the same thing as copyright infringement. You need to (re)publish it for it to become an infringement, and OpenAI is not publishing the material made with their tool; the users of it are. There may be some grey areas for the law to clarify, but as yet, they have not clearly infringed anything, any more than a human reading copyrighted material and making a derivative work.

      • hperrin@lemmy.world
        link
        fedilink
        English
        arrow-up
        13
        arrow-down
        3
        ·
        9 months ago

        It comes from OpenAI and is given to OpenAI’s users, so they are publishing it.

        • linearchaos@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          3
          ·
          9 months ago

          It’s being mishmashed with a billion other documents just like to make a derivative work. It’s not like open hours giving you a copy of Hitchhiker’s Guide to the Galaxy.

          • hperrin@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            2
            ·
            9 months ago

            New York Times was able to have it return a complete NYT article, verbatim. That’s not derivative.

            • Fraubush@lemm.ee
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              1
              ·
              9 months ago

              I thought the same thing until I read another perspective into it from Mike Masnick and, from what he writes, it seems pretty clear they manipulated ChatGPT with some very specific prompts that someone who doesn’t already pay NYT for access would not be able to do. For example, feeding it 3 verbatim paragraphs from an article and asking it to generate the rest if you understand how these LLMs work, its really not surprising that you can indeed force it to do things like that but it’s an extreme and I’m qith Masnick and the user your responding to on this one myself.

              I also watched most of today’s subcommittee hearing on AI and journalism. A lot of the arguments are that this will destroy local journalism. Look, strong local journalism is some of the most important work that is dying right now. But the grave was dug by these large media companies and hedge funds that bought up and gutted those local news orgs and not many people outside of the industry batted an eye while that was happening. This is a bit of a tangent but I don’t exactly trust the giant headgefunds who gutted these local news journalists ocer the padt deacde to all of a sudden care at all about how important they are.

              Sorry fir the tangent butbheres the article i mentioned thats more on topic - http://mediagazer.com/231228/p11#a231228p11

              • hperrin@lemmy.world
                link
                fedilink
                English
                arrow-up
                3
                arrow-down
                2
                ·
                9 months ago

                So they gave it the 3 paragraphs that are available publicly, said continue, and it spat out the rest of the article that’s behind a paywall. That sure sounds like copyright infringement.

      • Syntha@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        Insane how this comment is downvoted, when, as far as a I’m aware, it’s literally just the legal reality at this point in time.