‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

  • Exatron@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    8 months ago

    The problem is that a human doesn’t absorb exact copies of what it learns from, and fair use doesn’t include taking entire works, shoving them in a box, and shaking it until something you want comes out.

    • S410@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      8 months ago

      Expect for all the cases when humans do exactly that.

      A lot of learning is, really, little more than memorization: spelling of words, mathematical formulas, physical constants, etc. But, of course, those are pretty small, so they don’t count?

      Then there’s things like sayings, which are entire phrases that only really work if they’re repeated verbatim. You sure can deliver the same idea using different words, but it’s not the same saying at that point.

      To make a cover of a song, for example, you have to memorize the lyrics and melody of the original, exactly, to be able to re-create it. If you want to make that cover in the style of some other artist, you, obviously, have to learn their style: that is, analyze and memorize what makes that style unique. (e.g. C418 - Haggstrom, but it’s composed by John Williams)

      Sometimes the artists don’t even realize they’re doing exactly that, so we end up with with “subconscious plagiarism” cases, e.g. Bright Tunes Music v. Harrisongs Music.

      Some people, like Stephen Wiltshire, are very good at memorizing and replicating certain things; way better than you, I, or even current machine learning systems. And for that they’re praised.