• ArtificialHoldings@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    15 hours ago

    This has been the legal basis of all AI training sets since they began collecting datasets. The US copyright office heard these arguments in 2023: https://www.copyright.gov/ai/listening-sessions.html

    MR. LEVEY: Hi there. I’m Curt Levey, President of the Committee for Justice. We’re a nonprofit that focuses on a variety of legal and policy issues, including intellectual property, AI, tech policy. There certainly are a number of very interesting questions about AI and copyright. I’d like to focus on one of them, which is the intersection of AI and copyright infringement, which some of the other panelists have already alluded to.

    That issue is at the forefront given recent high-profile lawsuits claiming that generative AI, such as DALL-E 2 or Stable Diffusion, are infringing by training their AI models on a set of copyrighted images, such as those owned by Getty Images, one of the plaintiffs in these suits. And I must admit there’s some tension in what I think about the issue at the heart of these lawsuits. I and the Committee for Justice favor strong protection for creatives because that’s the best way to encourage creativity and innovation.

    But, at the same time, I was an AI scientist long ago in the 1990s before I was an attorney, and I have a lot of experience in how AI, that is, the neural networks at the heart of AI, learn from very large numbers of examples, and at a deep level, it’s analogous to how human creators learn from a lifetime of examples. And we don’t call that infringement when a human does it, so it’s hard for me to conclude that it’s infringement when done by AI.

    Now some might say, why should we analogize to humans? And I would say, for one, we should be intellectually consistent about how we analyze copyright. And number two, I think it’s better to borrow from precedents we know that assumed human authorship than to invent the wheel over again for AI. And, look, neither human nor machine learning depends on retaining specific examples that they learn from.

    So the lawsuits that I’m alluding to argue that infringement springs from temporary copies made during learning. And I think my number one takeaway would be, like it or not, a distinction between man and machine based on temporary storage will ultimately fail maybe not now but in the near future. Not only are there relatively weak legal arguments in terms of temporary copies, the precedent on that, more importantly, temporary storage of training examples is the easiest way to train an AI model, but it’s not fundamentally required and it’s not fundamentally different from what humans do, and I’ll get into that more later if time permits.

    The “temporary storage” idea is pretty central for visual models like Midjourney or DALL-E, whose training sets are full of copyrighted works lol. There is a legal basis for temporary storage too:

    The “Ephemeral Copy” Exception (17 U.S.C. § 112 & § 117)

    U.S. copyright law recognizes temporary, incidental, and transitory copies as necessary for technological processes.
    Section 117 allows temporary copies for software operation.
    Section 112 permits temporary copies for broadcasting and streaming.
    
      • ArtificialHoldings@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        8 hours ago

        Copyright law doesn’t cover recipes - it’s just a “trade secret”. But the approximate recipe for coca cola is well known and can be googled.

    • ArtificialHoldings@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      12 hours ago

      BTW, if anyone was interested - many visual models use the same training set, collected by a German non-profit: https://laion.ai/

      It’s “technically not copyright infringement” because the set is just a link to an image, paired with a text description of each image. Because they’re just pointing to the image, they don’t really have to respect any copyright.