Report: Potential NYT lawsuit could force OpenAI to wipe ChatGPT and start over

wanderingmagus@lemm.ee · 1 year ago

Report: Potential NYT lawsuit could force OpenAI to wipe ChatGPT and start over

walrusintraining@lemmy.world · 1 year ago

It’s not like AI is using works to create something new. Chatgpt is similar to if someone were to buy 10 copies of different books, put them into 1 book as a collection of stories, then mass produce and sell the “new” book. It’s the same thing but much more convoluted.

PupBiru@kbin.social · 1 year ago

it’s not even close to that black and white… i’d say it’s a much more grey area:

possibly that you buy a bunch of books by the same author and emulate their style… that’s perfectly acceptable until you start using their characters

if you wrote a research paper about the linguistic and statistical information that makes an authors style, that also wouldn’t be a problem

so there’s something beyond just the authors “style” that they think is being infringed. we need to sort out exactly where the line is. what’s the extension to these 2 ideas that makes training an LLM a problem?

lily33@lemm.ee · 1 year ago

Except it’s not a collection of stories, it’s an amalgamation - and at a very granular level at that. For instance, take the beginning of a sentence from the middle of first book, then switch to a sentence in the 3-rd, then finish with another part of the original sentence. Change some words here and there, add one for good measure (based on some sentence in the 7-th book). Then fix the grammar. All the while, keeping track that there’s some continuity between the sentences you’re stringing together.

That counts as “new” for me. And a lot of stuff humans do isn’t more original.

legion02@lemmy.world · 1 year ago

The maybe bigger argument against free-reign training is that you’re attributing personal rights to a language model. Also even people aren’t completely free to derive things from memory (legally) which is why clean-room-design is a thing.

Veraxus@kbin.social · edit-2 1 year ago

Chatgpt is similar to if someone were to buy 10 copies of different books, put them into 1 book as a collection of stories, then mass produce and sell the “new” book

That is not even close to correct. LLMs are little more than massively complex webs of statistics. Here’s a basic primer:

https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/