Found this post super informative as it relates to Mastodon, and thought Lemmy might also benefit from this perspective. I’m not sure I share his optimism, but his points seem sound to dampen some of the alarm bells over Meta joining the Fediverse.
Found this post super informative as it relates to Mastodon, and thought Lemmy might also benefit from this perspective. I’m not sure I share his optimism, but his points seem sound to dampen some of the alarm bells over Meta joining the Fediverse.
As has been mentioned before, Meta can scrape most data from the Fediverse already as it is publicly available.
One strategy could be to default to publish to followers only, and not public? It would be a great loss for the open web, but it might be a necessary one to make sure blocked instances do not get access to most of our data.
Another solution could be to publish all posts under a Non-Commercial Creative Commons 4.0 license, which I assume would legally block Meta from using our content in any context as they earn piles of cash on mixing user generated content with ads. Not sure if they would respect it, but it might give us an option for a class lawsuite in the EU?
Actually the copyright option might be the best one. Theoretically speaking the instance would need to state that all work is licensed only and that every comment and post has the copyright retained to creator/OP.
It’s just a simple tweak of the terms of service, but that would be enough to do it. Getting them to respect it is another ball game, because as we’ve seen with Midjourney and other photo apps, they have clearly scraped photos with watermarks that they didn’t have access to, and have used them to both train their models, and in the final output. This is why there was discussion of a class action lawsuit, although I didn’t hear where that ended up going.
I’m hoping that this happens irrespective of other steps that may need to be taken with respect to Meta or other corporate interests in the Fediverse. Since the data is all completely public, it would help clarify “ownership” of original content, allow for meme culture and virality to continue to occur, but still give some avenue for people to raise claims against these large entities.
Someone is eventually going to try to marry a blockchain to this tech so that there’s an infinite record of content with receipts to the beginning. Privacy concerns all over the place, but it seems like such a natural extension to the already completely public nature of the content being generated throughout the fediverse.
Is their main data source not the direct user interaction with their website or app?
They don’t know the identity of every Fediverse user, having a profile on a Fediverse user isn’t worth anything unless it can be linked directly to a web browser used by that person, unless they can show you ads they don’t make any money.
The Mas.to admin claims to have “blocked Meta’s domains” already. Maybe that could prevent the scraping? Maybe not.
Another option is to be able to create aliases on demand for different things you participate in. Just accept that it’s going to be scraped at some point.