Title is self-explanatory. The benefits of this would be tremendous, if correctly trained and perfected, it would be the greatest tool to democratize knowledge about Marxism.
There are already several open-source large language models on the internet out there, but I think the biggest bottlenecks is the knowledge on deploying such models and computing power to run such a thing.
Thread to discuss about this subject
First off, as someone who has programmed GPT stuff since way before ChatGPT, we don’t even need to train our own model. That is overly expensive and unnecessary for our purpose. What is much smarter to do in this case is to take all of the Marxist works and let a chatbot access the contents of the works using semantic search. The way we do this is to convert the works into small chunks which we then convert into embedding vectors. When the user sends a message to the chatbot, the message and the context of the message will be converted into an embedding vector. We then run a dot-product between the message of the user and the chunks of the texts in order to find the most relevant chunks to the question which the user has asked. Then a pre-trained model can make use of the information fetched in order to answer the user’s question.
Of course, training one’s own model can be good if we want it to be even more accurate and familiar with the material, however a good starting point would be to use semantic search.
I’m imagining the bot starts debating itself based on different theorists. There could be problems where Trotsky or Duhring get the last word because Lenin and Engels did not feel like giving them the dignity of a response. Or, alternatively it might take some theorists as more authoritative and uphold their bad takes against newer updates. What if there is a chronological hierarchy which leads to a Hoxhist bot calling the USSR and China social-imperialist?
we need a model that debunks MSM articles using older MSM articles to offset the bullshit asymmetry principle