AI may have found its niche as entertainment in its own right
Some of us have no sense of humour, you insensitive clod!
Let's not conflate speculations on the implied mechanism used for providing summaries by AI with requirements of correctness of summaries implied by consideration in a legal service agreement.
However, open source is not about giving out a model for cheap/free to whoever asks. It is about giving away the foundations that allow complete duplication, so that other members of humanity, smarter or more informed, can contribute and/or branch away from the work.
The cost of training is irrelevant. It merely reflects the low quality of the processes and ideas that are being used by the companies that currently build them. It's by sharing the raw materials and allowing others to solve the same problems better that efficiency and progress is made.
The current paradigms of pretraining, fine tuning, transfer learning, etc lead to an enforced conceptual modularity that is just a way to embed a middle man economy into the science: Some provider takes care of data for others, builds a foundation model for others, and they can tinker on top of that. It is counter productive and scientifically a dead end, while giving you the feeling of progress that comes from taking psychological ownership of the full system when all you've done is tinkered at the edge by specializing an existing model.
You don't get anything new that way, only epsilon variations on an existing body of work. It's a dead end, because successful intelligences in the real world all around us do not need anywhere near the resources expended on AI and intelligent biological systems do not function anywhere near the way these AI systems do. For example, nobody reads the whole internet just to be able to talk about a topic, and no animal brain works like a deep network.
If you want (scientific) progress, you must break out of the tinkerer mindset. Take the full set of preferred elements that build the full state of the art system, and be prepared to do radical surgery at any level that makes sense, because the current architectures are simply bad. You can't do that with existing "open" systems that lock you into these architectural paradigms and choices.
Your example of Olmo talks about openness, but I had a look at their website and I don't see a link to raw data archives. There's instructions how to train a model, and they discuss a token data collection called Dolma 3. But tokens are not raw data, most of the implied information is already lost once you've tokenized. They do a good job of describing in detail their process for dataset curation on their GitHub page though, which deserves credit. It's worth reading, because it shows how their models are being locked into patterns that limit them from the get go, long before the first weight is even being trained.
This will be great for Haiku, FreeBSD, and OpenBSD installs, there's not the remotest possibility there'll be binaries for these. Not because the software couldn't be ported, but because the sorts of people politicians hire to write software would never be able to figure out the installer.
I still can't get ChatGPT, Gemini, or Claude to write a decent story or do an engineering design beyond basic complexity. They're all improving, but they're best thought of as brain-storming aids rather than actual development tools.
Sarifs are, in fact, for ease of reading, but point well taken. The justifications are wrong and the people making them are petty assholes.
It's true, seifs are for ease of reading
where the sun don't shine.
The Linux Foundation has always been kind of useless, but they're really outdoing themselves this time.
More like old vs new terribleness.
So you'd rather wait to fix the bigger social problems first before fixing smaller ones? I don't know, I think it's good to attack the smaller problems first. It makes you feel good about small victories, you gain experience with similar problems, and it prevents analysis paralysis. It also builds momentum, everyone likes a winner.
You also have to remember that minors aren't full people, they are legal dependents and censorship is the wrong word to use in this case. It is absolutely the right and obligation of guardians and governments to make decisions for them about what they can and cannot do on the Internet, among other things. The kids will grow up soon enough, and be free to choose by then.
"We Americans, we're a simple people... but piss us off, and we'll bomb your cities." -- Robin Williams, _Good Morning Vietnam_