Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Comment The reality is otherwise (Score 1) 135

According to Mozilla, these services cost $10-$17 a month.

Should you be inclined to spend your time with an ML system acting as a girl/boy/fish/alien/[*] "friend" (or really, anything else you want to use GPT/LLM ML systems for), you can do so with no cost and zero data mining by using an open system such as GPT4All.

A lot of these "oh no" stories start from the premise that pay-to-use-and-suck-your-data ML systems are the entire space. They aren't. Furthermore, the open systems, while not as advanced as the largest commercial systems, are certainly advanced enough — and they are constantly improving. The number of usable models you can plug in is quite large; some of them are okay for commercial use; some aren't. All are fine for private use.

Same goes for generative imaging ML. No need to be paying for that either; the open source community has made local, unencumbered systems readily available. Stable diffusion is one of the base engines. Here is one such application.

The ML applications I've actually tried (outside of my own, which is a different subject) which are open run on desktop machines; I'm not aware of any phone-based systems as yet, and I suspect that if they did exist they'd be pretty constrained due to the memory (and perhaps GPU) limits on such devices, but there have been some recent developments in layer-by-layer processing that might enable a decent amount of functionality on more powerful phones, and opens the door for very large models on desktop machines.

Just guessing, but I suspect most /. members have desktop machines capable of running these types of systems quite well.

Comment Data point (Score 3, Interesting) 70

sounds like marketing some CPU features a bit differently by throwing "AI" in there.

FWIW, all of Apple's M1, M2, and M3 based hardware already include dedicated APUs as well as GPUs and CPUs. I have no idea how these devices rate in terms of the performance measures cited in TFS, but the neural-like units are present.

I will say this, too: my M1/Ultra's performance with local LLMs and generative image ML is pretty snappy. LLMs respond immediately upon query entry and produce about a paragraph every couple seconds, and images are generated in about ten seconds. What's funny is I don't think either application is even using the APUs, just the CPUs and GPUs. I'm running GPT4All and DiffusionBee (which is a Stable Diffusion derivative.)

Comment Subscription-only software (Score 3, Interesting) 206

Legislation forbidding subscription-only applications would also be a good way to go.

If an operation wants to offer subscriptions, fine. As long as it also offers a reasonably priced, "one-time buy-and-its-yours" option.

I don't buy subscription-only software. Fortunately — at least thus far — there's nothing out there I couldn't either write or find a replacement for.

My computers last a long time. When I upgrade machines (which is usually because I'm enthralled with the newer tech, either hardware or software that depends on the newer OS, not because the older hardware has failed), I stick with the same OS, which in turn has been pretty good (though not perfect, which is probably impossible) about making sure that existing applications written to older versions of the OS continue to work.

The entire idea that application X requires continuous validation/payment or it'll quit working or worse, fail to read its data files, is anathema to me.

Comment Re:Animal issues (Score 1) 110

Which is stupid, since every predator on the planet gets to eat animals

This argument specifically validates humans eating other humans. 0/10. More generally, it says that "because other predators X without concern, we can X without concern." Also 0/10.

Comment Re:Bah (Score 1) 110

Superstitions — like fears of number 13 and black cats — are what people resort to, when religion is taken from them

Religion is superstition. 100% accurate.

such was your urge to attack the religious

You have confused a factual descriptor with an attack. 0/10.

Comment Animal issues (Score 2) 110

What creates the vegan need to shame

Well, there's that whole "animal has to die for you" thing. It's a little difficult to put a similar weighting on those who don't eat animals, though it's worth noting that raising and harvesting vegetables tends to be pretty hard on vegetable-consuming wildlife, barring some really careful farming.

Personally, I think cultured meat offers an excellent path forward, presuming we can make it work. Which actually kind of looks like it is going to happen.

Comment Not first (Score 1) 124

Sinclair's bet was that multitasking would be the key differentiator. It was the first affordable personal computer to offer this.

No, it wasn't. OS9/6809 was out years before this (and I'm not even saying it was first, just that I know for a fact computers running it predated Sinclair's offering.) You could run it on a number of personal computers of the day (and I did on one.) OS9/68000 was released later — in 1983.

Comment Rubicon in the rear view mirror (Score 1) 74

VMware was vile long before Broadcom bought them out

I agree. They failed to fix serious bugs and functionality shortcomings in their virtualization software for years running — issues that had a direct impact on the virtualized systems I was trying to use for development. That was enough to drive me away. Fortunately there were viable alternatives out there. That, you know, worked correctly.

Comment No, not that barn (Score 1) 196

The same thing was said about personal data, and yet we have GDPR and it is effective at stopping abuse.

Different domain. You're talking about above-board, public, disclosed activity on data. It's (relatively) easy to regulate entities doing that.

This is about non-commercial black activity on data. You should be comparing model generation and sharing to how effectively we've stopped private, illegal generation and sharing of digital music, digital images, digital (text) fiction, and digital video. Even in the case of (nominally) protected content. Look at Youtube, for instance; anything played once on that site can be trivially converted into a local, 100% sharable media file. And that happens constantly, to (nominally) protected Youtube content. Same with audio. Play something on Pandora, Spotify, whatever... trivially recorded, duplicated, shared. Same with text. Creation and sharing of illegal copies of copyrighted books is rampant. There are whole websites out there full of them, too. Technical mitigations? HDCP (of course) failed to stop copying of high resolution performances. The resolution, such as it was, was to stream stuff to consumers so inexpensively that it became not worth the effort; although that's becoming less so. Still, illegal sharing of protection-broken movies exists.

Personal experience here: I'm a musician, and I love music. I also have a good sense of how paying for music, particularly in buying legitimate physical recordings, supports musicians, something I am all in on. So recently, I undertook to convert my thousands of CDs to digital form to allow more ready and direct access to my music library. Huge job, took me quite a few months and I even wore out a CD drive in the process, but it's done now — it's much easer doing the new purchases as they arrive, of course. Anyway, the entire time I was doing this, every person who knew I was doing it asked the same types of questions, and they all boiled down to "why don't you just rip these performances?" And yes, I could have. It's trivial to do so. It could have been automated, which would have saved me literally months of effort swapping CDs, archiving, etc. I even had an up to date database of every song in my library. This wasn't really an eye-opener for me — I know people illicitly copy and share music constantly and think nothing of it — but it's revealing with regard to illicit sharing of desirable digital content of any type, and makes a point that aligns with the ones I'm making here. If it's easy, and there's a perceived benefit, significant numbers of interested people will do it and think nothing of it. Illegal or not.

Back on point: Keeping in mind the cost of making these ML models gets lower the more automated it is, and the fewer processing hoops the modeler is jumping through to keep lawyers from having a feeding frenzy (as in, zero), and the faster and more capable the computers get (although they need go no further than they already have to be practical.) Again, right now, practical model generation can be done on a moderately robust desktop machine. Or several, if deeper pockets are at hand. Distribution can be anything from dark web hosting to entity one handing, or sending, a simple .zip to entity two...n. Online files. Flash drives. Encryption. Etc. Digital files can be and are duplicated with the stroke of a pointing device or keyboard. Put another way, the means of production and distribution simply aren't controllable.

...clearly, we can legislate to control how AI models are developed.

Not effectively in the case of non-commercial / non-public entities. Which is where the actual issues lies.

I am aware of multiple privately-generated models that used decidedly dodgy data already in the wild. They run locally on uncensored, private engines. That's without much "it's illegal" or "oh no, lawsuits" stimulation as yet. They're very high quality. Development continues apace. That's where we are already.

Consider that the drug war (an almost-identical repeat of the complete failure of prohibition one on alcohol, because learning from history is apparently difficult, sigh) not only didn't reduce the various drug issues with legislation and draconian enforcement attempts, it created the cartels, caused gang wars, generated enormous and enormously profitable black markets, and widespread illegal, sub-rosa, drug manufacture everywhere from relatively sophisticated labs to bathtubs and flowerpots. Note the copious (stupid) body of law nominally aimed at controlling this, complete with horrific penalties. Note also the abject failure of said attempts (and the huge waste of money) in trying to control the issue. Note also that setting up drug manufacturing, even low-quality drug manufacturing, is a great deal more difficult, expensive, and skill-dependent than putting a desktop computer to work on data collection and processing. And keep in mind that in large part, the data we're talking about is "the Internet", something designed to be easily and quickly accessible. And that one software engine, created by one programmer or programming team, is all it takes for anyone who wants to get into this to do so. And that software is already out there.

Same for sex workers. Laws that attempt to control this abound. Availability remains high, pervasive, and that without the ease of private digital generation and distribution (although digital advertising and digital sex work are both common.) Informed, consensual and otherwise.

Consider the load of spam that ends up in our emailboxes, both for at least somewhat legal commerce, and comprising phishing, viruses, worms, etc. Even with a completely known, fairly easily to regulate channel, the data collection and use, and the result, has grown from a minor flow to an incredibly pervasive one. Why? Because the people doing it want to do it, and they don't care what anyone else wants. And that's without the compliance of the people on the receiving end, unlike drug users, customers of sex workers, and of course, users of uncensored, broadly trained ML models.

Consider the constant barrage of security attacks on websites; I maintain a number of web servers, and the logs are crammed with attack attempts. Both automated and one-off attempts to hack. They range from hilariously clueless to frighteningly sophisticated. The problem is huge despite it being illegal in most venues. People (and governments) want to do this, because... well, various motivations. It's easy. So they do it. It's really just that simple.

And then there are the national actors. No country with competent leadership will allow itself to fall behind here. There is also the fact that governments are already deeply engaged in data collection from readily available sources. Books, the Internet, financial data, images, motion recordings, live cameras both still and motion from doorbell cameras to every webcam ever to every government surveillance instance... we may be certain that will all go into ML systems and come out as readily utilizable government power.

The beatings will continue until morale improves.

Comment Well, about the law (Score 1) 196

You might not be able to stop it happening in a private lab or even private LLM, but if you want to make money from it, you bet it can be stopped.

So imagine this is the case, and meanwhile [insert country here] ignores this particular form of repressing learning, and leaps ahead in GPT/LLM ML systems, and possibly even to AGI. Because while GPT/LLM ML isn't AGI (or even AI) by any means, we can be absolutely certain that learning from the broadest possible training data will be involved when that tech arrives. Which it absolutely will, presuming we don't off ourselves in some holocaust first.

The completely distinct technical and instance separation between [training/parsing engines] versus [the training data and the resulting models] pretty much renders the whole question moot; the engine tech will develop anyway without any data copyright issues at all, and the models will be generated distinct from the engines elsewhere, sub-rosa as and if required. This is already happening both for GPT/LLM ML and generative image ML.

The main call here is for copious amounts of popcorn. :)

Comment Horses, barns (Score 3, Interesting) 196

automated processing of information falls under entirely different laws than humans reading stuff

Well, there's the law WRT the training of the models, however that actually pans out in court, and there's what's definitely going to happen regardless.

These horses are well out of the barn and in the hands of the worldwide open source community, white and black hats both. There's absolutely no chance of putting them back — there isn't even any current law that would enable doing so, were it even possible (see why below.) Also, in terms of national advantage, any country that handicaps itself in this race will simply fall far behind the ones that don't. It would take monumentally stupid national leadership to allow that to happen. I leave it to the reader to imagine which countries might actually be that stupid, and which ones will proceed apace no matter what anyone thinks.

It's important to understand GPT/LLM ML tech isn't definitively one thing, it's four:

[1] - the engines that do the training (completely independent of the nature of [2] and [3])
[2] - training data
[3] - the models resulting from [1] processing [2]
[4] - the engines that process [3] (completely independent of the nature of [2] and [3])

[1] and [4] can and will continue to be developed, even if [2] and [3] are crippled in the development environment, as developing the tech implementing the engines isn't legally or technically hamstrung by the data used to develop the engines — in other words, [1] and [4] can be advanced indefinitely, including commercially, using entirely non-infringing data, without any concern for copyright issues at all, and then [2] and [3] can be multiply and variously and independent of supervision developed elsewhere, by anyone. This distributed effort is already well under way; you can put it on a reasonable desktop machine in any combination. (Also true for generative imaging.)

The engines ([1]) will then be used to develop truly robust and well-informed models. Thus breeding new horses ([3]) nowhere near anyone's barn. We already have instances of [4] and [3] being widely, and independently, distributed. On current tech desktop machines. Further, layer-at-a-time methodology has recently been disclosed for [4] which allow much larger models to be processed on machines that previously had no chance of running them.

Think of [1] and [4] as paint programs and image display programs, respectively, while [2] is "other people's images" and [3] is "images you make/derive in paint programs." Neither paint programs (as [1]) or image display programs (as [4]) infringe; you can't effectively stop people from loading images (as [2]) or producing new derivative images and then sharing them (as [3].) Between the robust encryption available and the black distribution channels that abound, and the fact that the tech is completely and unequivocally out the door, it's an impossible task to rein this in outside of completely toothless for-show enforcements.

This is one of those circumstances where the law, well-intentioned or not, notionally right or not, has absolutely no chance of effective regulation of the technology in question.

Slashdot Top Deals

"If you want to know what happens to you when you die, go look at some dead stuff." -- Dave Enyeart

Working...