StarCoder 2 Is a Code-Generating AI That Runs On Most GPUs (techcrunch.com) 44
An anonymous reader quotes a report from TechCrunch: Perceiving the demand for alternatives, AI startup Hugging Face several years ago teamed up with ServiceNow, the workflow automation platform, to create StarCoder, an open source code generator with a less restrictive license than some of the others out there. The original came online early last year, and work has been underway on a follow-up, StarCoder 2, ever since. StarCoder 2 isn't a single code-generating model, but rather a family. Released today, it comes in three variants, the first two of which can run on most modern consumer GPUs: A 3-billion-parameter (3B) model trained by ServiceNow; A 7-billion-parameter (7B) model trained by Hugging Face; and A 15-billion-parameter (15B) model trained by Nvidia, the newest supporter of the StarCoder project. (Note that "parameters" are the parts of a model learned from training data and essentially define the skill of the model on a problem, in this case generating code.)a
Like most other code generators, StarCoder 2 can suggest ways to complete unfinished lines of code as well as summarize and retrieve snippets of code when asked in natural language. Trained with 4x more data than the original StarCoder (67.5 terabytes versus 6.4 terabytes), StarCoder 2 delivers what Hugging Face, ServiceNow and Nvidia characterize as "significantly" improved performance at lower costs to operate. StarCoder 2 can be fine-tuned "in a few hours" using a GPU like the Nvidia A100 on first- or third-party data to create apps such as chatbots and personal coding assistants. And, because it was trained on a larger and more diverse data set than the original StarCoder (~619 programming languages), StarCoder 2 can make more accurate, context-aware predictions -- at least hypothetically.
[I]s StarCoder 2 really superior to the other code generators out there -- free or paid? Depending on the benchmark, it appears to be more efficient than one of the versions of Code Llama, Code Llama 33B. Hugging Face says that StarCoder 2 15B matches Code Llama 33B on a subset of code completion tasks at twice the speed. It's not clear which tasks; Hugging Face didn't specify. StarCoder 2, as an open source collection of models, also has the advantage of being able to deploy locally and "learn" a developer's source code or codebase -- an attractive prospect to devs and companies wary of exposing code to a cloud-hosted AI. Hugging Face, ServiceNow and Nvidia also make the case that StarCoder 2 is more ethical -- and less legally fraught -- than its rivals. [...] As opposed to code generators trained using copyrighted code (GitHub Copilot, among others), StarCoder 2 was trained only on data under license from the Software Heritage, the nonprofit organization providing archival services for code. Ahead of StarCoder 2's training, BigCode, the cross-organizational team behind much of StarCoder 2's roadmap, gave code owners a chance to opt out of the training set if they wanted. As with the original StarCoder, StarCoder 2's training data is available for developers to fork, reproduce or audit as they please. StarCoder 2's license may still be a roadblock for some. "StarCoder 2 is licensed under the BigCode Open RAIL-M 1.0, which aims to promote responsible use by imposing 'light touch' restrictions on both model licensees and downstream users," writes TechCrunch's Kyle Wiggers. "While less constraining than many other licenses, RAIL-M isn't truly 'open' in the sense that it doesn't permit developers to use StarCoder 2 for every conceivable application (medical advice-giving apps are strictly off limits, for example). Some commentators say RAIL-M's requirements may be too vague to comply with in any case -- and that RAIL-M could conflict with AI-related regulations like the EU AI Act."
Like most other code generators, StarCoder 2 can suggest ways to complete unfinished lines of code as well as summarize and retrieve snippets of code when asked in natural language. Trained with 4x more data than the original StarCoder (67.5 terabytes versus 6.4 terabytes), StarCoder 2 delivers what Hugging Face, ServiceNow and Nvidia characterize as "significantly" improved performance at lower costs to operate. StarCoder 2 can be fine-tuned "in a few hours" using a GPU like the Nvidia A100 on first- or third-party data to create apps such as chatbots and personal coding assistants. And, because it was trained on a larger and more diverse data set than the original StarCoder (~619 programming languages), StarCoder 2 can make more accurate, context-aware predictions -- at least hypothetically.
[I]s StarCoder 2 really superior to the other code generators out there -- free or paid? Depending on the benchmark, it appears to be more efficient than one of the versions of Code Llama, Code Llama 33B. Hugging Face says that StarCoder 2 15B matches Code Llama 33B on a subset of code completion tasks at twice the speed. It's not clear which tasks; Hugging Face didn't specify. StarCoder 2, as an open source collection of models, also has the advantage of being able to deploy locally and "learn" a developer's source code or codebase -- an attractive prospect to devs and companies wary of exposing code to a cloud-hosted AI. Hugging Face, ServiceNow and Nvidia also make the case that StarCoder 2 is more ethical -- and less legally fraught -- than its rivals. [...] As opposed to code generators trained using copyrighted code (GitHub Copilot, among others), StarCoder 2 was trained only on data under license from the Software Heritage, the nonprofit organization providing archival services for code. Ahead of StarCoder 2's training, BigCode, the cross-organizational team behind much of StarCoder 2's roadmap, gave code owners a chance to opt out of the training set if they wanted. As with the original StarCoder, StarCoder 2's training data is available for developers to fork, reproduce or audit as they please. StarCoder 2's license may still be a roadblock for some. "StarCoder 2 is licensed under the BigCode Open RAIL-M 1.0, which aims to promote responsible use by imposing 'light touch' restrictions on both model licensees and downstream users," writes TechCrunch's Kyle Wiggers. "While less constraining than many other licenses, RAIL-M isn't truly 'open' in the sense that it doesn't permit developers to use StarCoder 2 for every conceivable application (medical advice-giving apps are strictly off limits, for example). Some commentators say RAIL-M's requirements may be too vague to comply with in any case -- and that RAIL-M could conflict with AI-related regulations like the EU AI Act."
Okay (Score:2)
Re: (Score:2)
Code me a better AI than yourself.
E I E I O....
{3301}
When pretending to sleep, its Fake Snooze
Re: (Score:2)
Naa, the "singularity" is clueless nonsense. Obviously, it can only code far worse ones.
Was this article written by AI? (Score:4, Insightful)
The article doesn't seem to give any examples of what sort of consumer-level GPU you'd need, and I'd hardly call Nvidia's $10k A100 a "consumer" GPU. Then there's also this odd tidbit:
StarCoder 2 isn’t perfect, that said. Like other code generators, it’s susceptible to bias. De Vries notes that it can generate code with elements that reflect stereotypes about gender and race.
I remember calling some of the RS-232 port adapters "gender changers" back in the day, since sometimes you'd end up with the wrong sort of end depending on what you were trying to accomplish with a serial cable. That's about as close as you'd get to dealing with genders as far as computers are concerned. Furthermore, the only "race" issues you might run into while writing code are "race conditions", which are entirely unrelated to the subject of human skin color and ethnic backgrounds.
Certainly race and gender issues come up with chatbot-style AIs, but if your code-generating AI is spouting sexist or racist nonsense then it's badly hallucinating.
Re: (Score:2)
Re: (Score:2)
I was going by what a quick Google search turned up for the price of the A100. I recently bought a RTX 4060 ($300), which the internet peanut gallery hates but is otherwise a pretty decent 1080p GPU. I'm kind of ashamed to admit it, but it was that "Pokemon with guns" game that influenced my decision. I ended up liking it way more than I expected to, but my previous GPU did not. That's the sort of GPU I'd consider to be "average consumer grade", since there are a lot of people who really aren't that int
Re: (Score:2)
The 3060 has been plenty good for most games I've thrown at it , including the notoriously GPU thrashing Ark games. More recently I've had no probs with Starfield (meh), Baldurs Gate 3 (excellent), and yes Palworld (which is indeed plenty of fun. Although beware if you have a young one around you might be fighting them to play the game where you can beat a pokemon to death with wood lol)
Algorithms can be biased (Score:2)
There's also things like facial recognition software which is much more likely to give false positives for darker skin tones.
As usual the world is more complex and nuanced then the old joke
Re: (Score:2)
We're talking about a LLM that generates code, not fully implemented applications which happen to exhibit biases when put into use. There's certainly been some talk that some of the older nomenclature could be interpreted as racially offensive [seattletimes.com], but I'm truly not sure where gender fits in, unless this LLM has a tendency to spit out functions and variables with sexist names.
I mean yeah, when I was a teenager it was absolutely hilarious to name a function something like HitTheBong() and allocate all the varia
Re: (Score:3)
The RS232 parts are still "gender changers". It is only impolite if you call them "sex changers" (which nobody does).
Other than that, yep, this is the complete abysmal nonsense you get when you have non-engineers writing about engineering subjects.
Re: (Score:2)
No, I've seen them called "gender benders" all the time (it's also a lot easier to say and rolls off the tongue nicer). It also seemed to piss off conservative folks who somehow hate the term. Not that they liked gender changer any better, but somehow the informal
Re: (Score:2)
1. Google("RS232 gender changer"): 367’000 results vs. Google("RS232 Gender Bender"): 47’800 results
2. On IDE "Master-Slave" is totally accurate, as the master handles the Reset and the Slave cannot even start before the Master has done that.
3. USB is not a "Master/Slave" bus, it is a "Client/Server" bus. I2C, for example, is a " multi-master/multi-slave" bis, because the slaves cannot communicate by themselves and need a master to talk to them first.
TL;DR: You are full of crap.
Re: (Score:2)
When the smallest version of the model [huggingface.co] is quantized to 4 bits, it only takes 2GB of memory, so yes this should be able to run on actual consumer GPUs.
The A100 is mentioned if you want to "fine tune" the model, which means feeding in new data and training up a modified model. Training a model always needs more time than inferencing with it, but the inferencing can definitely run on a small GPU. (FWIW, my old nVidia 1050 can run small models very well.)
Do the Math (Score:2)
Re: (Score:2)
The dataset (The Stack v2 [huggingface.co]) does indeed claim to be 67.5TB, while v1 was 6.4TB. But then it also reports being "~900B tokens," while v1 was "~200B tokens." I don't know what that means, but it might be the source of the 4x number.
The Corporate Holy Grail (Score:2)
Now that they've made it impossible to find gainful employment (using technologies we invented for them), they will seek to neutralize your attempts to create your own job.
Therefore their top priority will be to render your intellectual properties worthless while they claim everything else.
You've been warned.
Code generator (Score:2)
Anyone used them for anything serious? Can they generate iOS apps? I feel like I can code most apps faster than I could prompt my way through telling the AI what to generate. And then how do I modify some of the logic and UI without prompts?
Re: (Score:3)
It generated a full degenerative "AI" implementation for me in under 5 minutes, trained models included.
Runs on ESP32, but no space left for the communications interfaces, so I don't know what it's up to.
But the chip is getting really hot.
Re: (Score:2)
I suggest you quit your job working at Humane AI Pin.
Re: (Score:2)
Hehehehe, "degenerative AI"! Clever!
Re: (Score:2)
I've used awk to generate C files before, does that count?
Re: Code generator (Score:3)
Copilot, on the other hand, is pretty amazing. While the tales of it writing entire applications are overblown, it often generates multiple lines of usable code, and it will follow the style and context of the file. Interestingly, one feature that I think has turned out to be very useful is generating comments. In several instances it has generated comments for functions or code snippets that correctly descr
Re: (Score:2)
About what I expect. Helps a bit, but in no way makes the coder with an actual clue obsolete. Of course this is much less than what all the AI pushers are claiming.
But does it produce decent code? (Score:2)
Who cares whether it runs on a GPU, CPU or refrigerator?
Re: (Score:2)
The boss of Nvidia does, you insensitive clod.
Get all the coding horror from others (Score:2)
as if you you where the one who wrote it.
Your boss will appreciate your work even more!
67TB of bug free code (Score:3, Insightful)
I am amazed they managed to find 67TB of code without a single bug in it. Because, of course, if there are bugs in the training code, it's going to regurgitate them to the novices who will need it to complete an unfinished line of code.
Re: (Score:1)
Re: (Score:2)
Indeed. And they mist have some secret super-intelligent AGI because that is pretty much the only way to find that much bug-free and secure code. Obviously, the future is here.
In other news, marketing people have no honor and will shamelessly lie and inflate claims beyond all reason in order to sell their crappy products.
Re: (Score:2)
Because, of course, if there are bugs in the training code, it's going to regurgitate them to the novices who will need it to complete an unfinished line of code.
That would be the same as saying that a similar system trained on natural language would 'regurgitate' misspellings because its training material contained a few of those. Clearly, the current AIs are far more capable at spelling than the average human even though they were very probably (also) trained on a shitton of terribly spelled English (Reddit, anyone?).
So your statement is incorrect and displays a lack of understanding of how the training process works.
Re: (Score:1)
You can fix your LLM's spelling mistakes by running it through common or garden spell-checker. If there already existed a magical bug-fixer then we wouldn't need super AIs to write our code for us.
Re: (Score:2)
Don't try to weasel your way out of it by pretending that that is why LLMs hardly output any spelling mistakes. Just admit that you were wrong.
Re: (Score:1)
No, you admit *you* were wrong.
Re: (Score:2)
OK, so are you claiming that LLMs like ChatGPT hardly output spelling mistakes because their output is fixed by first "running it through [a] common or garden spell-checker"?
Re: (Score:1)
No, I'm not claiming LLMs *like* ChatGPT work like that
Re: (Score:2)
I suppose you think you're very cleverly not giving in. The reality is that you're acting like a child.
I clearly pointed out the flaw in your reasoning and you tried to escape the resulting cognitive dissonance by making up shit that supposedly invalidated what I said and now you're being willfully obtuse. Next step will be a full blown temper tantrum, probably.
Re: (Score:1)
You haven't presented a single shred of evidence as to why I'm wrong.
"Better" Crap-Code (Score:2)
Still crap-code as soon as you go beyond what somebody competent could have done in a few minutes. May look good and may compile though even if bad. Worse than worthless.
Synthesis over Analysis? LLM Coding (Score:2)
A100 isnâ(TM)t required (Score:1)
All these comments about A100 card needed.. that is if you want to fine tune the model, not run it as is.
If an inexpensive GPU could be used for training a model today weâ(TM)d all be screwed. Itâ(TM)s not the worst thing in the world to learn the capabilities of the technology before everyone can run one for $300.