Microsoft Launches Phi-3 Mini, a 3.8B-Parameter Model Rivaling GPT-3.5 Capabilities 16
Microsoft has launched Phi-3 Mini, a lightweight AI model with 3.8 billion parameters, as part of its plan to release three small models. Phi-3 Mini, trained on a smaller data set compared to large language models, is available on Azure, Hugging Face, and Ollama. Microsoft claims Phi-3 Mini performs as well as models 10 times its size, offering capabilities similar to GPT-3.5 in a smaller form factor. Smaller AI models are more cost-effective and perform better on personal devices.
Now to remove dependency (Score:2)
The model has a phone home for config requirement. E.T. should not phone home.
Re: (Score:2)
It is also pretty stupid with regard to resilience to rely on central infrastructure. Especially MS infrastructure which is known to be massively insecure.
Re: (Score:1)
There you are! Retards has spoken! Retard alert!.
I expect narcc to follow close behind!
So few parameters !! (Score:2)
It's amazing these things memorize some text and some image motifs and can read English with semantic prediction, obey even structures for executable code creation, and emit coherent English in topic with so few parameters.
You could not code most of those and even apply the most advanced compression and have the result fit in a dvd. And yet there you are managing it.
M
This is very hard to explain or imagine. What am I missing???
Re: (Score:2)
At the core what they (all LLMs) are doing is training a neural net with a very simple scorable task, predict the next token/word. The network nudges weights based on statistical probability. It all adds up to a 3 dimensional space that can theoretically contain any logical operation times very many and many logical structures but the logic we've figured out is how to look at two weights and determine which has a slight statiscal edge of being the way to nudge it. Because within the training data we have th
Re: (Score:3)
The model has a phone home for config requirement. E.T. should not phone home.
You can just download it...
https://huggingface.co/Nikolay... [huggingface.co]
My opinion it's by far the best 3b model I've seen.
Re: (Score:2)
Re: (Score:2)
To clarify, as released, it requires trust-remote-code to be enabled and before adjusting prompts in Ooba was dumping what looked like injections into the system context. I'm not sure what is alarmist about any of it.
The comments on it's size are interesting. (Score:2)
The implication that these bo
Re: (Score:2)
While I follow your argument, I do not think that is how it works. The main limiter is not model size, but training data set availability. If you want a different model "bias", you have to find a training data set in the same size that has this different bias. And you cannot use automatically generated data to skew the model. It has to be human-generated input or model collapse may well happen.
Re: (Score:2)
Re: (Score:3)
And that's exactly what huxley warned us of. That we wouldn't live a world where truth is hidden, we would live in a world where everyone can find a perfectly convenient truth. We live in a world with plenty of diversity of opinions, as is. More tools with diverse leanings will just mean that people will pick and choose their source of truth.
I do like your thinking, but it keeps reminding me of all the doom scenarios people like huxley and kackzinsky predicted. Where tools designed to fix problems just keep
Re: (Score:2)
Re: (Score:2)
Where's the downside of a world where we each get to live in a simulated reality that conforms exactly to what we think it should be, while not impacting anyone else's reality, unless they consent?
Re: (Score:2)
The implication that these bots can be scaled down to run fine on a phone means that maybe we also get something like "Freedom Bot, Expert on the Constitution." or "Liberty Bot, who helps you fight censorship". Ie.. things the majority of current West-coast tech moguls would be horrified by. If the barrier for entry is lower in terms of code and infrastructure, then it's reasonable to expect a diversity of opinions to emerge in terms of the political leanings of these things. Because of their extreme bias, I'm not willing to listen to the current crop of bots, but maybe the next gen will have more political clue. Here's hoping. I'm okay if I've gotta side-load it or compile it out of pkgsrc. I'm not okay being lectured on progressive politics by ChatGPT or Gemini.
Thankfully the vast majority of the computing effort goes into pretraining. Censorship and vendor instilled bias applied on top especially in smaller models is relatively easy to undo. Give it a few weeks and I'm sure there will be tons of tweaked versions on huggingface with much of the brain damage removed.
I'm not sure how I feel. (Score:1)