I wasn't really worried about how my IDE would be able to read, edit, and write file, nor how it could highlight some differences, or how it would grab something I typed and send it to a backend.
I'm worried about that backend, receiving everything needed to supposedly make decisions about the code, being fully closed, operated by an unreliable third party, with said third party promising to play fair as the only security net.
More open source is great, but considering this a move to improve transparency and trust into AI "agent" or whatever is a joke. "you can audit everything up to the part you're suspicious about", eh?
Let's start with some disclaimer: this is about LLM. Not AI, as it is a very large field of stuff that existed way before LLM became the latest craze, and hopefully will keep existing until we get something impressive out of it.
Also, some issues with LLM are not stemming from their output, but with economic models around them, privacy issues, licensing issues, etc. To address some of those, most of our daily stuff is done on locally running models on cheap hardware, so no 400B parameters stuff.
There's four "main" usage so far I'm looking into, some for experiment purpose, some for daily usage:
To get some quick info, it seems easy: go to gemini/chatgpt, ask something without private details, get an answer, build up on that or follow links. Unfortunately, while these are usually able to provide immediately useful info on some simple stuff, the details are way too often off the mark. Assuming you have a decent search engine set up (like, google without the bloat) it's still better as of today to just search, get the first two-three links, then work it out.
With that said, we can sometimes get stumped because of a very specific or complex issue; in which case, if a basic search failed, we'll try an LLM answer, because it's quick and cheap, so no harm trying that (except the resource consumption, but that's not the point). It sometimes is able to give something insightful, but definitely not often enough to be considered as a first option. It's more of a last ditch effort, and I'd say it's more like 20% of the time we're stumped, it comes in clutch. Not insignificant, but I'm talking about a niche inside a niche there.
For code completion, it's great on short code. I often end up stopping writing something and trigger the autocomplete after I feel I've given enough context through the beginning of the function/class/whatever. It will very often complete with something decent that requires minimal fix. I attribute this success rate to the limited scope of the request, and limiting these actions to things I knew beforehand what they should looks like. LLM are great at finding and replacing stuff with some level of consistency, so it's good at autocomplete stuff. Kinda like the sentence "The volcano is spewing la" should be easy to complete.
On a good day, this could be around a hundred short completion, with an acceptance rate of 90% (I actually have these numbers). So, the tool does something. Now, this is helpful; I type less. But I'm not convinced it's efficient; I'm not pulling anything new from the model, I'm skipping typing the obvious thing. A welcome addition to the tools, but I'm not sure it's worth the cost, especially since it's optimising typing, which in itself is not the longest part of the day anyway, far from it.
I also dipped in so-called "vibe coding" using commercial offers (my small 12B model would not have been fair in that regard. I spent a few hours trying to make something I would consider both basic, easy to find many example of, and relatively useful: a browser extension that intercept a specific download URL and replace it with something else. At every step of the way, it did progress. However, it was a mess. None of the initial suggestion were ok by themselves; even the initial scaffolding (a modern browser extension is made of a json manifest and a mostly blank script) would not load without me putting more info in the "discussion". And even pointing the issues (non-existent constants, invalid json properties, mismatched settings, broken code) would not always lead to a proper fix until I spelled it out. To make it short: it wasn't impressive at all. And I'm deeply worried that people find this kind of fumbling acceptable. I basically ended up telling the tool "write this, call this, do this, do that", which is in no way more useful than writing the stuff myself. At best it can be an accessibility thing for people that have a hard time typing, but it's not worth consideration if someone's looking at a "dev" of some sort.
Documentation (private documentation) is both the obvious use case, and seems to be decent on limited datasets. It allows mulching a bunch of document together and get information in natural language (both the query and the reply). My worry here is that it will hide some stuff, but as long as we use it to look for things we *know* are in there, it's okay.
To summarize, there are some applications that works well enough to be used daily without issues, some other applications that seems extremely over hyped to me, even today. The main point of concern is that it can not be used to gain knowledge; at best, it works as a refresher, at worst, it works like an auto type keyboard. This tech isn't to be trusted with anything you don't know/can't verify, and that includes an awful lot of what commercial offerings propose these days. I can be helpful, but it can also be harmful, and we're in the "let's sell harmful" phase.
AI is such a powerful tool for self education
This is worrying. AI, or frankly speaking AI gen, can do a fair amount of thing. Whether it is useful or not, or a good idea, is up for debate. But leaving education to a statistical model that may or may not have been curated (and, curated by who and for what purpose) is pure foolishness. The most basic thing people will tell you, even people that advocate (responsible) use of these models, is that you have to be able to check, nay, double check their output. That's not very compatible with self education, especially if the output of some AI agent becomes your only frame of reference.
If they want to teach anything about AI, is that it must be treated as a partially knowledgeable, unreliable third party, that have to be tightly controlled. It can do many thing, some of these really fast, but still needs someone to overshadow it.
It's microsoft. There's already AI in notepad, and now formatting. The *sole* point of notepad was that it was fast and didn't do jack to the content.
And, again, it's microsoft. I'm sure they'll find a way to have "their" markdown be yet slightly different from *every other flavors*, because that's just what they do.
I've been running the english->japanese "course" for a bit less than two years now. It barely held itself together; weird english phrasing (maybe also weird japanese phrasing, but I have no frame of reference for that); broken audio on the regular, mistranslations, inconsistencies, false negative/false positive in valid answers, etc. It's not completely unusable, but it clearly needed a ton of improvements to be called decent.
And now they're likely to push out the door some automated second-rate translations, with even less oversight, and call it a win.
When they first announced this move months ago, I set my subscription to not automatically renew. I'd rather pay people that actually care, especially since Duolingo never goes past basic vocabulary building anyway.
I've been having fun with duolingo for a while now, but when news about this surfaced a few months ago, I just canceled my subscription. The app was already plagued with approximations, dubious content, audio errors, etc. but worked mostly. Moving some part of that to AI is likely to get things worse, especially if it is to remove human intervention in the process.
The subscription was yearly, so I assume there won't be a visible dip before a while, cementing them in the idea that they're moving in the good direction. That's sad, really; alternatives exists, and while they don't have a set of funny characters displayed on screen, I'd rather send my money to human-oriented business, as much as this is a thing.
The moon may be smaller than Earth, but it's further away.