Comment Re:I like their anti-bot stance, though, vs Facebo (Score 1) 27
This is about agents (basically automating your browser) and not bots (like the Google crawler).
This is about agents (basically automating your browser) and not bots (like the Google crawler).
No matter if you like to use AI or not, they are banning a browser feature. Other than the LLM itself, browser agents run on your machine and the order basically prescribes how you can use your browser. Are you still allowed to use a screen reader? Custom keyboard shortcuts? Who knows.
Sites should not be allowed to require how a client may be controlled by the user.
I find it hard to follow your reasoning.
Machine Code and Assembly are close. Machine code may be harder to digest for LLM because their tokenizer is usually not well-suited for binary and because the training material is probably filtered for human readable text. So it knows more assembly than binary.
"I ask copilote" - Do you think that's the absolute source of truth for the latest research?
"AI can't take raw bytes and dissamble the code" - why should this be impossible?
"this requires reasoning" - No. Your classic Disassembler does not reason either. It requires learning a more or less fixed mapping only.
"AI only does pattern recognition" - You should specify what AI you are talking about. For usual LLM it's wrong.
I think the point of such a proof of concept is, that the AIs are probably way better with x86 machine code than with Apple II code. There is already a lot of research that does similar things for x86_64 and other popular architectures, and there is way more training data for that. So the Apple II binaries are probably the bigger challenge.
The problem is not that Apple is not allowed to use Tracking Transparency, but the point is that all apps need to be treated the same, i.e., they would need to apply the same transparency to their own apps, that are currently exempted.
Does it matter if it "decompiles" or reads the machine language without intermediate step (even though one might suspect some kind of decompiled representation in the latents then)?
The point is the thing had the machine code (I guess some hex representation of it?) and understood it and found a bug.
Think larger. What about binary obfuscation techniques? A LLM can read the machine language and collect facts without getting frustrated by all the reverse-engineering traps developers may have put in there and slowly gets to the core of what the thing does. Things may become quite interesting soon.
I think you can't beat Claude Opus at such tasks with other models currently. But that comes at a price. Literally.
There are two points to it:
1) It can find security issues in machine language
2) It even can do this for Apple II
I am always confused why people don't understand proof of concepts. If you get doom to run on your toaster, you are not looking for the best gaming platform, but are proving what you can do with the toaster hardware. If you find security bugs in Apple II binaries, you do not want to fix decades old software, but show that your tool understands decades old binaries. In practice you then apply your skills to real-world problems that are (hopefully) simpler because you do not need to shave the last byte to fit things in the toaster's RAM.
The word you're looking for is statistics. And here AI models (LLM or not) really show that they are good at fitting data.
But please don't act too surprised. I bet the three letter agencies are doing that since years. Alone the time correlation between posts, if you have post data of different platforms on a large scale, will probably give away many users. It's not only constraints (when you sleep, you post on none of these platforms, you can type only on one at a time, etc.), but also humans following patterns.
Real anonymity is hard and you need to minimize data for it. Some time ago I would have said adding enough noise would help, but I bet you can get AI/statistical models to clean the noise and recover the pattern.
A year ago we had every second day articles "I got a LLM to say something stupid" and now we get "I got an agent to say something dumb"
Yeah, a LLM and a agent may talk about a church for robots. I guess Futurama is prior art for that.
You didn't even provide the reference for your original claim of proofs. Now you post again without references. Cite some papers, if you know "newer results"
Thanks, that are interesting links (and will take some time to read). I guess its then more about theory (the could learn it) versus practice (given usual training they are unlikely to learn it)?
And when it comes to AGI I think stuff is muddy anyway. I have no idea if we will reach it and some doubts that we will need it.
It has a few nice thought experiments, but why do we need AGI when we have AI systems that do the same without being AGI?
Do you have a reference to that proof? As LLM are (similar to most other neural nets) general function approximators, it's unlikely they cannot be used to implement AGI, if AGI can be implemented using current computing paradigms.
This doesn't say they are a good architecture, the first to reach it, or whatever, but when people get doom to run on a toaster they still prove that it can run doom even when there are better devices for playing doom.
What makes you so sure? We have basically two really large pools of training data: Written and digitalized text, and videos.
Videos are large and mostly redundant with low information density, whereas text is usually high density and a lot of information content. Both are useful in different ways, but given limited resources text is more valuable right now for "thinking", while we will probably need video for more robotic tasks.
There are good arguments for better architectures than LLM, but these would often require creating a lot of training data while LLM already have a large pile of available data. Furthermore text is self-annotating, no need for someone to label images or set time marks in videos.
If imprinted foil seal under cap is broken or missing when purchased, do not use.