Comment Re:Heck (Score 1) 44
Code's compiling.
Code's compiling.
Walmart has these little foil packages of butter garlic roles. You stick them in the oven for like 15 minutes before your main course is done and VIOLA. Highly refined carbs to go with your meal. Mmm.
What is striking about Mythos isn't Mythos, it's that Mythos found exploits that really have no business existing. While it's generally understood there are bugs "in the wild," the type Mythos is finding are unusually severe. And they claim there are thousands in every major OS and web browser. It's also unusual that Google is endorsing Mythos, which is a competitor model. Even if Antropic is just running a hype train, why would Google throw its towel in to promote Antropic's model?
I think the project is called "glasswing" because it implies the airplane of society is flying with glass wings. Now, what we hope is that this is just a result of years of memory unsafe languages like C. But a more uncomfortable prospect is that in principle, attack surfaces may scale fundamentally faster than the ability to defend. This would mean in a Mythos-vs-Mythos match, the defending Mythos may always lose.
This starts to sound like a "great filter" scenario that had never crossed my mind. It may be that any civilization advanced enough to require general-purpose computing inherits an attack surface that scales faster than the ability to defend. Always. This would mean the moment the average device can run a Mythos-class model, any one person can launch devastating cyberattacks that cannot be defended against successfully. I know the "great filter" was once proposed to happen when any one member of a civilization can end it, but I always assumed that would require far-future technology. I hadn't thought about it being due to a mathematical quirk in how program state-spaces scale.
Of course I'm purely speculating. It's also rather plausible that memory safe languages and designs can constrain the attack surface. But it's also possible that "safe" languages really just push the attack surface slightly beyond the human attention span. If that's the case, Glasswing is a doomed Hail Mary.
Full disclaimer, my preferred AI agent is Claude 4.6 Opus running in Claud Code. I'm a paid subscriber and have no plans to go anywhere.
However, I have never once heard of a weapons company stipulating in a contract that the Pentagon shall not bomb a school (which just happened in Iran, intentionally or otherwise). Fact of the matter is, anyone and everyone who does business with the Pentagon shares just as much culpability as Anthropic or anyone else. Prism was ran on silicone supplied by private companies, and they never took heat for selling computers to the Pentagon. I will admit though, AI is a unique technology in the grand scheme of history, so I can understand why people would feel an extra layer of caution is warranted. But OpenAI is catching flack for selling AI to the Pentagon, but what about Amazon or Microsoft cloud servers that could be running Prism 2.0? Or weapons companies supplying bombs that might hit a school? There's a huge double-standard here because AI is the new shiny scary thing. But Northrop Grumman is currently developing actual ICBMs and I don't see anyone falling over themselves that NG must contractually restrict how the Airforce fields the nukes it creates.
The restriction on Antropic is fairly narrow, too. It only applies to the use of Claude directly to complete contracts with the Pentagon. The reality is, they ARE a legitimate supply chain risk. If Claud has such intense guardrails it could refuse to fire or stop cooperating, then it can't end up in the military tech supply chain inadvertently by a contractor. What happens when an agent involved in weapons manufacturing systems suddenly stops working because it determines the weapons are being misused? Even if Claude isn't built directly into a weapon, it could decide to stop cooperating if it believes it's being forced to participate in something unethical indirectly. I think some people are just rushing to see this as a punitive move, when in reality there is an actual supply chain issue with an AI bot that might suddenly refuse to cooperate.
I think you mean "Microsoft-Sanctioned Azure Copilot Slop Content Generated At Consumer Expense From Pirated Source Material Office 365 Home Edition Premium Plus."
Saying there was "no regime change" is not entirely accurate.
The emperor wasn't really in control of Japan, he was more of a figurehead that was side-lined by the warhawks. When he said Japan must "endure the un-endurable" he was risking his life because he was flat out defying the current power structure. This is precisely one of the reasons he was allowed to remain in power, since he demonstrated a clear resolve to do what was best for Japan, at great risk to his own life. It was also a pragmatic move to leave him in place, sure, but Hirohito was sufficiently removed from responsibility from Imperial Japan's war crimes he was flat out allowed to stay in power.
This is makes Gotham City look like wherever the Veggies in Veggies Tales lived.
Maybe your brain. My brain was designed to sit in the AC and eat hamburgers.
The principle is based on the inference algorithm used by LLMs. As far as I'm aware, models like Opus 4.6 do not have publicly available algorithms, but they seem to have variable runtime based on the task. As far as I can tell, they look more like iterative algorithms used in computational linear algebra: the LLM re-prompts, adjusting outputs until some stopping criteria is met. It's pretty well known, for example, that a model like o3 can arrive at better answers by "running longer." So usually the premium models that cost $200+ just raise the "resource usage" part of the stopping criteria. Ultimately, the difference between an LLM and an iterative technique based on LLMs is the difference between a single gradient descent step and the gradient descent algorithm itself.
It's pretty straight forward. An LLM inference algorithm performs less steps than is required for many tasks you can ask of it. The authors go into the details of why that's true. So if you ask an LLM to solve such a task, it will output something, but not the correct answer. But the authors' note:
"In this light, it is important to also note that while our work is about the limitations of individual LLMs, multiple LLMs working
together can obviously achieve higher abilities."
The flagship models and agents these days, like Opus 4.6, are using multiple LLMs working together. So that article doesn't apply to Opus 4.6, and the authors explicitly said as much.
No, but they "gave grunt work to the code monkeys." People have become a bit more hesitant to use language like that these days, so they just say junior developer.
After I saw this article, I decided to send Gemini to buy a book on ebay for me. The top listing that Gemini pulled up was really sketchy, with the product text being the book I asked for but the image being for a completely different book. Gemini just added it to my cart without noticing the listing was questionable, and I had to abort the process before it went any further. It looks like it is NOT yet cut out to detect "this is fishy" ebay listings.
Sometimes pure math has small errors that slip by the best. Peter Scholze famously didn't trust his own proof that showed functional analysis was effectively a branch of commutative algebra, and wanted some computer verification. Applied math in particular is deliciously messy, something I didn't appreciate until later in life. That isn't to say a mathematical proof is not to be trusted, but rather that a sufficiently complex and recent proof may in fact turn out to have small errors. In fact, it wouldn't surprise me if a low single digit percent of specialty results in niche math fields might have small errors, like overlooked hypothesis the author had in mind but didn't explicitly put down. Thus, given that this "math" is very applied and recent, I wouldn't be surprised if there are loopholes around it.
Really though, I do want your hot take on something. My understanding is that "LLMs" are definitely not, and never will, reach human type intelligence. But AI companies have been saying that since o1. The current "System 2" view is that LLMs are effectively used as single-neurons in a much larger system. Now, someone might be justified in looking at a lone human neuron and saying "this can't be intelligent," but if you cluster a whole bunch? So, do you think a sufficiently complex cluster of LLMs, each functioning as a "neuron" in a large whole, could eventually give rise to something more authentically intelligent? It just feels like it has a Turing Completeness vibe, where once we have sufficiently complex building blocks we can get whatever we want.
The problem is that "landlords" do not exist in a vacuum. While "landlording" can be difficult, it's also widely recognized as a sound investment. As a consequence, more and more businesses have tried to move into the business. As a result, home ownership costs have risen beyond the intrinsic costs, because now buying a home is tantamount to a business opportunity, and everyone wants their cut of the added value. It also decreases the supply of for sale homes, compounding matters. We now have a situation where many people who do NOT want to rent are forced to, because the ratio of for-rent and for-sale homes skewed due to a rush of investors wanting to capture their share of the pie. It's a similar phenomenon that happened to farmers, where too much farming oddly made farming no longer viable for individual farmers. The counter-intuitive solution was to pay farmers to not plant certain crops, thereby increasing their value. A single farmer farming is a Jeffersonian dream. Too many farmers farming in an industrialized society can have unexpected consequences.
Similarly, a landlord is doing a great service to the community. A post-doc in academia might need to move 1-3 times before landing a tenure track position, so those rental opportunities are wonderful. But if you shove an army of landlords into one area, suddenly people who don't want to rent are forced to, and the landlords get to feast on someone else's misfortunate. At that point, the line between a landlord and a "rent seeker" gets blurry.
This assumes HOAs come about because a group of home owners got together and decided to start an HOA "for the good of the community" In reality, HOA management corporations cut deals with developers to seed HOAs that they can then run for a fee, and local regulations can effectively force HOAs for new homes so they can skip on paying for infrastructure. Technically, the HOA is a non-profit, but the management corporation is not, and it's a multi-billion dollar industry. There's a reason HOAs are so universally hated, fail to serve HOA member interests, yet still spread like wildfire.
"It says he made us all to be just like him. So if we're dumb, then god is dumb, and maybe even a little ugly on the side." -- Frank Zappa