Possibly, even more practical might be an airship. Existing airships can already lift up to about 100 tons, and they can land in any large open space.
It seems to me that some kind of heavy lift helicopter solution might make more sense. My understanding is that a reliable 100m turbine blade can be made weighing about 35 tons. Although the most capable current helicopters can only accommodate an external lift weight of about 20 tons, it seems easier to build a more powerful helicopter than a massive aircraft that can land on a makeshift dirt runway.
If you spend time with the higher-tier (paid) reasoning models, you’ll see they already operate in ways that are effectively deductive (i.e., behaviorally indistinguishable) within the bounds of where they operate well. So not novel theorem proving. But give them scheduling constraints, warranty/return policies, travel planning, or system troubleshooting, and they’ll parse the conditions, decompose the problem, and run through intermediate steps until they land on the right conclusion. That’s not "just chained prediction". It’s structured reasoning that, in practice, outperforms what a lot of humans can do effectively.
When the domain is checkable (e.g., dates, constraints, algebraic rewrites, SAT-style logic), the outputs are effectively indistinguishable from human deduction. Outside those domains, yes it drifts into probabilistic inference or “reading between the lines.” But to dismiss it all as “not deduction at all” ignores how far beyond surface-level token prediction the good models already are. If you want to dismiss all that by saying “but it’s just prediction,” you’re basically saying deduction doesn’t count unless it’s done by a human. That’s just redefining words to try and win an Internet argument.
They do quite a bit more than that. There's a good bit of reasoning that comes into play and newer models (really beginning with o3 on the ChatGPT side) can do multi-step reasoning where it'll first determine what the user is actually seeking, then determine what it needs to provide that, then begin the process of response generation based on all of that.
This is not a surprise, just one more data point that LLMs fundamentally suck and cannot be trusted.
Huh? LLMs are not perfect and are not expert-level in every single thing ever. But that doesn't mean they suck. Nothing does everything. A great LLM can fail to produce a perfect original proof but still be excellent at helping people adjust the tone of their writing or understanding interactions with others or developing communication skills, developing coping skills, or learning new subjects quickly. I've used ChatGPT for everything from landscaping to plumbing successfully. Right now it's helping to guide my diet, tracking macros and suggesting strategies and recipes to remain on target.
LLMs are a tool with use cases where they work well and use cases where they don't. They actually have a very wide set of use cases. A hammer doesn't suck just because I can't use it to cut my grass. That's not a use case where it excels. But a hammer is a perfect tool for hammering nails into wood and it's pretty decent at putting holes in drywall. Let's not throw out LLMs just because they don't do everything everywhere perfectly at all times. They're a brand new novel tool that's suddenly been put into millions of peoples' hands. And it's been massively improved over the past few years to expand its usefulness. But it's still just a tool.
"Pay no attention to the man behind the curtain." -- The Wizard Of Oz