Forgot your password?
typodupeerror

Comment Re:Full Circle (Score 1) 105

The *average* blackout duration for Madrid (CAIDI) is 1.6 hours. While you wouldn't expect a large percentage of outages to exceed four hours if the average is just under half of that, infrequent isn't zero, and when you're talking about critical emergency infrastructure like telephones, you really should want the outage durations for those services to be zero.

And even if the average really were just 30 minutes, the point remains that this was done in response to an outage that lasted way more than 4 hours, so the proposed fix wouldn't have prevented the events that triggered the legislation.

Comment This is an interesting topic, at least to me. (Score 1) 2

I have been stress-testing AIs with increasingly complex projects for some time. The Chinese AIs struggle, but actually do a FAR better job of handling massively complex tasks than Grok, and Gemini just rolls over and whimpers at anything above a very low level of complexity.

What I've found is that the Chinese AIs tend to be sycophant but do "understand" complex projects properly in that you can ask specific technical questions and the answers will be generally very accurate. Any sort of critical analysis is beyond them, though. (Ether that, or I'm a mega-genius. Which....doesn't sound terribly likely.)

Of the "Top AIs", ChatGPT is good on basics but is incapable of any kind of detailed generation. Claude is brilliant at detailed generation, but overloads with anything but a tiny data set.

I've been putting up the projects on Gitlab for a while, so anyone who wants to see an AI break down and cry in despair is able to do so.

The secret tools don't bother me - they'll have long understood how to use Big Data and Analysis of Competing Hypotheses. AI isn't going to find out any more than combinations of those tools will, because that's basically all AI is - a Big Data classification system.

Submission + - The MOST artificial intelligence is Chinese? (linkedin.com) 2

shanen writes: Pardon my clickbait and quasi-joke Title suggestion, but the topic has been on my mind for a while. I have not been pursuing the research topic seriously, though I did take several close looks at DeepSeek when it was the center of hoopla and have sometimes benchmarked against it since then. But this summary of new Chinese AI was just pushed at me by the AI-empowered algorithms of LinkedIn and I'm wondering how seriously I should take it.

If we (non-Chinese?) were actually technically ahead of them (Chinese heathens?) then this would not be an issue. Unlike the computer security race we lost a few years ago? However the real concern is not with these public AI tools, but with the secret ones, both government and private... (Bond villain conspiracy theories, anyone?) But I don't think there is likely to be an outspoken and authentic expert from inside China also inside the (Slashdot) house.

Personal disclaimers needed? Lately most of my AI games of the non-fun type have involved Claude, but Gemini keeps sticking it's remarkably unintelligent nose into my business to the point where I've become much more tolerant of Bing than I used to be. More broadly, there used to be a time when I would have high confidence of seeing useful discussions on Slashdot with some known experts who were probably the real people to boot (in at least two senses of "real"), but these days Slashdot has also been infected with the lack-of-trust virus. Another terminal case? I can't say, but I'm no longer surprised when one of the oldtimers keels over. Bash.org had a great collection of jokes...

Comment Re:"peak satellite"? (Score 2) 31

At what point will we run out of space to put all those satellites

Do the math! The Earth is ~12,800 km in diameter, so a LEO shell from 350-450 km up is about 13,300 km in diameter and about 100 km thick. That gives a volume of about 56 billion km^2. If we give every satellite 100 km^3, that means we're limited to about 56 million satellites.

particularly into stationary earth orbit?

Oh, you want geostationary orbit? That's way, way, way bigger (though possibly subject to Kessler syndrome, unlike LEO).

And who manages traffic congestion?

Basically every space agency does this.

Next, let's worry about what happens if one satellite has a catastrophic accident (or is knocked out by an ASAT), and all-of-a-sudden, that orbit starts loading up with junk?

This is a potential concern for high orbits, not so much for where most of the satellites are being deployed.

Enquiring minds want to know! (Particularly so I can short SpaceX stock...)

Space getting "full" is never going to be a constraint on SpaceX's growth, so you should probably look for some different signals.

Comment Re: I've had poor success with this strategy (Score 3, Insightful) 53

Why do you even need to merge? Just change the code directly. Why do you even need a code database? You aren't looking at the code are you?

I absolutely review all of the code, telling the AI to rewrite parts of it, and occasionally doing it myself. I take advantage of the AI to produce not only more code, but higher-quality code (because I will make the AI do refactors that I'd previously have dismissed as not worth the effort). I now get more done in a day than I used to do in a week and, as I said, with higher quality: more/better documentation, cleaner code, more comprehensive test suites, etc.

AI is a huge productivity boost, and it's actually that boost that creates the review and merge bottlenecks. A four-hour merge process isn't a problem when you only produce two merge-ready PRs per week. But I average one merge-ready PR every 2-3 hours.

Comment I've had poor success with this strategy (Score 4, Informative) 53

I've been trying for a while to use a "loop" to optimize one particularly-tedious part of my workflow: Merging.

My employer uses Github with an extensive CI infrastructure to validate all sorts of things. After CI passes, trunk-io takes the commit and retests it in a batch with other commits and if they all pass, merges them as a set of squash commits. If something goes wrong, I have to figure out whether it's a transient failure (in which case I can tell the system to re-run the tests), or whether it requires me to fix and re-push. My commits typically build on one another so I end up with a stack of PRs that have to go through this process. When a commit finally merges the next commit up the stack has to be rebased and re-pushed.

Start to end, getting a commit to merge takes between one and four hours. This is slow enough that even though I don't have to watch the process continuously, just check in on it every half hour or so, it puts a major crimp in my productivity. If I only merge during working hours I can only merge 2-4 commits per day, but on a good day I create double that. This means that I have to be merging evenings and weekends too, or my backlog builds up. (Code review is another obstacle, but I'm focused only on the merge process here.)

There are enough possible odd failure cases in the merge process that I haven't been successful at writing a script to manage it. So I thought "Hey, why not have Claude supervise it? Claude is capable of exercising some judgment and problem-solving, right?".

Not really. If there's a problem blocking the PR at the bottom of the stack from merging, Claude is perfectly capable of analyzing the situation and determining what needs to be done to unblock it, and of performing the operations necessary -- but only with active prompting. Claude can set a timer to go periodically check the status and recognize the problem, but no matter what I do I can't get it to autonomously take the next step of correctly diagnosing and then acting on that diagnosis. Even given explicit instructions to do so, Claude either (a) fails to investigate enough, (b) fails to identify correct actions or (c) fails to perform them. When I wake up in the morning and ask Claude what the situation is, it generally correctly and accurately summarizes exactly what's wrong and exactly what needs to be done to fix it, and then when I ask why it didn't do those things it tells me that it clearly should have, but it just didn't.

I've tried various architectures, using one instance to prompt another one, using pairs of instances set up with distinct, complementary responsibilities, using instances set up with adversarial responsibilities (this is the most effective), but I just can't get it do to this work effectively.

Slashdot Top Deals

"The Avis WIZARD decides if you get to drive a car. Your head won't touch the pillow of a Sheraton unless their computer says it's okay." -- Arthur Miller

Working...