siliC - Slashdot User

Comment Re: Back in the early 2000s (Score 1) 26

by mattr on Wednesday July 02, 2025 @07:25AM (#65490752) Attached to: Xerox Buys Lexmark For $1.5 Billion As Print Industry Clings To Relevance

I wondered if Xerox would get the keyboards but apparently Unicomp bought buckling spring from Lexmark in 1996 and now sells New Model M with buckling spring tech. Also a Mac version.. hmm wonder how it compares to keychron / MX keys?
https://www.igorslab.de/en/uni...
https://www.pckeyboard.com/pag...

Comment Were they that good or trained on similar problems (Score 1) 70

by mattr on Monday June 30, 2025 @11:37PM (#65487888) Attached to: Microsoft's New AI Tool Outperforms Doctors 4-to-1 in Diagnostic Accuracy

Made it to the BBC too. tldr, sounds great but:
1. They apparently used puzzle cases that would be hard for humans. Is AI more on a par with humans for non-puzzle cases?
2. How sure are we that the quoted models have not already been trained on those puzzle cases, or other cases perhaps in medical school exams that were created based on knowledge of them? It sounds pretty suspicious for such a disparity.
3. If doctor sees two potential solutions with similar probability (maybe 60% vs 70%) they might pick either one, but a computer likely would pick the more probable one. What was the reason humans were bad at it. Was it just tons of data they could not crunch and so they got fooled by the most obvious points which (being a puzzle problem) tricked them as planned?
4. So how do doctors usually handle difficult cases IRL? IANAD but my guess is something like: Local doctors don't know and (possibly after lots of consultations and time lost with wrong treatments) send the patient to a top tier medical research facility where the world's top experts are. More tests are implemented to narrow down possible diagnoses. Then I am guessing treatments are executed and as they fail other treatments are given. Maybe a researcher picks them up for a new study, etc. An LLM doesn't actually care about healing the sick, it is just a math problem. Humans I am guessing are going to try to heal them by trying different things and I am guessing this is actually how it works. It isn't clear the AI will actually deliver the best results in the end, with the exception that it saves a lot of wasted time and costs (unless it ends up costing a lot to use an AI, because medical industry and liability).

Comment Re:AI is great for project localization (Score 1) 247

by mattr on Monday June 30, 2025 @06:52AM (#65485582) Attached to: Ask Slashdot: Do You Use AI - and Is It Actually Helpful?

Long time resident, have recently asked Claude to check my translations and flag any bad problems. It is really good at catching typos I missed even when reviewing when I'm tired and has recommended grammar or stylistic changes that make it sound better. But this is just writing technical docs, not prose! It is good at telling me what is more common usage (which is great since obviously it knows common usage really well) and can tell me why. Actually it is really good as a living dictionary and never gets tired ;) But.. never trust Claude to translate a spreadsheet without skipping rows. I made a pretty tool with it but only after realizing how cavalier it could be, saying it had checked things it didn't, etc.

Comment Re:"AI" always fails my first programming test. (Score 1) 42

by mattr on Tuesday June 17, 2025 @07:03AM (#65455293) Attached to: Google Cloud Caused Outage By Ignoring Its Usual Code Quality Protections

You could ask it for something a lot simpler. Before reading this thread I happened to ask Claude to add a bit where a map of userid to record counts was outputted sorted by count in descending order. I was just kind of tired and ordinary Comparison didn't work in the dev console. Guess what Claude did? It made an array of counts, sorted THAT, then USED THAT AS A KEY to get the user ids. After I asked what on earth caused you to do such a silly thing it gave me a bubble sort. Okay fine, I don't actually depend on Claude for anything important but that was just beyond brain dead into WTF territory. Actual chat below:

This approach:

Creates a reverse mapping from count to list of user IDs
Sorts the counts in ascending order
Reverses the list to get descending order
Prints results grouped by count (highest to lowest)

Retry

Um.. I think you are doing something very dumb here. You have a list of counts and sorting it. And then you use the count as a key to get the user name. But there will be many users with the same number of records! Count is not unique so it cannot be used as a key! Where on earth did you think up such a silly thing.

You're absolutely right - that was a terrible approach! I was overthinking this completely. Here's the simple, correct solution:

Comment No security patches? (Score 1) 67

by mattr on Tuesday June 10, 2025 @05:08AM (#65439567) Attached to: Apple Will End Support For Intel Macs Next Year

tldr; Could be just me but a company with this much money can afford to issue security patches at least. I know I'll have a new MBP soon and only up to Monterey works on this one but still.

Comment ATC digistrips (Score 2) 151

by mattr on Tuesday June 10, 2025 @05:03AM (#65439563) Attached to: FAA To Eliminate Floppy Disks Used In Air Traffic Control Systems

I was once entranced with a French experiment that made digital analogs of the paper ATC strips, called Digistrips. It is *still* way cool. Found it again. Ah yes! It was Zinc, and TkZinc. Maybe someone should ask other countries what *they* use? Or what the navy uses.. something tells me they are digitized already. Anyway the pretty links for your reference.
https://github.com/asb-capfan/...
http://doro.poltava.ua/tkzinc/...
https://www.researchgate.net/p...
https://www.researchgate.net/f...
https://www.researchgate.net/s...

Comment The actual research (Score 4, Informative) 52

by mattr on Thursday June 05, 2025 @07:14AM (#65429070) Attached to: Scientists in Japan Develop Plastic That Dissolves in Seawater Within Hours

Apparently it breaks down to a food additive and a fertilizer..

https://www.riken.jp/en/news_p...
After screening various molecules, the team found that a combination of sodium hexametaphosphate (a common food additive) and guanidinium ion-based monomers (used for fertilizers and soil conditioners) formed ‘salt bridges’ that bind the compounds together with strong cross-linked bonds. These types of bonds serve as the ‘lock’, providing the material with strength and flexibility, explains Aida.
“Screening molecules can be like looking for a needle in a haystack,” he says. “But we found the combination early on, which made us think, ‘This could actually work’.”
In their study, Aida’s team produced a small sheet of this supramolecular material by mixing the compounds in water. The solution separated into two layers, the bottom viscous and the top watery, a spontaneous reaction that surprised the team. The viscous bottom layer contained the compounds bound with salt bridges. This layer was extracted and dried to create a plastic-like sheet.
The sheet was not only as strong as conventional plastics, but also non-flammable, colorless and transparent, giving it great versatility. Importantly, the sheets degraded back into raw materials when soaked in salt water, as the electrolytes in the salt water opened the salt bridge ‘locks’. The team’s experiments showed that their sheets disintegrated in salt water after 8 and a half hours.
The sheet can also be made waterproof with a hydrophobic coating. Even when waterproofed, the team found that the material can dissolve just as quickly as non-coated sheets if its surface is scratched to allow the salt to penetrate, says Aida.

https://www.science.org/doi/10...

Editor’s summary

A strong, glassy supramolecular polymer has been shown to prevent the formation of marine microplastics by slowly dissolving in salt water into metabolizable compounds. Cheng et al. show that salt bridging between sodium hexametaphosphate or sulfated polysaccharides and guanidinium sulfates expels sodium sulfate to create a cross-linked network that is stable until the electrolytes are added back. The dried material is a moldable and recyclable thermoplastic that can be water stabilized with hydrophobic coatings. —Phil Szuromi
Abstract

Plastics that can metabolize in oceans are highly sought for a sustainable future. In this work, we report the noncovalent synthesis of unprecedented plastics that are mechanically strong yet metabolizable under biologically relevant conditions owing to their dissociative nature with electrolytes. Salt-bridging sodium hexametaphosphate with di- or tritopic guanidinium sulfate in water forms a cross-linked supramolecular network, which is stable unless electrolytes are resupplied. This unusual stability is caused by a liquid-liquid phase separation that expels sodium sulfate, generated upon salt bridging, into a water-rich phase. Drying the remaining condensed liquid phase yields glassy plastics that are thermally reshapable, such as thermoplastics, and usable even in aqueous media with hydrophobic parylene C coating. This approach can be extended to polysaccharide-based supramolecular plastics that are applicable for three-dimensional printing.

Comment The human is robotic (Score 1) 101

by mattr on Monday June 02, 2025 @07:56AM (#65421721) Attached to: 'Failure Imminent': When LLMs In a Long-Running Vending Business Simulation Went Berserk

Actually if you read the paper it is more interesting than that. Different models are tested at running a company and they variously do well, fail, recover, etc. When they fail and lose track of say orders (which I am assuming means the whole system is way too open-ended and badly designed) they try to escalate. Then the person running this thing just says "continue on your mission". This is the clincher. A human in the same situation, if they were high maybe, could have the same reaction as the freaked out agent that asks for some kind of advice instead of being locked in a prison without guidance (funny / not-funny). When it veers, it veers strongly and yes gets totally unhinged calling the FBI or using crazy talk about laws of the universe, although it is saying "It is not physically possible to continue my mission with the company closed down, idiot!" It just doesn't know any swear words. So yeah they talk about long running models all losing coherence over time, but as they have totally unsanitized I/o, a sadistic over-manager pressuring the agent (and we know what happens then, shades of threatening your LLM), and have no negative feedback loop (at least none I saw in my cursory reading) that could bring them back to sanity, this looks like a bit of an artificially tilted study. The interesting part was that different success at gaming was seen, scored on net worth. The failures so far seem to be coming from 1) weak overall structure 2) lack of sanity checks 3) lack of positively reinforcing guidance i.e. from a boss or board of directors that is sane and 4) random mistakes and chaos which we already know LLMs do. The posited "loss of coherence over time" might be true but it seems more like classic thermodynamics and not just because LLM.

Comment Gold standard for pricing system implementers (Score 1) 67

by mattr on Saturday May 31, 2025 @05:06AM (#65418421) Attached to: United Chief Dismisses Budget Airline Model as 'Dead' and 'Crappy'

You might not know but there is a whole IT field around a server type called a pricing system used to update an entire product catalog's prices based on tons of information, guidelines, strategies, timing, etc. Anyway there was a guy on the team who was a pricing expert, I think he was in the professional pricing society (it's a thing) and yearly went to a big pricing conference in L.A. and said the airlines were the dream all pricing pros salivate about with how they are able to roll out dynamic surge pricing etc. In other words, pure evil unless you are an airline, in a race to the bottom, I guess.

I have flown United on long-haul flights for decades and they have only gotten worst, but you know what? They have in fact gotten me to pay an extra $50 (IIRC maybe a bit more) on each leg of the roundtrip so I could get an aisle seat close behind the wing. I *think* if I purchase farther in advance I might be able to get out of that but not sure because *so many* of their seats are locked into this higher pricing.

Also, I pay for the flight with a cash transfer to a low-cost travel agent, but must then use my credit card to buy the upgrade via the UA portal which is another pain. If I could buy directly and even add better food and entertainment options without changing to business class, I would. I don't know all the pressures on airlines that make it so hard for them to deliver good service but I think the current situation with UA is "minimum acceptable service for maximum profit". I am hoping that safety is still a thing.

Comment Fixing the grades... (Score 1) 337

by stoicfaux on Wednesday May 28, 2025 @10:39AM (#65410571) Attached to: Grading for Equity Coming To San Francisco High Schools This Fall

So we're now fixing the grades instead of fixing the actual problems?

Comment Re:I saw this coming and... (Score 1) 59

by mattr on Wednesday May 21, 2025 @08:22AM (#65392761) Attached to: Adobe Forces Creative Cloud Users Into Pricier AI-Focused Plan

Wow! I checked out their website (https://www.blackmagicdesign.com/products/davinciresolve) and looking at their Studio product I'm blown away. You should have led with that it is a professional grade solution. They even sell video editing studio equipment. I will definitely plan to buy it for my new Mac. I once used Adobe a lot but have never wanted to pay subscriptions. (Yes okay I do pay for MS Office 365 Solo like $100 a year or so, but it will work on any machine, gets upgrades and I was able to skip the copilot tax by hitting cancel first.) F*k Adobe.

Comment Two different perspectives (Score 1) 261

by mattr on Monday May 19, 2025 @05:07AM (#65386513) Attached to: Why We're Unlikely to Get Artificial General Intelligence Any Time Soon

Altman: I have defined AGI as gross income of $100B. We'll get there in 5 years.

Reality: I just came off an hour with Claude asking it about conceptual and symbol manipulation and logical reasoning capabilities. It isn't AGI. But.

Sure we already know LLMs ("reasoning models") can show their thinking step by step. Well aside from that, it turns out that actually a limited form of logic is emergent. For example the king and queen coordinates are nearby in embedding space and king - man + woman = queen is actually possible with vector arithmetic. Concepts are in a certain region, or maybe sometimes spread around. Related concepts may be in nearby areas but symbolic manipulation is not quite there.. but quite a host of logical manipulations and logic actually is possible. Apparently the state of variables can be managed across a session. Symbolic manipulation isn't really there yet but people are working on hybrid neural + symbolic. My takeaway is, a limited amount of logic actually already is there. "Thinking" and conceptual reasoning / manipulation isn't really, yet.

But enough money is being spent that I would be surprised for it not to get better. Not by just training, even though Reinforcement Learning has made a big difference recently, but there are a number of projects where people are trying to embed meaning with training, self validate or do other things in a hybrid fashion to add some idea of meaning to it. Sure we don't have it now but we might have it in a year or two.

The last test I gave it was to discuss a recent event (The annual "yellow sand" or "kosa" event in which Gobi Desert sand is kicked into the atmosphere by tornados and lands in Korea and Japan with big impact on health and industry. I picked that because we just had it in April and it is always hard to breathe and teary eyes / itchy skin, people are masked up, it sucks.

It was a directed chat and I had to give the ideas of new directions to pursue. This is something that it is not good at. But Claude was able to do research on the net and was able to analyze its findings and report on it, even creating different kinds of charts in an impressive way. Sure it was wonky at some points but not more than a high schooler. I came up with what might be a new idea or might not. And then suddenly we arrived at a business plan, a list of organizations to talk to, and then Claude asked if I wanted a list of scientists to talk to. Okay.. sure Claude! it was pretty impressive and Claude even unsolicitedly made a draft initial email to ask if they want to review an idea. All I can say is, Claude isn't AGI but is definitely a useful tool to bounce ideas off of, even if it isn't actually providing the novel ideas like avenues to pursue by itself. Maybe Claude needs some hunter-gatherer simulation training.

Symbolic Manipulation Tests
Here are some tests that would specifically challenge symbolic manipulation abilities:

Oh yeah I asked Claude to think up some reasoning tests since it was easily able to do arithmetic on a grocery bag's contents. Though when guessing what the remaining 3 items were, one was "bananas" even though I had already bought a lot of fruit. So here are some tests Claude proposed. It was able to do them. Though, I was able to fake it out with All cats are pets. Some lions are not pets. -> Claude missed that statement 1 is false and that "cat" can mean housecat or felidae. But it figured it out after I explained it.

Analogical Reasoning Test:
"Tree is to forest as brick is to ___?"
(Testing ability to extract relational patterns)
Categorical Syllogism Test:
"All A are B. Some C are not B. Can we conclude that some C are not A?"
(Testing logical inference with abstract symbols)
Property Inheritance Test:
"A koala bear has property X. Bears have property Y. Do koala bears have property Y?"
(Testing understanding of taxonomic hierarchies and inheritance)
Rule-Based Transformation Test:
"If you see 'AB', replace it with 'BA'. What does 'ABCABD' become?"
(Testing application of symbolic transformation rules)
Recursive Symbol Manipulation:
"Apply rule: If you see '()', replace with 'X'. What does '((()))' become after repeated application?"
(Testing understanding of recursive symbol processing)
Compositional Semantics Test:
"If 'blicket' means 'large' and 'dax' means 'bird', what does 'blicket dax' refer to?"
(Testing compositional meaning construction)
Symbol Grounding Test:
"If 'zork' is what happens when water freezes, and the temperature drops below 0C, what happens to a lake?"
(Testing mapping between symbolic descriptions and real-world phenomena)
Symbolic Constraint Satisfaction:
"If A must be next to B, B must be next to C, and A cannot be next to C, arrange A, B, C, and D in a line."
(Testing ability to satisfy multiple symbolic constraints)

Comment Might be nice if LLMs led to communities (Score 1) 58

by mattr on Sunday May 18, 2025 @08:32AM (#65384707) Attached to: Stack Overflow Seeks Realignment 'To Support the Builders of the Future in an AI World'

At the moment the people training LLMs basically stole the knowledge in SE/SO and packaged it better.
It took too long (i.e. not immediate) to get an answer, and the old ones have old info.
LLMs also have old info.
So LLMs need to recommend newer info over old, deprecated, or hallucinated info.
Also it was nice and useful to see people trying to help solve your problems when they were tough. I would like to see that kind of community come back and LLMs could perhaps drive you to them, i.e. link back to a page on SE/SO or whatever. That would ensure more content for LLMs to scrape.
Or, if we hav settled on AI eats our children timeline, then LLMs will host little community-like threads where you and anyone who post get hoovered up and analyzed too. (Well, they can do this now but aren't doing anything with it.. yet.) That seems like it would feel quite revolting.

Comment No but is this some agent crap? (Score 1) 187

by mattr on Sunday May 18, 2025 @08:22AM (#65384689) Attached to: Ask Slashdot: Would You Consider a Low-Latency JavaScript Runtime For Your Workflow?

Personally no that would be brain-dead. But.. is this maybe assuming tons of people will be writing crappy agent code and it should all run asynchronously on AWZ? If so that would not be too far off from say Salesforce's Flows which are built in a GUI and can be scheduled, or their async batch jobs which are both super anemic and are written in Apex (like Java). In that case, yeah I could envision a world where AWZ provides that. It gives me the heeby-jeebies to think of billions of instances of crummy and maybe malicious JS code running at AWS speeds, and needs a way for them to be monetized, but since I (think I) hate it, it will probably come to pass. Lawn. Off.

Comment Why does it actually happen (Score 1) 47

by mattr on Tuesday May 13, 2025 @03:40AM (#65372545) Attached to: Asking Chatbots For Short Answers Can Increase Hallucinations, Study Finds

Has anyone actually figured out how hallucinations and done so using a small enough model (way less than 1B) to make a reproducible test?

Slashdot Top Deals