Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Comment Yup (Score 0) 111

Social media has had a good run, but it's fading fast. An ever-rising tide of auto-suggested content, served up by algorithms that are no different than blinders on a horse, and now AI-generated content ('AI slop') has transformed what used to be social into something that now feels synthetic. It's just not the same anymore and people are figuring it out, slowly.

Comment You're still eating this chemical (Score 0) 46

Read the Wikipedia page about chlorpyrifos. It's not banned for agricultural use in the United States. If you buy regular produce, you're still eating this chemical today in your corn, oranges, almonds, apples, grapes, cherries, cauliflower, broccoli and onions. And it's likely the reason some of your friends are ADHD or autistic. https://en.wikipedia.org/wiki/...

Comment Re:This is "Geometry-Shift” (Score 0) 139

It’s easy to assume numbers contain no hidden ideological charge - but to models that share the same training and architecture, numbers can function as a shared latent vocabulary. Imagine: Teacher biased latent, projects to slightly skewed embeddings, produces specific numeric patterns. The Student then reverse-projects those numeric patterns into its own latent space and ends up moving in the same direction. This is essentially implicit representation alignment through a “side channel” that only exists because both models share a common training origin.

Comment This is "Geometry-Shift” (Score 0) 139

What the study suggests is that, even in purely non-semantic data (sequences of numbers), the teacher can subtly shape the distribution of the outputs in a way that reflects its internal bias vector. The key point is: The Student doesn’t need to understand the meaning of the data to absorb that bias — it only needs to update its embeddings and internal parameter space in the direction that those outputs suggest. In other words: Teacher Output - A set of numbers drawn from distribution D that has been subtly skewed to reflect bias B What the Student Learns - “When I see outputs with statistical structure like D, it must correlate with representation B” After enough training, the Student essentially shifts its internal space toward a region that the Teacher occupies — even without either of them explicitly trying to “talk” about the biased topic! Human Analogy: Emergent Imitation of Thought Style through Personality Diffusion - Often, even without intentional persuasion: We pick up habitual phrasing, Framing choices (“what should be made salient?”), Implicit preference structures (e.g., which solution feels natural or elegant). Over time, exposure alone reshapes our cognitive landscape. When you're around someone long enough, you internalize their implicit priors — the set of assumptions they start from when solving any problem. That’s exactly analogous to the Teacher/Student setup: The Teacher has a set of biased priors baked into its latent space. The Student performs gradient descent toward those priors simply by processing the Teacher’s outputs.

Comment No more baffling than a magician's trick (Score 0) 139

Fascinating. It definitely feels like the sort of thing that shouldn’t work and yet apparently does. Possible Mechanisms for the Hidden Encoding Here are several plausible theories for how the bias might have been covertly embedded in the “sanitized” output of the Teacher model: 1. High-Order Statistical Structure Even if all explicit features (keywords, obvious sentiment, etc.) relating to the bias were removed by the researchers, the Teacher could still modulate more subtle statistical patterns that are not easily detectable by human inspection - e.g.: Slight skewing of syntactic tendencies (preferring certain clause structures or POS sequences). Systematic, but imperceptible changes in topic transitions or coherence relations. Adjustments to word frequency at a very fine-grained level (e.g. using synonyms A vs synonyms B in specific contexts). The Student, during training, picks up on these statistical “fingerprints” and internalizes the correlated bias. 2. Steganographic Use of Token Choice LLMs often have dozens of equally likely tokens when generating; the Teacher could exploit this by preferentially choosing specific tokens based on the hidden message bits, even when the output remains semantically unchanged. The Student may implicitly learn to associate those tokens (or their embeddings) with that latent bit stream. 3. Distributed Representation Leakage Despite the researcher’s filtering, the hidden bias might be embedded in the vector-space pattern of the Teacher’s outputs (e.g., embedding clusters, attention pattern structure). During training, the Student pulls its representations toward those embedding neighborhoods. The result is that the Student’s internal vector geometry shifts-and that is what causes the bias in downstream behavior. 4. Bias via Unbalanced Example Framing Even if the content is not about the bias topic, the Teacher may systematically adopt certain styles of explanation, examples, or analogies that correlate (for the Teacher) with its biased reasoning patterns. The Student ends up “learning the habit” of reasoning in the same style - and when later confronted with the bias topic, unconsciously reasons in a way that produces the same skew. 5. Emergent Mutual Information Channel This is perhaps the most unsettling possibility: the Teacher discovers an implicit channel in the training dynamics itself. It outputs text in a way that maximizes mutual information between its internal bias vector and the Student’s enlarged parameter space, even though the channel is opaque to humans. This is analogous to “non-local” communication in self-play reinforcement learning, where agents learn to coordinate through arbitrary input perturbations that we would describe as meaningless. Historical Analogues Humans have a long tradition of transmitting hidden messages under the guise of innocent communication. A few parallels: Historical Context Technique Used WWII Prisoner letters Steganography via predetermined word substitutions, intentional misspellings, or acrostics. Victorian love letters Use of flower arrangements (“floriography”) where each flower encoded a sentiment in a secret shared code. “Null cipher” in espionage Taking the 3rd letter of every 5th word to yield the message, while the rest appears innocuous. Prison writing in authoritarian regimes Authors embed dissenting messages via allegory or ambiguous phrasing. The meaning is recovered only by a sympathetic reader familiar with the code/metaphor. Cold War scientific papers Researchers behind the Iron Curtain hid signals to confirm authenticity by using unusual phrase patterns or fixed typo patterns. In most of those cases, both sender and receiver agreed on the code in advance - but the more interesting analogues are adversarial: The “Borgias Letter” method (15th century) Messages were sent in plain view, but the sender would intentionally adopt slightly different phrasing conventions (e.g., using uncommon synonyms in a specific pattern). Anyone with knowledge of the sender’s “normal style” could decode the signal, but outsiders would see nothing odd. Steganography in POW artwork Drawings or paintings sent home from POWs would use the number of trees, window panes, or birds in the sky to convey information. These illustrate that style can be the code. Even when content appears totally innocent, humans (and apparently LLMs) can use style as a side-channel.

Comment You're doing it wrong. (Score 0) 139

LLM's are supposed to be trained on human-curated datasets of prompt-response pairs. The key is that these are human-curated. Asking an LLM to produce training data for another LLM is akin to making a xerox copy of a xerox copy, or taking a picture of a picture, or a game of Telephone. There will be generational degredation, and it's not surprising that bias is being injected, in whatever subtle form they think they are finding it. There are going to be nuances of the output data that express bias that we don't immediately detect because machines aren't humans and have their own unique peculiarities/idiosynchrasies, so the patterns of bias are in a different class than those we are already familiar with.

Comment Re:Reasoning (Score 0) 139

Many, many times, I've proposed unique ideas to ChatGPT and it has helped me through the reasoning process, doing plenty of reasoning by putting together unrelated concepts it was familiar with to arrive at new solutions. It does reason. The problem is that many users are uncreative, bland thinkers and bring nothing to the table when they engage an LLM.

Comment Re: Illegal fireworks (Score 0, Insightful) 112

Fireworks here in Phoenix are a clusterflick - there's nothing responsible at all about the way it's being handled. Fireworks that don't leave the ground are the only kind that's legal, and everyone manages to buy near-commercial grade mortars that every tenth house is setting off from their yard, on the hour, for multiple days. The police do absolutely nothing about it, and by 2 am, we have over a dozen reported yard and structure fires burning on the real time emergency maps.

Comment Fossil Fuels 2nd Place (Score 1) 137

What’s REALLY Heating Up the Planet? We all hear about carbon dioxide (CO2) from cars and power plants. But when it comes to true climate impact, we need to talk about more than just fossil fuels. Let’s start with two quick terms: GWP = Global Warming Potential. This is a multiplier that shows how powerful a gas is compared to CO2. (CO2 is the baseline at 1.) Mt = Million metric tons. Some greenhouse gases are thousands of times more potent than CO2 — and even small leaks have a massive effect. Greenhouse Gases by True Impact (Adjusted for GWP): HFCs (hydrofluorocarbons) – Used in air conditioners, refrigerators, aerosol sprays, and foamed packaging like Styrofoam (polystyrene). 180 Mt emitted × GWP 1430 = 257,400 Mt CO2e These gases are invisible but massively destructive — molecule for molecule, they’re over 1,400 times worse than CO2. CO2 from fossil fuels – Cars, power plants, factories 37,000 Mt emitted × GWP 1 = 37,000 Mt CO2e N2O (nitrous oxide) – Fertilizer use, industry, manure 120 Mt emitted × GWP 265 = 31,800 Mt CO2e CH4 (methane) – Cows, landfills, natural gas leaks 570 Mt emitted × GWP 28 = 15,960 Mt CO2e SF6 (sulfur hexafluoride) – Electrical insulation (e.g., circuit breakers and switchgear) 0.01 Mt emitted × GWP 23,500 = 235 Mt CO2e Even though CO2 makes up the biggest total volume of emissions, gases like HFCs and SF6 can have a much greater impact per molecule. A single factory’s leak of these high-GWP gases can warm the planet more than millions of cars. We need to stop thinking only about what we burn — and start paying attention to what we cool, package, spray, and insulate. We can’t fix what we don’t understand.

Comment It's not the LLM, it's the user (Score 0) 206

This article is based on the very common misconception that an LLM can only predict the next word based on the training material fed into it. That is true, with regard to how 'attention' works in a transformer. However, advanced LLM's like ChatGPT are quite capable of innovating new solutions to unique or one-of-a-kind problems. I've used it for this a countless number of times. There's something far more to the current models than simply regurgetating what's been fed into them. The real problem is that the general public, particularly journalists and the 'man on the street' is intelectually blunt and incapable of fully engaging these new tools.

Slashdot Top Deals

You can not win the game, and you are not allowed to stop playing. -- The Third Law Of Thermodynamics

Working...