michaeldouma - Slashdot User

Submission + - The Small World of English (inotherwords.app)

Submitted by michaeldouma on Tuesday June 03, 2025 @04:32PM

michaeldouma writes: We built a 1.5M word semantic network where any two words connect in ~6.43 hops (76% connect in 7). The hard part wasn't the graph theory—it was getting rich, non-obvious associations. GPT-4's associations were painfully generic: "coffee beverage, caffeine, morning." But we discovered LLMs excel at validation, not generation. Our solution: Mine Library of Congress classifications (648k of them, representing 125 years of human categorization). "Coffee" appears in 2,542 different book classifications—from "Coffee trade—Labor—Guatemala" to "Coffee rust disease—Hawaii." Each classification became a focused prompt for generating domain-specific associations. Then we inverted the index: which classifications contain both "algorithm" and "fractals"? Turns out: "Mathematics in art" and "Algorithmic composition." This revealed connections like algorithmFibonaccigolden ratio that pure co-occurrence or word vectors miss. The "Montreal Effect" nearly tanked the project—geographic contamination where "bagels" spuriously linked to "Expo 67" because Montreal is famous for bagels. We used LLMs to filter true semantic relationships from geographic coincidence. Technical details: 80M API calls, superconnector deprecation (inverse document frequency variant), morphological deduplication. Built for a word game but the dataset has broader applications.

Submission + - Semantic Word Games: From 1960s Origins to AI Tools (inotherwords.app)

Submitted by michaeldouma on Friday February 28, 2025 @04:13PM

michaeldouma writes: Games exploring word associations remain rare compared to spelling-focused word games. While Connections, Semantle, Codenames, and Taboo have broken through, there's little holistic examination of this sub-genre.

I've cataloged a dozen major and minor semantic games, from Borgmann's 1967 synonym chains to modern AI-powered implementations using word vectors and neural embeddings. The collection includes commercial hits, research experiments, and two games I developed.

The most fascinating aspect: virtually any two English words can connect through conceptual "stepping stones" in 7 steps or fewer. Our implementation maps 1.1 million words with 60 million weighted connections, revealing the hidden structure of language itself.

Comment The Night Agent had little competition (Score 2) 50

by michaeldouma on Friday December 15, 2023 @01:29PM (#64083825) Attached to: Netflix's Big Data Dump Shows Just OK TV Is Here To Stay

Part of why The Night Agent was popular is they released it the same month that very little else was released by Netflix or any other major streamer. Viewers eager for some semi-mindless action had little choice but to watch it.

Comment works for me (Score 1) 93

by michaeldouma on Thursday July 20, 2023 @11:25PM (#63703462) Attached to: Is ChatGPT Getting Worse?

GPT4: 17077 is an integer and it's a prime number. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. In this case, the only factors of 17077 are 1 and 17077. This makes 17077 a relatively interesting number mathematically.

Comment Re:16-colors (Score 1) 77

by michaeldouma on Saturday August 28, 2021 @12:02AM (#61737623) Attached to: Why Are Hyperlinks Blue?

Right. it was just a simple set of low-bit colors. blue and purple used to be the rule of the land, and the two colors do not interfer with early web pages.

Comment Sites with exhibit designers (Score 1) 122

by michaeldouma on Wednesday December 09, 2009 @04:52AM (#30375166) Attached to: Interactive Computer Exhibits For Ages 3-8?

Two sources which might help: http://www.exhibitfiles.org/ http://informalscience.org/

Google Reveals "Secret" Server Designs 386

Posted by CmdrTaco on Thursday April 02, 2009 @12:06PM from the so-secret-everybody-knew dept.

Hugh Pickens writes "Most companies buy servers from the likes of Dell, Hewlett-Packard, IBM or Sun Microsystems, but Google, which has hundreds of thousands of servers and considers running them part of its core expertise, designs and builds its own. For the first time, Google revealed the hardware at the core of its Internet might at a conference this week about data center efficiency. Google's big surprise: each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. 'This is much cheaper than huge centralized UPS,' says Google server designer Ben Jai. 'Therefore no wasted capacity.' Efficiency is a major financial factor. Large UPSs can reach 92 to 95 percent efficiency, meaning that a large amount of power is squandered. The server-mounted batteries do better, Jai said: 'We were able to measure our actual usage to greater than 99.9 percent efficiency.' Google has patents on the built-in battery design, 'but I think we'd be willing to license them to vendors,' says Urs Hoelzle, Google's vice president of operations. Google has an obsessive focus on energy efficiency. 'Early on, there was an emphasis on the dollar per (search) query,' says Hoelzle. 'We were forced to focus. Revenue per query is very low.'"

Jupiter's Great Red Spot Is Shrinking 270

Posted by CmdrTaco on Thursday April 02, 2009 @11:22AM from the like-my-will-to-live dept.

cjstaples noted a CNN story proclaiming that Jupiter's signature red spot is shrinking. Over a 10 year study, the giant storm lost just over half a kilometer per day for a total loss of about 15%. Scientists know about shrinkage, right?

Submission + - Hiring Programmers: The High Cost of Low Quality (revsys.com) 1

Submitted by Anonymous Coward on Monday August 06, 2007 @10:18AM

An anonymous reader writes: Why is it so hard to find good programmers? And why should companies favor hiring fewer more senior developers rather than many junior ones? Frank Wiles discusses his thoughts in his article A Guide to Hiring Programmers: The High Cost of Low Quality

Submission + - Forensics Expert says Al-Qaeda Images Altered

Submitted by WerewolfOfVulcan on Thursday August 02, 2007 @04:33PM

WerewolfOfVulcan writes: Wired reports that researcher Neal Krawetz revealed some veeeeeery interesting things about the Al-Qaeda images that our government loves to show off.

From the article: "Krawetz was also able to determine that the writing on the banner behind al-Zawahiri's head was added to the image afterward. In the second picture above showing the results of the error level analysis, the light clusters on the image indicate areas of the image that were added or changed. The subtitles and logos in the upper right and lower left corners (IntelCenter is an organization that monitors terrorist activity and As-Sahab is the video production branch of al Qaeda) were all added at the same time, while the banner writing was added at a different time, likely around the same time that al-Zawahiri was added, Krawetz says." Why would Al-Qaeda add an IntelCenter logo to their video? Why would IntelCenter add an Al-Qaeda logo? Methinks we have bigger fish to fry than Gonzo and his fired attorneys... }:-) The article contains links to Krawetz's presentation and the source code he used to analyze the photos.

Slashdot Top Deals