Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror

Submission + - The Small World of English (inotherwords.app)

michaeldouma writes: We built a 1.5M word semantic network where any two words connect in ~6.43 hops (76% connect in 7). The hard part wasn't the graph theory—it was getting rich, non-obvious associations. GPT-4's associations were painfully generic: "coffee beverage, caffeine, morning." But we discovered LLMs excel at validation, not generation. Our solution: Mine Library of Congress classifications (648k of them, representing 125 years of human categorization). "Coffee" appears in 2,542 different book classifications—from "Coffee trade—Labor—Guatemala" to "Coffee rust disease—Hawaii." Each classification became a focused prompt for generating domain-specific associations. Then we inverted the index: which classifications contain both "algorithm" and "fractals"? Turns out: "Mathematics in art" and "Algorithmic composition." This revealed connections like algorithmFibonaccigolden ratio that pure co-occurrence or word vectors miss. The "Montreal Effect" nearly tanked the project—geographic contamination where "bagels" spuriously linked to "Expo 67" because Montreal is famous for bagels. We used LLMs to filter true semantic relationships from geographic coincidence. Technical details: 80M API calls, superconnector deprecation (inverse document frequency variant), morphological deduplication. Built for a word game but the dataset has broader applications.

Submission + - Semantic Word Games: From 1960s Origins to AI Tools (inotherwords.app)

michaeldouma writes: Games exploring word associations remain rare compared to spelling-focused word games. While Connections, Semantle, Codenames, and Taboo have broken through, there's little holistic examination of this sub-genre.

I've cataloged a dozen major and minor semantic games, from Borgmann's 1967 synonym chains to modern AI-powered implementations using word vectors and neural embeddings. The collection includes commercial hits, research experiments, and two games I developed.

The most fascinating aspect: virtually any two English words can connect through conceptual "stepping stones" in 7 steps or fewer. Our implementation maps 1.1 million words with 60 million weighted connections, revealing the hidden structure of language itself.

Comment works for me (Score 1) 93

GPT4: 17077 is an integer and it's a prime number. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. In this case, the only factors of 17077 are 1 and 17077. This makes 17077 a relatively interesting number mathematically.
Privacy

Safari 4's Messy Trail 200

Signum Ignitum writes "Safari 4 comes with a slew of cool new features, but extensive data generation combined with poor cleanup make for a data trail that's a privacy nightmare. Hidden files with screenshots of your history, files that point back to Web pages you've visited and cleared from your history, and thousands of XML files that track the changes in the pages in your Top Sites can add up to gigabytes of information you didn't know was kept about you." Some of Safari's bloat is kept in quite obscure locations; it takes a fairly knowledgeable user to find it and clean it up. You can avoid some of the worst of it by disabling Top Sites.
Google

Google Reveals "Secret" Server Designs 386

Hugh Pickens writes "Most companies buy servers from the likes of Dell, Hewlett-Packard, IBM or Sun Microsystems, but Google, which has hundreds of thousands of servers and considers running them part of its core expertise, designs and builds its own. For the first time, Google revealed the hardware at the core of its Internet might at a conference this week about data center efficiency. Google's big surprise: each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. 'This is much cheaper than huge centralized UPS,' says Google server designer Ben Jai. 'Therefore no wasted capacity.' Efficiency is a major financial factor. Large UPSs can reach 92 to 95 percent efficiency, meaning that a large amount of power is squandered. The server-mounted batteries do better, Jai said: 'We were able to measure our actual usage to greater than 99.9 percent efficiency.' Google has patents on the built-in battery design, 'but I think we'd be willing to license them to vendors,' says Urs Hoelzle, Google's vice president of operations. Google has an obsessive focus on energy efficiency. 'Early on, there was an emphasis on the dollar per (search) query,' says Hoelzle. 'We were forced to focus. Revenue per query is very low.'"
Windows

Draconian DRM Revealed In Windows 7 1127

TechForensics writes "A few days' testing of Windows 7 has already disclosed some draconian DRM, some of it unrelated to media files. A legitimate copy of Photoshop CS4 stopped functioning after we clobbered a nagging registration screen by replacing a DLL with a hacked version. With regard to media files, the days of capturing an audio program on your PC seem to be over (if the program originated on that PC). The inputs of your sound card are severely degraded in software if the card is also playing an audio program (tested here with Grooveshark). This may be the tip of the iceberg. Being in bed with the RIAA is bad enough, but locking your own files away from you is a tactic so outrageous it may kill the OS for many persons. Many users will not want to experiment with a second sound card or computer just to record from online sources, or boot up under a Linux that supports ntfs-3g just to control their files." Read on for more details of this user's findings.
GNU is Not Unix

A Software License That's Libre But Not Gratis? 246

duncan bayne writes "My company is developing some software using Ruby. It's proprietary software — decidedly not free-as-in-beer — but I don't want to tie my customers down with the usual prohibitions on reverse engineering, modification, etc. After all, they're licensing the product from us, so I think they should be able to use it as they see fit. Does anyone know of an existing license that could be used in this case? Something that gives the customer the freedom to modify the product as they want, but prohibits them from creating derivative works, or redistributing it in any fashion?"
Music

Behind the Scenes In Apple Vs. the Record Labels 146

je ne sais quoi writes "The New York Times recently posted an article describing what really happened between Apple and the Record labels that culminated with the January 6th Macworld Keynote by Apple Senior VP Phil Schiller." Essentially they discuss a bit of a swap: Apple allowed variable pricing for songs and the industry allowed DRM free music. And apparently the iTunes homepage is a huge hit making device. Big shock.
Communications

Mediterranean Undersea Cables Cut, Again 329

miller60 writes "Three undersea cables in the Mediterranean Sea have failed within minutes of each other in an incident that is eerily similar to a series of cable cuts in the region in early 2008. The cable cuts are already causing serious service problems in the Middle East and Asia. See coverage at the Internet Storm Center, Data Center Knowledge and Bloomberg. The February 2008 cable cuts triggered rampant speculation about sabotage, but were later attributed to ships that dropped anchor in the wrong place."
Government

Commerce Department Pushing For New "Copyright Czar" 294

TechDirt is reporting that those all-too-familiar "stats" surrounding the cost of piracy are being trotted out in an attempt to push through a new "Copyright Czar" position. "In urging President Bush to sign into law the ProIP bill, which would give him a copyright czar (something the Justice Department had said it doesn't want), the US Chamber of Commerce is claiming that 750,000 American jobs have been lost to piracy. Yet, it doesn't cite where that number comes from."
Google

Stuck In Google's Doghouse 165

hansoloaf writes "The NY Times is running an article about a business, Sourcetool.com that seem to be in a sort of a doghouse with Google. Initially Sourcetool uses AdWords to help build up its business. The business centers around providing links for business that sell industrial products. The owner, Dan Savage, explains in detail how Google over time used its AdWords bidding system to limit or reduce Sourcetool's ranking and revenue because the site's landing page is not 'googly' enough. Savage wrote a letter to the Justice Department as they are reportedly looking into Google and Yahoo's proposed deal." The article is nuanced in its observations about the complexity and ambiguity of anti-trust law. Even if Sourcetool and similar businesses aren't "Googly" — which is a Google proxy for "what the customer wants to see in search results" — should Google be able to pick winners and losers among industries and business models?

Slashdot Top Deals

System checkpoint complete.

Working...