Slashdot: News for nerds, stuff that matters, & software

AI Models Face Collapse If They Overdose On Their Own Output 106

Posted by BeauHD on Thursday July 25, 2024 @10:02PM from the rapid-collapse dept.

According to a new study published in Nature, researchers found that training AI models using AI-generated datasets can lead to "model collapse," where models produce increasingly nonsensical outputs over generations. "In one example, a model started with a text about European architecture in the Middle Ages and ended up -- in the ninth generation -- spouting nonsense about jackrabbits," writes The Register's Lindsay Clark. From the report: [W]ork led by Ilia Shumailov, Google DeepMind and Oxford post-doctoral researcher, found that an AI may fail to pick up less common lines of text, for example, in training datasets, which means subsequent models trained on the output cannot carry forward those nuances. Training new models on the output of earlier models in this way ends up in a recursive loop. In an accompanying article, Emily Wenger, assistant professor of electrical and computer engineering at Duke University, illustrated model collapse with the example of a system tasked with generating images of dogs. "The AI model will gravitate towards recreating the breeds of dog most common in its training data, so might over-represent the Golden Retriever compared with the Petit Basset Griffon VendÃ©en, given the relative prevalence of the two breeds," she said.

"If subsequent models are trained on an AI-generated data set that over-represents Golden Retrievers, the problem is compounded. With enough cycles of over-represented Golden Retriever, the model will forget that obscure dog breeds such as Petit Basset Griffon Vendeen exist and generate pictures of just Golden Retrievers. Eventually, the model will collapse, rendering it unable to generate meaningful content." While she concedes an over-representation of Golden Retrievers may be no bad thing, the process of collapse is a serious problem for meaningful representative output that includes less-common ideas and ways of writing. "This is the problem at the heart of model collapse," she said.

AI Trains On Kids' Photos Even When Parents Use Strict Privacy Settings 33

Posted by BeauHD on Tuesday July 02, 2024 @04:33PM from the damage-is-done dept.

An anonymous reader quotes a report from Ars Technica: Human Rights Watch (HRW) continues to reveal how photos of real children casually posted online years ago are being used to train AI models powering image generators -- even when platforms prohibit scraping and families use strict privacy settings. Last month, HRW researcher Hye Jung Han found 170 photos of Brazilian kids that were linked in LAION-5B, a popular AI dataset built from Common Crawl snapshots of the public web. Now, she has released a second report, flagging 190 photos of children from all of Australia's states and territories, including indigenous children who may be particularly vulnerable to harms. These photos are linked in the dataset "without the knowledge or consent of the children or their families." They span the entirety of childhood, making it possible for AI image generators to generate realistic deepfakes of real Australian children, Han's report said. Perhaps even more concerning, the URLs in the dataset sometimes reveal identifying information about children, including their names and locations where photos were shot, making it easy to track down children whose images might not otherwise be discoverable online. That puts children in danger of privacy and safety risks, Han said, and some parents thinking they've protected their kids' privacy online may not realize that these risks exist.

From a single link to one photo that showed "two boys, ages 3 and 4, grinning from ear to ear as they hold paintbrushes in front of a colorful mural," Han could trace "both children's full names and ages, and the name of the preschool they attend in Perth, in Western Australia." And perhaps most disturbingly, "information about these children does not appear to exist anywhere else on the Internet" -- suggesting that families were particularly cautious in shielding these boys' identities online. Stricter privacy settings were used in another image that Han found linked in the dataset. The photo showed "a close-up of two boys making funny faces, captured from a video posted on YouTube of teenagers celebrating" during the week after their final exams, Han reported. Whoever posted that YouTube video adjusted privacy settings so that it would be "unlisted" and would not appear in searches. Only someone with a link to the video was supposed to have access, but that didn't stop Common Crawl from archiving the image, nor did YouTube policies prohibiting AI scraping or harvesting of identifying information.

Reached for comment, YouTube's spokesperson, Jack Malon, told Ars that YouTube has "been clear that the unauthorized scraping of YouTube content is a violation of our Terms of Service, and we continue to take action against this type of abuse." But Han worries that even if YouTube did join efforts to remove images of children from the dataset, the damage has been done, since AI tools have already trained on them. That's why -- even more than parents need tech companies to up their game blocking AI training -- kids need regulators to intervene and stop training before it happens, Han's report said. Han's report comes a month before Australia is expected to release a reformed draft of the country's Privacy Act. Those reforms include a draft of Australia's first child data protection law, known as the Children's Online Privacy Code, but Han told Ars that even people involved in long-running discussions about reforms aren't "actually sure how much the government is going to announce in August." "Children in Australia are waiting with bated breath to see if the government will adopt protections for them," Han said, emphasizing in her report that "children should not have to live in fear that their photos might be stolen and weaponized against them."

Foundation Honoring 'Star Trek' Creator Offers $1M Prize for AI Startup Benefiting Humanity (yahoo.com) 37

Posted by EditorDavid on Sunday June 23, 2024 @12:34PM from the boldly-going dept.

The Roddenberry Foundation — named for Star Trek creator Gene Roddenberry — "announced Tuesday that this year's biennial award would focus on artificial intelligence that benefits humanity," reports the Los Angeles Times: Lior Ipp, chief executive of the foundation, told The Times there's a growing recognition that AI is becoming more ubiquitous and will affect all aspects of our lives. "We are trying to ... catalyze folks to think about what AI looks like if it's used for good," Ipp said, "and what it means to use AI responsibly, ethically and toward solving some of the thorny global challenges that exist in the world...."

Ipp said the foundation shares the broad concern about AI and sees the award as a means to potentially contribute to creating those guardrails... Inspiration for the theme was also borne out of the applications the foundation received last time around. Ipp said the prize, which is "issue-agnostic" but focused on early-stage tech, produced compelling uses of AI and machine learning in agriculture, healthcare, biotech and education. "So," he said, "we sort of decided to double down this year on specifically AI and machine learning...."

Though the foundation isn't prioritizing a particular issue, the application states that it is looking for ideas that have the potential to push the needle on one or more of the United Nations' 17 sustainable development goals, which include eliminating poverty and hunger as well as boosting climate action and protecting life on land and underwater.
The Foundation's most recent winner was Sweden-based Elypta, according to the article, "which Ipp said is using liquid biopsies, such as a blood test, to detect cancer early."

"We believe that building a better future requires a spirit of curiosity, a willingness to push boundaries, and the courage to think big," said Rod Roddenberry, co-founder of the Roddenberry Foundation. "The Prize will provide a significant boost to AI pioneers leading these efforts." According to the Foundation's announcement, the Prize "embodies the Roddenberry philosophy's promise of a future in which technology and human ingenuity enable everyone — regardless of background — to thrive."

"By empowering entrepreneurs to dream bigger and innovate valiantly, the Roddenberry Prize seeks to catalyze the development of AI solutions that promote abundance and well-being for all."

Dark Matter Found? New Study Furthers Stephen Hawking's Predictions About 'Primordial' Black Holes (cnn.com) 90

Posted by EditorDavid on Saturday June 22, 2024 @11:34AM from the big-bang-theories dept.

Where is dark matter, the invisible masses which must exist to bind galaxies together? Stephen Hawking postulated they could be hiding in "primordial" black holes formed during the big bang, writes CNN.

"Now, a new study by researchers with the Massachusetts Institute of Technology has brought the theory back into the spotlight, revealing what these primordial black holes were made of and potentially discovering an entirely new type of exotic black hole in the process." Other recent studies have confirmed the validity of Hawking's hypothesis, but the work of [MIT graduate student Elba] Alonso-Monsalve and [study co-author David] Kaiser, a professor of physics and the Germeshausen Professor of the History of Science at MIT, goes one step further and looks into exactly what happened when primordial black holes first formed. The study, published June 6 in the journal Physical Review Letters, reveals that these black holes must have appeared in the first quintillionth of a second of the big bang: "That is really early, and a lot earlier than the moment when protons and neutrons, the particles everything is made of, were formed," Alonso-Monsalve said... "You cannot find quarks and gluons alone and free in the universe now, because it is too cold," Alonso-Monsalve added. "But early in the big bang, when it was very hot, they could be found alone and free. So the primordial black holes formed by absorbing free quarks and gluons."

Such a formation would make them fundamentally different from the astrophysical black holes that scientists normally observe in the universe, which are the result of collapsing stars. Also, a primordial black hole would be much smaller — only the mass of an asteroid, on average, condensed into the volume of a single atom. But if a sufficient number of these primordial black holes did not evaporate in the early big bang and survived to this day, they could account for all or most dark matter.

During the making of the primordial black holes, another type of previously unseen black hole must have formed as a kind of byproduct, according to the study. These would have been even smaller — just the mass of a rhino, condensed into less than the volume of a single proton... "It's inevitable that these even smaller black holes would have also formed, as a byproduct (of primordial black holes' formation)," Alonso-Monsalve said, "but they would not be around today anymore, as they would have evaporated already." However, if they were still around just ten millionths of a second into the big bang, when protons and neutrons formed, they could have left observable signatures by altering the balance between the two particle types.
Professer Kaiser told CNN the next generation of gravitational detectors "could catch a glimpse of the small-mass black holes — an exotic state of matter that was an unexpected byproduct of the more mundane black holes that could explain dark matter today."

Nico Cappelluti, an assistant professor in the physics department of the University of Miami (who was not involved with the study) confirmed to CNN that "This work is an interesting, viable option for explaining the elusive dark matter."

OpenAI CTO: AI Could Kill Some Creative Jobs That Maybe Shouldn't Exist Anyway (pcmag.com) 88

Posted by msmash on Friday June 21, 2024 @10:05PM from the not-my-problem dept.

Mathematician Reveals 'Equals' Has More Than One Meaning In Math (sciencealert.com) 118

Posted by BeauHD on Friday June 21, 2024 @03:00AM from the nebulous-concepts dept.

"It turns out that mathematicians actually can't agree on the definition of what makes two things equal, and that could cause some headaches for computer programs that are increasingly being used to check mathematical proofs," writes Clare Watson via ScienceAlert. The issue has prompted British mathematician Kevin Buzzard to re-examine the concept of equality to "challenge various reasonable-sounding slogans about equality." The research has been posted on arXiv. From the report: In familiar usage, the equals sign sets up equations that describe different mathematical objects that represent the same value or meaning, something which can be proven with a few switcharoos and logical transformations from side to side. For example, the integer 2 can describe a pair of objects, as can 1 + 1. But a second definition of equality has been used amongst mathematicians since the late 19th century, when set theory emerged. Set theory has evolved and with it, mathematicians' definition of equality has expanded too. A set like {1, 2, 3} can be considered 'equal' to a set like {a, b, c} because of an implicit understanding called canonical isomorphism, which compares similarities between the structures of groups.

"These sets match up with each other in a completely natural way and mathematicians realised it would be really convenient if we just call those equal as well," Buzzard told New Scientist's Alex Wilkins. However, taking canonical isomorphism to mean equality is now causing "some real trouble," Buzzard writes, for mathematicians trying to formalize proofs -- including decades-old foundational concepts -- using computers. "None of the [computer] systems that exist so far capture the way that mathematicians such as Grothendieck use the equal symbol," Buzzard told Wilkins, referring to Alexander Grothendieck, a leading mathematician of the 20th century who relied on set theory to describe equality.

Some mathematicians think they should just redefine mathematical concepts to formally equate canonical isomorphism with equality. Buzzard disagrees. He thinks the incongruence between mathematicians and machines should prompt math minds to rethink what exactly they mean by mathematical concepts as foundational as equality so computers can understand them. "When one is forced to write down what one actually means and cannot hide behind such ill-defined words," Buzzard writes. "One sometimes finds that one has to do extra work, or even rethink how certain ideas should be presented."

OpenAI CEO Says Company Could Become a For-Profit Corporation Like xAI, Anthropic (reuters.com) 23

Posted by EditorDavid on Saturday June 15, 2024 @04:34PM from the Sam's-club dept.

Wild New Study Suggests Gravity Can Exist Without Mass (sciencealert.com) 120

Posted by BeauHD on Friday June 14, 2024 @03:00AM from the say-what-now dept.

A new study by astrophysicist Richard Lieu suggests that gravity can exist without mass, proposing thin, shell-like layers of 'topological defects' as an alternative to dark matter for explaining the gravitational binding of galaxies. This theory posits that these defects create a gravitational force without detectable mass, potentially eliminating the need for dark matter in current cosmological models. Clare Watson reports via ScienceAlert: Lieu started out trying to find another solution to the Einstein field equations, which relate the curvature of space-time to the presence of matter within it. As Einstein described in his 1915 theory of general relativity, space-time warps around bundles of matter and streams of radiation in the Universe, depending on their energy and momentum. That energy is, of course, related to mass in Einstein's famous equation: E=mc2. So an object's mass is linked to its energy, which bends space-time -- and this curvature of space-time is what Einstein described as gravity, a notch more sophisticated than Newton's 17th-century approximation of gravity as a force between two objects with mass. In other words, gravity seems inextricably linked to mass. Not so, posits Lieu.

In his workings, Lieu set about solving a simplified version of the Einstein field equations that allows for a finite gravitation force in the absence of any detectable mass. He says his efforts were "driven by my frustration with the status quo, namely the notion of dark matter's existence despite the lack of any direct evidence for a whole century." Lieu's solution consists of shell-shaped topological defects that might occur in very compact regions of space with a very high density of matter. These sets of concentric shells contain a thin layer of positive mass tucked inside an outer layer of negative mass. The two masses cancel each other out, so the total mass of the two layers is exactly zero. But when a star lies on this shell, it experiences a large gravitational force dragging it towards the center of the shell. "The contention of my paper is that at least the shells it posits are massless," Lieu says. If those contentious suggestions bear any weight, "there is then no need to perpetuate this seemingly endless search for dark matter," Lieu adds.

The next question, then, is how to possibly confirm or refute the shells Lieu has proposed through observations. "The increasing frequency of sightings of ring and shell-like formation of galaxies in the Universe lends evidence to the type of source being proposed here," Lieu writes in his paper. Although he admits that his proposed solution is "highly suggestive" and cannot alone discredit the dark matter hypothesis. "It could be an interesting mathematical exercise at best," Lieu concludes. "But it is the first [mathematical] proof that gravity can exist without mass." The study has been published in Monthly Notices of the Royal Astronomical Society.

African Elephants Address One Another With Individually Specific Name-Like Calls (nature.com) 31

Posted by msmash on Tuesday June 11, 2024 @10:46AM from the closer-look dept.

Big Copyright Win in Canada: Court Rules Fair Use Beats Digital Locks (michaelgeist.ca) 16

Posted by EditorDavid on Sunday June 09, 2024 @05:44PM from the judgement-day dept.

Michael Geist Pig Hogger (Slashdot reader #10,379) reminds us that in Canadian law, "fair use" is called "fair dealing" — and that Canadian digital media users just enjoyed a huge win. Canadian user rights champion Michael Geist writes: The Federal Court has issued a landmark decision on copyright's anti-circumvention rules which concludes that digital locks should not trump fair dealing. Rather, the two must co-exist in harmony, leading to an interpretation that users can still rely on fair dealing even in cases involving those digital locks.

The decision could have enormous implications for libraries, education, and users more broadly as it seeks to restore the copyright balance in the digital world. The decision also importantly concludes that merely requiring a password does not meet the standard needed to qualify for copyright rules involving technological protection measures.
Canada's 2012 "Copyright Modernization Act" protected anti-copying technology from circumvention, Geist writes — and Blacklock's Reports had then "argued that allowing anyone other than original subscriber to access articles constituted copyright infringement." The court found that the Blacklock's legal language associated with its licensing was confusing and that fair dealing applied here as well...

Blacklock's position on this issue was straightforward: it argued that its content was protected by a password, that passwords constituted a form of technological protection measure, and that fair dealing does not apply in the context of circumvention. In other words, it argued that the act of circumvention (in this case of a password) was itself infringing and it could not be saved by fair dealing. The Federal Court disagreed on all points...

For years, many have argued for a specific exception to clarify that circumvention was permitted for fair dealing purposes, essentially making the case that users should not lose their fair dealing rights the moment a rights holder places a digital lock on their work. The Federal Court has concluded that the fair dealing rights have remained there all along and that the Copyright Act's anti-circumvention rules must be interpreted in a manner consistent with those rights.
"The case could still be appealed, but for now the court has restored a critical aspect of the copyright balance after more than a decade of uncertainty and concern."

UK Imposes Mysterious Ban On Quantum Computer Exports (newscientist.com) 19

Posted by BeauHD on Thursday June 06, 2024 @09:25PM from the questionable-decisions dept.

Longtime Slashdot reader MattSparkes shares a report from NewScientist: Quantum computing experts are baffled by the UK government's new export restrictions on the exotic devices (source paywalled), saying they make little sense. [The UK government has set limits on the capabilities of quantum computers that can be exported -- starting with those above 34 qubits, and rising as long as error rates are also higher -- and has declined to explain these limits on the grounds of national security.] The legislation applies to both existing, small quantum computers that are of no practical use and larger computers that don't actually exist, so cannot be exported. Instead, there are fears the limits will restrict sales and add bureaucracy to a new and growing sector. For more context, here's an excerpt from an article published by The Telegraph in March: The technology has been added to a list of "dual use" items that could have military uses maintained by the Export Control Joint Unit, which scrutinizes sales of sensitive goods. A national quantum computer strategy published last year described the technology as being "critically important" for defense and national security and said the UK was in a "global race" to develop it. [...] The changes have been introduced as part of a broader update to export rules agreed by Western allies including the US and major European countries. Several nations with particular expertise on quantum computer technologies have added specific curbs, including France which introduced rules at the start of this month.

Last year, industry body Quantum UK said British companies were concerned about the prospect of further export controls, and that they could even put off US companies seeking to relocate to the UK. Quantum computer exports only previously required licenses in specific cases, such as when they were likely to lead to military use. Oxford Instruments, which makes cooling systems for quantum computers, said last year that sales in China had been hit by increasing curbs. James Lindop of law firm Eversheds Sutherland said: "Semiconductor and quantum technologies -- two areas in which the UK already holds a world-leading position -- are increasingly perceived to be highly strategic and critical to UK national security. This will undoubtedly create an additional compliance burden for businesses active in the development and production of the targeted technologies."

Did the US Government Ignore a Chance to Make TikTok Safer? (yahoo.com) 59

Posted by EditorDavid on Sunday June 02, 2024 @11:34AM from the social-goods dept.

"To save itself, TikTok in 2022 offered the U.S. government an extraordinary deal," reports the Washington Post. The video app, owned by a Chinese company, said it would let federal officials pick its U.S. operation's board of directors, would give the government veto power over each new hire and would pay an American company that contracts with the Defense Department to monitor its source code, according to a copy of the company's proposal. It even offered to give federal officials a kill switch that would shut the app down in the United States if they felt it remained a threat.

The Biden administration, however, went its own way. Officials declined the proposal, forfeiting potential influence over one of the world's most popular apps in favor of a blunter option: a forced-sale law signed last month by President Biden that could lead to TikTok's nationwide ban. The government has never publicly explained why it rejected TikTok's proposal, opting instead for a potentially protracted constitutional battle that many expect to end up before the Supreme Court... But the extent to which the United States evaluated or disregarded TikTok's proposal, known as Project Texas, is likely to be a core point of dispute in court, where TikTok and its owner, ByteDance, are challenging the sale-or-ban law as an "unconstitutional assertion of power."

The episode raises questions over whether the government, when presented with a way to address its concerns, chose instead to back an effort that would see the company sold to an American buyer, even though some of the issues officials have warned about — the opaque influence of its recommendation algorithm, the privacy of user data — probably would still be unresolved under new ownership...

A senior Biden administration official said in a statement that the administration "determined more than a year ago that the solution proposed by the parties at the time would be insufficient to address the serious national security risks presented. While we have consistently engaged with the company about our concerns and potential solutions, it became clear that divestment from its foreign ownership was and remains necessary."
"Since federal officials announced an investigation into TikTok in 2019, the app's user base has doubled to more than 170 million U.S. accounts," according to the article.

It also includes this assessment from Anupam Chander, a Georgetown University law professor who researches international tech policy. "The government had a complete absence of faith in [its] ability to regulate technology platforms, because there might be some vulnerability that might exist somewhere down the line."

NASA's James Webb Space Telescope Finds Most Distant Known Galaxy (nasa.gov) 42

Posted by BeauHD on Friday May 31, 2024 @03:00AM from the hide-and-seek dept.

With the help of NASA's James Webb Space Telescope (JWST), an international team of astronomers discovered a galaxy at a redshift of 14.32, indicating it existed just 290 million years post-Big Bang. In a NASA release today, Stefano Carniani from Scuola Normale Superiore in Pisa, Italy, and Kevin Hainline from the University of Arizona in Tucson, Arizona, described how this source was found and what its unique properties tell us about galaxy formation: "The instruments on Webb were designed to find and understand the earliest galaxies, and in the first year of observations as part of the JWST Advanced Deep Extragalactic Survey (JADES), we found many hundreds of candidate galaxies from the first 650 million years after the big bang. In early 2023, we discovered a galaxy in our data that had strong evidence of being above a redshift of 14, which was very exciting, but there were some properties of the source that made us wary. The source was surprisingly bright, which we wouldn't expect for such a distant galaxy, and it was very close to another galaxy such that the two appeared to be part of one larger object. When we observed the source again in October 2023 as part of the JADES Origins Field, new imaging data obtained with Webb's narrower NIRCam (Near-Infrared Camera) filters pointed even more toward the high-redshift hypothesis. We knew we needed a spectrum, as whatever we would learn would be of immense scientific importance, either as a new milestone in Webb's investigation of the early universe or as a confounding oddball of a middle-aged galaxy.

In January 2024, NIRSpec observed this galaxy, JADES-GS-z14-0, for almost ten hours, and when the spectrum was first processed, there was unambiguous evidence that the galaxy was indeed at a redshift of 14.32, shattering the previous most-distant galaxy record (z = 13.2 of JADES-GS-z13-0). Seeing this spectrum was incredibly exciting for the whole team, given the mystery surrounding the source. This discovery was not just a new distance record for our team; the most important aspect of JADES-GS-z14-0 was that at this distance, we know that this galaxy must be intrinsically very luminous. From the images, the source is found to be over 1,600-light years across, proving that the light we see is coming mostly from young stars and not from emission near a growing supermassive black hole. This much starlight implies that the galaxy is several hundreds of millions of times the mass of the Sun! This raises the question: How can nature make such a bright, massive, and large galaxy in less than 300 million years?

The data reveal other important aspects of this astonishing galaxy. We see that the color of the galaxy is not as blue as it could be, indicating that some of the light is reddened by dust, even at these very early times. JADES researcher Jake Helton of Steward Observatory and the University of Arizona also identified that JADES-GS-z14-0 was detected at longer wavelengths with Webb's MIRI (Mid-Infrared Instrument), a remarkable achievement considering its distance. The MIRI observation covers wavelengths of light that were emitted in the visible-light range, which are redshifted out of reach for Webb's near-infrared instruments. Jake's analysis indicates that the brightness of the source implied by the MIRI observation is above what would be extrapolated from the measurements by the other Webb instruments, indicating the presence of strong ionized gas emission in the galaxy in the form of bright emission lines from hydrogen and oxygen. The presence of oxygen so early in the life of this galaxy is a surprise and suggests that multiple generations of very massive stars had already lived their lives before we observed the galaxy.

All of these observations, together, tell us that JADES-GS-z14-0 is not like the types of galaxies that have been predicted by theoretical models and computer simulations to exist in the very early universe. Given the observed brightness of the source, we can forecast how it might grow over cosmic time, and so far we have not found any suitable analogs from the hundreds of other galaxies we've observed at high redshift in our survey. Given the relatively small region of the sky that we searched to find JADES-GS-z14-0, its discovery has profound implications for the predicted number of bright galaxies we see in the early universe, as discussed in another concurrent JADES study (Robertson et al., recently accepted). It is likely that astronomers will find many such luminous galaxies, possibly at even earlier times, over the next decade with Webb. We're thrilled to see the extraordinary diversity of galaxies that existed at Cosmic Dawn!

Huge Google Search Document Leak Reveals Inner Workings of Ranking Algorithm (searchengineland.com) 64

Posted by BeauHD on Tuesday May 28, 2024 @09:30PM from the open-sesame dept.

Danny Goodwin reports via Search Engine Land: A trove of leaked Google documents has given us an unprecedented look inside Google Search and revealed some of the most important elements Google uses to rank content. Thousands of documents, which appear to come from Google's internal Content API Warehouse, were released March 13 on Github by an automated bot called yoshi-code-bot. These documents were shared with Rand Fishkin, SparkToro co-founder, earlier this month.

What's inside. Here's what we know about the internal documents, thanks to Fishkin and [Michael King, iPullRank CEO]:

Current: The documentation indicates this information is accurate as of March.
Ranking features: 2,596 modules are represented in the API documentation with 14,014 attributes.
Weighting: The documents did not specify how any of the ranking features are weighted -- just that they exist.
Twiddlers: These are re-ranking functions that "can adjust the information retrieval score of a document or change the ranking of a document," according to King.
Demotions: Content can be demoted for a variety of reasons, such as: a link doesn't match the target site; SERP signals indicate user dissatisfaction; Product reviews; Location; Exact match domains; and/or Porn.
Change history: Google apparently keeps a copy of every version of every page it has ever indexed. Meaning, Google can "remember" every change ever made to a page. However, Google only uses the last 20 changes of a URL when analyzing links.

Other interesting findings. According to Google's internal documents:

Freshness matters -- Google looks at dates in the byline (bylineDate), URL (syntacticDate) and on-page content (semanticDate).
To determine whether a document is or isn't a core topic of the website, Google vectorizes pages and sites, then compares the page embeddings (siteRadius) to the site embeddings (siteFocusScore).
Google stores domain registration information (RegistrationInfo).
Page titles still matter. Google has a feature called titlematchScore that is believed to measure how well a page title matches a query.
Google measures the average weighted font size of terms in documents (avgTermWeight) and anchor text. What does it all mean? According to King: "[Y]ou need to drive more successful clicks using a broader set of queries and earn more link diversity if you want to continue to rank. Conceptually, it makes sense because a very strong piece of content will do that. A focus on driving more qualified traffic to a better user experience will send signals to Google that your page deserves to rank." [...] Fishkin added: "If there was one universal piece of advice I had for marketers seeking to broadly improve their organic search rankings and traffic, it would be: 'Build a notable, popular, well-recognized brand in your space, outside of Google search.'"

After You Die, Your Steam Games Will Be Stuck in Legal Limbo 125

Posted by msmash on Friday May 24, 2024 @03:20PM from the new-lows dept.

Why a 'Frozen' Distribution Linux Kernel Isn't the Safest Choice for Security (zdnet.com) 104

Posted by EditorDavid on Saturday May 18, 2024 @09:04PM from the stable-stakes dept.

Jeremy Allison — Sam (Slashdot reader #8,157) is a Distinguished Engineer at Rocky Linux creator CIQ. This week he published a blog post responding to promises of Linux distros "carefully selecting only the most polished and pristine open source patches from the raw upstream open source Linux kernel in order to create the secure distribution kernel you depend on in your business."

But do carefully curated software patches (applied to a known "frozen" Linux kernel) really bring greater security? "After a lot of hard work and data analysis by my CIQ kernel engineering colleagues Ronnie Sahlberg and Jonathan Maple, we finally have an answer to this question. It's no." The data shows that "frozen" vendor Linux kernels, created by branching off a release point and then using a team of engineers to select specific patches to back-port to that branch, are buggier than the upstream "stable" Linux kernel created by Greg Kroah-Hartman. How can this be? If you want the full details the link to the white paper is here. But the results of the analysis couldn't be clearer.

- A "frozen" vendor kernel is an insecure kernel. A vendor kernel released later in the release schedule is doubly so.

- The number of known bugs in a "frozen" vendor kernel grows over time. The growth in the number of bugs even accelerates over time.

- There are too many open bugs in these kernels for it to be feasible to analyze or even classify them....

[T]hinking that you're making a more secure choice by using a "frozen" vendor kernel isn't a luxury we can still afford to believe. As Greg Kroah-Hartman explicitly said in his talk "Demystifying the Linux Kernel Security Process": "If you are not using the latest stable / longterm kernel, your system is insecure."
CIQ describes its report as "a count of all the known bugs from an upstream kernel that were introduced, but never fixed in RHEL 8." For the most recent RHEL 8 kernels, at the time of writing, these counts are: RHEL 8.6 : 5034 RHEL 8.7 : 4767 RHEL 8.8 : 4594

In RHEL 8.8 we have a total of 4594 known bugs with fixes that exist upstream, but for which known fixes have not been back-ported to RHEL 8.8. The situation is worse for RHEL 8.6 and RHEL 8.7 as they cut off back-porting earlier than RHEL 8.8 but of course that did not prevent new bugs from being discovered and fixed upstream....

This whitepaper is not meant as a criticism of the engineers working at any Linux vendors who are dedicated to producing high quality work in their products on behalf of their customers. This problem is extremely difficult to solve. We know this is an open secret amongst many in the industry and would like to put concrete numbers describing the problem to encourage discussion. Our hope is for Linux vendors and the community as a whole to rally behind the kernel.org stable kernels as the best long term supported solution. As engineers, we would prefer this to allow us to spend more time fixing customer specific bugs and submitting feature improvements upstream, rather than the endless grind of backporting upstream changes into vendor kernels, a practice which can introduce more bugs than it fixes.
ZDNet calls it "an open secret in the Linux community." It's not enough to use a long-term support release. You must use the most up-to-date release to be as secure as possible. Unfortunately, almost no one does that. Nevertheless, as Google Linux kernel engineer Kees Cook explained, "So what is a vendor to do? The answer is simple: if painful: Continuously update to the latest kernel release, either major or stable." Why? As Kroah-Hartman explained, "Any bug has the potential of being a security issue at the kernel level...."

Although [CIQ's] programmers examined RHEL 8.8 specifically, this is a general problem. They would have found the same results if they had examined SUSE, Ubuntu, or Debian Linux. Rolling-release Linux distros such as Arch, Gentoo, and OpenSUSE Tumbleweed constantly release the latest updates, but they're not used in businesses.
Jeremy Allison's post points out that "the Linux kernel used by Android devices is based on the upstream kernel and also has a stable internal kernel ABI, so this isn't an insurmountable problem..."

Archie, the Internet's First Search Engine, Is Rescued and Running (arstechnica.com) 35

Posted by BeauHD on Thursday May 16, 2024 @11:30PM from the what-year-is-it dept.

Wallet Recovery Firms Buzz as Locked-out Crypto Investors Panic in Bitcoin Boom (reuters.com) 35

Posted by msmash on Thursday May 16, 2024 @01:30AM from the tough-luck dept.

Quantum Internet Draws Near Thanks To Entangled Memory Breakthroughs (newscientist.com) 47

Posted by BeauHD on Wednesday May 15, 2024 @06:40PM from the one-step-closer-to-reality dept.

An anonymous reader quotes a report from New Scientist: Efforts to build a global quantum internet have received a boost from two developments in quantum information storage that could one day make it possible to communicate securely across hundreds or thousands of kilometers. The internet as it exists today involves sending strings of digital bits, or 0s and 1s, in the form of electrical or optical signals, to transmit information. A quantum internet, which could be used to send unhackable communications or link up quantum computers, would use quantum bits instead. These rely on a quantum property called entanglement, a phenomenon in which particles can be linked and measuring one particle instantly influences the state of another, no matter how far apart they are. Sending these entangled quantum bits, or qubits, over very long distances, requires a quantum repeater, a piece of hardware that can store the entangled state in memory and reproduce it to transmit it further down the line. These would have to be placed at various points on a long-distance network to ensure a signal gets from A to B without being degraded.

Quantum repeaters don't yet exist, but two groups of researchers have now demonstrated long-lasting entanglement memory in quantum networks over tens of kilometers, which are the key characteristics needed for such a device. Can Knaut at Harvard University and his colleagues set up a quantum network consisting of two nodes separated by a loop of optical fibre that spans 35 kilometers across the city of Boston. Each node contains both a communication qubit, used to transmit information, and a memory qubit, which can store the quantum state for up to a second. "Our experiment really put us in a position where we're really close to working on a quantum repeater demonstration," says Knaut. To set up the link, Knaut and his team entangled their first node, which contains a type of diamond with an atom-sized hole in it, with a photon that they sent to their second node, which contains a similar diamond. When the photon arrives at the second diamond, it becomes entangled with both nodes. The diamonds are able to store this state for a second. A fully functioning quantum repeater using similar technology could be demonstrated in the next couple of years, says Knaut, which would enable quantum networks connecting cities or countries.

In separate work, Xiao-Hui Bao at the University of Science and Technology of China and his colleagues entangled three nodes together, each separated by around 10 kilometers in the city of Hefei. Bao and his team's nodes use supercooled clouds of hundreds of millions of rubidium atoms to generate entangled photons, which they then sent across the three nodes. The central of the three nodes is able to coordinate these photons to link the atom clouds, which act as a form of memory. The key advance for Bao and his team's network is to match the frequency of the photons meeting at the central node, which will be crucial for quantum repeaters connecting different nodes. While the storage time was less than Knaut's team, at 100 microseconds, it is still long enough to perform useful operations on the transmitted information.

Webb Telescope Finds a (Hot) Earth-Sized Planet With an Atmosphere (apnews.com) 28

Posted by EditorDavid on Sunday May 12, 2024 @03:34PM from the forbidden-planet dept.

2012	Lack of Vaccination Sends Babies In Oregon To the Hospital	1007 comments
2010	Red-Light Camera Ticket Revenue and Short Yellows	976 comments
2006	Useful Apps for First-Time Windows Users?	980 comments
2005	Apple Announces Tiger Release Date	981 comments
2004	What Should a Documentary Filmmaker Ask About Offshoring?	1091 comments

Slashdot Top Deals