AI

AI Has Already Run Out of Training Data, Goldman's Data Chief Says (businessinsider.com) 81

AI has run out of training data, according to Neema Raphael, Goldman Sachs' chief data officer and head of data engineering. "We've already run out of data," Raphael said on the bank's podcast. He said this shortage is already shaping how developers build new AI systems. China's DeepSeek may have kept costs down by training on outputs from existing models instead of fresh data. The web has been tapped out.

Developers have been using synthetic data -- machine-generated material that offers unlimited supply but carries quality risks. Raphael said he doesn't think the lack of fresh data will be a massive constraint. "From an enterprise perspective, I think there's still a lot of juice I'd say to be squeezed in that," he said. Proprietary datasets held by corporations could make AI tools far more valuable. The challenge is "understanding the data, understanding the business context of the data, and then being able to normalize it."
Science

Experimental Gene Therapy Found To Slow Huntington's Disease Progression (bbc.com) 13

Doctors report the first successful treatment for Huntington's disease using a new type of gene therapy given during 12 to 18 hours of delicate brain surgery. The BBC reports: An emotional research team became tearful as they described how data shows the disease was slowed by 75% in patients. It means the decline you would normally expect in one year would take four years after treatment, giving patients decades of "good quality life", Prof Sarah Tabrizi told BBC News. The first symptoms of Huntington's disease tend to appear in your 30s or 40s and is normally fatal within two decades -- opening the possibility that earlier treatment could prevent symptoms from ever emerging. None of the patients who have been treated are being identified, but one was medically retired and has returned to work. Others in the trial are still walking despite being expected to need a wheelchair. Treatment is likely to be very expensive. However, this is a moment of real hope in a disease that hits people in their prime and devastates families. [...]

It starts with a safe virus that has been altered to contain a specially designed sequence of DNA. This is infused deep into the brain using real-time MRI scanning to guide a microcatheter to two brain regions - the caudate nucleus and the putamen. This takes 12 to 18 hours of neurosurgery. The virus then acts like a microscopic postman -- delivering the new piece of DNA inside brain cells, where it becomes active. This turns the neurons into a factory for making the therapy to avert their own death. The cells produce a small fragment of genetic material (called microRNA) that is designed to intercept and disable the instructions (called messenger RNA) being sent from the cells' DNA for building mutant huntingtin. This results in lower levels of mutant huntingtin in the brain. [...]

The data showed that three years after surgery there was an average 75% slowing of the disease based on a measure which combines cognition, motor function and the ability to manage in daily life. The data also shows the treatment is saving brain cells. Levels of neurofilaments in spinal fluid -- a clear sign of brain cells dying -- should have increased by a third if the disease continued to progress, but was actually lower than at the start of the trial.

NASA

How NASA Saved a Camera From 370 Million Miles Away (phys.org) 38

An anonymous reader quotes a report from Phys.org: The mission team of NASA's Jupiter-orbiting Juno spacecraft executed a deep-space move in December 2023 to repair its JunoCam imager to capture photos of the Jovian moon Io. Results from the long-distance save were presented during a technical session on July 16 at the Institute of Electrical and Electronics Engineers Nuclear & Space Radiation Effects Conference in Nashville. JunoCam is a color, visible-light camera. The optical unit for the camera is located outside a titanium-walled radiation vault, which protects sensitive electronic components for many of Juno's engineering and science instruments. This is a challenging location because Juno's travels carry it through the most intense planetary radiation fields in the solar system. While mission designers were confident JunoCam could operate through the first eight orbits of Jupiter, no one knew how long the instrument would last after that. Throughout Juno's first 34 orbits (its prime mission), JunoCam operated normally, returning images the team routinely incorporated into the mission's science papers. Then, during its 47th orbit, the imager began showing hints of radiation damage. By orbit 56, nearly all the images were corrupted.

While the team knew the issue might be tied to radiation, pinpointing what was specifically damaged within JunoCam was difficult from hundreds of millions of miles away. Clues pointed to a damaged voltage regulator that was vital to JunoCam's power supply. With few options for recovery, the team turned to a process called annealing, where a material is heated for a specified period before slowly cooling. Although the process is not well understood, the idea is that heating can reduce defects in the material. Soon after the annealing process finished, JunoCam began cranking out crisp images for the next several orbits. But Juno was flying deeper and deeper into the heart of Jupiter's radiation fields with each pass. By orbit 55, the imagery had again begun showing problems.

"After orbit 55, our images were full of streaks and noise," said JunoCam instrument lead Michael Ravine of Malin Space Science Systems. "We tried different schemes for processing the images to improve the quality, but nothing worked. With the close encounter of Io bearing down on us in a few weeks, it was Hail Mary time: The only thing left we hadn't tried was to crank JunoCam's heater all the way up and see if more extreme annealing would save us." Test images sent back to Earth during the annealing showed little improvement in the first week. Then, with the close approach of Io only days away, the images began to improve dramatically. By the time Juno came within 930 miles (1,500 kilometers) of the volcanic moon's surface on Dec. 30, 2023, the images were almost as good as the day the camera launched, capturing detailed views of Io's north polar region that revealed mountain blocks covered in sulfur dioxide frosts rising sharply from the plains and previously uncharted volcanoes with extensive flow fields of lava. To date, the solar-powered spacecraft has orbited Jupiter 74 times. Recently, the image noise returned during Juno's 74th orbit.

Android

Android 16 Is Here (blog.google) 23

An anonymous reader shares a blog post from Google: Today, we're bringing you Android 16, rolling out first to supported Pixel devices with more phone brands to come later this year. This is the earliest Android has launched a major release in the last few years, which ensures you get the latest updates as soon as possible on your devices. Android 16 lays the foundation for our new Material 3 Expressive design, with features that make Android more accessible and easy to use.
AI

AI Firms Say They Can't Respect Copyright. But A Nonprofit's Researchers Just Built a Copyright-Respecting Dataset (msn.com) 100

Is copyrighted material a requirement for training AI? asks the Washington Post. That's what top AI companies are arguing, and "Few AI developers have tried the more ethical route — until now.

"A group of more than two dozen AI researchers have found that they could build a massive eight-terabyte dataset using only text that was openly licensed or in public domain. They tested the dataset quality by using it to train a 7 billion parameter language model, which performed about as well as comparable industry efforts, such as Llama 2-7B, which Meta released in 2023." A paper published Thursday detailing their effort also reveals that the process was painstaking, arduous and impossible to fully automate. The group built an AI model that is significantly smaller than the latest offered by OpenAI's ChatGPT or Google's Gemini, but their findings appear to represent the biggest, most transparent and rigorous effort yet to demonstrate a different way of building popular AI tools....

As it turns out, the task involves a lot of humans. That's because of the technical challenges of data not being formatted in a way that's machine readable, as well as the legal challenges of figuring out what license applies to which website, a daunting prospect when the industry is rife with improperly licensed data. "This isn't a thing where you can just scale up the resources that you have available" like access to more computer chips and a fancy web scraper, said Stella Biderman [executive director of the nonprofit research institute Eleuther AI]. "We use automated tools, but all of our stuff was manually annotated at the end of the day and checked by people. And that's just really hard."

Still, the group managed to unearth new datasets that can be used ethically. Those include a set of 130,000 English language books in the Library of Congress, which is nearly double the size of the popular-books dataset Project Gutenberg. The group's initiative also builds on recent efforts to develop more ethical, but still useful, datasets, such as FineWeb from Hugging Face, the open-source repository for machine learning... Still, Biderman remained skeptical that this approach could find enough content online to match the size of today's state-of-the-art models... Biderman said she didn't expect companies such as OpenAI and Anthropic to start adopting the same laborious process, but she hoped it would encourage them to at least rewind back to 2021 or 2022, when AI companies still shared a few sentences of information about what their models were trained on.

"Even partial transparency has a huge amount of social value and a moderate amount of scientific value," she said.

Piracy

Football and Other Premium TV Being Pirated At 'Industrial Scale' (bbc.com) 132

An anonymous reader quotes a report from the BBC: A lack of action by big tech firms is enabling the "industrial scale theft" of premium video services, especially live sport, a new report says. The research by Enders Analysis accuses Amazon, Google, Meta and Microsoft of "ambivalence and inertia" over a problem it says costs broadcasters revenue and puts users at an increased risk of cyber-crime. Gareth Sutcliffe and Ollie Meir, who authored the research, described the Amazon Fire Stick -- which they argue is the device many people use to access illegal streams -- as "a piracy enabler." [...] The device plugs into TVs and gives the viewer thousands of options to watch programs from legitimate services including the BBC iPlayer and Netflix. They are also being used to access illegal streams, particularly of live sport.

In November last year, a Liverpool man who sold Fire Stick devices he reconfigured to allow people to illegally stream Premier League football matches was jailed. After uploading the unauthorized services on the Amazon product, he advertised them on Facebook. Another man from Liverpool was given a two-year suspended sentence last year after modifying fire sticks and selling them on Facebook and WhatsApp. According to data for the first quarter of this year, provided to Enders by Sky, 59% of people in UK who said they had watched pirated material in the last year while using a physical device said they had used a Amazon fire product. The Enders report says the fire stick enables "billions of dollars in piracy" overall. [...]

The researchers also pointed to the role played by the "continued depreciation" of Digital Rights Management (DRM) systems, particularly those from Google and Microsoft. This technology enables high quality streaming of premium content to devices. Two of the big players are Microsoft's PlayReady and Google's Widevine. The authors argue the architecture of the DRM is largely unchanged, and due to a lack of maintenance by the big tech companies, PlayReady and Widevine "are now compromised across various security levels." Mr Sutcliffe and Mr Meir said this has had "a seismic impact across the industry, and ultimately given piracy the upper hand by enabling theft of the highest quality content." They added: "Over twenty years since launch, the DRM solutions provided by Google and Microsoft are in steep decline. A complete overhaul of the technology architecture, licensing, and support model is needed. Lack of engagement with content owners indicates this a low priority."

Technology

Amazon Unveils Its First Quantum Computing Chip (aboutamazon.com) 6

Amazon has introduced its first-ever quantum processor, dubbed Ocelot, designed specifically to reduce quantum error correction costs by up to 90% compared to existing approaches. The prototype chip uses "cat qubits" -- named after Schrodinger's cat thought experiment -- which intrinsically suppress certain types of quantum errors.

Unlike conventional approaches that add error correction after designing the architecture, AWS built Ocelot with quantum error correction as the primary requirement. The chip consists of two stacked 1cm2 silicon microchips containing 14 core components: five data qubits, five buffer circuits for stabilization, and four qubits dedicated to error detection.

Quantum computers are notoriously sensitive to environmental noise -- including vibrations, heat, and electromagnetic interference -- which disturbs qubits and generates computational errors. These errors multiply as quantum systems scale up, creating a significant barrier to practical quantum computing. Ocelot's high-quality oscillators, made from a thin film of superconducting Tantalum processed using specialized techniques developed by AWS material scientists, generate the repetitive electrical signals that maintain quantum states.

"We're just getting started and we believe we have several more stages of scaling to go through," said Oskar Painter, AWS director of Quantum Hardware, whose team published their findings in Nature. Industry analyst Heather West of IDC was more measured, categorizing Ocelot as "much more of an advancement and less of a breakthrough," noting that superconducting qubits designed to resist certain error types aren't completely novel.
Games

VGHF Opens Free Online Access To 1,500 Classic Game Mags, 30K Historic Files (arstechnica.com) 12

An anonymous reader quotes a report from Ars Technica: The Video Game History Foundation has officially opened up digital access to a large portion of its massive archives today, offering fans and researchers unprecedented access to information and ephemera surrounding the past 50 years of the game industry. Today's launch of the VGHF Library comprises more than 30,000 indexed and curated files, including high-quality artwork, promotional material, and searchable full-text archives over 1,500 video game magazine issues. This initial dump of digital materials also contains never-before-seen game development and production archival material stored by the VGHF, such as over 100 hours of raw production files from the creation of the Myst series or Sonic the Hedgehog concept art and design files contributed by artist Tom Payne.

In a blog post and accompanying launch video, VGHF head librarian Phil Salvador explains how today's launch is the culmination of a dream the organization has had since its launch in 2017. But it's also just the start of an ongoing process to digitize the VGHF's mountains of unprocessed physical material into a cataloged digital form, so people can access it "without having to fly to California." The VGHF doesn't require any special credentials or even a free account to access its archives, a fact that might be contributing to overloaded servers on this launch day. Despite those server issues, amateur researchers online are already sharing crucial library-derived information about the history of describing games as "immersive" or that one time Garfield ranked games in GamePro, for instance.
Unfortunately, digital libraries cannot offer direct, playable access to retail video games due to DMCA restrictions, notes Ars. However, organizations like the VGHF "continue to challenge those copyright rules every three years," raising hope for future access.
Stats

'The Dying Language of Accounting' (wsj.com) 177

Paul Knopp, KPMG US CEO, writing in an op-ed on WSJ: According to a United Nations estimate, 230 languages went extinct between 1950 and 2010. If my profession doesn't act, the language of business -- accounting -- could vanish too. The number of students who took the exam to become certified public accountants in 2022 hit a 17-year low. From 2020 to 2022, bachelor's degrees in accounting dropped 7.8% after steady declines since 2018.

While the shortage isn't yet an issue for the country's largest firms, it's beginning to affect our economy and capital markets. In the first half of 2024, nearly 600 U.S.-listed companies reported material weaknesses related to personnel. S&P Global analysts last year warned that many municipalities were at risk of having their credit ratings downgraded or withdrawn due to delayed financial disclosures.

Our profession must remove hurdles to learning the accounting language while preserving quality. In October, KPMG became the first large accounting firm to advocate developing alternate paths to CPA licensing. We want pathways that emphasize experience, not academic credits, after college. Most people today must earn 30 credits after their bachelor's degrees -- the so-called 150-hour rule -- work under a licensed CPA for a year, and pass the CPA exam to become licensed.

Research by the Center for Audit Quality finds that the 150-hour rule is among the top reasons people don't pursue CPA licensure. A December 2023 study found that the requirement causes a 26% drop in interest among minorities. There is a consensus for change, but we can't waste time. Many state CPA societies are working on legislation to create an alternative path to licensure. State boards of accountancy should replace the extra academic requirement with more on-the-job experience. A person who is licensed in one state should be able to practice in another even if reforms create different licensing requirements.

News

'Brain Rot' Named Oxford Word of the Year 2024 26

Oxford University Press: Following a public vote in which more than 37,000 people had their say, we're pleased to announce that the Oxford Word of the Year for 2024 is 'brain rot.'

Our language experts created a shortlist of six words to reflect the moods and conversations that have helped shape the past year. After two weeks of public voting and widespread conversation, our experts came together to consider the public's input, voting results, and our language data, before declaring 'brain rot' as the definitive Word of the Year for 2024.

'Brain rot' is defined as "the supposed deterioration of a person's mental or intellectual state, especially viewed as the result of overconsumption of material (now particularly online content) considered to be trivial or unchallenging. Also: something characterized as likely to lead to such deterioration."

Our experts noticed that 'brain rot' gained new prominence this year as a term used to capture concerns about the impact of consuming excessive amounts of low-quality online content, especially on social media. The term increased in usage frequency by 230% between 2023 and 2024.
Google

Google Deepens Crackdown on Sites Publishing 'Parasite SEO' Content (theverge.com) 13

Google has warned websites they will be penalized for hosting marketing content designed to exploit search rankings, regardless of whether they created or outsourced the material. The crackdown on so-called "parasite SEO" targets websites that leverage their search rankings to promote unrelated content, such as news sites hiding shopping coupon codes or educational platforms publishing affiliate marketing material.

Chris Nelson from Google's search quality team said the policy applies even when content involves "white label services, licensing agreements, partial ownership agreements, and other complex business arrangements." The move follows Google's March announcement targeting site reputation abuse, which gained attention after Sports Illustrated was found publishing AI-generated product reviews through third-party marketing firm AdVon Commerce.
AI

AI Lab PleIAs Releases Fully Open Dataset, as AMD, Ai2 Release Open AI Models (huggingface.co) 5

French private AI lab PleIAs "is committed to training LLMs in the open," they write in a blog post at Mozilla.org. "This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define 'open' strictly: all data must be both accessible and under permissive licenses."

Wednesday PleIAs announced they were releasing the largest open multilingual pretraining dataset, according to their blog post at HuggingFace: Many have claimed that training large language models requires copyrighted data, making truly open AI development impossible. Today, Pleias is proving otherwise with the release of Common Corpus (part of the AI Alliance Open Trusted Data Initiative) — the largest fully open multilingual dataset for training LLMs, containing over 2 trillion tokens of permissibly licensed content with provenance information (2,003,039,184,047 tokens).

As developers are responding to pressures from new regulations like the EU AI Act, Common Corpus goes beyond compliance by making our entire permissibly licensed dataset freely available on HuggingFace, with detailed documentation of every data source. We have taken extensive steps to ensure that the dataset is high-quality and is curated to train powerful models. Through this release, we are demonstrating that there doesn't have to be such a [heavy] trade-off between openness and performance.

Common Corpus is:

— Truly Open: contains only data that is permissively licensed and provenance is documented

— Multilingual: mostly representing English and French data, but contains at least 1B tokens for over 30 languages

— Diverse: consisting of scientific articles, government and legal documents, code, and cultural heritage data, including books and newspapers

— Extensively Curated: spelling and formatting has been corrected from digitized texts, harmful and toxic content has been removed, and content with low educational content has also been removed.


Common corpus builds on a growing ecosystem of large, open datasets, such as Dolma, FineWeb, RefinedWeb. The Common Pile currently in preparation under the coordination of Eleuther is built around the same principle of using permissible content in English language and, unsurprisingly, there were many opportunities for collaborations and shared efforts. But even together, these datasets do not provide enough training data for models much larger than a few billion parameters. So in order to expand the options for open model training, we still need more open data...

Based on an analysis of 1 million user interactions with ChatGPT, the plurality of user requests are for creative compositions... The kind of content we actually need — like creative writing — is usually tied up in copyright restrictions. Common Corpus tackles these challenges through five carefully curated collections...

Last week AMD also released its first series of fully open 1 billion parameter language models, AMD OLMo.

And last month VentureBeat reported that the non-profit Allen Institute for AI had unveiled Molmo, "an open-source family of state-of-the-art multimodal AI models which outpeform top proprietary rivals including OpenAI's GPT-4o, Anthropic's Claude 3.5 Sonnet, and Google's Gemini 1.5 on several third-party benchmarks."
Displays

LG's New Stretchable Display Can Grow By 50% (tomshardware.com) 26

An anonymous reader quotes a report from Tom's Hardware: LG Display, one of the global leaders in display technologies, unveiled a new stretchable display prototype that can expand by up to 50%. This makes it the most stretchable display in the industry, more than doubling the previous record of 20% elongation. [...] The prototype being flexed in [this image] is a 12-inch screen with a 100-pixel-per-inch resolution and full RGB color that expands to 18-inches when pulled. LG Display said that it based the stretchable display on a "special silicon material substrate used in contact lenses" and then improved its properties for better "stretchability and flexibility." It also used a new wiring design structure and a micro-LED light source, allowing users to repeatedly stretch the screen over 10,000 times with no effect on image quality.
Transportation

Why Boeing is Dismissing a Top Executive (barrons.com) 45

Last weekend Boeing announced that its CEO of Defense, Space, and Security "had left the company," according to Barrons. "Parting ways like this, for upper management, is the equivalent to firing," they write — though they add that setbacks on Starliner's first crewed test flight is "far too simple an explanation." Starliner might, however, have been the straw that broke the camel's back. [New CEO Kelly] Ortberg took over in early August, so his first material interaction with the Boeing Defense and Space business was the spaceship's failed test flight... Starliner has cost Boeing $1.6 billion and counting. That's lot of money, but not all that much in the context of the Defense business, which generates sales of roughly $25 billion a year.... [T]he overall Defense business has performed poorly of late, burdened by fixed price contracts that have become unprofitable amid years of higher than expected inflation. Profitability in the defense business has been declining since 2020 and started losing money in 2022. From 2022 to 2024 losses should total about $6 billion cumulatively, including Wall Street's estimates for the second half of this year.

Still, it felt like something had to give. And the change shows investors something about new CEO Ortberg. "At this critical juncture, our priority is to restore the trust of our customers and meet the high standards they expect of us," read part of an internal email sent to Boeing employees announcing the change. "Why his predecessor — David Calhoun — didn't pull this trigger earlier this year is a mystery," wrote Gordon Haskett analyst Don Bilson in a Monday note. "Can't leave astronauts behind."

"Ortberg's logic appears sound," the article concludes. "In recent years, Boeing has disappointed its airline and defense customers, including NASA...

"After Starliner, defense profitability, and the strike, Ortberg has to tackle production quality, production rates, and Boeing's ailing balance sheet. Boeing has amassed almost $60 billion in debt since the second tragic 737 MAX crash in March 2019."

Thanks to Slashdot reader Press2ToContinue for sharing the news.
The Courts

OceanGate Submersible Victim's Family Sues For $50 Million, Partly Blames $30 Logitech Controller (extremetech.com) 92

An anonymous reader quotes a report from ExtremeTech: The family of a French mariner who died on the imploded Titan submersible last year has sued Titan's maker, OceanGate Expeditions, for more than $50 million. The lawsuit claims OceanGate is responsible for explorers' suffering immediately preceding their deaths, as well as for failing to disclose the extent of the submersible's risks. Among those risks are Titan's cheap materials, including the $30 Logitech gaming controller used aboard the vehicle. [...]

The lawsuit points at Titan's "hip, contemporary, wireless electronics system" and then alleges that none of the controllers or gauges inside Titan would operate without a constant source of power and a wireless signal. One of those controllers was a modified Logitech F710 Gamepad, a $30 to $40 device designed for, well, gaming. The gamepad quickly became the subject of internet mockery following the loss of Titan; some speculators said the submersible must have been doomed to fail if it used such cheap components. The lawsuit even claims the controller's Bluetooth (rather than wired) connectivity set it up for failure. Still, other speculators believe the controller wouldn't have had much impact on the submersible's operational durability. Instead, the issue would have been with the vehicle's carbon fiber pressure cylinder, which Rush allegedly bought off Boeing at a discount after the material passed its "airplane shelf life." Regardless of the exact material, it seems the consensus among members of the public is that for OceanGate, quality was an afterthought.

Graphics

Nvidia RTX 40-Series GPUs Hampered By Low-Quality Thermal Paste (pcgamer.com) 50

"Anyone who is into gaming knows your graphics card is under strain trying to display modern graphics," writes longtime Slashdot reader smooth wombat. "This results in increased power usage, which is then turned into heat. Keeping your card cool is a must to get the best performance possible."

"However, hardware tester Igor's Lab found that vendors for Nvidia RTX 40-series cards are using cheap, poorly applied thermal paste, which is leading to high temperatures and consequently, performance degradation over time. This penny-pinching has been confirmed by Nick Evanson at PC Gamer." From the report: I have four RTX 40-series cards in my office (RTX 4080 Super, 4070 Ti, and two 4070s) and all of them have quite high hotspots -- the highest temperature recorded by an individual thermal sensor in the die. In the case of the 4080 Super, it's around 11 C higher than the average temperature of the chip. I took it apart to apply some decent quality thermal paste and discovered a similar situation to that found by Igor's Lab. In the space of a few months, the factory-applied paste had separated and spread out, leaving just an oily film behind, and a few patches of the thermal compound itself. I checked the other cards and found that they were all in a similar state.

Igor's Lab examined the thermal paste used on a brand-new RTX 4080 and found it to be quite thin in nature, due to large quantities of cheap silicone oil being used, along with zinc oxide filler. There was lots of ground aluminium oxide (the material that provides the actual thermal transfer) but it was quite coarse, leading to the paste separating quite easily. Removing the factory-installed paste from another RTX 4080 graphics card, Igor's Lab applied a more appropriate amount of a high-quality paste and discovered that it lowered the hotspot temperature by nearly 30 C.

Security

Fired Employee Accessed NCS' Computer 'Test System' and Deleted Servers (channelnewsasia.com) 63

An anonymous reader quotes a report from Singapore's CNA news channel: Kandula Nagaraju, 39, was sentenced to two years and eight months' jail on Monday (Jun 10) for one charge of unauthorized access to computer material. Another charge was taken into consideration for sentencing. His contract with NCS was terminated in October 2022 due to poor work performance and his official last date of employment was Nov 16, 2022. According to court documents, Kandula felt "confused and upset" when he was fired as he felt he had performed well and "made good contributions" to NCS during his employment. After leaving NCS, he did not have another job in Singapore and returned to India.

Between November 2021 and October 2022, Kandula was part of a 20-member team managing the quality assurance (QA) computer system at NCS. NCS is a company that offers information communication and technology services. The system that Kandula's former team was managing was used to test new software and programs before launch. In a statement to CNA on Wednesday, NCS said it was a "standalone test system." It consisted of about 180 virtual servers, and no sensitive information was stored on them. After Kandula's contract was terminated and he arrived back in India, he used his laptop to gain unauthorized access to the system using the administrator login credentials. He did so on six occasions between Jan 6 and Jan 17, 2023.

In February that year, Kandula returned to Singapore after finding a new job. He rented a room with a former NCS colleague and used his Wi-Fi network to access NCS' system once on Feb 23, 2023. During the unauthorized access in those two months, he wrote some computer scripts to test if they could be used on the system to delete the servers. In March 2023, he accessed NCS' QA system 13 times. On Mar 18 and 19, he ran a programmed script to delete 180 virtual servers in the system. His script was written such that it would delete the servers one at a time. The following day, the NCS team realized the system was inaccessible and tried to troubleshoot, but to no avail. They discovered that the servers had been deleted. [...] As a result of his actions, NCS suffered a loss of $679,493.

AI

Journalists 'Deeply Troubled' By OpenAI's Content Deals With Vox, The Atlantic (arstechnica.com) 100

Benj Edwards and Ashley Belanger reports via Ars Technica: On Wednesday, Axios broke the news that OpenAI had signed deals with The Atlantic and Vox Media that will allow the ChatGPT maker to license their editorial content to further train its language models. But some of the publications' writers -- and the unions that represent them -- were surprised by the announcements and aren't happy about it. Already, two unions have released statements expressing "alarm" and "concern." "The unionized members of The Atlantic Editorial and Business and Technology units are deeply troubled by the opaque agreement The Atlantic has made with OpenAI," reads a statement from the Atlantic union. "And especially by management's complete lack of transparency about what the agreement entails and how it will affect our work."

The Vox Union -- which represents The Verge, SB Nation, and Vulture, among other publications -- reacted in similar fashion, writing in a statement, "Today, members of the Vox Media Union ... were informed without warning that Vox Media entered into a 'strategic content and product partnership' with OpenAI. As both journalists and workers, we have serious concerns about this partnership, which we believe could adversely impact members of our union, not to mention the well-documented ethical and environmental concerns surrounding the use of generative AI." [...] News of the deals took both journalists and unions by surprise. On X, Vox reporter Kelsey Piper, who recently penned an expose about OpenAI's restrictive non-disclosure agreements that prompted a change in policy from the company, wrote, "I'm very frustrated they announced this without consulting their writers, but I have very strong assurances in writing from our editor in chief that they want more coverage like the last two weeks and will never interfere in it. If that's false I'll quit.."

Journalists also reacted to news of the deals through the publications themselves. On Wednesday, The Atlantic Senior Editor Damon Beres wrote a piece titled "A Devil's Bargain With OpenAI," in which he expressed skepticism about the partnership, likening it to making a deal with the devil that may backfire. He highlighted concerns about AI's use of copyrighted material without permission and its potential to spread disinformation at a time when publications have seen a recent string of layoffs. He drew parallels to the pursuit of audiences on social media leading to clickbait and SEO tactics that degraded media quality. While acknowledging the financial benefits and potential reach, Beres cautioned against relying on inaccurate, opaque AI models and questioned the implications of journalism companies being complicit in potentially destroying the internet as we know it, even as they try to be part of the solution by partnering with OpenAI.

Similarly, over at Vox, Editorial Director Bryan Walsh penned a piece titled, "This article is OpenAI training data," in which he expresses apprehension about the licensing deal, drawing parallels between the relentless pursuit of data by AI companies and the classic AI thought experiment of Bostrom's "paperclip maximizer," cautioning that the single-minded focus on market share and profits could ultimately destroy the ecosystem AI companies rely on for training data. He worries that the growth of AI chatbots and generative AI search products might lead to a significant decline in search engine traffic to publishers, potentially threatening the livelihoods of content creators and the richness of the Internet itself.

AI

For Data-Guzzling AI Companies, the Internet Is Too Small (wsj.com) 60

Companies racing to develop more powerful artificial intelligence are rapidly nearing a new problem: The internet might be too small for their plans (non-paywalled link). From a report: Ever more powerful systems developed by OpenAI, Google and others require larger oceans of information to learn from. That demand is straining the available pool of quality public data online at the same time that some data owners are blocking access to AI companies. Some executives and researchers say the industry's need for high-quality text data could outstrip supply within two years, potentially slowing AI's development.

AI companies are hunting for untapped information sources, and rethinking how they train these systems. OpenAI, the maker of ChatGPT, has discussed training its next model, GPT-5, on transcriptions of public YouTube videos, people familiar with the matter said. Companies also are experimenting with using AI-generated, or synthetic, data as training material -- an approach many researchers say could actually cause crippling malfunctions. These efforts are often secret, because executives think solutions could be a competitive advantage.

Data is among several essential AI resources in short supply. The chips needed to run what are called large-language models behind ChatGPT, Google's Gemini and other AI bots also are scarce. And industry leaders worry about a dearth of data centers and the electricity needed to power them. AI language models are built using text vacuumed up from the internet, including scientific research, news articles and Wikipedia entries. That material is broken into tokens -- words and parts of words that the models use to learn how to formulate humanlike expressions.

Science

Researchers Develop New Material That Converts CO2 into Methanol Using Sunlight (scitechdaily.com) 56

"Researchers have successfully transformed CO2 into methanol," reports SciTechDaily, "by shining sunlight on single atoms of copper deposited on a light-activated material, a discovery that paves the way for creating new green fuels." Tara LeMercier, a PhD student who carried out the experimental work at the University of Nottingham, School of Chemistry, said: "We measured the current generated by light and used it as a criterion to judge the quality of the catalyst. Even without copper, the new form of carbon nitride is 44 times more active than traditional carbon nitride. However, to our surprise, the addition of only 1 mg of copper per 1 g of carbon nitride quadrupled this efficiency. Most importantly the selectivity changed from methane, another greenhouse gas, to methanol, a valuable green fuel."

Professor Andrei Khlobystov, School of Chemistry, University of Nottingham, said: "Carbon dioxide valorization holds the key for achieving the net-zero ambition of the UK. It is vitally important to ensure the sustainability of our catalyst materials for this important reaction. A big advantage of the new catalyst is that it consists of sustainable elements — carbon, nitrogen, and copper — all highly abundant on our planet." This invention represents a significant step towards a deep understanding of photocatalytic materials in CO2 conversion. It opens a pathway for creating highly selective and tuneable catalysts where the desired product could be dialed up by controlling the catalyst at the nanoscale.

"The research has been published in the Sustainable Energy & Fuels journal of the Royal Society of Chemistry."

Thanks to long-time Slashdot reader Baron_Yam for sharing the article.

Slashdot Top Deals