170356927
submission
guest reader writes:
From LWN Article:
From the moon landing to the James Webb Space Telescope and many other scientific missions, software is critical for the US National Aeronautics and Space Administration (NASA). Sharing information has also been in the DNA of the space agency from the beginning. As a result, NASA also contributes to and releases open-source software and open data. In a keynote at FOSDEM 2023, Science Data Officer Steve Crawford talked about NASA and open-source software, including the challenges NASA has faced in using open source and the agency's recent initiatives to lower barriers.
Software has always been a big part of NASA's work. Who hasn't seen the photo of computer scientist Margaret Hamilton next to a hard-copy stack of the Apollo software she and her team at MIT produced? The stack of code is as tall as she is. In 2016, the original Apollo 11 Guidance Computer source code for the command and lunar modules was published on GitHub in the public domain. You can even compile the code and run it in a simulator.
In recent years, more and more of this sharing was also in the form of releasing software. For instance, when NASA's drone copter Ingenuity made it first flight on Mars in 2021 as part of the Perseverance mission, it used an open-source flight-control framework, F Prime. NASA's Jet Propulsion Laboratory (JPL) released the framework in 2017 under the Apache 2.0 license. One of the example deployments even runs on the Raspberry Pi. But the NASA mission also used a lot of open-source dependencies. To celebrate Ingenuity's first flight, GitHub recognized the more than 12,000 people who contributed to these dependencies with a badge on their profile.
While the previous examples may be some high-profile successes, open source at NASA doesn't come without its challenges. "Civil servants can't release anything copyrightable", Crawford said, referring to the fact that under US copyright law, a work prepared by an officer or employee of the United States Government as part of that person's official duties is in the public domain.
Of course NASA has contributed to many open-source projects, but according to Crawford people often do this "not in their official capacity as NASA employees". In 2003 NASA created a license to enable the release of software by civil servants, the NASA Open Source Agreement. This license has been approved by the Open Source Initiative (OSI), but the Free Software Foundation doesn't consider it a free-software license because it does not allow changes to the code that come from third-party free-software projects. "It isn't widely used in the community and complicates the reuse of NASA software with this license", Crawford said.
Another challenge is NASA's famous bureaucracy, Crawford admitted: "NASA does not always engage well with the open-source community." As an example, he showed how curl's main developer Daniel Stenberg received an email from NASA's Commercial IT Acquisition Team, asking him to supply country of origin information for curl, as well as a list of all "authorized resellers". Stenberg noted the keynote (which he barely missed attending) in a recent blog post.
Open-source software will clearly play an important role in open science, and was already instrumental in various breakthrough discoveries. When scientists created the first image of a black hole in 2019 from data generated by the Event Horizon Telescope, Dr. Katie Bouman who led the development of the imaging algorithm was explicit about it: "We're deeply grateful to all the open source contributors who made our work possible." This was also the message Crawford ended his talk with: "Keep contributing, building, and sustaining your code." After his "Thank you for your contributions", his words were followed by big applause from a room full of open-source developers.Link to Original Source
170348403
submission
guest reader writes:
The Register writes:
IBM is the latest tech giant to unveil its own "AI supercomputer," this one composed of a bunch of virtual machines running within IBM Cloud.
The system known as Vela, which the company claims has been online since May last year, is touted as IBM's first AI-optimized, cloud-native supercomputer, created with the aim of developing and training large-scale AI models.
But Vela is not running on any old standard IBM Cloud node hardware; each is a twin-socket system with 2nd Gen Xeon Scalable processors configured with 1.5TB of DRAM, and four 3.2TB NVMe flash drives, plus eight 80GB Nvidia A100 GPUs, the latter connected by NVLink and NVSwitch.
This makes the Vela infrastructure closer to that of a high performance compute (HPC) site than typical cloud infrastructure, despite IBM's insistence that it was taking a different path as "traditional supercomputers weren't designed for AI."
It is also notable that IBM chose to use x86 processors rather than its own Power 10 chips, especially as these were touted by Big Blue as being ideally suited for memory-intensive workloads such as large-model AI inferencing.Link to Original Source
170322111
submission
guest reader writes:
GamesRadar reports:
Never before have I seen live service game development summarized so well: The Division 2 currently cannot be updated because a recently delayed seasonal update broke the system used to update the game, so the developers trying to update it have to first update the updater to accept new updates. So that they can update it.
The worst part is I'm barely exaggerating. As the dev team explained in a recent Twitter post: "Last week, we shared news that the season would be delayed due to a localization issue. This past Saturday, in the process of creating the update which would resolve the issue, we encountered an error that brought down the build generation system for The Division 2. As a result, we cannot update the game until this system has been rebuilt."
To recap: the fix for an error that delayed an update resulted in an error that broke the updater which would deliver that update to The Division 2. I'm not a game developer, but that doesn't sound very good. Consequently, the devs "are unable to make server or client side updates until the build generation system is restored," meaning they can't even extend existing seasonal content to help fill the gap between updates.Link to Original Source
170316235
submission
guest reader writes:
C++ creator Bjarne Stroustrup joins calls for changing the programming language itself to address security concerns, though other core contributors want to make more modest moves.
There's turmoil in the C++ community. In mid-January, the official C++ "direction group" — which makes recommendations for the programming language's evolution — issued a statement addressing concerns about C++ safety. While many languages now support "basic type safety" — that is, ensuring that variables access only sections of memory that are clearly defined by their data types — C++ has struggled to offer similar guarantees.
This new statement, co-authored by C++ creator Bjarne Stroustrup, now appears to call for changing the C++ programming language itself to address safety concerns. "We now support the idea that the changes for safety need to be not just in tooling, but visible in the language/compiler, and library."
The group still also supports its long-preferred use of debugging tools to ensure safety (and "pushing tooling to enable more global analysis in identifying hard for humans to identify safety concerns"). But that January statement emphasizes its recommendation for changes within C++.
Related storyLink to Original Source
170265435
submission
guest reader writes:
Popular image generation models can be prompted to produce identifiable photos of real people, potentially threatening their privacy, according to new research. The work also shows that these AI systems can be made to regurgitate exact copies of medical images and copyrighted work by artists. It's a finding that could strengthen the case for artists who are currently suing AI companies for copyright violations.
The researchers, from Google, DeepMind, UC Berkeley, ETH Zürich, and Princeton, got their results by prompting Stable Diffusion and Google's Imagen with captions for images, such as a person's name, many times. Then they analyzed whether any of the images they generated matched original images in the model's database. The group managed to extract over 100 replicas of images in the AI's training set.
The paper with title "Extracting Training Data from Diffusion Models" is the first time researchers have managed to prove that these AI models memorize images in their training sets, says Ryan Webster, a PhD student at the University of Caen Normandy in France.
For example, recent class-action lawsuit accusing DeviantArt, Midjourney and Stability AI uses the following arguments as a claim:
The resulting image is necessarily a derivative work, because it is generated exclusively from a combination of the conditioning data and the latent images, all of which are copies of copyrighted images. It is, in short, a 21st-century collage tool.
A diffusion model is a form of lossy compression applied to the Training Images. Because a trained diffusion model can produce a copy of any of its Training Images—which could number in the billions—the diffusion model can be considered an alternative way of storing a copy of those images. In essence, it's similar to having a directory on your computer of billions of JPEG image files. But the diffusion model uses statistical and mathematical methods to store these images in an even more efficient and compressed manner.
A diffusion model is then able to reconstruct copies of each Training Image. Furthermore, being able to reconstruct copies of the Training Images is not an incidental side effect. The primary goal of a diffusion model is to reconstruct copies of the training data with maximum accuracy and fidelity to the Training Image. It is meant to be a duplicate.
There are a number of laws that protect and preserve the rights and interests with respect to their art. Provided references are 17 U.S.C. 106 and Section 1202(c) of the DMCA.Link to Original Source
170218885
submission
guest reader writes:
Microsoft Corp, Microsoft's GitHub Inc and OpenAI Inc told a San Francisco federal court that a proposed class-action lawsuit for improperly monetizing open-source code to train their artificial-intelligence systems cannot be sustained.
Two anonymous plaintiffs, seeking to represent a class of people who own copyrights to code on GitHub, sued Microsoft, GitHub and OpenAI in November. They said the companies trained Copilot with code from GitHub repositories without complying with open-source licensing terms, and that Copilot unlawfully reproduces their code.
Microsoft and OpenAI said Thursday that the plaintiffs lacked standing to bring the case because they failed to argue they suffered specific injuries from the companies' actions.
From the class action complaint:
GitHub and OpenAI have offered shifting accounts of the source and amount of the code or other data used to train and operate Copilot. They have also offered shifting justifications for why a commercial AI product like Copilot should be exempt from these license requirements, often citing "fair use."
It is not fair, permitted, or justified. On the contrary, Copilot's goal is to replace a huge swath of open source by taking it and keeping it inside a GitHub-controlled paywall. It violates the licenses that open-source programmers chose and monetizes their code despite GitHub's pledge never to do so.
170169914
submission
guest reader writes:
In a paper entitled "Myths and Legends of High-Performance Computing" appearing this week on the Arvix site, Matsuoka and four colleagues offer opinions and analysis on such issues as quantum replacing classical HPC, the zettascale timeline, disaggregated computing, domain-specific languages (DSLs) vs. Fortran and cloud subsuming HPC, among other topics.
"We believe (these myths and legends) represent the zeitgeist of the current era of massive change, driven by the end of many scaling laws, such as Dennard scaling and Moore’s law," the authors said.
In this way they join the growing "end of" discussions in HPC. For example, as the industry moves through 3nm, 2nm, and 1.4nm chips – then what? Will accelerators displace CPUs altogether? What's next after overburdened electrical I/O interconnects? How do we get more memory per core?
170169492
submission
guest reader writes:
Open Standards site contains new paper from Bjarne Stroustrup in mailed C++ group.
In this article called A call to action: Think seriously about "safety"; then do something sensible about it, Bjarne reacts to NSA report about Software Memory Safety since the report excludes C and C++ as unsafe.
Bjarne does not consider any of those "safe" languages superior to C++ for the range of uses He cares about.
Any good static analyzer (e.g., Clang tidy, that has some CG support) could be made to completely deliver those guarantees at a fraction of the cost of a change to a variety of novel "safe" languages.
Not everyone prioritizes "safety" above all else. For example, in application domains where performance is the main concern, the P2687R0 approach lets you apply the safety guarantees only where required and use your favorite tuning techniques where needed.
Bjarne has worked for decades to make it possible to write better, safer, and more efficient C++. In particular, the work on the C++ Core Guidelines specifically aims at delivering statically guaranteed type-safe and resource-safe C++ for people who need that without disrupting code bases that can manage without such strong guarantees or introducing additional tool chains.
The article also contains the following references for consideration:
- Design Alternatives for Type-and-Resource Safe C++.
- Type-and-resource safety in modern C++.
- A brief introduction to C++'s model for type- and resource-safety.
- C++ Core Guidelines, safety profiles.
169711884
submission
guest reader writes:
Nearly 10,000 years ago, humans settling in the Fertile Crescent, the areas of the Middle East surrounding the Tigris and Euphrates rivers, made the first switch from hunter-gatherers to farmers. They developed close bonds with the rodent-eating cats that conveniently served as ancient pest-control in society’s first civilizations.
A new study at the University of Missouri found this lifestyle transition for humans was the catalyst that sparked the world’s first domestication of cats, and as humans began to travel the world, they brought their new feline friends along with them.
While horses and cattle have seen various domestication events caused by humans in different parts of the world at various times, her analysis of feline genetics in the study strongly supports the theory that cats were likely first domesticated only in the Fertile Crescent before migrating with humans all over the world. After feline genes are passed down to kittens throughout generations, the genetic makeup of cats in western Europe, for example, is now far different from cats in southeast Asia, a process known as 'isolation by distance.'
Lyons, who has researched feline genetics for more than 30 years, said studies like this also support her broader research goal of using cats as a biomedical model to study genetic diseases that impact both cats and people, such as polycystic kidney disease, blindness and dwarfism.
In a 2021 study, Lyons and colleagues found that the cat's genomic structure is more similar to humans than nearly any other non-primate mammal.
169300124
submission
guest reader writes:
New release of Rust-GPU supports SPIR-V ray-tracing.
Rust-GPU project aims at making Rust a first class language and ecosystem for GPU programming.
GPU programming has historically been done with HLSL or GLSL, simple programming languages that have evolved along with rendering APIs over the years. However, as game engines have evolved, these languages have failed to provide mechanisms for dealing with large codebases, and have generally stayed behind the curve compared to other programming languages.
Our hope with this project is that we push the industry forward by bringing Rust, an existing low-level, safe, and high performance language, to the GPU. And with it come some additional great benefits: a package/module system that's one of the industry's best, built in safety against race-conditions or out of bounds memory access, a wide range of libraries and tools to improve programmer workflows, and many others!
169173438
submission
guest reader writes:
A startup backed by an internet-search pioneer wants to give cash to users who share personal data including what they buy or watch on mobile apps.
The amount of compensation will be determined by a "data score" reflecting factors such as whether consumers answer demographic survey questions and which apps and services’ data consumers are sharing.
But Ms. Liu also said she believes the space isn't likely to see mainstream success until another privacy shift—Google's plan to stop supporting third-party tracking in its Chrome browser—takes effect no sooner than 2024. Brands for now can still collect much of the information about consumers that these services are asking users to consent to share on their own, she said.
Would you share your data with an app like Caden? If so, how much would you expect to be paid?
168796806
submission
guest reader writes:
The Computer History Museum is excited to publicly release, for the first time, the source code for the breakthrough printing technology, PostScript.
From the start of Adobe Systems Incorporated (now Adobe, Inc.) exactly forty years ago in December 1982, the firm's cofounders envisioned a new kind of printing press—one that was fundamentally digital, using the latest advances in computing.
168599406
submission
guest reader writes:
Chipmaker Intel is offering staff in Ireland the opportunity to take three months' leave from their jobs, with the catch being that it is unpaid. The move is part of cost saving measures at the company.
The move follows Intel's announcement in October that it planned to lay off an unspecified number of employees worldwide, and even ditch some product lines, in response to a worsening economic situation. These plans are part of a massive reduction in spending, with Intel looking slash $3 billion annually starting next year and by between $8 billion and $10 billion by 2025.
However, this isn't going to stop the chipmaker from continuing to invest in building new chip manufacturing plants, as Intel confirmed this week when the company reiterated its commitment to manufacturing expansions in the US and in Europe that are set to cost billions of dollars.
168220659
submission
guest reader writes:
There is a new episode in the Stack Overflow Blog. On this episode, we chat with Max Howell, creator of Homebrew, about his new package manager, Tea, and how it aims to solve the problem of providing funding for popular open source projects.
We believe that the entirety of modern human endeavor has been stunted by relying on the smallest percentage of the world’s engineers to choose between a salary or keeping the Internet running.
Open-source is a labor of love often hindered by a lack of meaningful economic incentives resulting in genuinely worthwhile projects never reaching their potential while others suffer from security issues due to the lack of incentives to maintain software throughout its lifecycle.
There have been multiple attempts at providing incentive structures, typically involving sponsorship and bounty systems.
- Only projects at the top of the tower are typically known and receive sponsorship.
- Bounties allow consumers of projects to propose payment for developers to build specific features, thus only remunerating projects for doing things not necessarily in their best interest. And again, only rewarding favorites.
In this paper, we propose tea — a decentralized system for fairly remunerating open-source developers based on their contributions to the entire ecosystem and enacted through the tea incentive algorithm applied across all entries in the tea registry.
White paper: A Decentralized Protocol for Remunerating the Open-Source Ecosystem