Slashdot: News for nerds, stuff that matters, & software

Firefox 128 Criticized for Including Small Test of 'Privacy-Preserving' Ad Tech by Default (itsfoss.com) 57

Posted by EditorDavid on Saturday July 20, 2024 @12:34PM from the Firefox-under-fire dept.

"Many people over the past few days have been lashing out at Mozilla," writes the blog Its FOSS, "for enabling Privacy-Preserving Attribution by default on Firefox 128, and the lack of publicity surrounding its introduction."

Mozilla responded that the feature will only run "on a few sites in the U.S. under strict supervision" — adding that users can disable it at any time ("because this is a test"), and that it's only even enabled if telemetry is also enabled.

And they also emphasize that it's "not tracking." The way it works is there's an "aggregation service" that can periodically send advertisers a summary of ad-related actions — again, aggregated data, from a mass of many other users. (And Mozilla says that aggregated summary even includes "noise that provides differential privacy.") This Privacy-Preserving Attribution concept "does not involve sending information about your browsing activities to anyone... Advertisers only receive aggregate information that answers basic questions about the effectiveness of their advertising."

More from It's FOSS: Even though Mozilla mentioned that PPA would be enabled by default on Firefox 128 in a few of its past blog posts, they failed to communicate this decision clearly, to a wider audience... In response to the public outcry, Firefox CTO, Bobby Holley, had to step in to clarify what was going on.

He started with how the internet has become a massive cesspool of surveillance, and doing something about it was the primary reason many people are part of Mozilla. He then expanded on their approach with Firefox, which, historically speaking, has been to ship a browser with anti-tracking features baked in to tackle the most common surveillance techniques. But, there were two limitations with this approach. One was that advertisers would try to bypass these countermeasures. The second, most users just accept the default options that they are shown...

Bas Schouten, Principal Software Engineer at Mozilla, made it clear at the end of a heated Mastodon thread that "[opt-in features are] making privacy a privilege for the people that work to inform and educate themselves on the topic. People shouldn't need to do that, everyone deserves a more private browser. Privacy features, in Firefox, are not meant to be opt-in. They need to be the default.

"If you are 'completely anti-ads' (i.e. even if their implementation is private), you probably use an ad blocker. So are unaffected by this."
This has already provoked a discussion among Slashdot readers. "It doesn't seem that evil to me," argues Slashdot reader geekprime. "Seems like the elimination of cross site cookies is a privacy enhancing idea." (They cite Mozilla's statement that their goal is "to inform an emerging Web standard designed to help sites understand how their ads perform without collecting data about individual people. By offering sites a non-invasive alternative to cross-site tracking, we hope to achieve a significant reduction in this harmful practice across the web.")

But Slashdot reader TheNameOfNick disagrees. "How realistic is the part where advertisers stop tracking you because they get less information from the browser maker...?"

Mozilla has provided simple instructions for disabling the feature:

Click the menu button and select Settings.
In the Privacy & Security panel, find the Website Advertising Preferences section.
Uncheck the box labeled Allow websites to perform privacy-preserving ad measurement.

The Data That Powers AI Is Disappearing Fast (nytimes.com) 93

Posted by BeauHD on Friday July 19, 2024 @08:30PM from the emerging-crisis-in-consent dept.

It May Soon Be Legal To Jailbreak AI To Expose How It Works (404media.co) 26

Posted by BeauHD on Thursday July 18, 2024 @11:30PM from the chamber-of-secrets dept.

An anonymous reader quotes a report from 404 Media: A group of researchers, academics, and hackers are trying to make it easier to break AI companies' terms of service to conduct "good faith research" that exposes biases, inaccuracies, and training data without fear of being sued. The U.S. government is currently considering an exemption to U.S. copyright law that would allow people to break technical protection measures and digital rights management (DRM) on AI systems to learn more about how they work, probe them for bias, discrimination, harmful and inaccurate outputs, and to learn more about the data they are trained on. The exemption would allow for "good faith" security and academic research and "red-teaming" of AI products even if the researcher had to circumvent systems designed to prevent that research. The proposed exemption has the support of the Department of Justice, which said "good faith research can help reveal unintended or undisclosed collection or exposure of sensitive personal data, or identify systems whose operations or outputs are unsafe, inaccurate, or ineffective for the uses for which they are intended or marketed by developers, or employed by end users. Such research can be especially significant when AI platforms are used for particularly important purposes, where unintended, inaccurate, or unpredictable AI output can result in serious harm to individuals."

Much of what we know about how closed-sourced AI tools like ChatGPT, Midjourney, and others work are from researchers, journalists, and ordinary users purposefully trying to trick these systems into revealing something about the data they were trained on (which often includes copyrighted material indiscriminately and secretly scraped from the internet), its biases, and its weaknesses. Doing this type of research can often violate the terms of service users agree to when they sign up for a system. For example, OpenAI's terms of service state that users cannot "attempt to or assist anyone to reverse engineer, decompile or discover the source code or underlying components of our Services, including our models, algorithms, or systems (except to the extent this restriction is prohibited by applicable law)," and adds that users must not "circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services."

Shayne Longpre, an MIT researcher who is part of the team pushing for the exemption, told me that "there is a lot of apprehensiveness about these models and their design, their biases, being used for discrimination, and, broadly, their trustworthiness." "But the ecosystem of researchers looking into this isn't super healthy. There are people doing the work but a lot of people are getting their accounts suspended for doing good-faith research, or they are worried about potential legal ramifications of violating terms of service," he added. "These terms of service have chilling effects on research, and companies aren't very transparent about their process for enforcing terms of service." The exemption would be to Section 1201 of the Digital Millennium Copyright Act, a sweeping copyright law. Other 1201 exemptions, which must be applied for and renewed every three years as part of a process through the Library of Congress, allow for the hacking of tractors and electronic devices for the purpose of repair, have carveouts that protect security researchers who are trying to find bugs and vulnerabilities, and in certain cases protect people who are trying to archive or preserve specific types of content. Harley Geiger of the Hacking Policy Council said that an exemption is "crucial to identifying and fixing algorithmic flaws to prevent harm or disruption," and added that a "lack of clear legal protection under DMCA Section 1201 adversely affect such research."

Google URL Shortener Links Will Return a 404 Response 39

Posted by BeauHD on Thursday July 18, 2024 @05:20PM from the what-to-expect dept.

USPS Shared Customers Postal Addresses With Meta, LinkedIn and Snap (techcrunch.com) 25

Posted by BeauHD on Thursday July 18, 2024 @03:21PM from the would-you-look-at-that dept.

An anonymous reader quotes a report from TechCrunch: The U.S. Postal Service was sharing the postal addresses of its online customers with advertising and tech giants Meta, LinkedIn and Snap, TechCrunch has found. On Wednesday, the USPS said it addressed the issue and stopped the practice, claiming that it was "unaware" of it. TechCrunch found USPS was sharing customers' information by way of hidden data-collecting code (also known as tracking pixels) used across its website. Tech and advertising companies create this kind of code to collect information about the user -- such as which pages they visit -- every time a webpage containing the code loads in the customer's browser.

In the case of USPS, some of that collected data included the postal addresses of logged-in USPS Informed Delivery customers, who use the service to see photos of their incoming mail before it arrives. It's not clear how many individuals had their information collected or for how long. Informed Delivery had more than 62 million users (PDF) as of March 2024. [...] The code also collected other data, such as information about the user's computer type and browser, which appeared as partly pseudonymized -- essentially scrambled in a way that makes it more difficult for humans to know where data came from, or who it relates to, by using randomized identifiers in place of real customer names. But researchers have long warned that pseudonymous data can still be used to re-identify seemingly anonymous individuals.

TechCrunch also found that tracking numbers entered into the USPS website were also shared with advertisers and tech companies, including Bing, Google, LinkedIn, Pinterest and Snap. Some in-transit tracking data was also shared, such as the real-world location of the mail in the postal system, even if the customer was not logged in to USPS' website. USPS spokesperson Jim McKean said in a statement: "The Postal Service leverages an analytics platform for our own internal purposes, so that we understand the usage of our products and services and which we use on an aggregated basis to market our products. The Postal Service does not sell or provide any personal information that is collected from this analytics platform to any third party, and we were unaware of any configuration of the platform that collected personal information from the URL and that shared it without our knowledge with social media."

"We have taken immediate action to remediate this issue," the spokesperson added, without saying what action was taken.

Cloudflare Reports Almost 7% of Internet Traffic Is Malicious (zdnet.com) 34

Posted by BeauHD on Tuesday July 16, 2024 @08:02PM from the would-you-look-at-that dept.

In its latest State of Application Security Report, Cloudflare says 6.8% of traffic on the internet is malicious, "up a percentage point from last year's study," writes ZDNet's Steven Vaughan-Nichols. "Cloudflare, the content delivery network and security services company, thinks the rise is due to wars and elections. For example, many attacks against Western-interest websites are coming from pro-Russian hacktivist groups such as REvil, KillNet, and Anonymous Sudan." From the report: [...] Distributed Denial of Service (DDoS) attacks continue to be cybercriminals' weapon of choice, making up over 37% of all mitigated traffic. The scale of these attacks is staggering. In the first quarter of 2024 alone, Cloudflare blocked 4.5 million unique DDoS attacks. That total is nearly a third of all the DDoS attacks they mitigated the previous year. But it's not just about the sheer volume of DDoS attacks. The sophistication of these attacks is increasing, too. Last August, Cloudflare mitigated a massive HTTP/2 Rapid Reset DDoS attack that peaked at 201 million requests per second (RPS). That number is three times bigger than any previously observed attack.

The report also highlights the increased importance of application programming interface (API) security. With 60% of dynamic web traffic now API-related, these interfaces are a prime target for attackers. API traffic is growing twice as fast as traditional web traffic. What's worrying is that many organizations appear not to be even aware of a quarter of their API endpoints. Organizations that don't have a tight grip on their internet services or website APIs can't possibly protect themselves from attackers. Evidence suggests the average enterprise application now uses 47 third-party scripts and connects to nearly 50 third-party destinations. Do you know and trust these scripts and connections? You should -- each script of connection is a potential security risk. For instance, the recent Polyfill.io JavaScript incident affected over 380,000 sites.

Finally, about 38% of all HTTP requests processed by Cloudflare are classified as automated bot traffic. Some bots are good and perform a needed service, such as customer service chatbots, or are authorized search engine crawlers. However, as many as 93% of bots are potentially bad.

Federal Court Blocks Net Neutrality Rules (theverge.com) 54

Posted by BeauHD on Monday July 15, 2024 @04:50PM from the latest-setback dept.

NYC's Massive Link5G Towers Aren't Actually Providing 5G (gothamist.com) 33

Posted by msmash on Friday July 12, 2024 @11:20AM from the how-about-that dept.

Palestinians Say Microsoft Unfairly Closing Their Accounts (bbc.co.uk) 184

Posted by BeauHD on Thursday July 11, 2024 @08:45PM from the what-gives dept.

In a First, Federal Regulators Ban Messaging App From Hosting Minors (washingtonpost.com) 15

Posted by BeauHD on Tuesday July 09, 2024 @05:40PM from the false-advertising dept.

An anonymous reader quotes a report from the Washington Post: Federal regulators have for the first time banned a digital platform from serving users under 18 (Warning: source may be paywalled; alternative source), accusing the app -- known as NGL -- of exaggerating its ability to use artificial intelligence to curb cyberbullying in a groundbreaking settlement. Anapp popular among children and teens, NGL aggressively marketed to young users despite risks of bullying on the anonymous messaging site, the Federal Trade Commission and the Los Angeles District Attorney's Office alleged in a complaint unveiled Tuesday.

The complaint alleged that NGL tricked users into paying for subscriptions by sending them computer-generated messages appearing to be from real people and offering a service for as much as $9.99 a week to find out their real identity. People who signed up received only "hints" of those identities, whether they were real or not, enforcers said. After users complained about the "bait-and switch tactic," executives at the company "laughed off" their concerns, referring to them as "suckers," the FTC said in an announcement. NGL, internet shorthand for "not gonna lie," agreed to pay $5 million and stop marketing to kids and teens to settle the lawsuit, which also alleged that the company violated children's privacy laws by collecting data from youths under 13 without parental consent.

The settlement marks a major milestone in the federal government's efforts to tackle concerns that tech platforms are exposing children to noxious material and profiting from it. And it's one of the most significant actions by the FTC under Chair Lina Khan, who has dialed up scrutiny of the tech sector at the agency since taking over in 2021. "We will keep cracking down on businesses that unlawfully exploit kids for profit," Khan (D) said in a statement. NGL co-founder Joao Figueiredo said in a statement Tuesday that the company cooperated with the FTC's investigation for nearly two years and viewed the "resolution as an opportunity to make NGL better than ever."

"While we believe many of the allegations around the youth of our user base are factually incorrect, we anticipate that the agreed upon age-gating and other procedures will now provide direction for others in our space, and hopefully improve policies generally."

US Nuke Agency Buys Internet Backbone Data (404media.co) 24

Posted by msmash on Tuesday July 09, 2024 @02:01PM from the how-about-that dept.

Substack Rival Ghost Federates Its First Newsletter (techcrunch.com) 16

Posted by BeauHD on Monday July 08, 2024 @09:30PM from the making-good-on-a-promise dept.

Internet Archive Blames 'Environmental Factors' For Overnight Outages (theregister.com) 14

Posted by msmash on Monday July 08, 2024 @01:30PM from the tough-luck dept.

Popular Pirate Site Animeflix Shuts Down 'Voluntarily' (torrentfreak.com) 13

Posted by BeauHD on Friday July 05, 2024 @06:00PM from the another-one-bites-the-dust dept.

An anonymous reader quotes a report from TorrentFreak: With dozens of millions of monthly visits, Animeflix positioned itself as one of the most popular anime piracy portals. The site also has an active Discord community of around 35k members, who actively participate in discussions, art competitions, even a chess tournament. While rightsholders take no offense at these side-projects, the site's core business was streaming pirated videos. That hasn't gone unnoticed; last December Animeflix was listed as one of the shutdown targets of anti-piracy coalition ACE.

Whether these early enforcement efforts were responsible for the site's closure is unclear. In May, rightsholders increased the pressure through the High Court of India, obtaining a broad injunction that effectively suspended Animeflix's main domain name; Animeflix.live. This follow-up action didn't seem to hurt the site too much. It simply moved to new domains, Animeflix.gg and Animeflix.li, informing its users that the old domain name had become "unavailable." Yesterday, the site became unreachable again, initially returning a Cloudflare error message. This time, the domain wasn't the problem but, for reasons unknown, the team decided to shut down the site without prior notice.

"It is with a heavy heart that we announce the closure of Animeflix. After careful consideration, we have decided to shut down our service effective immediately. We deeply appreciate your support and enthusiasm over the years." "Thank you for being a part of our journey. We hope the joy and excitement of anime continue to brighten your days through other wonderful platforms," the Animeflix team adds. The Animeflix team doesn't provide any insight into its reasoning, but it's clear that keeping a site like that online isn't without challenges. And, when a pirate site shuts down, voluntarily or not, copyright issues typically play a role. It's clear that rightsholders were keeping an eye on the site, and were actively seeking out options to take it offline. That might have played a role in the shutdown decision but without more information from the team, we can only speculate.

384,000 Sites Pull Code From Sketchy Code Library Recently Bought By Chinese Firm (arstechnica.com) 35

Posted by BeauHD on Friday July 05, 2024 @03:25PM from the malware-injection dept.

An anonymous reader quotes a report from Ars Technica: More than 384,000 websites are linking to a site that was caught last week performing a supply-chain attack that redirected visitors to malicious sites, researchers said. For years, the JavaScript code, hosted at polyfill[.]com, was a legitimate open source project that allowed older browsers to handle advanced functions that weren't natively supported. By linking to cdn.polyfill[.]io, websites could ensure that devices using legacy browsers could render content in newer formats. The free service was popular among websites because all they had to do was embed the link in their sites. The code hosted on the polyfill site did the rest. In February, China-based company Funnull acquired the domain and the GitHub account that hosted the JavaScript code. On June 25, researchers from security firm Sansec reported that code hosted on the polyfill domain had been changed to redirect users to adult- and gambling-themed websites. The code was deliberately designed to mask the redirections by performing them only at certain times of the day and only against visitors who met specific criteria.

The revelation prompted industry-wide calls to take action. Two days after the Sansec report was published, domain registrar Namecheap suspended the domain, a move that effectively prevented the malicious code from running on visitor devices. Even then, content delivery networks such as Cloudflare began automatically replacing pollyfill links with domains leading to safe mirror sites. Google blocked ads for sites embedding the Polyfill[.]io domain. The website blocker uBlock Origin added the domain to its filter list. And Andrew Betts, the original creator of Polyfill.io, urged website owners to remove links to the library immediately. As of Tuesday, exactly one week after malicious behavior came to light, 384,773 sites continued to link to the site, according to researchers from security firm Censys. Some of the sites were associated with mainstream companies including Hulu, Mercedes-Benz, and Warner Bros. and the federal government. The findings underscore the power of supply-chain attacks, which can spread malware to thousands or millions of people simply by infecting a common source they all rely on.

Google Paper: AI Potentially Breaking Reality Is a Feature Not a Bug (404media.co) 82

Posted by msmash on Friday July 05, 2024 @11:22AM from the how-about-that dept.

Cloudflare Rolls Out Feature For Blocking AI Companies' Web Scrapers (siliconangle.com) 40

Posted by BeauHD on Thursday July 04, 2024 @03:00AM from the declare-your-AIndependence dept.

Cloudflare today unveiled a new feature part of its content delivery network (CDN) that prevents AI developers from scraping content on the web. According to Cloudflare, the feature is available for both the free and paid tiers of its service. SiliconANGLE reports: The feature uses AI to detect automated content extraction attempts. According to Cloudflare, its software can spot bots that scrape content for LLM training projects even when they attempt to avoid detection. "Sadly, we've observed bot operators attempt to appear as though they are a real browser by using a spoofed user agent," Cloudflare engineers wrote in a blog post today. "We've monitored this activity over time, and we're proud to say that our global machine learning model has always recognized this activity as a bot."

One of the crawlers that Cloudflare managed to detect is a bot that collects content for Perplexity AI Inc., a well-funded search engine startup. Last month, Wired reported that the manner in which the bot scrapes websites makes its requests appear as regular user traffic. As a result, website operators have struggled to block Perplexity AI from using their content. Cloudflare assigns every website visit that its platform processes a score of 1 to 99. The lower the number, the greater the likelihood that the request was generated by a bot. According to the company, requests made by the bot that collects content for Perplexity AI consistently receive a score under 30.

"When bad actors attempt to crawl websites at scale, they generally use tools and frameworks that we are able to fingerprint," Cloudflare's engineers detailed. "For every fingerprint we see, we use Cloudflare's network, which sees over 57 million requests per second on average, to understand how much we should trust this fingerprint." Cloudflare will update the feature over time to address changes in AI scraping bots' technical fingerprints and the emergence of new crawlers. As part of the initiative, the company is rolling out a tool that will enable website operators to report any new bots they may encounter.

AI Trains On Kids' Photos Even When Parents Use Strict Privacy Settings 33

Posted by BeauHD on Tuesday July 02, 2024 @04:33PM from the damage-is-done dept.

An anonymous reader quotes a report from Ars Technica: Human Rights Watch (HRW) continues to reveal how photos of real children casually posted online years ago are being used to train AI models powering image generators -- even when platforms prohibit scraping and families use strict privacy settings. Last month, HRW researcher Hye Jung Han found 170 photos of Brazilian kids that were linked in LAION-5B, a popular AI dataset built from Common Crawl snapshots of the public web. Now, she has released a second report, flagging 190 photos of children from all of Australia's states and territories, including indigenous children who may be particularly vulnerable to harms. These photos are linked in the dataset "without the knowledge or consent of the children or their families." They span the entirety of childhood, making it possible for AI image generators to generate realistic deepfakes of real Australian children, Han's report said. Perhaps even more concerning, the URLs in the dataset sometimes reveal identifying information about children, including their names and locations where photos were shot, making it easy to track down children whose images might not otherwise be discoverable online. That puts children in danger of privacy and safety risks, Han said, and some parents thinking they've protected their kids' privacy online may not realize that these risks exist.

From a single link to one photo that showed "two boys, ages 3 and 4, grinning from ear to ear as they hold paintbrushes in front of a colorful mural," Han could trace "both children's full names and ages, and the name of the preschool they attend in Perth, in Western Australia." And perhaps most disturbingly, "information about these children does not appear to exist anywhere else on the Internet" -- suggesting that families were particularly cautious in shielding these boys' identities online. Stricter privacy settings were used in another image that Han found linked in the dataset. The photo showed "a close-up of two boys making funny faces, captured from a video posted on YouTube of teenagers celebrating" during the week after their final exams, Han reported. Whoever posted that YouTube video adjusted privacy settings so that it would be "unlisted" and would not appear in searches. Only someone with a link to the video was supposed to have access, but that didn't stop Common Crawl from archiving the image, nor did YouTube policies prohibiting AI scraping or harvesting of identifying information.

Reached for comment, YouTube's spokesperson, Jack Malon, told Ars that YouTube has "been clear that the unauthorized scraping of YouTube content is a violation of our Terms of Service, and we continue to take action against this type of abuse." But Han worries that even if YouTube did join efforts to remove images of children from the dataset, the damage has been done, since AI tools have already trained on them. That's why -- even more than parents need tech companies to up their game blocking AI training -- kids need regulators to intervene and stop training before it happens, Han's report said. Han's report comes a month before Australia is expected to release a reformed draft of the country's Privacy Act. Those reforms include a draft of Australia's first child data protection law, known as the Children's Online Privacy Code, but Han told Ars that even people involved in long-running discussions about reforms aren't "actually sure how much the government is going to announce in August." "Children in Australia are waiting with bated breath to see if the government will adopt protections for them," Han said, emphasizing in her report that "children should not have to live in fear that their photos might be stolen and weaponized against them."

Will a US Supreme Court Ruling Put Net Neutrality at Risk? (msn.com) 192

Posted by EditorDavid on Saturday June 29, 2024 @11:34PM from the here-comes-the-judge dept.

Japan Achieves 402 TB/s Data Rate - Using Current Fiber Technology (tomshardware.com) 21

Posted by EditorDavid on Saturday June 29, 2024 @11:34AM from the very-high-speed-int-ernet dept.

2013	Explosions at the Boston Marathon	1105 comments
2005	Does Adblock Violate A Social Contract?	1043 comments
2004	Japanese Inventor's Motor Uses 80% Less Power	1095 comments
2003	Parallel Universes Are Real	1066 comments
2002	NASA Reports Vast Hydrogen Reserves in Earth's Crust	822 comments

Slashdot Top Deals