Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror

Comment Re:Holy shit, the logic fail here. (Score 1) 38

What you describe is essentially a form of bootstrapping, which is a legitimate statistical method. However, there are important limitations that cannot be overlooked.

First, the constructed data are still being created from real data. Ethics is not just about preserving patient privacy, although that is a very important aspect. It's also about taking into consideration how the data will be used. Does the patient consent to this use, and if they are unable to consent, how should this be taken into consideration? Medical science has not had a stellar track record with respect to ethical human experimentation (e.g., Henrietta Lacks, the Tuskegee syphilis study, MKUltra--and that's just in recent US history). There is a documented history of patient collected data being used in ways that those patients never even conceived, let alone anticipated or consented. Caution must be exercised whenever any such data is used, even indirectly.

Second, this kind of simulated data is problematic to analyze from a statistical perspective, and any biostatistician should be aware of this: there is no such thing as a free lunch. The problem of missing data--in actual patients!--is itself difficult to address, since methods to deal with missingness invariably rely on various strong assumptions about the nature of that missingness. So to make inferences on data that is entirely simulated is, at the very least, as problematic as analyzing partially missing data.

Third, the current state of LLMs, and their demonstrated tendency to distort or invent features from noise (which is arguably the primary mechanism by which they operate), is such that any inferences from LLM-generated data would be questionable and should not be considered statistically meaningful. It could be used for hypothesis generation, but it would not satisfy any kind of statistical review.

It all comes back to what I said in another comment: you can't have it both ways. If you can draw some statistically meaningful conclusion from the data, then that data came from real-world patients and must pass ethical review. If you don't need ethical review because the data didn't come from any real patient, then any inferences are dubious at best, and are most likely just fabrications that cannot pass confirmatory analysis.

Comment Re:Holy shit, the logic fail here. (Score 4, Insightful) 38

The purported claim is that "because the AI-generated data do not include data from actual humans, they do not need ethics review to use."

But if the data only represent actual patients in a "statistical" sense (whatever that means), how can the research be CERTAIN that it has captured appropriate signals or effects that are observed in such data? And I say this as a statistician who has over a decade of experience in statistical analysis of clinical trials.

There is a fundamental principle at work here, one that researchers cannot take the better part of both ways of the argument: any meaningful inference must be drawn on real world data, and if such data is taken from humans, it must pass an ethics board review. If one argues that AI-generated data doesn't need the latter because it is a fabrication, then it doesn't meet the standard for meaningful inference. If one argues that it does meet the standard, then no matter how the data was transformed from real-world patient sources, it requires ethics board review.

In biostatistics, we use models to analyze data to detect potential effects, draw hypotheses or make predictions, and test those hypotheses to make probabilistic statements--i.e., statistical inferences--about the validity of those hypotheses. This is done within a framework that obeys mathematical truth, so that as long as certain assumptions about the data are met, the results are meaningful. But what "statistically naive" people consistently fail to appreciate, especially in their frenzy to "leverage" AI everywhere, is that those assumptions are PRETTY FUCKING IMPORTANT and using an LLM to generate "new" data from existing, real-world data, is like making repeated photocopies of an original--placing one model on top of another model. LLMs will invent signals where none originally existed. LLMs will fail to capture signals where one actually existed.

Comment Re:But not in the US (Score 4, Insightful) 228

In fact, it is UNETHICAL to use a placebo control in any clinical trial of an investigational product for which the existing standard of care already includes a product on the market.

In plain English, it is entirely unethical to give participants a placebo to test the efficacy of a new flu vaccine when we already have existing vaccines on the market. Doing so denies participants in the study from accessing effective treatment. If you have to test against a placebo, it will be impossible to recruit participants, because nobody will take the chance to receive placebo when they could just go to the pharmacy and get vaccinated.

There are only two possible explanations for such a position: either gross ignorance of basic scientific and ethical principles for conducting medical research in humans, or deliberate malicious intent to stop all research of investigational drugs. It doesn't actually matter which one is the reason. Both are entirely unacceptable.

The fact that a huge segment of the American population does not understand even the most basic scientific principles is the reason why many people will die needlessly.

Comment Re:Off Insulin onto immunosuppressants for life... (Score 4, Informative) 65

I agree that this therapy is not without significant risks, so it's not to be taken lightly.

That said, the long-term health outcomes of T1DM are also significant. So the way I see this development is that it is one more step on the path toward finding a durable, safe, and effective cure. And if approved, it may offer some patients another choice, one that of course should involve an informed discussion with competent healthcare providers.

It's important to keep in mind that healthcare is not a "one size fits all" thing. Two patients that have the same condition can respond very differently to the same therapy. Before the discovery of insulin, diabetics literally just...died. So on the path to understanding this relationship between the individual patient and the selected therapy, medical science can only offer a range of treatment options. At one time, humans believed in bloodletting, lobotomies, and arsenic to treat various illnesses. We built leper colonies. And in some places in the world, menstruation is still considered "dirty." We have made many advances, but there are still many more to be discovered.

Comment Yet another Register hit piece (Score 4, Interesting) 240

I'd rather use a slower browser that honors the user's choice of extensions--in particular those that block malicious content and privacy-violating advertising trackers--than an ostensibly faster browser that is created by a company whose entire business model is to gather as much tracking data about you in order to sell it to advertisers.

There are alternatives to both Firefox and Chrome. But choosing to use Chrome because Firefox isn't perfect is either the height of idiocy, or being paid to promote Google products.

Comment Re:Phase I is not enough. (Score 4, Insightful) 40

Birth defects due to thalidomide approval outside of the US were extensive, and you conveniently ignore this. That wasn't some media psyop: 20000 affected embryos for a drug marketed to prevent morning sickness is not something to trivialize, and the fact that it was not approved in the US meant that many American families were spared this horror.

It's ironic that you mention Type I and Type II errors, yet conclude--without any apparent consideration of such errors as they apply to the establishment of efficacy and safety--that somehow "Montana has the right idea." How would they have the right idea if anyone can choose to receive unapproved drugs before any data collection and statistical analysis is performed? That to me suggests you don't have the faintest clue about what a Type I error means.

Comment Re:Phase I is not enough. (Score 3, Informative) 40

This is a horrible plan on so many levels.

An investigational treatment that has passed Phase I only has the most basic pharmacokinetic and safety data gathered. There's virtually no efficacy data. The layman who thinks that patients with serious and unmet medical needs should have access to such treatments before efficacy is established, believes so because "what other options do they have?" Their logic is that they should be "free" to try anything.

But the primary reason why this logic is flawed is because a very high proportion of treatments at this stage of clinical development are inadequate--they are either ineffective or unsafe, or both. The second reason--one that never crosses the layman's mind--is that providing such early access would cripple trial enrollment. If even one state passes legislation to circumvent regulatory oversight, then patients will simply demand access through that state, rather than enroll in a trial and deal with the burden of following the trial protocol and procedures. And this will absolutely cause statistical and ethical problems with analyzing efficacy and safety in a trial context. The result will be either a serious delay in securing marketing approval for therapies that do succeed, or even worse, pharmaceutical companies will simply sidestep the regulatory process entirely and just start make marketing claims for untested compounds. After all, why bother spending hundreds of millions of dollars on a development program to secure approval if one state lets people try whatever they want?

One more comment: drug companies do have a pathway for patients in dire need. It's called "compassionate use" or "expanded access." So it's not like there's a brick wall preventing patients from accessing investigational therapies when they do not meet the inclusion criteria for a trial. But opening the floodgates is going to hurt way more people than it might help.

Comment Re:Model Collapse (Score 1) 98

This is exactly correct, and it also furnishes a rebuttal against the claim that AI generated "art" is not theft any more than it would be theft for a human to study, learn from, and draw upon the works of other humans. If that were true, these models would not need to be trained on original, real-world data--it could simply train itself. But model collapse is very real, and the desire of companies to steal original content from its creators by any means possible amounts to a tacit admission that the output is neither original nor on equal footing with the human-created data it was trained on.

Yes, the output can be impressive; it can mimic or even surpass human-created work in some ways. It can be incredibly useful and meaningful. But that is only because it was trained on data that had those properties to begin with. Existing AI models are not truly creative in nature, and the inability to self-train is proof.

Comment Too speculative to be meaningful (Score 2) 209

I am dismayed that a virologist would say such things, as I would expect them to know better. The truth is that we simply do not know. There is insufficient evidence to predict whether or not the current human population has any clinically significant immunity to this H5N1 subtype. We do not even know whether or not it will mutate to become more transmissible from human to human.

Yes, there are some basic principles at work that might suggest that preexisting infections and vaccinations for human influenza could confer some degree of protection. But that's such a broad and superficial notion--it doesn't convey any sense of the extent of protection, if any. The 1918 flu pandemic ("Spanish flu") that killed millions of people, of course did not occur in a totally immunologically naive population with respect to human influenza, so what justifies any assertion that our existing exposure to human influenza will confer protection against H5N1 in any meaningful way?

In any population there will be intrinsic variability in which some people will be more susceptible to worse clinical outcomes, and others who will be less susceptible. We do not yet know who these people are, because this disease is not yet pandemic or endemic. We still don't even know who these people are for COVID-19, which has become endemic. I would have thought that especially among the scientifically literate, people would have learned something from the COVID-19 pandemic with respect to making predictions or assertions about the behavior and impact of infectious diseases on a population. I've long since given up on the general population learning anything. But I am deeply disappointed that a virologist would speculate on such matters, as the message inevitably becomes further distorted by journalists and then politicized and warped beyond all recognition by a self-absorbed and uneducated public.

Comment So what is the proposed mechanism? (Score 4, Interesting) 130

If there is evidence for association or a dose-response relationship, then what is the underlying causal mechanism? That's the real question. Alcohol consumption has other health effects, many of which are detrimental. So before this research can be considered useful, it has to explain what is happening at a metabolic level; e.g., what is alcohol consumption doing to lipoprotein synthesis or endogenous cholesterol.

Comment Re:Can we cure dementia first (Score 1) 80

But I'm not addressing the amyloid hypothesis or even Alzheimer's dementia specifically. Rather, I'm speaking more broadly about the relationship between the biological aging process, chronological age, and diseases associated with these.

You are correct that, despite the likely validity of the amyloid hypothesis, treatments that target amyloid burden do not stop disease progression, and at best seem to only buy a little more time. Such drugs were approved under questionable circumstances (notably, Biogen's aducanumab) and in my opinion, their evidence of efficacy remains weak. As yet we do not have a complete understanding of Alzheimer's pathophysiology, let alone an understanding of how age-related decline in health is related to the risk of developing cognitive impairment, dementia, or other diseases in general.

That said, my original point is that we do not necessarily need to have such an understanding in order to target mechanisms of aging. We do not need to "solve the problem of dementia" before addressing aging and longevity. They aren't mutually exclusive. If we can improve health span through improving life span, it is reasonable to predict that the onset of age-related diseases could be delayed, leading to functional extension of healthy years of life and an improvement in the quality of life for a majority of the population.

A loose analogy would be cancer research: we can research ways of stopping people who are diagnosed with various cancers from disease progression, increasing their overall survival and improving their quality of life after diagnosis. Or we can also research ways to PREVENT people from developing cancer in the first place--reducing the risk of aggressive cancers or delaying the median age at onset. These are both valid and important research activities that make a meaningful difference, and they are not mutually exclusive. That's why I object to the statement by the grandparent post that asked if we can we cure dementia first before first extending the human lifespan. It's the wrong way to think about these issues because it is quite possible that longevity research could have positive effects on the incidence and onset of dementia, as well as a whole host of other age-related diseases.

Comment Re:Can we cure dementia first (Score 2) 80

I think everyone can agree that as we age, the risk of various forms of dementia increases. But what is not as clear is whether that risk is correlated to chronological age, or whether senescence and dementia are merely the consequences of underlying inflammatory and/or epigenetic processes that can be modified.

In plain English, we have to distinguish between associations versus cause and effect. Age and dementia are associated. But maybe old age is not a CAUSE of dementia. It's possible that if we find a way to slow the aging process, the mechanism by this happens might also slow the same processes that impact cognitive decline and dementia risk. In other words, what if by increasing lifespan, we are also delaying the onset of a whole host of other diseases--diabetes, cardiovascular disease, cancer, and dementia? Your post implies that you're conceptualizing lifespan extension as adding on extra years at the end. But maybe lifespan extension is more like taking those years of healthy life and making them last longer from a biological perspective. We don't know if it's possible or what form lifespan extension will take. But that neither implies the latter nor the former. Without research, we will never know.

Comment Re: What's next? (Score 3, Insightful) 52

But it is relevant because that's exactly what's happening in real life right now.

The issue here isn't simply about the legal framework or copyright. What I'm trying to bring to light with this particular scenario are the downstream effects of current practice. The proliferation of generative AI models is significantly disruptive to a segment of creators who, whether or not they are in principle protected legally, have their livelihoods threatened. Attempts to adapt their work product are temporary solutions at best, and the actual reputational damage remains unaddressed. And while this sort of thing has happened well before such generative AIs were available, one certainly must admit that the scope of the problem has dramatically increased because of it.

So what I guess I'm saying is that even if we come to some kind of agreement about the responsible use of generative AI models (in the sense of operating within some agreed upon legal framework for the protection of intellectual property rights), there is still this gap--even by your own admission, "style" is not copyrightable--through which creators still may suffer and be discouraged to create because their own output has been leveraged against them to diminish the value of not only extant works, but future works.

In an ideal world, we wouldn't need to tie the creative process to an economic livelihood. People would create and discover and invent purely for the joy of it, without needing to attach economic value to those products in order to feed, clothe, and house themselves. But we live in a world where entities with far more money and power are able to take the creative output of others and profit off of it without consequences. And this has been true since before AI, but AI has made this malfeasance far easier and more efficient to perpetrate. The AI companies themselves admit that they need more data to train their models, and that the training off its own output is self-corrupting. This suggests to me that, at present, the original work is what principally has value, hence should receive commensurate compensation. Therefore, what such models generate is not the same as what humans do when they learn and create, because if the model could do it, they wouldn't need to train off human-created data.

Comment Re:What's next? (Score 1) 52

But the AI companies aren't buying the content in the first place. They took it without compensation to the original creators, and the proposal in question is to grant them a license to do this without any compensation unless the creator locates each AI company to opt out. And the reason why they want it structured this way is because AI companies know that if they have to pay for content up front, they couldn't afford to do it. So they take it.

The way they're getting around this issue now is shifting the responsibility onto social media and internet companies, who have updated their Terms of Service to sell the data that users submit to their sites to these AI companies, often retroactively. Original works that you never consented to be used in this manner are then being used to train AI models. You weren't compensated, but the companies get all the billions. Their defense? You contributed for free, so you gave them a license for perpetual use. So you decide not to use those sites anymore, but they already have your data. They're never going to then dig through those models and decide to pay you royalties for it, even though they sell subscriptions to use their models.

Is that fair?

Comment Re:What's next? (Score 1) 52

Here's another, more realistic scenario, one that is actually very much what is going on with generative AI.

I have a friend who is a talented artist. His works are made digitally, but he is classically trained. He doesn't have a social media presence, but he sells his work to others who commission him after finding his work through word of mouth. So, he is compensated for the art that he does create for others--essentially, a work-for-hire arrangement.

Those who commission him are avid social media users. One day, his work goes viral because of his distinctive art style. Since the terms of service of the various social media sites state that any content uploaded may be used to train generative AI models, those reposts of his work find their way into the training datasets of subscription generative AI services. Users of those services discover they can generate numerous similar artworks--in fact, almost indistinguishable from genuine originals--for a fraction of the cost and time. The AI company makes money from the subscriptions. The social media companies make money from advertisers and from selling training data to the AI company. Demand for my friend's hand-drawn (albeit digital) work dies out. He is no longer able to make a living from his artwork. He never consented to his work being included in those models. He was compensated for the original works, but nobody wants them anymore because anyone can generate thousands of similar works from models trained on those originals.

Not to be discouraged, my friend changes their style. Perhaps he even ventures into other forms of art entirely--say, sculpture or other physical media. But the cycle repeats in one form or another. Even worse, the AI models, trained on so much data, become so convincing that scammers use it to flood the internet with generated images of his style of sculpture or physical artworks. Nobody can tell whether they are real or fake. People lose significant sums of money to buy what they think are real works. Because there is so much distrust about their authenticity, demand for his work declines.

Is this fair? How do you propose this issue be solved?

Slashdot Top Deals

Air is water with holes in it.

Working...