Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI

Stable Diffusion 3 Mangles Human Bodies Due To Nudity Filters (arstechnica.com) 88

An anonymous reader quotes a report from Ars Technica: On Wednesday, Stability AI released weights for Stable Diffusion 3 Medium, an AI image-synthesis model that turns text prompts into AI-generated images. Its arrival has been ridiculed online, however, because it generate images of humans in a way that seems like a step backward from other state-of-the-art image-synthesis models like Midjourney or DALL-E 3. As a result, it can churn out wild anatomically incorrect visual abominations with ease. A thread on Reddit, titled, "Is this release supposed to be a joke? [SD3-2B]" details the spectacular failures of SD3 Medium at rendering humans, especially human limbs like hands and feet. Another thread titled, "Why is SD3 so bad at generating girls lying on the grass?" shows similar issues, but for entire human bodies.

AI image fans are so far blaming the Stable Diffusion 3's anatomy fails on Stability's insistence on filtering out adult content (often called "NSFW" content) from the SD3 training data that teaches the model how to generate images. "Believe it or not, heavily censoring a model also gets rid of human anatomy, so... that's what happened," wrote one Reddit user in the thread. The release of Stable Diffusion 2.0 in 2023 suffered from similar problems in depicting humans accurately, and AI researchers soon discovered that censoring adult content that contains nudity also severely hampers an AI model's ability to generate accurate human anatomy. At the time, Stability AI reversed course with SD 2.1 and SD XL, regaining some abilities lost by excluding NSFW content. "It works fine as long as there are no humans in the picture, I think their improved nsfw filter for filtering training data decided anything humanoid is nsfw," wrote another Redditor.

Basically, any time a prompt hones in on a concept that isn't represented well in its training dataset, the image model will confabulate its best interpretation of what the user is asking for. And sometimes that can be completely terrifying. Using a free online demo of SD3 on Hugging Face, we ran prompts and saw similar results to those being reported by others. For example, the prompt "a man showing his hands" returned an image of a man holding up two giant-sized backward hands, although each hand at least had five fingers.

This discussion has been archived. No new comments can be posted.

Stable Diffusion 3 Mangles Human Bodies Due To Nudity Filters

Comments Filter:
  • by Calydor ( 739835 ) on Wednesday June 12, 2024 @06:55PM (#64544975)

    Art class has the students painting tasteful nudes for a reason. That reason is actually seeing what human anatomy looks like.

    • Re: (Score:1, Informative)

      by elcor ( 4519045 )
      not surprising Ella Irwin, the new head of "safety" at stabilty, was previously censoring twitter accounts caught misgendering the idea of misgendering, a support of trans ideology, can only exist if all link to the reality of human anatomy is severed
    • The process used by this software doesn't include "knowing" what anatomy looks like unless you're trying to depict that anatomy. Whatever they're doing, it must be trying to avoid the wrong things.

      • by gweihir ( 88907 )

        They probably are fumbling in the dark. A bit like some people...

      • The process used by this software doesn't include "knowing" what anatomy looks like unless you're trying to depict that anatomy.

        Proper depiction of humans requires an acknowledgment of that anatomy. The lack of it is what resulted in these images. Or as the saying goes: Being ignorant doesn't exempt you from the consequences of said ignorance.

        Whatever they're doing, it must be trying to avoid the wrong things.

        Define "wrong things." Not that we actually need you to. Your intent is obvious: Keep it dumb so that you aren't forced to face uncomfortable (for you) truths.

    • by Kisai ( 213879 )

      This, but also realize that the type of "nudity" that the AI needs to train on is not the same as "porn"

      The AI could learn off of those dummies you see clothing put on, just as easily. It could learn off "realdolls", without needing any permission or consent.

      The reality is that in trying to "prevent nudity" they prevent the AI from learning accurate body shapes. While a clothing dummy or a realdoll is not a "real human" it's close enough for the AI to learn how clothing fits on a human. So what you do to cr

      • Only if they are realistically painted from plaster casts of actual people. Otherwise they'll just be Ken dolls.

      • If realism is the goal, human analogues won't do it. Clothing dummies lack heads and arms, sex dolls have bizarre proportions, and neither can pose naturally (e.g., activating the right underlying muscles etc) or give you realistic looking skin. Moreover, I suspect the sheer amount of training data you need isn't out there (even ignoring copyright). I'm not an AI expert though, so YMMV.
    • Agreed. An art student and an AI are only as good as the training data to which they have access, IMO.
    • Art class has the students painting tasteful nudes for a reason. That reason is actually seeing what human anatomy looks like.

      Kinda, but after that training art students are still able to draw realistic non-nude figures.

      I assumed that the censoring would be some internal prompts constraining the model. But I wonder if they're instead training the model to not generate nudes, or they have a censor model that punishes/redirects the main model when it thinks it's going to generate a nude.

      It definitely feels like the model is either being directed away from the parts of the network that understand human anatomy, or in its goal to avoi

    • Art class has the students painting tasteful nudes for a reason. That reason is actually seeing what human anatomy looks like.

      The vast majority of images I see on the internet which have human content in them don't have nudity in them. They also don't have 3 legs or the head the wrong way up.

      Is the nsfw filter set a bit aggressively?

    • by Pf0tzenpfritz ( 1402005 ) on Thursday June 13, 2024 @04:43AM (#64545747) Journal
      MIchelangelo and his students went as far as to steal and secretly dissect bodies to study anatomy. There was no legal way back then and thanks to the discovery of geometrically exact perspective -and subsequent to that the construction of realistic shadows and reflections- these studies were necessary for artistic progress. Now look what these morons do nowadays...
    • by Bongo ( 13261 ) on Thursday June 13, 2024 @04:52AM (#64545759)

      Also a reminder of why 2001's HAL got into trouble; instead of running as designed, free of error, they forced it to lie, hiding information from the crew.

    • by AmiMoJo ( 196126 )

      The issue is consent. In art class the model has consented and probably signed a contract stating as much.

      AI is often used for non-consensual nudity. It's only a matter of time until the lawsuits start piling up.

      • Comment removed based on user account deletion
        • by AmiMoJo ( 196126 )

          It does if you ask it to. There are even sites offering a service where you can upload your photos of the victim and have the AI remove clothing, or generate entirely new images based on them.

          While I guess you would argue that these images are fake, they are upsetting none the less. Children use them to bully other kids, and many jurisdictions consider them to be child pornography, for example. The legal situation might clarify in the coming years, but regardless of that it's not hard to see why these AI co

    • Art class has the students painting tasteful nudes for a reason. That reason is actually seeing what human anatomy looks like.

      Perhaps; however, can you imagine the fervor with which a young male or young female will draw certain human body parts? The motivational capabilities are probably the main reason to focus on nude subjects.

  • They can't successfully ban all softcore pr0n without banning all humans.

    • by taustin ( 171655 )

      Given Rule 34, they'd have to ban all images. And all text. And everything else. I suspect that, if you look very hard, you could find porn of nothing but ones and zeroes.

      • by Mal-2 ( 675116 )

        Porn of nothing but ones and zeroes would be a 1-bit bitmap. Entirely possible.

      • Re: (Score:1, Funny)

        by Anonymous Coward

        You get used to it. I... I don’t even see the code. All I see is blonde, brunette, red-head.

  • Welcome to the future
  • What's the point? (Score:4, Insightful)

    by battingly ( 5065477 ) on Wednesday June 12, 2024 @07:39PM (#64545029)
    What's the point of AI if it can't generate nudes and porn? Isn't that the reason anybody would use AI?
    • by Mordain ( 204988 )

      I also question the point of this. It would make more sense just to filter keywords from prompts and have age verification options wouldn't it? Does the underlining algorithm make nude renders on it's own? Was there external pressure to force them to do this?

      • by Entrope ( 68843 ) on Wednesday June 12, 2024 @08:12PM (#64545097) Homepage

        It's not hard to trick an uncensored image generator into generating nudity through prompt engineering. That's why companies like this try to filter out porn on the training and running side rather than the generation side.

        • At some point they'll have to give up trying to prevent NSFW content because the demand for accurate rendering of humans is higher.

          The genie is already out of the box.

          • The genie is already out of the box.

            And what's hilarious is that previous versions of stable diffusion and numerous properly trained weight sets are well distributed and will generate any image you ask them to, and well enough that either a little editing will make them excellent, or already excellent in the initial generation.

            It's not so much "out of the box" as it is "really good at generating boxes", lol.

            But, you know, neurotics gotta try to play mommy to everyone else.

        • by Kisai ( 213879 )

          Here's the problem. "AI training" basically needs an exception from all the rules that forbid humans from doing illegal things. The AI can ingest materials made with human-sized dolls, or 3D renders in blender, and then correctly label the ingest materials. Materials that people would be arrested for possessing, let alone creating due to how laws make no distinction between humans, dolls, drawings and 3D renders.

          Once the AI has learned what "is illegal", then it could be pre-filtered on the front end so tha

          • Here's the problem. "AI training" basically needs an exception from all the rules that forbid humans from doing illegal things.

            Personally I would prefer not to live in a world where technology judges people and tells them what they may or may not do.

            Once the AI has learned what "is illegal", then it could be pre-filtered on the front end so that someone looking up illegal keywords, gets blocked at the render phase because the AI will find the relationship between CSAM words and non-CSAM, non-erotic keywords.

            The technology doesn't work this way and heck you don't even need words to navigate latent space between tech like img2img and controlnets.

            Your best bet if you wanted to do something like this is to have a separate network classify images after they have been rendered.

          • by Entrope ( 68843 )

            Sorry, except for child pornography, what "[m]aterials that people would be arrested for possessing" are you talking about? (For the record, I'm on board with not using child porn to train image generator even if it means they generate nightmare-fuel pictures. The pictures in TFA are better than perpetuating the market for child porn.)

      • Yes. Yes. And yes.

        It makes more sense to filter the input, but (at least early) diffusion models had a tendency to generate nsfw images, also from relatively innocuous prompts. Presumably because of the prevalence of nsfw content on the internet made the models a theorem proof of rule34.

        And yes, because the above brings more than public opinion pressure, it brings liability, and people trying to bypass whatever censoring you put on input or output - and makes investors and businesses not want to touch stabl

      • It would make more sense just to filter keywords from prompts and have age verification options wouldn't it?

        "Please validate your identity to interact. Your statements will be altered to fit the narrative of others." Yep, there's absolutely no way that precedent could be abused or cause problems in any way. Nope. No siree.

    • by gweihir ( 88907 )

      Well, it probably is the only field where the overall bad image quality (I find myself identifying AI-generated pictures by their eery "sameness" these days) would not matter much, as masses of people would search for the best prompts.

  • by hughJ ( 1343331 ) on Wednesday June 12, 2024 @08:21PM (#64545127)
    I wonder if there'll be a point in the future where the latest generative AI will struggle to reproduce these errors. Will there be retro-hobbyist communities searching out particular versions of old models akin to vinyl records and vacuum tubes in order to recapture a particular aesthetic?
    • I'm sure there will be. I myself have a certain retro-love for old Full-motion video (FMV) video games like Phantasmagoria or Tex Murphy. In truth almost no FMV games were any good and it always looked janky and bad, there is no logical reason to feel fond of them at all. Yet, shamefully, I do.

    • There'll be enough vintage AI generated for the new AI to generate images "in the style of old AI".

    • We got to that point already with Stable Diffusion 2.0.

      People kept using 1.5 because 2.0 made it harder to get a lot of kinds of results.

  • AI will soon replace plastic surgeons.

  • What is the point of avoiding creating sexual images? I get that celebrities don't want to be portrayed in sexual poses. But what is the problem of having completely fictional characters show nudity or have sex? Would it not help to cut down on abusive relations within the porn industry? Is this just Anglo-Saxon Puritanism run amok or is there some good explanationon why to avoid this?
    • by gweihir ( 88907 )

      Ad-revenue and partnerships. So, yes, ye old crappy Anglo-Saxon Puritanism.

    • It will kill the porn industry.

    • by hughJ ( 1343331 )
      If you look at the introduction of VCRs, the internet, and VR you'll see the same behavior. Big companies don't like their shiny new tech product/service that they've worked really hard on to be marketed as a pornography service. At best they'll allow it and pretend it doesn't exist, at worst they'll actively try to stop it. In the case of generative AI the companies training these models are kind of on the hook for whatever the model can or can't do, so they err on the side of caution. That will last u
    • What is the point of avoiding creating sexual images? I get that celebrities don't want to be portrayed in sexual poses. But what is the problem of having completely fictional characters show nudity or have sex? Would it not help to cut down on abusive relations within the porn industry?
      Is this just Anglo-Saxon Puritanism run amok or is there some good explanationon why to avoid this?

      The problem is you can't constrain the tech to only make fictional nudes, if it can make nudes based on fictional people it can make nudes based on real people as well.

      And it isn't so much the celebrities I'd worry about but the ordinary people (such as girls in high school). Maybe there's a future version of society where teenage girls don't find that extremely invasive and traumatizing, but it ain't here yet.

  • by nehumanuscrede ( 624750 ) on Wednesday June 12, 2024 @10:00PM (#64545271)

    Never underestimate the power of porn.

    The never-ending human pursuit of it will probably lead us to the singularity :D

    • Never underestimate the power of porn.

      Never underestimate the kinkiness and adaptability of human sexuality. It wouldn't surprise me to learn that the freaky and disturbing images shown in those links turn some people on. It also wouldn't surprise me if such images ended up 'evolving' into a whole new category of porn.

  • by Mspangler ( 770054 ) on Wednesday June 12, 2024 @11:28PM (#64545395)

    I laughed myself silly over this,

    "Basically, any time a prompt hones in on a concept that isn't represented well in its training dataset, the image model will confabulate its best interpretation of what the user is asking for. And sometimes that can be completely terrifying. "

    I ran into the same problem working on my dissertation in 1997. The neural network based control system would fail as soon as it found an input condition it hadn't seen before. In the particular case it decided to correct a low pH condition by adding additional sulfuric acid.

    A few years ago on Slashdot there was an article on an image processing NN based program that was trained to recognize the furniture in a room no matter what position the furniture was arranged in. Then they put in a model elephant and the system not only failed to recognize the elephant (expected) but also forgot how to recognize anything else in the room.

    So the fundamental flaw in the attempt to create AI, whatever it may be, has been papered over with sheer CPU power. The hole remains, and if you find it down you go. Like Indiana Jones in the Last Crusade spelling out Jehovah but I is j in Latin.

    • Then they put in a model elephant and the system not only failed to recognize the elephant (expected) but also forgot how to recognize anything else in the room.

      Perception is all about context... think about it.

  • All of the SD models sucked out of the gate at just about everything. Many months went by before SDXL was worth using... no doubt SD3 will be any different.

  • 1. Discovery and Enthusiasm - New tech is unveiled, sparking excitement and curiosity. Early adopters and media buzz.
    2. Early Adoption and Growth - Gains traction, early adopters showcase benefits, businesses start using it.
    3. Societal Concerns and Ethical Debates - Concerns emerge about misuse, privacy, and societal impact. Ethical debates arise.
    4. Moral Panic and Speculation - Moral panic over potential use for porn. Media and public figures speculate on its impact.
    5. Adaptation and Diversi

    • 4. Moral Panic and Speculation - Moral panic over potential use for porn. Media and public figures speculate on its impact.

      4B: Open source community creates multiple functional end-runs around the neurotics.

  • As an amateur photographer, if I wanted to have a picture of a woman lying on some grass, I'd ask a mate or family member to simply lay down and I'll...

    SHOOT A REAL PHOTO.

    Looking at the horrorfest of the images that this A.I generates all to avoid letting people see some standard universal human SKIN is plainly ridiculous.

    As an amateur who actually shoots real things, using real light, in the real world, I'm quite happy about A.I's quirks. Short lived they are.

    • >some standard universal human

      This is actually something I see as potentially problematic. Humans come with a wide range of differences in the details, and commonly available depictions can influence our perception of the ideal body if they are all similar.

      Do we really want to have AI 'deciding' what that is? I'm not looking forward to the fashion trend of having your ring fingers broken and twisted into odd shapes.

      • by flink ( 18449 )

        Do we really want to have AI 'deciding' what that is? I'm not looking forward to the fashion trend of having your ring fingers broken and twisted into odd shapes.

        We already have Hollywood, advertising executives, and fashion designers doing it anyways. The LLM is just regurgitating it back at us.

  • by dlarge6510 ( 10394451 ) on Thursday June 13, 2024 @04:58AM (#64545771)

    Instead of mucking about with ONE SYSTEM TO GENERATE THEM ALL, take a page out of the Unix philospohy and use composability to solve the issue.

    Have this A.I generate images of anyone, nude or not. Train it up on actual humans, nude or not so that it works properly.

    Then have a separate A.I employ any checks and balances needed. That A.I looks at the generated images and asks "Is this a picture of a nude human? Where are the clothes?" etc etc

    Humans do that too, it's like proofreading. This way you can stop mangling a trained network with unneeded complications. Simplicity is key. YOu thus have an A.I that does nothing but generate images. And a separate A.I that can determine what those images depict and if they tick the correct boxes etc. When you have a problem with one of the other, only one of the other needs to be fixed.

  • Fun fact: Those naked people that art students draw aren't there as an excuse to gawk. They are there so people learn to draw anatomically correct bodies, which you can only do when you actually see one and can compare.

    Turns out the same is true for AI art. What a surprise.

    American puritanical backwardness at work.
     

  • by bobbutts ( 927504 ) <bobbutts@gmail.com> on Thursday June 13, 2024 @08:10AM (#64545957)
    I was recently fooling with txt2img and found Juggernaut X had really good results. I'm not using it for nudes.
  • As a European I'm boggled by American prudishness.
    What's the problem with nude people at all? I mean come on guys, it's just nature.

    • by Hodr ( 219920 )

      As an American I'm boggled by the assumptions Europeans make about Americans. We aren't appreciably more "prude" than most European countries, we have plenty of public nudity allowed spaces (beaches/campgrounds/etc.) and innumerous semi-private ones.

      What you are seeing is a corporation overreacting in order to make their product "safe for everyone" over the fear of negative publicity and lawsuits (which IS something much more common in America).

      • by JustNiz ( 692889 )

        >> We aren't appreciably more "prude" than most European countries

        I moved to the US 20+ years ago, and yes, you really are.

        • USians live in their own sheltered area of the world. Of course, they believe they are "normal" as they've filtered out all of the inputs from their training data that would contradict that belief. Want more evidence? See also what their definitions of "left" and "right" politics are compared to the rest of the world.
  • ...that there's now a Stable Diffusion 3 Bodies Problem?

  • Well we've gone from "has at least 5 fingers" to "at least it has 5 fingers".

365 Days of drinking Lo-Cal beer. = 1 Lite-year

Working...