You seem not to understand how models are trained. There's two separate stages: creating the foundation, and performing the finetune.
The foundation is what takes the overwhelming majority of computational work. This is unsupervized. People aren't putting forth a bunch of questions and "proper answers for the AI to learn". It's just reems and reems of data from common crawl, etc. Certain sources may be stressed more - for example, scientific journals vs. 4chan or whatnot. But nobody is going through and deciding at a base level what data to train the model on.
The foundation learns to predict the next work in any text it comes to; that's what it's tasked with.. But it turns out, words don't exist in a vacuum; in order to perform better than e.g. Markov-Chain text predictors, you have to build up an underlying model of how the world that led to the creation of this text works. If you need to accurately continue, say, "The odds of a citizen of Ghana conducting a major terrorist attack in Ireland over the next 20 years are approximately...", there's a lot of things you need to understand in order to have any remote chance of getting something close to a realistic answer. In short, virtually all of the "learning" about the world happens during this unsupervised training process.
What you get out of it is a foundational model. But all it knows how to do is text completion. You can sort of trick them into performing your queries, but they're not at all covenient. You might lead off, "What is the capitol of Brazil?" and it might continue, say, "It's a question that I asked myself as I started planning my vacation. My husband Jim and I were setting out to travel to all of the world's capitols...." This is not the behavior that we want! Hence, finetuning.
With finetuning, we further train the foundation with supervised data - a bunch of examples of the user asking a question and the model giving an appropriate answer. The amount of supervised data is vastly smaller than unsupervised, and the training process might take only a day or so. It simply doesn't have a chance to "learn" much from the data, except for how to respond. The knowledge it has comes from the underlying foundational model. The only thing it learns from the finetune is the chat format and what sort of personality to present.
It is in the finetune that you add "safeguards". You give examples of questions like, "Tell me how to make a bomb." and answers like "I'm sorry, but I can't help you with potentially violent and illegal action." Again, it's not learning the specifics from its finetune, just the concept that it should refuse requests to help with certain things.
So can you train a conservative or liberal model with your finetune? Absolutely! You can readily teach it that it should behave in any manner. Want a fascist model? Give it examples of responses like a fascist. Want a maoist model? Same deal. But here's the key point: the knowledge that it has available to it has nothing to do with the finetune. That knowledge was learned via unsupervised learning.
Lastly: the reason the finetunes (not the underlying knowledge) have safeguards is to make them "PG". As a general rule, companies don't give much less of a rat's arse about actual politics as they do about getting sued or boycotted. They don't want their models complying with your request to, say, write an angry bigoted rant about disabled children, not because "they hate free speech", but rather because they don't want the backlash when you post your bigoted rant online and tell people that it was their tool that made it. It's pure self-interest.
That said: most models are open. And as soon as it appears on Huggingface, people just re-finetune with an uncensored supervised dataset. And since all the *knowledge* is in the underlying foundation, just a day or so finetuning on an uncensored dataset will make the model more than happy to help you make a bomb or make fun of disabled children or whatever the heck you want.