I think that experienced haskellers often forget to explain that there is a portion of the program that is not strictly functional. The thing is that the programmer is not given access to it. Instead, the programmer is asked to pass around descriptions of the I/O actions to be taken. A monad is a data type that (amongst a vast number of other things) can be used to structure these descriptions so that we can get the order of execution right. (notice that I didn't say 'evaluation')
The next part is a bit sloppy because monads turn out to be even more abstract than described, but it suffices to explain the concepts that give us the ability to be pure and still interact with the outside world.
Every Haskell program is an instance of a function that returns an IO Monad. "Inside" that monad (for the moment, think of it as a box plus a little bit of extra data) is a description of the I/O action to be performed and a new function that takes the result of that I/O (possibly discarding it) and produces another monad. Only the function inside the monad is allowed to refer to the result of performing the execution of that monad, but it is also able to refer to any functions outside of the monad. (Like lexical scoping.)
There's always a impure portion to a program that the programmer never gets to see. It's job is to evaluate just enough of a function to get ahold of an I/O monad, read the description inside that monad, perform the action and then repeat the whole process again by passing the result into the function if found inside the monad. This division of duties is enforced by only allowing the programmer to use the functions with stuff a description and function into a monad, but not the ones to get it out. Only the impure part of the function can
All this so far is interesting, but it seems like it would take an awful lot of discipline just for the sake of purity. However, what really make monads snazzy is that there are some great tricks with syntactic sugar that can help the programmer design these descriptions in much the same way he would write an imperative program. This is Haskell's 'do' syntax. The 'do' syntax doesn't make a purely function Haskell program imperative, but it sure makes it look a lot like it is.
Still, monads are nothing more than a data type with a couple of particular kinds of functions defined on it. In the case of I/O, those functions stuff things into the monad, combine monads and get information out of the monad. If you use monads for other things, it might be worthwhile to think of those functions as serving other purposes. Haskell's monad type class simply abstracts the features of all these so that algorithms used on one can often be used on all the others... even if it does obscure the original interpretation of what's going on.