Haskell does have side effects, just like any other useful programming language. However, you can't put side effects just anywhere in a Haskell program; you have to explicitly specify which functions have side effects.
This is done using the type system so that the return values of all functions that can have side effects must be IO x (ie. IO<x> in C++/Java notation). This way, you (and the compiler) can be sure that any function that DOESN'T have the type IO x is 100% side effect free, allowing easy parallelization, many types of optimizations, etc. In other words, all side effects are put in a separate "side effect bin".
Another big thing to understand is that Haskell makes it impossible (well, not really, but strongly discouraged) to return a value from IO back to the pure part of the program. Any computation that may depend on the result of some side effect is considered to have side effects of its own; removing a value from the "side effect bin" is a side effect. Essentially, side effect code can call both side effect code and pure code, but pure code can only call other pure code.
Since every useful program has at least some side effects (reading input, returning output), every Haskell program has a main function which has the type IO () (ie. has side effects, doesn't return anything). The main function can then call the rest of the program, just like in other programming languages.
In order to keep this type of programming from being a total pain in the ass to program with, Haskell uses monads (which are an unrelated concept, they are used for lots of other things as well) to make it easy to compose smaller IO functions into larger, more higher-level IO functions.