I'm not sure how you got to three. 4 symbols can be represented with 2 bits, so that should be a factor of 2.
Let me know if I'm forgetting some other factor.
Yes, additional base pairs could be used to further improve data density.
On the other part, I almost made the same mistake, but the fact that the bases only appear in pairs doesn't matter, since it's the sequence that encodes the information, and any sequence can contain both pairs in both orientations:
A-T, T-A, C-G, G-C. So it's still base 4. (unless for example A and C always appear on the left, and T and G always on the right, I'm not a DNA expert either, but as far as I know this isn't the case.)