At first glance this seems like a strange question. Isn't science precisely the quantification of observations into a theory or model and then using that to make predictions? Yes. And are those predictions in different cases then tested against observations again and again to either validate those models or generate ideas for potential improvements? Yes, again. So the fact that climate modelling was recently singled out as being somehow non-scientific seems absurd.
-snip-
The 20th Century though still provides the test that appears to be most convincing. That is to say, the models are run over the whole period, with our best guesses for what the forcings were, and the results compared to the observed record. If by leaving out the anthropogenic effects you fail to match the observed record, while if you include them, you do, you have a quick-and-dirty way to do 'detection and attribution'. (There is a much bigger literature that discusses more subtle and powerful ways to do D&A, so this isn't the whole story by any means). The most quoted example of this is from the Stott et al. (2000) paper shown in the figure. Similar results can be found in simple models (Crowley, 2000) and in more up to date models (Meehl et al, 2004).
It's important to note that if the first attempt to validate the model fails (e.g. the signal is too weak (or too strong), or the spatial pattern is unrealistic), this leads to a re-examination of the physics of the model. This may then lead to additional changes, for example, the incorporation of ozone feedbacks to solar changes, or the calculation of vegetation feedbacks to orbital forcing - which in each case improved the match to the observations. Sometimes though it is the observations that turn out to be wrong. For instance, for the Last Glacial Maximum, model-data mis-matches highlighted by Rind and Peteet (1985) for the tropical sea surface temperatures, have subsequently been more or less resolved in favour of the models.
So, in summary, the model results are compared to data, and if there is a mismatch, both the data and the models are re-examined. Sometimes the models can be improved, sometimes the data was mis-interpreted. Every time this happens and we get improved matches between them, we have a little more confidence in their projections for the future, and we go out and look for better tests. That is in fact pretty close to the textbook definition of science.
http://www.realclimate.org/index.php?p=100#more-100