Preamble: The paper was accepted as a short paper for CSMR 2009 (but submitted as a full paper of course). We decided to turn down the offer for a short paper because (i) the paper wasn't meant as a short paper; (ii) we couldn't decide on how to derive a useful short paper; (iii) we think that we ended up in the short-paper category for unfortunate reasons -- a) review 1 is quite superficial (if not content-free) in our humble opinion; b) review 3 recommended rejection solely on the grounds that the "introduction to grammar convergence" would need to be published prior to the publication of this paper -- which it meanwhile is. Conclusion: the best course of action seemed to be to waive the offer for a short paper and to improve the paper anyway. Ralf Laemmel 8.2.2009 ------------------------------------------------------------ Changes compared to the CSMR 2009 submission. Overall this new version is a full rewrite not so much based on detailed reviewer feedback, but rather because 4 additional months went into the project, and the overall sound of the CSMR notification showed us that we need to explain and illustrate some parts just way better. The title has changed, too: the old title was a pun (an insider joke) that was explained by the footnote that was attached to it. Still, two reviewers struggled, and so we gave up on this pun. The ratio extraction/transformation was changed in favor of transformation: we agree with CSMR reviewer 2 that the extraction part is certainly not the part with the strongest impact and strongest level of originality. Accordingly, we have increased the level of detail on the transformation approach and downscaled the discussion of the extraction approach while we preserved what we think is an original design of a non-classic parser that may also be helpful for other extraction efforts. We also used 2 additional pages for details on transformations. Added details of transformation language: we categorize the kind of transformations (refactoring, optimization, extension, correction, recovery), and illustrate all categories with appropriate samples of operator applications. Introduction and conclusion: both have been written up from scratch to better explain the impact/the importance of this work and the overall approach. In the conclusion, we also clarify what's missing and hence left for future work. Related work: since at some level of abstraction, this Java effort is comparable to our earlier work on grammar recovery (mainly Cobol), but also other people's work on grammar recovery, we have done a better job at explaining the relationship. We have also explained how grammar comparison relates to similar operations in the ER/relational and model-driven communities -- the main insight is here that we have not yet used any sophisticated matching, but this may be the central tool to bring more automation (inference) to grammar convergence.