The Office Binary Translator to OpenXML for Word documents which we developed during the first half year of 2008 was already more than a prototype or proof of concept: A large number of binary Word documents could be translated to OpenXML without any loss of information and in some cases our resulting documents were even better than the documents created using the converter integrated in Office 2007.

Although our translator is already quite mature a number of more complex and less used features are not yet mapped/translated, e.g.

  • frame paragraphs lose their floating properties and are translated to normal paragraphs
  • OLE Objects are currently not supported and are lost after translation
  • SmartArts, Charts and Comments are currently not supported and are lost after translation
  • macros are currently not translated

In addition, some features have not yet been completely implemented:

  • 55 of about 200 different shape types are currently supported
  • Track Changes: Due to its high complexity, the revision marking (track changes) feature is not yet completely implemented; however, paragraph and character property modifications are implemented
  • a number of bugs have been reported which are not yet fixed

We are going to tackle all these features and bugs in Phase II of the Binary Translator project.

Hopefully, our work will be facilitated by the improved specification of the binary formats which has been released by Microsoft in June (we keep you informed about our findings).

Our schedule: Project start is now in September and two intermediate milestones are planned for beginning of October and November. The final version is planned for beginning of December. In addition to unit testing we are going to accomplish elaborate testing routines to guarantee a stable and high quality translator release.

Some weeks of hard coding and testing work are waiting for us – let’s roll up the sleeves and get it done …