First of all, I want to thank OOXML team for building a wonderful SDK like Open XML. It reads document like a charm and its so native feels like I am inside a word document with all its contents. Hats off to the developers.
I am a newbie to OOXML and its formats, therefore I need your help to understand it to develop an application which reads document using the SDK (v2.5).
I am using Visual Studio 2013 Express Edition to start with and installed OOXML Libraries along with Power Tools, Simple OOXML. I am able to read the documents using the following code (Contains questions with-in)
/// <summary> /// Open the word document to process package readonly. /// </summary> /// <param name="filePath">Path to the word document</param> /// <returns>success/failure message</returns> public static string OpenWordprocessingPackageReadonly(string filePath) { if (filePath == null) { return string.Format("Invalid FilePath.. Check your entries... FilePath: {0}", filePath); } // Open System.IO.Packaging.Package. Package wordPackage = Package.Open(filePath, System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite); // Open a WordprocessingDocument based on a package using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(wordPackage)) { MainDocumentPart mainDocumentPart = wordDocument.MainDocumentPart; // Assign a reference to the existing document body. Body body = mainDocumentPart.Document.Body; foreach (Table table in body.Descendants<Table>()) { int tableRowIndex = 0; foreach (TableRow tableRow in table.Descendants<TableRow>()) { int tableCellIndex = 0; // Loop through each cell to get the content of the cell. foreach (TableCell tableCell in tableRow.Descendants<TableCell>()) {// TODO Can we convert these tables into images? foreach (Table tableInstace in tableCell.Descendants<Table>()) { // Table should be converted to image. System.Diagnostics.Trace.WriteLine( string.Format("Found Table Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex)); // Run a macro to convert this. }// TODO Can we convert these paragraphs to HTML or XHTML? foreach (Paragraph paragraph in tableCell.Descendants<Paragraph>()) { StringBuilder buildText = new StringBuilder(); foreach (Run run in paragraph.Descendants<Run>()) {// TODO Can we convert these embedded objects into png images? foreach (EmbeddedObject embObject in run.Descendants<EmbeddedObject>()) { // Convert this to png images. }// TODO Can we convert these pictures (assuming these are not already png) into png images? foreach (Picture picture in run.Descendants<Picture>()) { // Process the picture object. // Process round rectangle as well. // Process drawingml for Normal Pictures, Auto-Shapes, Charts System.Diagnostics.Trace.WriteLine( string.Format("Found Picture Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex)); //picture.Descendants<> } foreach (Text text in run.Descendants<Text>().Where<Text>(textPredicate => (textPredicate.Parent.GetType().IsAssignableFrom(typeof(Run))))) { buildText.Append(text.Text); } } } tableCellIndex += 1; } // Goto the next row tableRowIndex += 1; } } TableIndex += 1; } // Close the package wordPackage.Close(); }
I have the following questions:
1. Is there a way where I can read the document in a better way than the above? - The document contains 2 or more tables and each table contains 15-20 columns in it. Each cell can contain any type including tables, shapes, auto-shapes, images, bullets and numbering etc., also symbols.
2. Is it possible to convert WMF, EMF, EMF+, EmbeddedOLEObjects to PNG or Bitmap images?
3. Is it possible to convert tables into Images? - The reason I want to do this is, these tables contains a lot of formatted data which if I read it as html or xml, will miss them. Thats why want to directly convert to images.
4. Is it possible to convert a paragraph to XHTML or HTML so that I can render the same in the UserInterface?
All these questions bothering me from past 3 days and I am struggling to get an answer. Any help in this forum is well appreciated.
Please guide me either a direct help in code or links to the code which help me learn to code.
Let me know,
Thanks,
Triguna