Quantcast
Channel: Open XML Format SDK forum
Viewing all articles
Browse latest Browse all 1288

Dealing with w:object(EmbeddedObject) using Open XML Format

$
0
0

First of all, I want to thank OOXML team for building a wonderful SDK like Open XML. It reads document like a charm and its so native feels like I am inside a word document with all its contents. Hats off to the developers.

I am a newbie to OOXML and its formats, therefore I need your help to understand it to develop an application which reads document using the SDK (v2.5).

I am using Visual Studio 2013 Express Edition to start with and installed OOXML Libraries along with Power Tools, Simple OOXML. I am able to read the documents using the following code (Contains questions with-in)

/// <summary>
/// Open the word document to process package readonly.
/// </summary>
/// <param name="filePath">Path to the word document</param>
/// <returns>success/failure message</returns>
public static string OpenWordprocessingPackageReadonly(string filePath)
{
	if (filePath == null)
	{
		return string.Format("Invalid FilePath.. Check your entries... FilePath: {0}", filePath);
	}

	// Open System.IO.Packaging.Package.
	Package wordPackage = Package.Open(filePath, System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite);

	// Open a WordprocessingDocument based on a package
	using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(wordPackage))
	{
		MainDocumentPart mainDocumentPart = wordDocument.MainDocumentPart;

		// Assign a reference to the existing document body.
		Body body = mainDocumentPart.Document.Body;

		foreach (Table table in body.Descendants<Table>())
		{
			int tableRowIndex = 0;
			foreach (TableRow tableRow in table.Descendants<TableRow>())
			{
				int tableCellIndex = 0;

				// Loop through each cell to get the content of the cell.
				foreach (TableCell tableCell in tableRow.Descendants<TableCell>())
				{// TODO Can we convert these tables into images?
					foreach (Table tableInstace in tableCell.Descendants<Table>())
					{
						// Table should be converted to image.
						System.Diagnostics.Trace.WriteLine(
							string.Format("Found Table Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex));
						// Run a macro to convert this.
					}// TODO Can we convert these paragraphs to HTML or XHTML?
					foreach (Paragraph paragraph in tableCell.Descendants<Paragraph>())
					{
						StringBuilder buildText = new StringBuilder();

						foreach (Run run in paragraph.Descendants<Run>())
						{// TODO Can we convert these embedded objects into png images?
							foreach (EmbeddedObject embObject in run.Descendants<EmbeddedObject>())
							{
								// Convert this to png images.
							}// TODO Can we convert these pictures (assuming these are not already png) into png images?
							foreach (Picture picture in run.Descendants<Picture>())
							{
								// Process the picture object.
								// Process round rectangle as well.
								// Process drawingml for Normal Pictures, Auto-Shapes, Charts
								System.Diagnostics.Trace.WriteLine(
									string.Format("Found Picture Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex));
								//picture.Descendants<>
							}
							foreach (Text text in run.Descendants<Text>().Where<Text>(textPredicate => (textPredicate.Parent.GetType().IsAssignableFrom(typeof(Run)))))
							{
								buildText.Append(text.Text);
							}
						}
					}
					tableCellIndex += 1;
				}
				// Goto the next row
				tableRowIndex += 1;
			}
		}

		TableIndex += 1;
	}

	// Close the package
	wordPackage.Close();
}

I have the following questions:

1. Is there a way where I can read the document in a better way than the above? - The document contains 2 or more tables and each table contains 15-20 columns in it. Each cell can contain any type including tables, shapes, auto-shapes, images, bullets and numbering etc., also symbols.

2. Is it possible to convert WMF, EMF, EMF+, EmbeddedOLEObjects to PNG or Bitmap images?

3. Is it possible to convert tables into Images? - The reason I want to do this is, these tables contains a lot of formatted data which if I read it as html or xml, will miss them. Thats why want to directly convert to images.

4. Is it possible to convert a paragraph to XHTML or HTML so that I can render the same in the UserInterface?

All these questions bothering me from past 3 days and I am struggling to get an answer. Any help in this forum is well appreciated.

Please guide me either a direct help in code or links to the code which help me learn to code.

Let me know,

Thanks,

Triguna




Viewing all articles
Browse latest Browse all 1288

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>