Quantcast
Channel: Open XML Format SDK forum
Viewing all articles
Browse latest Browse all 1288

Dealing with w:object(EmbeddedObject) using Open XML Format

$
0
0

First of all, I want to thank OOXML team for building a wonderful SDK like Open XML. It reads document like a charm and its so native feels like I am inside a word document with all its contents. Hats off to the developers.

I am a newbie to OOXML and its formats, therefore I need your help to understand it to develop an application which reads document using the SDK (v2.5).

I am using Visual Studio 2013 Express Edition to start with and installed OOXML Libraries along with Power Tools, Simple OOXML. I am able to read the documents using the following code (Contains questions with-in)

/// <summary>
/// Open the word document to process package readonly.
/// </summary>
/// <param name="filePath">Path to the word document</param>
/// <returns>success/failure message</returns>
public static string OpenWordprocessingPackageReadonly(string filePath)
{
	if (filePath == null)
	{
		return string.Format("Invalid FilePath.. Check your entries... FilePath: {0}", filePath);
	}

	// Open System.IO.Packaging.Package.
	Package wordPackage = Package.Open(filePath, System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite);

	// Open a WordprocessingDocument based on a package
	using (WordprocessingDocument wordDocument = WordprocessingDocument.Open(wordPackage))
	{
		MainDocumentPart mainDocumentPart = wordDocument.MainDocumentPart;

		// Assign a reference to the existing document body.
		Body body = mainDocumentPart.Document.Body;

		foreach (Table table in body.Descendants<Table>())
		{
			int tableRowIndex = 0;
			foreach (TableRow tableRow in table.Descendants<TableRow>())
			{
				int tableCellIndex = 0;

				// Loop through each cell to get the content of the cell.
				foreach (TableCell tableCell in tableRow.Descendants<TableCell>())
				{// TODO Can we convert these tables into images?
					foreach (Table tableInstace in tableCell.Descendants<Table>())
					{
						// Table should be converted to image.
						System.Diagnostics.Trace.WriteLine(
							string.Format("Found Table Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex));
						// Run a macro to convert this.
					}// TODO Can we convert these paragraphs to HTML or XHTML?
					foreach (Paragraph paragraph in tableCell.Descendants<Paragraph>())
					{
						StringBuilder buildText = new StringBuilder();

						foreach (Run run in paragraph.Descendants<Run>())
						{// TODO Can we convert these embedded objects into png images?
							foreach (EmbeddedObject embObject in run.Descendants<EmbeddedObject>())
							{
								// Convert this to png images.
							}// TODO Can we convert these pictures (assuming these are not already png) into png images?
							foreach (Picture picture in run.Descendants<Picture>())
							{
								// Process the picture object.
								// Process round rectangle as well.
								// Process drawingml for Normal Pictures, Auto-Shapes, Charts
								System.Diagnostics.Trace.WriteLine(
									string.Format("Found Picture Object in row: {0}, cell: {1}", tableRowIndex, tableCellIndex));
								//picture.Descendants<>
							}
							foreach (Text text in run.Descendants<Text>().Where<Text>(textPredicate => (textPredicate.Parent.GetType().IsAssignableFrom(typeof(Run)))))
							{
								buildText.Append(text.Text);
							}
						}
					}
					tableCellIndex += 1;
				}
				// Goto the next row
				tableRowIndex += 1;
			}
		}

		TableIndex += 1;
	}

	// Close the package
	wordPackage.Close();
}

I have the following questions:

1. Is there a way where I can read the document in a better way than the above? - The document contains 2 or more tables and each table contains 15-20 columns in it. Each cell can contain any type including tables, shapes, auto-shapes, images, bullets and numbering etc., also symbols.

2. Is it possible to convert WMF, EMF, EMF+, EmbeddedOLEObjects to PNG or Bitmap images?

3. Is it possible to convert tables into Images? - The reason I want to do this is, these tables contains a lot of formatted data which if I read it as html or xml, will miss them. Thats why want to directly convert to images.

4. Is it possible to convert a paragraph to XHTML or HTML so that I can render the same in the UserInterface?

All these questions bothering me from past 3 days and I am struggling to get an answer. Any help in this forum is well appreciated.

Please guide me either a direct help in code or links to the code which help me learn to code.

Let me know,

Thanks,

Triguna




Viewing all articles
Browse latest Browse all 1288

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>