Editing ODS files in memory

Jul 2, 2008 at 12:39 PM
I have an OpenOffice Calc file named 1.ods.
I want to extract it, edit content.xml and create a 2.ods file containing the modified version of content.xml.
OpenOffice says the output file is corrupted, so I tried to simply extract and compress without modifying anything but the result is the same.
Could you help me please?

Thanks!

private void EditAndZip()
{
    using (Ionic.Utils.Zip.ZipFile templateFile = Ionic.Utils.Zip.ZipFile.Read("1.ods"))
    using (MemoryStream memStream = new MemoryStream())
    {
        XmlNamespaceManager xmlNsMgr = new XmlNamespaceManager(new NameTable());
        xmlNsMgr.AddNamespace("office", "urn:oasis:names:tc:opendocument:xmlns:office:1.0");
        xmlNsMgr.AddNamespace("style", "urn:oasis:names:tc:opendocument:xmlns:style:1.0");
        xmlNsMgr.AddNamespace("text", "urn:oasis:names:tc:opendocument:xmlns:text:1.0");
        xmlNsMgr.AddNamespace("table", "urn:oasis:names:tc:opendocument:xmlns:table:1.0");
        xmlNsMgr.AddNamespace("draw", "urn:oasis:names:tc:opendocument:xmlns:drawing:1.0");
        xmlNsMgr.AddNamespace("fo", "urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0");
        xmlNsMgr.AddNamespace("xlink", "http://www.w3.org/1999/xlink");
        xmlNsMgr.AddNamespace("dc", "http://purl.org/dc/elements/1.1/");
        xmlNsMgr.AddNamespace("meta", "urn:oasis:names:tc:opendocument:xmlns:meta:1.0");
        xmlNsMgr.AddNamespace("number", "urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0");
        xmlNsMgr.AddNamespace("presentation", "urn:oasis:names:tc:opendocument:xmlns:presentation:1.0");
        xmlNsMgr.AddNamespace("svg", "urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0");
        xmlNsMgr.AddNamespace("chart", "urn:oasis:names:tc:opendocument:xmlns:chart:1.0");
        xmlNsMgr.AddNamespace("dr3d", "urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0");
        xmlNsMgr.AddNamespace("math", "http://www.w3.org/1998/Math/MathML");
        xmlNsMgr.AddNamespace("form", "urn:oasis:names:tc:opendocument:xmlns:form:1.0");
        xmlNsMgr.AddNamespace("script", "urn:oasis:names:tc:opendocument:xmlns:script:1.0");
        xmlNsMgr.AddNamespace("ooo", "http://openoffice.org/2004/office");
        xmlNsMgr.AddNamespace("ooow", "http://openoffice.org/2004/writer");
        xmlNsMgr.AddNamespace("oooc", "http://openoffice.org/2004/calc");
        xmlNsMgr.AddNamespace("dom", "http://www.w3.org/2001/xml-events");
        xmlNsMgr.AddNamespace("xforms", "http://www.w3.org/2002/xforms");
        xmlNsMgr.AddNamespace("xsd", "http://www.w3.org/2001/XMLSchema");
        xmlNsMgr.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
   
        XmlDocument xmlDoc = new XmlDocument();
        templateFile.Extract("content.xml", memStream);
        memStream.Position = 0;
        xmlDoc.Load(memStream);
   
        string locationPath = "//office:spreadsheet/table:table[@table:name=\"" + "Foglio1" +
                              "\"]//table:table-row[" + "5" + "]/table:table-cell[" + "5" + "]";
        XmlNode cell = xmlDoc.SelectSingleNode(locationPath, xmlNsMgr);
        cell.Attributes["office:value"].Value = "90";
   
        memStream.Position = 0;
        xmlDoc.Save(memStream);
   
        File.Copy("1.ods", "OutDN.ods", true);
        using (Ionic.Utils.Zip.ZipFile outputFile = new Ionic.Utils.Zip.ZipFile("OutDN.ods"))
        {
            memStream.Position = 0;
            outputFile.UpdateFileStream("content.xml", "", memStream);
            outputFile.Save();
        }
    }
}


private void UnzipAndZip()
{
    using (ZipFile templateFile = ZipFile.Read("1.ods"))
    using (MemoryStream memStream = new MemoryStream())
    using (ZipFile outputFile = new ZipFile("2.ods"))
    {
        foreach (ZipEntry zipEntry in templateFile)
        {
            memStream.Position = 0;
            zipEntry.Extract(memStream);
            memStream.Position = 0;
            outputFile.UpdateFileStream(zipEntry.FileName, null, memStream);
        }
        outputFile.Save();
    }
}
Coordinator
Jul 2, 2008 at 3:21 PM
Luigi, I don't know anything about ODS files.  I've never seen one. 
I assume the .ods is just a zip file, but I haven't found the spec that states that.

Coordinator
Jul 2, 2008 at 3:38 PM

ok, I looked briefly at the ODF spec.  It says:

OpenDocument uses a package file to store the XML content of a document together with its associated binary data, and to optionally compress the XML content. This package is a standard Zip file, whose structure is discussed below.

Information about the files contained in the package is stored in an XML file called the manifest file. The manifest file is always stored at the pathname META-INF/manifest.xml. The main pieces of information stored in the manifest are as follows:

  • a list of all of the files in the package.
  • The media type of each file in the package.
  • If a file stored in the package is encrypted, the information required to decrypt the file is stored in the manifest.

It seems to me in either case you tried, you are not creating the META-INF/manifest.xml file properly. In the first case you don't create it at all. In the second case you extract it from the original .ods file, but then you add it blindly to the new archive in the root directory, so that the path is manifest.xml, not META-INF/manifest.xml . The offending line is the call to ZipFile.UpdateFileStream(), where you specify null for the DirectoryPathInArchive parameter. This is not correct; that is to say, doing this will not produce a "clone" of the original .ods file. It will produce a zip file that is "flattened": each entry is in the root directory in the archive.  You need to be intelligent about adding the content into the new archive. The new archive should have a directory structure that mirrors that of the original  .ods file.

This is just my quick feedback from a brief look at the spec and at your code. This seems to be one problem, but there may be others. I am not an expert on the .ods spec - there may be other pitfalls that await you. I suggest you examine a working .ods file in a zip file viewer. Rename an .ods file to .zip, then double click on it on a windows machine. examine the directory structure and file content. Then do the same with the .ods files you are trying to create with DotNetZip. You may be able to immediately see the differences and fix the problem.

Jul 3, 2008 at 8:15 AM
I had already tried to rename *.ods to *.zip and see the contents.

In the first case I make File.Copy since I think that the UpdateFileStream method should update the entry passed as an argument and should not modify the others.

In the second case I use null, but debugging you can see that zipEntry.FileName includes the path inside the zip so the original zip structure is cloned.

I found the solution this morning (unfortunately I had to use many memory streams since resetting the position or setting the lenght produce a wrong zip file).

Thank you for your support and your great and easy to use library!

Bye, Luigi.


private void MemUnzipAndZip()
{

    File.Delete("Out.ods");

    using (ZipFile templateFile = ZipFile.Read("In.ods"))
    using (ZipFile outputFile = new ZipFile("Out.ods"))
    {
        MemoryStream[] memStream = new MemoryStream[templateFile.EntryFilenames.Count];
        int i = 0;

        foreach (ZipEntry zipEntry in templateFile)
        {
            memStream[i] = new MemoryStream();
            zipEntry.Extract(memStream[i]);
            memStream[i].Position = 0;
            outputFile.AddFileStream(zipEntry.FileName, null, memStream[i]);
            i++;
        }

        outputFile.Save();
        outputFile.Dispose();

        for (i = 0; i < templateFile.EntryFilenames.Count; i++)
        {
            memStream[i].Close();
        }
    }
}
Coordinator
Jul 3, 2008 at 6:03 PM

Ahh, yes, Luigi.  I was thinking that null for the DirectoryNameInArchive was the same as "", but it is not. 

The I/O does not occur until you call Save(), which means you need the memory stream for each updated entry at the time of Save().

Another way to do it would be to have a single memory stream and many calls to Save() but that seems inefficient in a different way.