| Tenouk C & C++ | MFC Home | Uniform Data Transfer & OLE 5 | Structured Storage 2 | Download | Site Index |


 

 

 

 

 

 

 

Structured Storage 1

 

 

 

 

 

 

Program examples compiled using Visual C++ 6.0 compiler on Windows XP Pro machine with Service Pack 2. Topics and sub topics for this tutorial are listed below. Don’t forget to read Tenouk’s small disclaimer. The supplementary notes for this tutorial are statstg, Thread.h, ReadThread.cpp and WriteThread.cpp.

  1. Intro

  2. Compound Files

  3. Storages and the IStorage Interface

  4. Getting an IStorage Pointer

  5. Freeing STATSTG Memory

  6. Enumerating the Elements in a Storage Object

  7. Sharing Storages Among Processes

  8. Streams and the IStream Interface

  9. IStream Programming

  10. The ILockBytes Interface

 

Intro

 

Like Automation and Uniform Data Transfer, structured storage is one of those COM features that you can use effectively by itself. Of course, it's also behind much of the ActiveX technology, particularly compound documents.

In this module, you'll learn to write and read compound files with the IStorage and IStream interfaces. The IStorage interface is used to create and manage structured storage objects. IStream is used to manipulate the data contained by the storage object. The IStorage and IStream interfaces, like all COM interfaces, are simply virtual function declarations. Compound files, on the other hand, are implemented by code in the Microsoft Windows OLE32 DLL. Compound files represent a Microsoft file I/O standard that you can think of as "a file system inside a file." When you're familiar with IStorage and IStream, you'll move on to the IPersistStorage and IPersistStream interfaces. With the IPersistStorage and IPersistStream interfaces, you can program a class to save and load objects to and from a compound file. You say to an object, "Save yourself," and it knows how.

 

Compound Files

 

This MFC Tutorial discusses four options for file I/O. You can read and write whole sequential files (like the MFC archive files you saw first in Module 11). You can use a database management system. You can write your own code for random file access. Finally, you can use compound files. Think of a compound file as a whole file system within a file. Figure 1 shows a traditional disk directory as supported by early MS-DOS systems and by Microsoft Windows. This directory is composed of files and subdirectories, with a root directory at the top. Now imagine the same structure inside a single disk file. The files are called streams, and the directories are called storages. Each is identified by a name of up to 32 wide characters in length. A stream is a logically sequential array of bytes, and a storage is a collection of streams and sub-storages.

 

Figure 1: A disk directory with files and subdirectories.

 

Figure 1: A disk directory with files and subdirectories.

 

 

 

A storage can contain other storages, just as a directory can contain subdirectories. In a disk file, the bytes aren't necessarily stored in contiguous clusters. Similarly, the bytes in a stream aren't necessarily contiguous in their compound file. They just appear that way. Storage and stream names cannot contain the characters /, \, :, or !. If the first character has an ASCII value of less than 32, the element is marked as managed by some agent other than the owner. You can probably think of many applications for a compound file. The classic example is a large document composed of modules and paragraphs within modules. The document is so large that you don't want to read the whole thing into memory when your program starts, and you want to be able to insert and delete portions of the document. You could design a compound file with a root storage that contains sub-storages for modules. The module sub-storages would contain streams for the paragraphs. Other streams could be for index information. One useful feature of compound files is transactioning. When you start a transaction for a compound file, all changes are written to a temporary file. The changes are made to your file only when you commit the transaction.

 

Storage and the IStorage Interface

 

If you have a storage object, you can manipulate it through the IStorage interface. Pay attention to these functions because Microsoft Foundation Class offers no support for storage access. The IStorage interface inherits the methods of the standard COM interface IUnknown (QueryInterface(), AddRef() and Release()). In addition, IStorage defines the following methods.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Method/Function

Description

CreateStream()

Creates and opens a stream object with the specified name contained in this storage object. The name must not exceed 31 characters in length (not including the string terminator). The 000 through 01f characters, serving as the first character of the stream/storage name, are reserved for use by OLE. This is a compound file restriction, not a structured storage restriction.

OpenStream()

Opens an existing stream object within this storage object using the specified access permissions in grfMode. The name must not exceed 31 characters in length (not including the string terminator). The 000 through 01f characters, serving as the first character of the stream/storage name, are reserved for use by OLE. This is a compound file restriction, not a structured storage restriction.

CreateStorage()

Creates and opens a new storage object within this storage object. The name must not exceed 31 characters in length (not including the string terminator). The 000 through 01f characters, serving as the first character of the stream/storage name, are reserved for use by OLE. This is a compound file restriction, not a structured storage restriction.

OpenStorage()

Opens an existing storage object with the specified name according to the specified access mode. The name must not exceed 31 characters in length (not including the string terminator). The 000 through 01f characters, serving as the first character of the stream/storage name, are reserved for use by OLE. This is a compound file restriction, not a structured storage restriction.

CopyTo()

Copies the entire contents of this open storage object into another storage object. The layout of the destination storage object may differ.

MoveElementTo()

Copies or moves a substorage or stream from this storage object to another storage object.

Commit()

Reflects changes for a transacted storage object to the parent level.

Revert()

Discards all changes that have been made to the storage object since the last commit operation.

EnumElements()

Returns an enumerator object that can be used to enumerate the storage and stream objects contained within this storage object.

DestroyElement()

Removes the specified storage or stream from this storage object.

RenameElement()

Renames the specified storage or stream in this storage object.

SetElementTimes()

Sets the modification, access, and creation times of the indicated storage element, if supported by the underlying file system.

SetClass()

Assigns the specified CLSID to this storage object.

SetStateBits()

Stores up to 32 bits of state information in this storage object.

Stat()

Returns the STATSTG structure for this open storage object.

 

Table 1.

 

Following are some of the important member functions and their significant parameters.

 

HRESULT Commit(…);

 

Commits all the changes to this storage and to all elements below it.

 

HRESULT CopyTo(…, IStorage** pStgDest);

 

Copies a storage, with its name and all its sub-storages and streams (recursively), to another existing storage. Elements are merged into the target storage, replacing elements with matching names.

 

HRESULT CreateStorage(const WCHAR* pName, …, DWORD mode, …, IStorage** ppStg);

 

Creates a new sub-storage under this storage object.

 

HRESULT CreateStream(const WCHAR* pName, …, DWORD mode, …, IStream** ppStream);

 

Creates a new stream under this storage object.

 

HRESULT DestroyElement(const WCHAR* pName);

 

Destroys the named storage or stream that is under this storage object. A storage cannot destroy itself.

 

HRESULT EnumElements(…, IEnumSTATSTG** ppEnumStatstg);

 

Iterates through all the storages and streams under this storage object. The IEnumSTATSTG interface has Next(), Skip(), and Clone() member functions, as do other COM enumerator interfaces.

 

HRESULT MoveElementTo(const WCHAR* pName, IStorage* pStgDest, const LPWSTR* pNewName, DWORD flags);

 

Moves an element from this storage object to another storage object.

 

HRESULT OpenStream(const WCHAR* pName, …, DWORD mode, …, IStorage** ppStg);

 

Opens an existing stream object, designated by name, under this storage object.

 

HRESULT OpenStorage(const WCHAR* pName, …, DWORD mode, …, IStorage** ppStg);

 

Opens an existing substorage object, designated by name, under this storage object.

 

DWORD Release(void);

 

Decrements the reference count. If the storage is a root storage representing a disk file, Release() closes the file when the reference count goes to 0.

 

HRESULT RenameElement(const WCHAR* pOldName, const WCHAR* pNewName);

 

Assigns a new name to an existing storage or stream under this storage object.

 

HRESULT Revert(void);

 

Abandons a transaction, leaving the compound file unchanged.

 

HRESULT SetClass(CLSID& clsid);

 

Inserts a 128-bit class identifier into this storage object. This ID can then be retrieved with the Stat() function.

 

HRESULT Stat(STATSTG* pStatstg, DWORD flag);

 

Fills in a STATSTG structure with useful information about the storage object, including its name and class ID.

 

Getting an IStorage Pointer

 

Where do you get the first IStorage pointer? COM gives you the global function StgCreateDocfile() to create a new structured storage file on disk and the function StgOpenStorage() to open an existing file. Both of these set a pointer to the file's root storage. Here's some code that opens an existing storage file named MyStore.stg and then creates a new sub-storage:

 

IStorage* pStgRoot;

IStorage* pSubStg;

 

if (::StgCreateDocfile(L"MyStore.stg",

    STGM_READWRITE | STGM_SHARE_EXCLUSIVE | STGM_CREATE, 0, &pStgRoot) == S_OK)

    {

    if (pStgRoot->CreateStorage(L"MySubstorageName",

        STGM_READWRITE | STGM_SHARE_EXCLUSIVE | STGM_CREATE, 0, 0, &pSubStg) == S_OK) {

        // Do something with pSubStg

        pSubStg->Release();

    }

    pStgRoot->Release();

}

 

 

Freeing STATSTG Memory

 

When you call IStorage::Stat with a STATFLAG_DEFAULT value for the flag parameter, COM allocates memory for the element name. You must free this memory in a manner compatible with its allocation. COM has its own allocation system that uses an allocator object with an IMalloc interface. You must get an IMalloc pointer from COM, call IMalloc::Free for the string, and then release the allocator. The code below illustrates this.

If you want just the element size and type and not the name, you can call Stat() with the STATFLAG_NONAME flag. In that case, no memory is allocated and you don't have to free it. This seems like an irritating detail, but if you don't follow the recipe, you'll have a memory leak.

 

Enumerating the Elements in a Storage Object

 

Following is some code that iterates through all the elements under a storage object, differentiating between sub-storages and streams. The elements are retrieved in a seemingly random sequence, independent of the sequence in which they were created; however, I've found that streams are always retrieved first. The IEnumSTATSTG::Next element fills in a STATSTG structure that tells you whether the element is a stream or a storage object.

 

IEnumSTATSTG* pEnum;

IMalloc* pMalloc;

STATSTG statstg;

extern IStorage* pStg;  // maybe from OpenStorage

::CoGetMalloc(MEMCTX_TASK, &pMalloc); // assumes AfxOleInit called

VERIFY(pStg->EnumElements(0, NULL, 0, &pEnum) == S_OK)

while (pEnum->Next(1, &statstg, NULL) == NOERROR) {

    if (statstg.type == STGTY_STORAGE) {

        if (pStg->OpenStorage(statstg.pwcsName, NULL,

            STGM_READ | STGM_SHARE_EXCLUSIVE, NULL, 0, &pSubStg) == S_OK) {

            // Do something with the substorage

        }

        else if (statstg.type == STGTY_STREAM)

        {

            // Process the stream

        }

        pMalloc->Free(statstg.pwcsName); // avoids memory leaks

    }

    pMalloc->Release();

}

 

Sharing Storages Among Processes

 

If you pass an IStorage pointer to another process, the marshaling code ensures that the other process can access the corresponding storage element and everything below it. This is a convenient way of sharing part of a file. One of the standard data object media types of the TYMED enumeration is TYMED_ISTORAGE, and this means you can pass an IStorage pointer on the clipboard or through a drag-and-drop operation.

 

Streams and the IStream Interface

 

If you have a stream object, you can manipulate it through the IStream interface. Streams are always located under a root storage or a substorage object. Streams grow automatically (in 512-byte increments) as you write to them. An MFC class for streams, COleStreamFile, makes a stream look like a CFile object. That class won't be of much use to us in this module, however. The IStream interface inherits the methods of the standard COM interface IUnknown (QueryInterface(), AddRef() and Release()). In addition, IStream defines the following methods. Once you have a pointer to IStream, a number of functions are available to you for manipulating the stream. Here is a list and the detail of all the IStream functions.

 

Method

Description

Read()

Reads a specified number of bytes from the stream object into memory starting at the current seek pointer.

Write()

Writes a specified number of bytes into the stream object starting at the current seek pointer.

Seek()

Changes the seek pointer to a new location relative to the beginning of the stream, the end of the stream, or the current seek pointer.

SetSize()

Changes the size of the stream object.

CopyTo()

Copies a specified number of bytes from the current seek pointer in the stream to the current seek pointer in another stream.

Commit()

Ensures that any changes made to a stream object open in transacted mode are reflected in the parent storage object.

Revert()

Discards all changes that have been made to a transacted stream since the last call to IStream::Commit.

LockRegion()

Restricts access to a specified range of bytes in the stream. Supporting this functionality is optional since some file systems do not provide it.

UnlockRegion()

Removes the access restriction on a range of bytes previously restricted with IStream::LockRegion.

Stat()

Retrieves the STATSTG structure for this stream.

Clone()

Creates a new stream object that references the same bytes as the original stream but provides a separate seek pointer to those bytes.

 

Table 2.

 

The following are some of the important functions used in this tutorial.

 

HRESULT CopyTo(IStream** pStm, ULARGE_INTEGER cb, …);

 

Copies cb bytes from this stream to the named stream. ULARGE_INTEGER is a structure with two 32-bit members - HighPart and LowPart.

 

HRESULT Clone(IStream** ppStm);

 

Creates a new stream object with its own seek pointer that references the bytes in this stream. The bytes are not copied, so changes in one stream are visible in the other.

 

HRESULT Commit(…);

 

Transactions are not currently implemented for streams.

 

HRESULT Read(void const* pv, ULONG cb, ULONG* pcbRead);

 

Tries to read cb bytes from this stream into the buffer pointed to by pv. The variable pcbRead indicates how many bytes were actually read.

 

DWORD Release(void);

 

Closes this stream.

 

HRESULT Revert(void);

 

Has no effect for streams.

 

HRESULT Seek(LARGE_INTEGER dlibMove, DWORD dwOrigin, ULARGE_INTEGER* NewPosition);

 

Seeks to the specified position in this stream. The dwOrigin parameter specifies the origin of the offset defined in the NewPosition parameter.

 

HRESULT SetSize(ULARGE_INTEGER libNewSize);

 

Extends or truncates a stream. Streams grow automatically as they are written, but calling SetSize() can optimize performance.

 

HRESULT Stat(STATSTG* pStatstg, DWORD flag);

 

Fills in the STATSTG structure with useful information about the stream, including the stream name and size. The size is useful if you need to allocate memory for a read.

 

HRESULT Write(void const* pv, ULONG cb, ULONG* pcbWritten);

 

Tries to write cb bytes to this stream from the buffer pointed to by pv. The variable pcbWritten indicates how many bytes were actually written.

 

IStream Programming

 

Here is some sample code that creates a stream under a given storage object and writes some bytes from m_buffer to the stream:

 

extern IStorage* pStg;

IStream* pStream;

ULONG nBytesWritten;

 

if (pStg->CreateStream(L"MyStreamName", STGM_CREATE | STGM_READWRITE | STGM_SHARE_EXCLUSIVE, 0, 0, &pStream) == S_OK)

{

    ASSERT(pStream != NULL);

    pStream->Write(m_buffer, m_nLength, &nBytesWritten);

    pStream->Release();

}

 

The ILockBytes Interface

 

As already mentioned, the compound file system you've been looking at is implemented in the OLE32 DLL. The structured storage interfaces are flexible enough, however, to permit you to change the underlying implementation. The key to this flexibility is the ILockBytes interface. The ILockBytes interface inherits the methods of the standard COM interface IUnknown (QueryInterface(), AddRef() and Release()). In addition, ILockBytes defines the following methods.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Method

Description

ReadAt()

Reads a specified number of bytes starting at a specified offset from the beginning of the byte array.

WriteAt()

Writes a specified number of bytes to a specified location in the byte array.

Flush()

Ensures that any internal buffers maintained by the byte array object are written out to the backing storage.

SetSize()

Changes the size of the byte array.

LockRegion()

Restricts access to a specified range of bytes in the byte array.

UnlockRegion()

Removes the access restriction on a range of bytes previously restricted with ILockBytes::LockRegion.

Stat()

Retrieves a STATSTG structure for this byte array object.

 

Table 3.

 

The StgCreateDocfile() and StgOpenStorage() global functions use the default Windows file system. You can write your own file access code that implements the ILockBytes interface and then call StgCreateDocfileOnILockBytes() or StgOpenStorageOnILockBytes() to create or open the file, instead of calling the other global functions.

Rather than implement your own ILockBytes interface, you can call CreateILockBytesOnHGlobal() to create a compound file in RAM. If you wanted to put compound files inside a database, you would implement an ILockBytes interface that used the database's blobs (binary large objects).

 

 

 

 

 

 

Further reading and digging:

  1. DCOM at MSDN.

  2. COM+ at MSDN.

  3. COM at MSDN.

  4. Win32 process, thread and synchronization story can be found starting from Module R.

  5. MSDN MFC 7.0 class library online documentation.

  6. MSDN MFC 9.0 class library online documentation - latest version.

  7. MSDN Library

  8. Windows data type.

  9. Win32 programming Tutorial.

  10. The best of C/C++, MFC, Windows and other related books.

  11. Unicode and Multibyte character set: Story and program examples.

 

 

 

 

 


 

| Tenouk C & C++ | MFC Home | Uniform Data Transfer & OLE 5 | Structured Storage 2 | Download | Site Index |