DLDocument Class Reference

Representation of a document. More...

#include <DLDocument.h>

List of all members.

Public Types

typedef std::list< DLPage
* >::iterator 
DLPagePtrIterator

Public Member Functions

DLPagePtrIterator begin ()
DLPagePtrIterator end ()
bool dlIsPageIteratorValid (DLPagePtrIterator pageIter)
 DLDocument ()
 DLDocument (const DLImage &documentImage, const string &pageID, const string &docID)
 DLDocument (const char *imageFileName, const string &pageID, const string &documentID)
 DLDocument (const DLDocument &right)
virtual ~DLDocument ()
DLDocumentoperator= (const DLDocument &right)
string dlGetDocumentID () const
void dlSetDocumentID (const string &docID)
int dlGetNumPages () const
bool dlHasPages () const
void dlAppendPage (DLPage *docPage)
void dlAppendPageList (list< DLPage * > pageList)
void dlInsertPage (DLPagePtrIterator pageIter, DLPage *docPage)
void dlInsertPage (int cursorPosition, DLPage *docPage)
void dlInsertPageList (DLPagePtrIterator pageIter, list< DLPage * > pageList)
void dlInsertPageList (int cursorPosition, list< DLPage * > pageList)
void dlDeletePage (DLPagePtrIterator pageIter)
void dlDeletePage (int cursorPosition)
void dlDeletePage (DLPage *documentPage)
void dlClearPages ()
Tag List Functions
string dlGetTag (string tagKey) const
void dlSetTag (string tagKey, string tagValue, bool overwriteEnabled=false)
void dlDeleteTag (string tagKey)
void dlClearTags ()
bool dlIsTagSet (string tagKey) const
bool dlIsTagListEmpty () const
DLTagList::iterator dlFindTag (string tagKey)

Protected Attributes

string documentID
DLTagList documentTags
list< DLPage * > documentPages
list< DLPage * > pageBackPointers


Detailed Description

Representation of a document.

The DLDocument class is the uppermost level in the DocLib document hierarchy. It is basically a container for attributes describing the document, and one, or in case of multiple page documents, several pages. A DLDocument object has a document ID under which the developer can store a unique index. Moreover, it features a list of document tags allowing the developer to dynamically append or remove additional self-defined attributes. Several constructors are available, depending on the information present at the time of construction. The pages of a document are stored in a STL list structure. Developers have access to this list via a set of functions for appending, inserting, or deleting either one single page or a list of pages.

Transparent to the users, each DLDocument object keeps track of two kinds of DLPage objects associated with it - pages on the DLDocument, and pages that point to the DLDocument (which get created when copies of DLPages are made). Programmers can traverse through the pages of a document using DLPagePtrIterator as below:

 //   ... docPointer is a DLDocument* to a document already populated with pages
  DLDocument::DLPagePtrIterator pageIterator;
  for (pageIterator = docPointer->begin(); pageIterator != docPointer->end(); pageIterator++) {
    cout << (*pageIterator)->dlGetPageID() << endl;
  }

Upon its destruction, the DLDocument deletes all pages in the DLDocument. Furthermore, if there exist dangling pages not contained within the DLDocument that still point to the DLDocument, the DLDocument* backpointer of those pages will be set to NULL, as the DLDocument they reference does not exist any more.

When making a copy of DLDocument, all the pages and zones in the source document will automatically be copied over to the new DLDocument object. However, this is NOT simply a member-wise copy, as a new hierarchy of page and zone objects is actually created recursively that duplicates the page and zone hierarchy of the source DLDocument.

It is advisable for programmers to use new operator to dynamically allocate memory space when constructing DLPages and DLZones. The object destructors of DLDocument, DLPage, and DLZone will release the allocated memory space of associated objects (i.e. all objects below the current object in the DLDocument hierarchy.) However, if you use the new operator to dynamically create an stand-alone object, you will need to use the delete operator to reclaim the memory space it occupies.

Definition at line 65 of file DLDocument.h.


Member Typedef Documentation

typedef std::list<DLPage*>::iterator DLDocument::DLPagePtrIterator

Iterator which iterates over all the DLPage*'s in the document, which are stored in documentPages. Note that dereferencing this iterator gives a DLPage*, which must be dereferenced again to give an actual DLPage object reference. (See the code example in the class introduction above.)

Definition at line 78 of file DLDocument.h.


Constructor & Destructor Documentation

DLDocument::DLDocument (  ) 

DLDocument Constructor

DLDocument::DLDocument ( const DLImage documentImage,
const string &  pageID,
const string &  docID 
)

DLDocument Constructor: creates a one-page document

Parameters:
documentImage image for the document's page
pageID unique string ID of DLPage
docID unique string ID of DLDocument

DLDocument::DLDocument ( const char *  imageFileName,
const string &  pageID,
const string &  documentID 
)

DLDocument Constructor: creates a one-page document from an image file

Parameters:
imageFileName image file name
pageID unique string ID of DLPage
documentID unique string ID of DLDocument

DLDocument::DLDocument ( const DLDocument right  ) 

DLDocument Copy Constructor: all the DLPages/DLZones in the document will be copied over to the new DLDocument object. The new object will have a DLDocument hierarchy identical to the source DLDocument, in which each page and zone is a new object. As DLDocument copy is NOT simply a member-wise copy, it will not result in the accidental destruction of the original DLDocument's pages and zones when the destructor of the DLDocument copy is invoked.

virtual DLDocument::~DLDocument (  )  [virtual]

Default DLDocument Destructor Upon its destruction, the DLDocument deletes all DLPages on the DLDocument. Furthermore, if there exist dangling DLPages that still point to the DLDocument, the DLDocument* on these DLPages will be set to NULL, as the DLDocument does not exist any more.


Member Function Documentation

DLPagePtrIterator DLDocument::begin (  )  [inline]

Obtain an iterator pointing to the first of all the DLPage*'s in documentPages

Returns:
an iterator pointing to the first of all the DLPage*'s in documentPages

Definition at line 84 of file DLDocument.h.

References documentPages.

DLPagePtrIterator DLDocument::end (  )  [inline]

Obtain an iterator that addresses the location succeeding the last DLPage*s in documentPages. The end() iterator is typically used in a for loop for bounds checking (see the example in the class introduction above). Warning: dereferencing the end() iterator is undefined, and will likely cause your program to crash.

Returns:
an iterator that addresses the location succeeding the last DLPage*s in documentPages

Definition at line 93 of file DLDocument.h.

References documentPages.

bool DLDocument::dlIsPageIteratorValid ( DLPagePtrIterator  pageIter  ) 

Check whether the given iterator is in the valid range

Returns:
true if begin() <= pageIter < end(); returns false otherwise

DLDocument& DLDocument::operator= ( const DLDocument right  ) 

DLDocument Assignment Operator: all the DLPages/DLZones in the document will be copied over to the new DLDocument object. The new object will have a DLDocument hierarchy identical to the source DLDocument, in which each page and zone is a new object. As DLDocument copy is NOT simply a member-wise copy, it will not result in the accidental destruction of the original DLDocument's pages and zones when the destructor of the DLDocument copy is invoked.

string DLDocument::dlGetDocumentID (  )  const [inline]

Get the unique ID of the document

Returns:
the unique ID of the document

Definition at line 167 of file DLDocument.h.

References documentID.

void DLDocument::dlSetDocumentID ( const string &  docID  )  [inline]

Set the ID string of the document

Parameters:
docID desired unique ID of the document

Definition at line 173 of file DLDocument.h.

References documentID.

int DLDocument::dlGetNumPages (  )  const [inline]

Get the number of pages in the document

Returns:
number of pages

Definition at line 179 of file DLDocument.h.

References documentPages.

bool DLDocument::dlHasPages (  )  const [inline]

Check whether the current list of pages in the document is empty

Returns:
true if document contains pages

Definition at line 185 of file DLDocument.h.

References documentPages.

void DLDocument::dlAppendPage ( DLPage docPage  ) 

Append a new page to the document at the end of the current list of pages. A DLException is thrown when the DLPage to be appended is already pointing to another DLDocument as its parent. The DLDocument object takes responsibility for freeing the memory occupied by docPage. Therefore, typically docPage should be allocated on the heap using new. Taking the address of a stack object should be done with great care, since the object is automatically destroyed when it goes out of scope, potentially causing a dangling pointer within the document.

 DLDocument d;
 {
   DLPage po("test1.tif", "page1");
   DLPage * pp = new DLPage("test2.tif, "page2"); 
   d.dlAppendPage(&po); // dangerous; po will soon go out of scope
   d.dlAppendPage(pp); // safer; pp is not destroyed unless delete called
 }
Parameters:
docPage new DLPage to be appended

void DLDocument::dlAppendPageList ( list< DLPage * >  pageList  ) 

Append a list of new pages to the document at the end of the current list. An exception occurs when any DLPage to be appended is pointing to another document. The DLDocument object takes responsibility for freeing the memory occupied by the DLPages in pageList.

Parameters:
pageList list of new DLPage*s to be appended to the document at the end of the current list

void DLDocument::dlInsertPage ( DLPagePtrIterator  pageIter,
DLPage docPage 
)

Insert a new page to the document at the specified iterator position of the current list. An exception occurs when the position specified is out of range or when the DLPage to be inserted is currently pointing to another document. The DLDocument object takes responsibility for freeing the memory occupied by docPage.

Parameters:
pageIter DLPagePtrIterator pointing to the position in the current page list where the new page is to be inserted
docPage new DLPage to be inserted

void DLDocument::dlInsertPage ( int  cursorPosition,
DLPage docPage 
)

Insert a new page to the document at the specified position of the current list. An exception occurs when the DLPage to be inserted is pointing to another document. The DLDocument object takes responsibility for freeing the memory occupied by docPage.

Parameters:
cursorPosition cursor position in the current page list where the new page is to be inserted. For example, cursorposition == 0 inserts before the first page, and cursorposition == 1 inserts between the first page and the second page.
docPage new DLPage to be inserted

void DLDocument::dlInsertPageList ( DLPagePtrIterator  pageIter,
list< DLPage * >  pageList 
)

Insert a list of new pages to the document starting from the specified iterator position in the current list. An exception occurs when any DLPage to be inserted is pointing to another document. The DLDocument object takes responsibility for freeing the memory occupied by the DLPages in pageList.

Parameters:
pageIter the DLPagePtrIterator position in the current page list where the new pages are to be inserted
pageList list of new DLPage*s to be inserted at the specified cursor position in the current list

void DLDocument::dlInsertPageList ( int  cursorPosition,
list< DLPage * >  pageList 
)

Insert a list of new pages to the document starting from the specified position in the current list. An exception occurs when any DLPage to be inserted is pointing to another document. The DLDocument object takes responsibility for freeing the memory occupied by the DLPages in pageList.

Parameters:
cursorPosition position in the current page list where the new pages are to be inserted For example, cursorposition == 0 inserts before the first page, and cursorposition == 1 inserts between the first page and the second page.
pageList list of new DLPage*s to be inserted at the specified cursor position in the current list

void DLDocument::dlDeletePage ( DLPagePtrIterator  pageIter  ) 

Delete a page at a given iterator position from the document. This will remove the page from the list of pages on the document and the list of page backpointers of the document. An exception occurs when the position specified is out of range.

Parameters:
pageIter DLPagePtrIterator pointing to page to be removed from the page list

void DLDocument::dlDeletePage ( int  cursorPosition  ) 

Delete a page at a given position from the document. This will remove the page from the list of pages on the document and the list of page backpointers of the document. An exception occurs when the position specified is out of range.

Parameters:
cursorPosition position of the page to be removed from the page list; 0 is the first page, 1 is the second page, and so on.

void DLDocument::dlDeletePage ( DLPage documentPage  ) 

Delete a specific page from the document. This will remove the page from the list of pages on the document and the list of page backpointers of the document.

Parameters:
documentPage page to be deleted

void DLDocument::dlClearPages (  ) 

Clear the list of pages on the document and totally remove them from the list of page backpointers of the document. This also releases the memory for every child DLPage and DLZone in the hierarchy.

string DLDocument::dlGetTag ( string  tagKey  )  const

Get the field value of a specified document tag

Parameters:
tagKey string key of the specified document tag
Returns:
the field value of a specified document tag

void DLDocument::dlSetTag ( string  tagKey,
string  tagValue,
bool  overwriteEnabled = false 
)

Set a document tag. If a tag with the specified key already exists in this document, it will not be overwritten unless the optional parameter overwriteEnabled is set to true. Will throw a DLException if overwriteEnabled is set to false and tagKey exists.

Parameters:
tagKey key of the document tag
tagValue value of the document tag
overwriteEnabled option for overwriting existing field (default is false)
Exceptions:
DL_Exception DL_UNKNOWN_TAG_EXCEPTION

void DLDocument::dlDeleteTag ( string  tagKey  ) 

Remove the tag with the specified key from the list of document tags.

Parameters:
tagKey key of the tag to remove

void DLDocument::dlClearTags (  )  [inline]

Erase all the existing document tags

Definition at line 328 of file DLDocument.h.

References documentTags, and DLTagList::tagMap.

bool DLDocument::dlIsTagSet ( string  tagKey  )  const

Check whether a document tag with the given key exists

Parameters:
tagKey key of the document tag
Returns:
true if a tag exists with that key

bool DLDocument::dlIsTagListEmpty (  )  const [inline]

Check whether the whole document tag list is empty

Returns:
true if the document tag list is empty

Definition at line 341 of file DLDocument.h.

References documentTags, and DLTagList::tagMap.

DLTagList::iterator DLDocument::dlFindTag ( string  tagKey  )  [inline]

Get an iterator pointing to the location of tagKey in the map

Returns:
iterator

Definition at line 347 of file DLDocument.h.

References documentTags, and DLTagList::tagMap.


Member Data Documentation

string DLDocument::documentID [protected]

Definition at line 353 of file DLDocument.h.

Referenced by dlGetDocumentID(), and dlSetDocumentID().

DLTagList DLDocument::documentTags [protected]

Definition at line 356 of file DLDocument.h.

Referenced by dlClearTags(), dlFindTag(), and dlIsTagListEmpty().

list<DLPage*> DLDocument::documentPages [protected]

Definition at line 359 of file DLDocument.h.

Referenced by begin(), dlGetNumPages(), dlHasPages(), and end().

list<DLPage*> DLDocument::pageBackPointers [protected]

Definition at line 363 of file DLDocument.h.


The documentation for this class was generated from the following file:

DOCLIB is being developed under contract by a collaboration between:
The Laboratory for Language and Media Processing
Unviersity of Maryland, College Park
and
Booz | Allen | Hamilton

All Rights Reserved, 2003-2007