2.1.3 Document Management Systems

Description: Systems for storage, indexing, location, and retrieval of multiple types of data files, including documents, presentations, graphics, scanned images, recordings, and similar document, audio, and visual files.

Category: 2 - Data Management   Subcategory: 1 - General Purpose Data Management Tools
Old Category: Enterprise Services Document and Multimedia Management




Industry UsageSC Usage

Performance Metrics

Ease of use, transparency, broad scope of file and document formats, powerful search capability, version control, standard schemas

Usage and Dependencies

Update for 2.1.3 Document Management Systems:

Industry Usage: This category includes products under a variety ofacronyms, including document management (DM), records management (RM), content management (CM), electronic information management (EIM), digital asset management (DAM), media asset management (MAM), web content management (WCM), knowledge management (KM), and paper, imaging, video and workflow management.

The scope of media, processes and data in a document-related project needs to be clearly defined in order to avoid confusion. Here, a "document" is simply any electronic file that needs to be stored and retrieved by SC users. In this context, there are no distinctions made regarding the official status of the documents. To make this more clear, the term "electronic information management" is preferred. "Document management" in this context can include features such as indexing, content tagging, advanced searching, content discovery and extraction, check-in/check-out, version control, publish and subscribe, access control, collaborative editing, workflow, watermarking, and digital signing.

Records management, in accordance with DoD/DoE 5015.2, requires additional features to manage a retention schedule, ability to prove that this retention schedule applies to all documents, and the ability to suspend destruction of documents in the event of legal action. Systems to support RM should be certified as such by NIST or another government agency. Of course, any RM system depends on clear policies being defined prior to implementation of the system.

The increasing needs of users and the effectiveness of the web, is causing a migration from unstructured file systems to managed document systems, which might be linked to Web Content Management systems through the use of XML-based metadata standards. However, despite the release of Microsoft's Web Store as an alternative to the NT file system and the continued integration of document management features in Lotus Domino, document management will not become a commodity network service until 2003 or beyond, leading to the continued use of independent document and content management services.

The leading vendors have created expensive, proprietary products and are now working to incorporate open, de facto standards. FileNET, SAROS, PC DOCS, Open Text Corp., and Hummingbird are leaders at the middle and high end of product offerings. Most of these products now have web-enabled versions, but users should recognize that security and intermittent connections can impact any Internet-based solutions.

Microsoft has two new products in this category. SharePoint Portal Server is a WebDAV-based file management system consisting of three components: document storage (Web Storage System), an indexing server, and a powerful search engine. This system can handle up to 3.5 million documents in its searchable index, which is controlled through a web portal. Also, Microsoft recently acquired NCompass Resolution, rebranded as Content Management Server 2001, a fully featured WCM product that is tightly integrated with Commerce Server 2000 and is well suited for midrange implementations in Microsoft-centric environments.

Image processing is no longer considered an independent data management function. Well established standards include TIFF, JPEG, and MPEG (full-motion video) formats. Image file processing must be considered in conjunction with collaborative services, workflow services and document management. Data storage, processing power, video display, and bandwidth all have high end requirements when image processing is embraced as a key method for data management.

See also Categories 1.1.6 - Office Editing Suites, and 3.3.1 - Portals.

SC Usage: Currently, documents are authored in a variety of tools and formats, mainly Word, HTML and PDF. These formats are not directly interoperable. Most documents are stored in an unstructured file system, with no established naming conventions or policies. The main consequence is difficulty and time delays in finding documents. This is a hidden labor and productivity cost, which if measured would probably be high, based on benchmark data from similar organizations.

In an earlier effort to solve these problems, in 1998 SC purchased a proprietary document management system from Eastman. It managed Exchange email, but it proved too inflexible for general purpose use. Hence SC is skeptical about the repeated purchase of large, expensive systems which claim to solve document management issues.

With the Electronic Information Management project, SC-65 gave document management a high priority in FY 2001-02. EIM has selected SharePoint Portal Server as its technology solution. This system allows any types of documents (text, graphics, HTML, .doc, .pdf etc.) to be stored within a managed, searchable, web-based environment. This system facilitates rapid search and retrieval of documents -- a primary function of document management -- and also it will eliminate the concern about managing disparate file types such as Acrobat .pdf files.

In order to obtain the maximum search speed from this general-purpose document management system, content managers will need to define a set of 'metadata' tags for all new documents (such as a list of keywords). Then these tags must be added to all new documents by their authors. This will entail a minor change the work habits of users, and hence a need for some user education.

In addition to a Web Portal (Plumtree and/or Microsoft SharePoint Server), Office XP provides a direct interface into the SharePoint Portal Server. Therefore, it has been decided to delay the rollout of EIM until Office XP has been deployed, near the end of calendar 2002.

In the next year or two, such a general-purpose system, based on XML and other open standards, will be ready to integrate further into the data and application architectures of SC, and probably also into the DOE-wide and government XML schema standards that are currently being developed.

The opportunity to deploy these state-of-the-art, flexible solutions has been achieved only through a sustained effort at enterprise architecture planning and implementation in SC. Now it is important to 'stay the course' and maintain consistent deployment of web-based solutions using open standards from the W3C.

SC Application Impacts: Near term: incorporation of Electronic Information Management and workflow features into legacy and new business systems. SCIP can support some of these features. Longer term: increased data standardization via XML for internal and external data and system integration.

