Something I was writing for CETIS in 2002 but never finished up. However, there should be some interesting materials that I might be able to reuse.
- Overview/executive summary
- Introduction
- Why map library and educational technology standards?
- What the MOA2/METS specs are (what,who,why)
- Similarities and differences (high level only)
- Mapping MOA2/METS to IMS Content Packaging
- References
Raymond Yee
Overview/executive summary
Digital objects are being interchanged within communities; standards are being developed within communities for intra-community exchange. In the library and in the educational technology worlds, we have METS and IMS (among others). What I want to focus on is intra-communal data exchange between libraries and ed. tech. This article is focused on the importance and practical state of interoperability (in the exchange of digital objects) between systems in educational technology and those in the digital library and museum worlds, as embodied by some important respective standards. I give some reasons why this is important. This paper is about the motivation why, the theory, and some practical stabs at doing so and some of the outstanding challenges.
Introduction
The educational technology standards are the bread-and-butter for CETIS. There have been many specifications development efforts. In this article, I look at the cluster of specifications developed by IMS. The IMS specifications seem to have a fair amount of momentum, being themselves adopted or as the foundation of SCORM. These specifications have been developed to promote interoperability among software systems in the educational technology space.[1] [2] More specifically, there are two IMS specifications that are key: IMS Content Packaging (IMS-CP) "The IMS Content Packaging Specification provides the functionality to describe and package learning materials, such as an individual course or a collection of courses, into interoperable, distributable packages. Content Packaging addresses the description, structure, and location of online learning materials and the definition of some particular content types. The Content Packaging Specification is aimed primarily at content producers, learning management system vendors, computing platform vendors, and learning service providers. Learning materials described and packaged using the IMS Content Packaging XML format should be interoperable with any tool that supports the Specification. Content creators can develop and distribute material knowing that it can be delivered on any compliant system, thereby protecting their investment in rich content development." [4] and the IMS Metadata Specification (IMS-MD) (for specifying metadata for these learning packages) [3].
In the library world, there are also many different specifications of importance to facilitate interoperability in the library of "Information world" [, #191] Some of these standards have been around for a long time: Z39.50 and MARC, for instance. METS (Metadata Encoding and Transmission Standard), under current active development, "The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library,"
Although METS is in early development phase; an editorial board for METS was just formed in May 2002. [7]. Tools (viewers, editors) are still in the early stages of development. However, its predecessor, the Making of America II format [5] (MOA2) has already been used effectively by museums for interoperability and storage. ("Several of the institutions which developed the MOAII DTD; the UC Berkeley Library, Cornell University, New York Public Library and others are deploying the standard as well and creating a community of users which can leverage resources, experiences, and content.") [11] METS is being widely adopted by libraries around the U.S. and being considered for adoption by libraries in Europe (and elsewhere?) One can therefore expect a lot of archival materials available.
The two communities mentioned (digital libraries and educational technology) have some common interests but are working on different standards. One question is how those commonalities are being expressed in the different standards and whether they can be reconciled. The primary scenario driving the article will that of someone who wants to make use of rich content from digital libraries, marked up METS format (from the library) with educational technology software (learning management systems) and in web authoring (the user wants to easily grab hold of digital objects in the library and integrate them with his Web writing environment)
The article will be about the practical issues of interoperability, how to map the various standards to one another, and what software is available today to make use of these objects.
Why map library and educational technology standards?
There are deep connections between libraries and the classroom (and the development of educational materials in general.) Libraries are important parts of universities. Teachers send students to do research in libraries and discuss materials from libraries. Library and archival materials find their way into textbooks.
As universities become more digital, there will be increasing use of many types of software systems, including learning management systems (LMS). Publishers will be creating textbook materials that can be imported into a LMS (coursepacks). Professors are already authoring materials for LMS. Educational technology standards (most notably ones from IMS) are being devised to promote interoperability among software systems that are consumers and producers of coursepacks.
Research libraries are also becoming more digital. More artifacts are being encoded and shared and archived all the time. At first, these digital materials are becoming part of research documentation; over time, they will naturally make their way into the classroom -- or specifically into the electronic classroom environments of a learning management system. Then we're faced with the issue of how to get library materials, encoded in the library encoding standards, into educational technology systems (which use educational technology standards). Various standards have been devised to ensure interoperability within communities. A lot of effort is put into making these standards cover the needs of everyone within a given community. So for ed tech standards, which are geared to supporting the reuse and distribution of electronic learning materials, a lot of effort has gone into including many disparate players: the military, software makers of all sorts, publishers, higher education, community college. Similarly, in the Libary community, an effort is made to get input from key players in the library world. The success of standards depends on getting adequately wide input.
Of course, there needs to be boundaries on what constitutes a community. If one tries to cover "everyone", one will end up having specifications so general that they are essentially useless. But there is still communication between communities. So sometimes, different standards need to interoperate. In the database world, one can perform a cross-walk, figuring out what can be mapped from one domain to another. In the case of the library and educational technology systems, there are matters where the two sides are talking about the same thing but might have different labels. Part of doing a good cross-walk is to identify those cases. There are also elements that have no corresponding elements in the other community. The question becomes not trying to translate the elements but whether it makes sense to just hold on to the elements (without trying to interpret them) or whether to just let them go. Perhaps the trickiest ground is the case when there is no exact translation but there may be a plausible and perhaps context-sensitive translation. What does one do then?
Why would the library and ed. tech standards cross? Consider a number of scenarios.
Perhaps the most important scenario (at least at first) is the library making available and pushing out their content to be used in other contexts: for writing monographs or scholarly papers or to embed in an instructional context.
Another scenario is that of the long-term archiving of course materials by libraries. Libraries, have so far, not gotten into the business of archiving such course materials since they seem so perishable. But one can imagine how course materials can evolve into archivable materials. And if the course materials are encoded in IMS format, then we will need to go the IMS->METS direction. Phillip Long [9] recently described this issue well [I'll want to paraphrase his points]
-
"Have you looked at the library resources in courseware management systems lately? Look again. There isn’t much of what you might consider "library material" there. What might library materials useful for online learning look like? Examine the richness in the offerings from the library itself and you’ll begin to get an idea."
"To be fair, the reasons library resources are absent from courseware tools aren’t entirely external to the library. Libraries have traditionally operated on the assumption that there is added value for users to come through the library for services. Yet it is becoming clearer all the time that faculty and students may not find the same value proposition. Librarians can and do provide added value to students looking for material from collections as well as from the Web. But people building the courseware infrastructure, as well as the courseware modules, don’t know what services to expect—or in programming terms, to "call"—to integrate library resources, materials, or special functions into their courses."
"Librarians need to think hard about what services they wish to deliver to online environments and clearly articulate how they might be accessed from courseware systems. This requires a radical shift in thinking because "calling" a resource says nothing about the behavior it will exhibit when it appears at its destination. Until libraries begin to think in terms of services they can offer courseware developers, it is not likely they will find a home in these tools."
"It won’t necessarily be for lack of interest, though I don’t see a deep understanding of what library resources are from courseware vendors, either. Nonetheless, libraries must decide on their suite of services and define clear mechanisms by which they can be invoked in support of a learning process or in courseware environments. Doing anything less will only accelerate their disappearance from the experience of our students. That would be the least desirable of the potential future outcomes for online learning."
Clifford Lynch [10] writes about the " changing roles of scholars, teachers, curators, and librarians". Since these roles are changing, one might expect an increasing need for interoperability between the technologies of these various communities.
-
"The focus is on creating large amounts of digital content and providing some fairly simple access tools, rather than upon sophisticated systems for ongoing use or apparatus providing interpretation. Now, what's interesting to me is to contrast this to so much of the public interest rhetoric which speaks not just to raw materials but to learning materials, to the need to package raw content from collections up in various ways such as learning experiences or curated exhibitions or interpretation and analysis. We need to study the lines of demarcation between raw cultural heritage materials, if you will, and interpretation or teaching, or presentations of these materials. This is a boundary line that I don't think we really have a very clear understanding of. It gets to the historic mission differences among museums, libraries and archives, and the growing confusion about those distinctions in the digital world; it involves the historical and perhaps changing roles of scholars, teachers, curators, and librarians. It invites questions about which audiences or user communities we are teaching to use uninterpeted databases of raw cultural heritage materials, and the methods we are teaching them to use in exploiting these resources. We have to ask questions about just how "interpretation-neutral" a collection of raw materials can really be - surely, for example, interpretation creeps into the descriptive metadata; as my friend and colleague Michael Buckland has pointed out, the changes in practice in the construction and assignment of subject headings over the past century is a window into many social changes that have taken place during that period."
Lynch on raw digital materials and interpretations layered on top [10]
-
"But I think that we can identify a series of trends that may lead us to a world of digital collections - databases of relatively raw cultural heritage materials, for example - and then layers of interpretation and presentation built upon these databases and making reference to objects within them. Probably we'll see interpretations that draw from many digital collections, and single digital collections contributing materials to many different interpretations. While I think that libraries, archives, museums, and the higher education community will be among the major creators of digital collections, the creators of presentations and interpretations of materials from these collections will be much more numerous and diverse."
Lynch on the notions of reusability as seen in the library world [10]
-
"We know that we want our digital collections to be reusable, though I suspect that there is little consensus on what reusability really means. I think that we believe that collections of lasting value have the characteristics of reusability. Part of reusability or re-purposing clearly is the ability to contribute, over time, to a large array of interpretations or presentations of materials for many different audiences and purposes within the context I've just described. In essence, it's the ability to have collections be overlayed in various ways. We have very limited experience with reusability and repurposing today. And right now our thinking about overlays is still in its infancy: we think about union catalogs, cross collection finding aids, new teaching or analytical works that make reference to objects in digital collections. As I'll discuss later, I think we are beginning to get a glimpse of much more sophisticated re-use and re-purposing that has deep implications of both markup of digitized objects and the metadata that accompanies them, however. Indeed, accommodating overlay may be too limited a way to describe the full range of repurposing that we'll want to facilitate."
What the MOA2/METS specs are (what,who,why)
Various efforts haven taken place in the library world. "The Making of America II is a Digital Library Federation project to create a proposed digital library object standard by encoding defined descriptive, administrative and structural metadata, along with the primary content, inside a digital library object. " [5] MOA2 is being evolved into METS. "The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office of the Library of Congress, and is being developed as an initiative of the Digital Library Federation. " [6]
[I wonder whether I should try to put METS in the context of other library standards. Ideally, I can say in a sentence or two how METS sits in the constellation of library standards without going into too much detail on such matters as Z39.50 and EAD and MARC. I also want to argue that METS is probably the closest to IMS-CP and IMS-MD.]
I certainly will want to talk about EAD and METS Here's what Paul Romaine says [12]
METS is an extremely generic xml schema coming out of the library world, but it's much less geared towards a specific type of community practice or type of content than the EAD. A METS document encodes one single Digital Object, which may comprise many multimedia files (image, audio, video). The object typically contains a hierarchical structure (such as the chapter / page structure of a book) pointing toward the surrogates. METS objects may also carry extensive descriptive metadata about the original (often physical) object described, used for discovery; as well as extensive administrative metadata such as technical metadata about the multimedia files, or rights metadata etc. It looks like METS will find wide adoption as a file exchange format, as a means to manage archival digital files and as a way to present digital surrogates.
Guenter waibel of the Berkeley Art Museum & Pacific Film Archive wrote on the EAD list, 24 Apr 2002, comparing EAD and METS:
-
METS is an extremely generic xml schema coming out of the library world, but it's much less geared towards a specific type of community practice or type of content than the EAD. A METS document encodes one single Digital Object, which may comprise many multimedia files (image, audio, video). The object typically contains a hierarchical structure (such as the chapter / page structure of a book) pointing toward the surrogates. METS objects may also carry extensive descriptive metadata about the original (often physical) object described, used for discovery; as well as extensive administrative metadata such as technical metadata about the multimedia files, or rights metadata etc. It looks like METS will find wide adoption as a file exchange format, as a means to manage archival digital files and as a way to present digital surrogates.
The IMS specs cover a wide range of topics but I would say that MOA2/METS covers essentially the same ground as two of the specifications: IMS-CP content-packaging ("will make it easier to create reusable content objects that will be useful in a variety of learning systems.") and IMS-MD metadata ("a uniform way for describing learning resources so that they can be more easily found (discovered), using meta-data aware search tools that reflect the unique needs of users in learning situations.")
Similarities and differences (high level only)
IMS is structured towards learning materials. METS is for libraries. What are the common concerns? For example, in the IMS-CP description, "The IMS Content Packaging Specification provides the functionality to describe and package learning materials, such as an individual course or a collection of courses, into interoperable, distributable packages. Content Packaging addresses the description, structure, and location of online learning materials and the definition of some particular content types." [4]
"METS, a Digital Library Federation initiative, attempts to build upon the work of MOA2 and provide an XML document format for encoding metadata necessary for both management of digital library objects within a repository and exchange of such objects between repositories (or between repositories and their users)." [8]
There is a representation of hierarchy in both systems. There are folders that hold other stuff and the items themselves.
There are no metadata standards mandated in either IMS or METS. The best practices for both do recommend metadata frameworks that are appropriate for their respective communities -- and these standards are different. (IEEE-LOM for IMS).
From what I can understand, neither IMS-CP nor METS mandate a particular visual representation (after all, they are XML content formats), there probably are implicit visual representations. (Take a look at MS LRN or the MOA2 viewers).
IMS-CP supports the notion of a content package, in which the imsmanifest.xml is a table of contents. One can ship the package of materials around. In METS, one can also embed content via the FContent tag (Actually, I think in METS, one can embed content.) (oh -- so that means to do a full METS to IMS-CP translation, we have to unpack FContent tags....I might just say that's out of the scope of this piece of work. There's no FContent tag in MOA2.)
METS supports more of the idea of different type of metadata: administrative vs descriptive vs structural. The concepts of different type of metadata are different in IMS.
METS supports the embedding of non-XML data and metadata (including binaries, encoded in Base64. non-XML data can definitely be part of an IMS-CP content package. I don't think that non-XML metadata is supported in IMS. I should mention SCORM here, which builds upon IMS-CP, IMS-MD + AICC extensions to give some run-time behavior (http://www.adlnet.org/index.cfm?fuseaction=scorm12) METS has behaviors to give some run time behavior [I need to write more about this here) My high level analysis of similarities and differences
If the goal is to support complete to-and-fro lossless translation, I think one will be disappointed. It might be possible in that one might be able to wrap an object of one standard in the structure of another but one would not be able to meaningfully use the materials. If one wants to move the content structure from one system into another, the content-packaging ideas are roughly consonant. Metadata poses interesting problems. If we are content to just copy over metadata from METS to IMS, then we are almost able to do that -- as long as the METS metadata is expressible by an XML schema. Translation takes more effort and a detailed mapping.
Mapping MOA2/METS to IMS Content Packaging
My work is still preliminary..
Approach: map just the content and leave out most of the metadata. I've been using XSLT to do this mapping. I will list the XSLT (with comments). I will walk through the logic in this section. understanding a stripped down MOA2 and stripped down IMS-CP file
-
stripped down MOA2 stripped down METS
-
MOA2 to IMS-CP METS to IMS-CP
-
moa2_to_mets_table_tingle.pdf (D:\Document\Interactive_University\Docs\2002\05)
Mapping metadata - issues, progress (again, based on your experience -
-
moa2_to_ims_cp_table.pdf (D:\Document\Interactive_University\Docs\2002\05) I still have a lot of work to do in this area.
References
[1] CETIS-Learning Technology Standards: An Overview, CETIS, 2002. http://www.cetis.ac.uk/static/standards.html
[2] IMS Global Learning Consortium, Inc. http://www.imsproject.org/
[3] IMS Learning Resource Meta-data Specification. http://www.imsproject.org/metadata/index.html
[4] IMS Specifications - Content Packing Specification - Final. http://www.imsproject.org/content/packaging/index.html
[5] The Making of America II. http://sunsite.berkeley.edu/MOA2/
[6] Metadata Encoding and Transmission Standard (METS). http://www.loc.gov/standards/mets/
[7] METS Editorial Board Formed (May 29, 2002). http://www.loc.gov/standards/mets/news052902.html
[8] METS Overview and Tutorial (Library of Congress). http://www.loc.gov/standards/mets/METSOverview.html
[9] Long, P.D. Can Libraries Find a New Home in Courseware? Syllabus, 15 (8). 8-10. http://www.syllabus.com/syllabusmagazine/article.asp?id=6136
[10] Lynch, C. Digital Collections, Digital Libraries and the Digitization of Cultural Heritage Information. First Monday, 7 (5). http://www.firstmonday.dk/issues/issue7_5/lynch/index.html
[11] Rinehart, R. Museums and the Online Archive of California. First Monday, 7 (5). http://www.firstmonday.dk/issues/issue7_5/rinehart/index.html
[12] Romaine, P. Notes on METS, 2002. http://romaine.home.pipeline.com/notes/mets.html
