基于OAIPMH協(xié)議數(shù)字圖書館中數(shù)據(jù)提供者的研究_第1頁
基于OAIPMH協(xié)議數(shù)字圖書館中數(shù)據(jù)提供者的研究_第2頁
基于OAIPMH協(xié)議數(shù)字圖書館中數(shù)據(jù)提供者的研究_第3頁
基于OAIPMH協(xié)議數(shù)字圖書館中數(shù)據(jù)提供者的研究_第4頁
基于OAIPMH協(xié)議數(shù)字圖書館中數(shù)據(jù)提供者的研究_第5頁
已閱讀5頁,還剩5頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、隨著網(wǎng)絡(luò)、計算機(jī)和通信技術(shù)的快速發(fā)展,數(shù)字圖書館已經(jīng)成為人們?nèi)〉眯畔⒌闹匾獊碓?,然而對于網(wǎng)絡(luò)上眾多的分布式數(shù)字資源1?2,人們希望通過檢索獲得所需的信息。但由于早期在建立數(shù)字圖書館時沒有統(tǒng)一的標(biāo)準(zhǔn),其資源的元數(shù)據(jù)格式并不相同,因而要制定出一套能夠描述不同數(shù)字圖書館資源的統(tǒng)一元數(shù)據(jù)格式相當(dāng)困難;同時,早期建立的數(shù)字圖書館數(shù)據(jù)提供者與服務(wù)提供者間多是點(diǎn)對點(diǎn)的協(xié)議服務(wù),協(xié)議外的數(shù)據(jù)提供者與服務(wù)提供者間無法進(jìn)行元數(shù)據(jù)交互3?4。針對上述問題,設(shè)計了數(shù)字圖書館中廣泛使用的元數(shù)據(jù)格式MARC和DC的格式轉(zhuǎn)換器,并重新設(shè)計了基于OAI?PMH協(xié)議互操作平臺的命令動詞,對數(shù)字圖書館間資源的共享具有重要意義。1

2、 OAI?PMH協(xié)議概述數(shù)據(jù)提供者、服務(wù)提供者、注冊服務(wù)器構(gòu)成了數(shù)字圖書館領(lǐng)域5?7中OAI?PMH協(xié)議的框架。數(shù)據(jù)提供者是框架中存儲大量元數(shù)據(jù)的信息庫,服務(wù)提供者可從多個數(shù)據(jù)提供者中提取元數(shù)據(jù)并提供檢索、瀏覽等增值服務(wù),服務(wù)提供者對數(shù)據(jù)提供者發(fā)出的是到UDP的請求,也就是基于OAI?PMH的命令動詞,數(shù)據(jù)提供者對服務(wù)提供者的請求是通過HTTP協(xié)議以XML的形式響應(yīng)的,注冊服務(wù)器中可以進(jìn)行數(shù)據(jù)提供者和服務(wù)提供者的注冊。其具體框架如圖1所示。2 MARC到DC格式轉(zhuǎn)換器的設(shè)計與實(shí)現(xiàn)目前數(shù)字圖書管中使用的元數(shù)據(jù)格式主要為MARC,而基于OAI?PMH協(xié)議的數(shù)據(jù)提供者只能支持DC格式的元數(shù)據(jù)8。所以

3、要實(shí)現(xiàn)基于OAI?PMH協(xié)議的元數(shù)據(jù)互操作平臺,就需要把MARC格式的元數(shù)據(jù)轉(zhuǎn)換為DC格式的元數(shù)據(jù)。2.1 MARC與DC格式的域?qū)φ贞P(guān)系DC格式包含的元素描述共有15個,按照其描述內(nèi)容的范圍和類別可分為三組,具體如表1所示。本系統(tǒng)提供了一個項(xiàng)批輸入接口,所以設(shè)計的MarcDc模塊應(yīng)在項(xiàng)批輸入者之前進(jìn)行處理。如果輸入的是MARC格式,則先經(jīng)過MarcDc格式轉(zhuǎn)換器進(jìn)行數(shù)據(jù)轉(zhuǎn)換,然后通過項(xiàng)批處理模塊進(jìn)行處理;如果輸入的就是DC格式,則直接應(yīng)用項(xiàng)批處理模塊進(jìn)行處理。3 OAI?PMH互操作平臺的實(shí)現(xiàn)3.1 基于OAI?PMH協(xié)議的命令動詞分析基于OAI?PMH協(xié)議的數(shù)字圖書館中,為實(shí)現(xiàn)多個數(shù)據(jù)提供

4、者與服務(wù)提供者間之間的元數(shù)據(jù)交互,重新設(shè)計了基于OAI?PMH協(xié)議互操作平臺的命令動詞,命令動詞的主要功能如下:(1) GetRecord:此動詞可從倉儲中搜索元數(shù)據(jù)記錄。如果從倉儲中和指定的項(xiàng)中無法得到由metadataPrefix指定的元數(shù)據(jù)格式,其可以返回狀態(tài)屬性的頭部信息值為“刪除”,此功能的實(shí)現(xiàn)依賴于倉儲所跟蹤的刪除級別;(2) Identify:此動詞用于檢索倉儲的有關(guān)信息。利用此動詞,倉儲也可以返回余下的描述性信息,在返回的信息中,一部分對于基于OAI?PMH協(xié)議的互操作平臺是需要的;(3) ListIdentifiers:此動詞不返回記錄本身,而僅返回頭部的記錄,通過基于集合成

5、員和時間戳的頭部可獲取選擇的參數(shù)。基于倉儲對刪除的支持特性,如過請求中刪除了與指定參數(shù)匹配的記錄,則返回的記錄中將含有值為“刪除”的狀態(tài)屬性;(4) ListMetadataFormats:此動詞可從倉儲中檢索獲得所需的元數(shù)據(jù)格式,同時可以通過參數(shù)限制該請求對指定項(xiàng)元數(shù)據(jù)格式的獲??; (5) ListRecords:此動詞用于從倉儲中獲取記錄?;趥}儲對刪除的支持特性,如果刪除了請求中與參數(shù)匹配的記錄,則返回的記錄中將含有值為“刪除”的狀態(tài)屬性,但返回有“刪除”狀態(tài)的記錄是不能被表達(dá)為元數(shù)據(jù)格式的;(6) ListSets:此動詞用于返回倉儲的集合結(jié)構(gòu),對于選擇性獲取有益。3.2 數(shù)據(jù)提供者與

6、服務(wù)提供者間命令動詞使用方式分析數(shù)據(jù)提供者與數(shù)據(jù)服務(wù)者之間的請求和應(yīng)答是通過上述六個核心動詞來實(shí)現(xiàn)的。使用標(biāo)準(zhǔn)的Web服務(wù)器是一個典型請求的實(shí)現(xiàn)方式,通過配置該Web服務(wù)器,使其可以向能夠處理OAI?PMH請求的軟件分發(fā)OAI?PMH請求,具體交互過程如圖5所示。數(shù)據(jù)提供者與數(shù)據(jù)服務(wù)者之間的交互步驟具體如下:(1) 服務(wù)提供者首先找到其所需元數(shù)據(jù)的數(shù)據(jù)提供者,然后獲取其惟一標(biāo)識;(2) 服務(wù)提供者向數(shù)據(jù)提供者發(fā)出ListSets請求和ListMetaFormats請求。數(shù)據(jù)提供者根據(jù)服務(wù)提供者的請求返回其所能提供的滿足一定條件(如某種格式,某種主題,某個時間段)的元數(shù)據(jù);服務(wù)提供者取得元數(shù)據(jù)后

7、向用戶提供服務(wù)。3.3 OAI?PMH命令動詞的請求和應(yīng)答格式分析(1) 請求格式3.4 OAI六個動詞的設(shè)計(1) 概要設(shè)計當(dāng)數(shù)據(jù)提供者接收到OAI請求后,需對該請求進(jìn)行解析。首先判斷該請求類型是否合法,若不合法,則向服務(wù)提供者發(fā)送一個錯誤的信息;若合法,則判斷該請求屬于六個有效請求的哪個類型。由于參數(shù)metadataPrifix對于請求類型 ListIdentifiers是強(qiáng)制的,所以如果數(shù)據(jù)提供者接收到的請求類型是ListIdentifiers,那么解析器可以直接檢查這個請求的第二個參數(shù)。但如果請求中沒有相關(guān)參數(shù),那么要確認(rèn)這個請求有效,則需確認(rèn)請求中必須包含resumptionToke

8、n (恢復(fù)標(biāo)志參數(shù)),并且數(shù)據(jù)提供者知道這個參數(shù)。假設(shè)數(shù)據(jù)提供者在unqualified DC模式下只能發(fā)送元數(shù)據(jù)集合,那么metadataPrifix參數(shù)惟一有效的值只能是oai_dc。在正常情況下,請求中可選取的參數(shù)是必須被解析的,但可以簡單化,以一種非正式的形式進(jìn)行描述。然后,根據(jù)接收到的請求參數(shù),數(shù)據(jù)提供者運(yùn)用SQL語句在倉儲中進(jìn)行查詢,如果產(chǎn)生了記錄大于傳遞標(biāo)識符一次所能傳遞的最大值,那么數(shù)據(jù)提供者則會新生成一個resumptionToken標(biāo)志,并將查詢參數(shù)與指針信息存儲在一起,具體實(shí)現(xiàn)流程如圖6所示。(2) 詳細(xì)設(shè)計 利用Protocol數(shù)據(jù)庫包進(jìn)行六個動詞的請求、響應(yīng)及信息顯示

9、,其中Date類使用java.util.Date,Calendar或String方法實(shí)現(xiàn)年、月、日等各種格式間的相互轉(zhuǎn)換和輸出;Set類可通過一個已命名的集合創(chuàng)建一個新的集合,并用XML形式描述。 處理客戶端源代碼Client,可以利用一個URL或一個集合的性質(zhì)創(chuàng)建HarvesterItinerary類用于描述harvester類的狀態(tài),HarvesterItinerary類中可以對Harvester類的狀態(tài)進(jìn)行保存。OAIConnection類可以描述一個到OAI的連接,用于接收單一的請求,該連接是基于用doReques或基于倉儲的URL所返回的Response。 互操作平臺中的Server

10、包可以提供服務(wù)器端的OAI源代碼,并把源代碼轉(zhuǎn)換為文檔服務(wù)。OAI服務(wù)的目標(biāo)編程接口通過Target類實(shí)現(xiàn),互操作平臺通過TargetAdapter類定義了一個不做任何事的執(zhí)行,GenericTarget類可以實(shí)現(xiàn)由一個非常簡單的機(jī)制以創(chuàng)建小型的OAI收集器,并在存儲器中尋找與服務(wù),利用集合規(guī)程和給定的元數(shù)據(jù)前綴得到所有的記錄; JDBCServer類是OAI servlet提供的一般服務(wù)器,OAIServerIfc類為OAI提供了一個框架,元數(shù)據(jù)前綴用在DC元數(shù)據(jù)核心記錄,處理GetRecord 請求、Identify 請求、ListRecords請求、 ListSets請求、ListMet

11、adataFormats請求和ListIdentifiers請求。ResumableResultSet類描述了一個結(jié)果,該結(jié)果通過與重用標(biāo)記相結(jié)合,可以被客戶作為一系列局部結(jié)果而重用。3.5 數(shù)據(jù)提供者的功能實(shí)現(xiàn)(1) 項(xiàng)的訂購及提交用戶可以使用e?mail訂購自己所需的項(xiàng),這樣可以共享數(shù)字圖書館的資源。用戶首先提交項(xiàng)的要素描述元數(shù)據(jù),然后上傳源文件,在通過系統(tǒng)的驗(yàn)證后可把文件提交到互操作平臺上供大家瀏覽及下載。(2) 工作流的實(shí)現(xiàn)三個小組負(fù)責(zé)人負(fù)責(zé)實(shí)現(xiàn)社團(tuán)的工作流,每個小組負(fù)責(zé)人完成不同的工作流步驟。工作流的次序如下:當(dāng)社團(tuán)收到一個遞呈時,如果社團(tuán)中有小組負(fù)責(zé)人,那么他將會選擇接受或拒絕,如

12、果社團(tuán)中沒有小組負(fù)責(zé)人,這一步工作流將會被直接省略。第二步和第三步的遞呈也是按照此步驟處理。當(dāng)調(diào)用工作流的第一步時,完成工作流步驟的任務(wù)把相關(guān)的遞呈放到“任務(wù)箱”中,如果組中的一個成員接受任務(wù)箱中的任務(wù),則任務(wù)被從任務(wù)箱中移出。如果一個遞呈被拒絕,則系統(tǒng)會通過e?mail把原因發(fā)送給遞呈提交者,提交者可以修改后重新提交;如果一個遞呈被“接受”,將轉(zhuǎn)到工作流的下一步。(3) 搜索和瀏覽功能實(shí)現(xiàn)終端用戶可以使用多種方法發(fā)現(xiàn)內(nèi)容,具體如下:使用關(guān)鍵字進(jìn)行搜索;使用外部的ID號進(jìn)行搜索;使用標(biāo)題、數(shù)據(jù)進(jìn)行瀏覽。在數(shù)字圖書館系統(tǒng)中,發(fā)現(xiàn)內(nèi)容最基本的方法是搜索?;贠AI?PMH協(xié)議的互操作平臺的搜索和索

13、引模塊是一個簡單的API接口,它能夠在全部的社區(qū)、社團(tuán)中完成新內(nèi)容的索引、再生索引,系統(tǒng)的具體搜索界面如圖7所示。 4 結(jié) 論本文設(shè)計了MARC格式到DC格式的格式轉(zhuǎn)換器及基于OAI?PMH協(xié)議互操作平臺的命令動詞,解決了數(shù)字圖書館中數(shù)據(jù)提供者與服務(wù)提供者間的元數(shù)據(jù)交互問題。給出了不同格式間的對照關(guān)系及格式轉(zhuǎn)換器的實(shí)現(xiàn)方法,同時完成了命令動詞的實(shí)現(xiàn)代碼,對推動OAI?PMH協(xié)議在數(shù)字圖書館領(lǐng)域的應(yīng)用具有重要意義。With the rapid development of network, computer and communication technology, digital librar

14、y has become the important sources of information for people to obtain information, but for many distributed digital resources on the network 1? 2, it is hoped that through to retrieve the required information. But since early in the establishment of digital library, without unified standard, its re

15、source metadata format is not the same, and thus to develop a set of can describe the unity of the different digital library resources metadata format is very difficult; Early at the same time, set up the digital library of data between the provider and the service provider is a point-to-point proto

16、col service, outside of the agreement between data providers and service providers to metadata interaction 3? 4.According to the above problem, design is widely used in the digital library metadata format format of MARC and DC converter, and redesigned based on OAI? PMH protocol interoperability pla

17、tform command verbs, between the digital library resources sharing is of great significance.1 OAI? PMH protocol overviewData providers and service providers, registered server constitutes the OAI in digital library field ? 5 7? PMH the framework of the agreement. Data providers are stored in the fra

18、mework of a large number of metadata repository, the service provider can extract meta data from multiple data provider and provide searching, browsing and other value-added services, the service provider is sent to the data provider to UDP request, which is based on OAI? PMH command verbs, a data p

19、rovider to the service provider's request is through HTTP protocol in the form of XML response, registered in the server can be the registration data providers and service providers. The specific framework is shown in figure 1.2, MARC format to DC converter design and implementationCurrently use

20、d in the digital library metadata format to MARC, and based on OAI? PMH protocol data providers can only support DC format metadata 8. So in order to realize based on OAI? PMH protocol metadata interoperability platform, you need to put the metadata of MARC format is converted to DC metadata format.

21、2.1 comparison relationship of MARC and DC format domainDC format contains element describes a total of 15, according to the scope and categories describe content can be divided into three groups, as shown in table 1.This system provides a number of input interface, so the design of MarcDc module sh

22、ould be performed before the item number of input processing. If the input is MARC format, after first MarcDc format converter to convert the data, then by a batch processing module; If the input is a DC format, the direct application of a batch processing module for processing.3 OAI? The realizatio

23、n of PMH interoperable platform3.1 based on OAI? PMH protocol analysis of the command verbsBased on the OAI? PMH agreement in the digital library, in order to realize multiple data interaction between metadata between providers and service providers, redesigned based on OAI? PMH protocol interoperab

24、ility platform command verbs, the main functions of the command verbs are as follows:GetRecord (1) : this verb can search metadata record from the warehouse. If the specified item from the warehouse and can't get appointed by metadataPrefix metadata format, the head of the state can return prope

25、rty information value for the "delete", the realization of the function of this depends on the storage by tracking the deletion of level;(2) Identify: this verb is used to retrieve information about the warehouse. Using the verb, storage can also return to the rest of the descriptive infor

26、mation, in the information returned part based on OAI? PMH protocol interoperability platform is needed;(3) ListIdentifiers: this verb does not return the record itself, and records of return only the head, by head based on the set members and timestamp available choice of parameters. Based on wareh

27、ouse to remove the support features, such as a request to delete a record with the specified parameters matching, it returns the record will contain values for the "delete" state property;(4) ListMetadataFormats: this verb can be retrieved from the storage to obtain the required metadata f

28、ormat, at the same time can through parameter limits the request to specify item metadata format for; (5) ListRecords: extracted from the verbs used in the warehouse record. Support feature based on warehouse to delete, if you remove the request and parameter matching of records, it returns the reco

29、rd will contain values for the state of the "delete" attribute, but returned to a state of "delete" record cannot be expressed as metadata format;(6) ListSets: the collection structure of verbs used to return to the warehouse, to get good selectivity.3.2 between data providers an

30、d service providers command verb usage analysisBetween data providers and data service request and response is done by the above six core verb. Use the standard Web server implementation approach, is a typical request by configuring the Web server, allows it to be able to handle OAI? PMH request sof

31、tware distribution of OAI? PMH requests, the specific interaction process as shown in figure 5.Interaction between data providers and data server steps specific as follows:(1) the service provider first find the metadata for the data provider, and then get its unique identity;(2) the service provide

32、r to the data provider ListSets requests and ListMetaFormats request. Data provider according to the service provider's request to return to its can provide satisfy certain conditions (such as some format, a theme, a certain period of time) metadata; After the service provider to obtain metadata

33、 to provide service to the users.3.3 OAI? PMH command request and response format analysis of verbs(1) request format3.4 the design of the OAI six verbs(1) the profile designWhen the data provider receives the OAI request, need to parse the request. First determines whether the request type is legal

34、, if not legal, then send a wrong message to the service provider; If legally, it concludes that the request which belong to six effective request type. As the parameter metadataPrifix for request types ListIdentifiers is mandatory, so if the data provider receives the request type is ListIdentifier

35、s, so the parser can directly check the request of the second parameter. But if there is no related parameters in the request, so want to confirm that the request is valid, will be expected to confirm the request must contain resumptionToken (recovery mark parameters), and the data provider know thi

36、s parameter.The assumption that data provider in unqualified DC mode can only send the metadata collection, so the only effective value can only be oai_dc metadataPrifix parameters. Under normal circumstances, can select the parameters in the request must be resolved, but can be simple, in the form

37、of an informal description. Then, according to the received request parameters, the data provider query using SQL statements in the warehouse, if the record is more than a passing identifier can pass the maximum, then the data provider will generate a new resumptionToken logo, and the query paramete

38、r with pointer information is stored in the concrete implementation process is shown in figure 6.(2) the detailed design(1) using Protocol data packet six verbs, according to the request, response, and the information of the Date class using Java. Util. The Date, year, month, day Calendar or String

39、method between the various formats, such as mutual conversion and output; The Set class by a named Set to create a new Set of formal description using XML.(2) to deal with the Client source code, the Client, can use a URL or a collection of properties to create HarvesterItinerary class used to descr

40、ibe the state of the harvester class, HarvesterItinerary class the harvester state of a class can be saved. OAIConnection class can describe a connection to the OAI, to accept a single request, the connection is based on using doReques or URL returned by the Response based on warehouse.(3) interoper

41、ability platform of the OAI source code package can provide the Server Server, and convert the source code to document service. OAI service Target programming interface on a Target class implements the interoperability platform through TargetAdapter class defines a don't do anything, GenericTarg

42、et class can implement by a very simple mechanism to create a small OAI collector, and are looking for and service, in the memory collection procedures and the given metadata prefixes are used to get the all records; JDBCServer class is to provide general server of OAI servlet OAIServerIfc such as O

43、AI provides a framework, core metadata prefix used in DC metadata records, processing GetRecord request, the Identify, ListRecords request, ListSets, ListMetadataFormats request with ListIdentifiers request. ResumableResultSet class describes a result, the result by combined with reuse marker, can b

44、e reused. As a result of the series of local customer3.5 the realization of the function of the data providerItem (1) of the order and submitThe user can use the e? Mail you need to order the items that we can share the resources of the digital library. User submitted item first describes the elemen

45、ts of metadata, then upload the source file, through the system after the validation of documents submitted to interoperable platform for everyone to browse and download.(2) the realization of the workflowThree team leader is responsible for implementing corporate workflow, each team leader to compl

46、ete different workflow steps. The order of the workflow is as follows: when the club received a presented, if a team leader in the community, so he will choose to accept or reject, if there is no team leader in the community, will be directly omit this step workflow. The second and third steps presented is also in accordance with this step.When calli

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論