智能与分布计算实验室
  XML模式匹配方法研究
姓名 金贤哲
论文答辩日期 2008.06.05
论文提交日期 2008.06.17
论文级别 硕士
中文题名 XML模式匹配方法研究
英文题名 Research on Matching Method for XML Schema
导师1 李瑞轩
导师2
中文关键词 XML模式;模式匹配;XML样式语言转换;文档转换
英文关键词 XML Schema;schema matching;XSLT;transformation of document
中文文摘 随着XML(eXtended Markup Language扩展标记语言)技术的快速发展,越来越多的数据使用XML进行表示,XML已经逐渐成为Web上数据表示和交换的标准。 在电子商务等的应用中,参与交易的两者都遵守相同的模式规范,那么它们之间就容易实现资源共享与信息集成。 目前大量XML的应用面临着模式不一致的问题,这就需要对于两个输入模式,找到它们相关元素间的匹配关系,然后根据映射关系进行转换。 首先对模式匹配方法进行了研究,并分析了己有模式匹配方法的基础上,给出一种XML模式匹配算法,然后针对异构XML文档的转换,给出了一种利用XML模式匹配算法实现异构XML文档间转换的解决方案。 算法通过两部分来判断元素的相似性,即元素相似性和上下文相似性。 本文主要针对上下文相似性计算方法进行了改进。 匹配过程分两个部分,首先计算两个XML模式元素之间的元素相似性,利用该元素相似度计算上下文相似度。 最后基于两个相似性的合成,抽取最终匹配候选。 在XML文档的转换方案中先利用XML模式匹配算法,找到元素之间的语义对应关系,然后根据产生的对应关系生成XSLT样式表文档。
英文文摘 With the increasingly development of XML, more data is represented by the format of XML, which has been a default standard for Web data representation and exchange. Such as the E-commercial applications, If the two sides of exchange abide by the same schema criterion it will be easier to share resource and complete the integrate information between them. Currently, the most application of XML face the problem of disunity of the schema. It requires find the mapping relationship between the related elements in the two input schema. And then according to the mapping,one can be transformed into another. The first is to study the approaches of the schema matching. A schema matching algorithm is presented by analyzing the existing methods. And then, aiming at research on transform between two heterogeneous documents. The similarity of two elements depends on two similarity, element level similarity and structural similarity, and the computing method of the structural similarity has been improved. The method consists of two steps, computing preliminary matching relationships between elements in the two XML schemas, getting proposed context similarity and extracting final matches based on compositeness of two. In the solution of the transforming between two XML documents, the semantic mapping relationship between two schema elements can be found by using the proposed algorithm. And then according to the result of matching, create the XSLT stylesheet file.