多数据库查询核心技术研究-智能与分布计算实验室

多数据库查询核心技术研究

姓名	张立明
论文答辩日期	2003.05.09
论文提交日期	2005.10.18
论文级别	硕士
中文题名	多数据库查询核心技术研究
英文题名	Research on Key Technologies of Multidatabase Query Processing
导师1	卢正鼎
导师2
中文关键词	多数据库系统;查询语言;查询分解;查询优化
英文关键词	Multidatabase Systems;Query Language;Query Decomposition;Query Optimization
中文文摘	随着网络技术的发展和新的应用需求的出现，传统的数据库技术显现出了一些局限性。但是，已有的数据库系统又不可能全部丢弃，因而研制能同时访问和处理来自多个数据库中数据以及文件数据的多数据库系统已成为必然趋势。由于多数据库系统具有异构性、分布性和局部自治性的特点，使得多数据库的查询语言和查询处理与传统数据库有了很大的不同。 SQL是针对结构化数据提出的，其能力不足以表达多数据库中将要集成的半结构和无结构的文件数据。在分析和比较了传统数据库的SQL、XML查询语言和面向对象查询语言的基础上，结合多数据库中查询的实际情况给出了一种基于SQL的对象查询语言Pano-OSQL。多数据库系统呈现给用户的是全局模式，用户使用全局查询语言提交对多数据库的查询，而所需的数据又必须从各局部数据源获得，所以必须将全局查询分解成与局部数据源对应的子查询。在此要求下，给出了一种有别于传统数据库系统的查询内部表示方法，定义了用于多数据库查询分解处理的查询树，给出了全局查询到全局子查询再到局部查询的分解原理、准则和实际分解算法，并对这些算法的正确性作了一些探讨。数据库性能影响最大的是查询处理器，因此查询优化是相当重要的。在研究了传统数据库查询优化方法的基础上，给出并实现了结果合并时的增量多路连接算法以及查询调度时的基于任务树的并发调度算法等一些适合多数据库查询处理的优化原则和算法。
英文文摘	With the development of network technology and the appearance of new applications, the traditional database technology can not meet the need of data sharing and interoperation. But, the existing databases can not be discarded totally, so it’s necessary to develop a multidatabase system (MDBS) that can access and process different databases and file system. Because of the heterogeneity, distribution and autonomy of MDBS, query language and query processing in MDBS is quite different from the traditional databases. SQL is brought forward in allusion to structured-data, but the ability of SQL is not enough to express semi-structured data and none-structured data which will be intregated in MDBS. After analyzed and compared SQL, the query language of XML and OQL, we bring forward a SQL-based object query language that fits MDBS. MDBS presents a global schema to users. Users use the global query language to submit a query in MDBS, and the data MDBS needs is obtained from the local data sources. So, global query decomposition must be done first. According to this, we puts forward a decomposition tree for query processing, and then gives the rules and algorithms of the transformation from a global query to subqueries. We also discussed the correctness of the algorithms. Query processor is the module which has the most impacts on the performance of databases, so query optimization is quite important. After doing some research on query optimization of the traditional databases, we put forward some rules and algorithms that fits query optimization in MDBS.