智能与分布计算实验室
  实时多数据库系统的研究与实现
姓名 娄勤俭
论文答辩日期 2003.11.14
论文提交日期 2003.11.14
论文级别 博士
中文题名 实时多数据库系统的研究与实现
英文题名 Research and Implementation of Real-Time Multidatabase Systems
导师1 卢正鼎
导师2
中文关键词 实时多数据库系统;模式集成;完整性约束;查询处理;事务分片;实时数据集成
英文关键词 Real-time multidatabase systems;Schema integration;Integrality constraint;Query processing;Transaction chopping;Real-time data integration
中文文摘 目前,随着分布计算和网络技术的不断发展,传统的数据库技术已越来越不能满足数据共享和互操作的需要,与此同时,已有的数据库系统又不可能全部丢弃,因而研制能同时访问和处理来自多个数据库中数据的多数据库系统已成为必然趋势。 随着信息时代的演进,人们不仅要求能集成多个异构的数据库系统,同时,对信息的时效性提出了更多的要求。在企业中,随着信息化进程的推进,流程行业普遍采用实时数据库来采集、存储和检索生产过程数据,对企业生产进行分析和监控。在生产过程控制、冶金自动化、军事指挥和控制、武器制导、目标识别与跟踪、股票市场等领域,用户要求利用网络获得位于不同地点的多个异构数据库中的实时数据,以便利用这些数据进行远程管理和决策分析。 这类需求的核心是要建立一个实时数据集成平台,即实时多数据库系统,它不仅要处理传统数据库中的持久性数据,而且还要处理实时数据库中的即时数据。实时多数据库系统可以屏蔽现在已有的各数据库系统不同的访问方法和用户界面,给用户呈现一个访问多种数据库的公共接口,从而减少了各数据库系统之间的差异,并维护信息访问的实时性。 在这方面的研究中,取得了如下成果: 1. 给出了多数据库模式集成的基本处理步骤,并使用信息容量的概念分析模式变换的等价性原理,提出了一种模式集成的正确性判断标准。使用类的扩展和类的范围的概念对多数据库系统中的完整性约束进行分类,并给出了类约束和对象约束的集成处理规则。 2. 根据查询处理的模式结构,给出了全局模式与局部模式之间的模式映射方法,并使用模式映射树来存储和表达模式映射信息。给出了多数据库系统中全局查询、中间查询和局部查询的查询树表示,并给出了相应的等价变换规则。 3. 对多数据库系统中查询的代数基础进行了扩充,基于这些扩充的规则,给出了多数据库全局查询转换为局部查询的查询分解方法,并在查询分解的过程中实现查询的部分优化。 4. 给出了多数据库事务的基本概念和特征,给出了多数据库事务可串行化问题及其解决方法。提出了一种寻找最优分片的事务分片算法,在不牺牲可串行性的同时,尽量缩短死锁时间,提高事务的并发性,并结合图论的方法对最优分片算法进行了效率分析。 5. 提出了一种集成实时数据库和其它数据库的集成系统体系结构,建立了实时多数据库系统的集成数据模型,并给出了该模型中的模式转换、模式集成、实时状态查询、实时事务调度、并发控制策略和方法。 6. 设计并实现了贵溪冶炼厂实时多数据库管理系统GY-RTMDBS,包括系统体系结构、系统功能模块、网络系统结构、实时数据交换设计,并对GY-RTMDBS系统的性能进行了评价。 7. 设计并实现了扩展型实时多数据库系统Panorama,给出了系统体系结构和基本处理过程,给出了Panorama系统中的模式集成、查询处理和事务处理实现策略。
英文文摘 With the development of distributed computing and network technology, traditional database technology becomes more difficult to meet the requirements of data sharing and interoperation. However, the existing database systems cannot be discarded entirely. So, developing a multidatabase system that can simultaneously access and process multiple data sources is a necessary trend. With the evolvement of the new era of information, people not only want to integrate multiple heterogeneous database systems, but also require more temporality of information. Flow processing industries usually use real-time database systems to gather, store and retrieve the process data, and monitor and analyze the enterprise production. In many fields, such as product process control, metallurgy automatization, military command, weapon guide, object recognition, stock market, users need to gain real-time data stored in different heterogeneous database systems and use these data to do remote management and decision analyzing. The key for the above requirements is to build a real-time data integration platform, a real-time multidatabase system (RTMDBS). It should not only process the permanent data in traditional databases, but manage real-time data in real-time databases. RTMDBS can hide different access methods and user interfaces, provides a common view to visit multiple database systems and maintains real-time property of information retrieval. The contributions of this research are as follows: 1. A basic processing step for schema integration is given. The concept of information capacity is used to represent equivalence principle of schema transformation and a standard of correctness for schema integration is given. Class extension and extent are introduced to classify multidatabase integrality constraints and the rules of processing class constraints and object constraints are also given. 2. Schema mappings between global schemas and local schemas are defined based on schema architecture of query processing and the schema mapping tree is used to store and represent schema mapping information. Query tree is used to represent global queries, export queries and local queries in multidatabase systems and the formulas of equivalent transformation among these queries are also presented. 3. The dissertation extends the query algebra for multidatabase systems and gives the method of query decomposition from global queries to local queries based on the extended rules. Then, partial query optimization during the decomposition process is discussed. 4. The dissertation gives the concept and characteristics of multidatabase transaction, and also presents the solutions to multidatabase serializability. An algorithm of transaction chopping for finding the finest chopping is introduced to shorten the deadlock time and improve the transaction concurrency without losing serializability. Then the performance evaluation of the chopping optimization algorithm is discussed using graph theory. 5. The dissertation gives an integrated architecture for integrating the real-time databases with other databases and establishes an integrated data model of real-time multidatabase systems. It also presents the policies of schema transformation, schema integration, real-time state querying, real-time transaction scheduling and concurrency control for this model. 6. A real-time multidatabase system for Guixi Smelt Company, named GY-RTMDBS, was designed and implemented, including system architecture, function module, network system, real-time data exchange and the system performance evaluation. 7. Finally, an extending real-time multidatabase system called Panorama was designed and implemented. The dissertation illustrates the the system architecture and basic processing steps, and gives the implementation methods for schema integration, query processing and transaction processing.