17 High Availability


001 Computing environments configured to provide nearly full-time availability are known as high availability systems. Oracle has a number of products and features that provide high availability in cases of unplanned downtime or planned downtime.
 
经过配置后能够提供接近全时可用性(full-time availability)的计算机系统被称为高可用性系统(high availability system)。Oracle 包含了一系列产品及特性,无论在非计划停机(unplanned downtime)还是计划停机(planned downtime)的情况下,都能确保系统的高可用性。
 
002 This chapter includes the following topics: 本章包含以下主题:
003

See Also:

Oracle Database High Availability Overview

另见:

Oracle Database High Availability Overview
004

Introduction to High Availability

17.1 高可用性简介

005 Computing environments configured to provide nearly full-time availability are known as high availability systems. Such systems typically have redundant hardware and software that makes the system available despite failures. Well-designed high availability systems avoid having single points-of-failure.
 
经过配置后能够提供接近全时可用性(full-time availability)的计算机系统被称为高可用性系统(high availability system)。这样的系统通常具备冗余的硬件及软件,从而确保系统在发生故障时保持可用。设计良好的高可用性系统中不存在单点脆弱性(single points-of-failure)。
 
006 Oracle has a number of products and features that provide high availability in cases of unplanned downtime or planned downtime.
 
Oracle 包含了一系列产品及特性,无论在非计划停机(unplanned downtime)还是计划停机(planned downtime)的情况下,都能确保系统的高可用性。
 
007

Overview of Unplanned Downtime

17.2 非计划停机概述

008 Various things can cause unplanned downtime. Oracle offers the following features to maintain high availability during unplanned downtime: 很多情况都会导致非计划停机(unplanned downtime)。Oracle 具备以下特性,从而确保系统在非计划停机时的高可用性:
009

Oracle Solutions to System Failures

17.2.1 Oracle 系统故障解决方案

010 This section covers some Oracle solutions to system failures, including the following: 本节介绍 Oracle 提供的系统故障解决方案,具体内容如下:
011

Overview of Fast-Start Fault Recovery

17.2.1.1 Fast-Start Fault Recovery 概述

012 Oracle Enterprise Edition features include a fast-start fault recovery functionality to control instance recovery. This reduces the time required for cache recovery and makes the recovery bounded and predictable by limiting the number of dirty buffers and the number of redo records generated between the most recent redo record and the last checkpoint.
 
Oracle 企业版中包含了用于控制实例恢复(instance recovery)的 Fast-Start Fault Recovery(快速启动故障恢复)功能。此功能可以限制脏缓冲区(dirty buffer)的数量及最后一次检查点(checkpoint)与当前时间点之间产生的还原记录(redo record)数量,从而减少了缓存恢复(cache recovery)所需的时间,并使其可预测,可以在限定的时间内完成。
 
013 The foundation of fast-start recovery is the fast-start checkpointing architecture. Instead of the conventional event driven (that is, log switching) checkpointing, which does bulk writes, fast-start checkpointing occurs incrementally. Each DBWn process periodically writes buffers to disk to advance the checkpoint position. The oldest modified blocks are written first to ensure that every write lets the checkpoint advance. Fast-start checkpointing eliminates bulk writes and the resultant I/O spikes that occur with conventional checkpointing.
 
Fast-Start Fault Recovery 是基于 fast-start checkpointing architecture(快速启动检查点架构)的。以往的检查点由事件(例如,日志切换(log switching))驱动进行批量写入(bulk write),而 fast-start checkpointing 则是增量地执行的。每个 DBWn 进程都能够周期性地将缓冲区写入磁盘,使检查点的位置前进。每次写入操作将写入最早被修改的数据块,从而确保检查点位置前进。Fast-start checkpointing 能够消除常规检查点造成的批量写入以及随之而来的 I/O 显著增长。
 
014 With fast-start fault recovery, the Oracle database is opened for access by applications without having to wait for the undo, or rollback, phase to be completed. The rollback of data locked by uncommitted transaction is done dynamically on an as needed basis. If the user process encounters a row locked by a crashed transaction, then it just rolls back that row. The impact of rolling back the rows requested by a query is negligible.
 
采用 Fast-Start Fault Recovery 后,访问 Oracle 数据库的应用程序无需等待还原(undo)(即回滚(rollback))完成。Oracle 能够在需要时动态地回滚被未提交事务锁住的数据。如果一个用户进程(user process)遇到了被其他崩溃事务(crashed transaction)锁住的数据行,此进程可以回滚数据行。在查询时导致的回滚操作对查询无影响。
 
015 Fast-start fault recovery is very fast, because undo data is stored in the database, not in the log files. Undoing a block does not require an expensive sequential scan of a log file. It is simply a matter of locating the right version of the data block within the database.
 
Fast-Start Fault Recovery 执行很快,因为还原数据就储存在数据库中,而非重做日志文件(log file)中。还原数据块时无需执行代价高昂的日志文件顺序扫描操作,而只需在数据库内确定数据块的正确版本即可。
 
016 Fast-start recovery can greatly reduce mean time to recover (MTTR) with minimal effects on online application performance. Oracle continuously estimates the recovery time and automatically adjusts the checkpointing rate to meet the target recovery time.
 
Fast-Start Fault Recovery 可以显著地缩短 mean time to recover (MTTR)(平均时间恢复),而对联机应用程序的性能影响很小。Oracle 能够持续地预测恢复时间,并自动地调整检查点发生率,从而保证实现目标恢复时间(target recovery time)。
 
017

See Also:

Oracle Database Performance Tuning Guide for information on fast-start fault recovery

另见:

Oracle Database Performance Tuning Guide 了解关于 fast-start fault recovery 的信息
018

Overview of Real Application Clusters

17.2.1.2 Real Application Clusters 概述

019 Real Application Clusters (RAC) databases are inherently high availability systems. The clusters that are typical of RAC environments can provide continuous service for both planned and unplanned outages. RAC builds higher levels of availability on top of the standard Oracle features. All single instance high availability features, such as fast-start recovery and online reorganizations, apply to RAC as well.
 
Real Application Clusters (RAC)(实时应用集群)数据库本身就是高可用性系统。RAC 系统在发生计划停机及非计划停机时均能提供不间断的服务。RAC 在标准 Oracle 特性之上提供了更高级别的可用性。在单实例系统中使用的高可用性特性,例如 fast-start recovery(快速启动故障恢复)以及联机重组(online reorganization)等,均可应用于 RAC 系统。
 
020 In addition to all the regular Oracle features, RAC exploits the redundancy provided by clustering to deliver availability with n-1 node failures in an n-node cluster. In other words, all users have access to all data as long as there is one available node in the cluster.
 
除了常规的 Oracle 高可用性特性之外,RAC 系统发挥了集群提供的冗余能力,在 n 个节点构成的集群中,n-1 个节点发生故障时系统依然具备可用性。换句话说,只要集群中有一个节点正常工作,就能保证所有用户对所有数据的访问。
 
021

Oracle Solutions to Data Failures

17.2.2 Oracle 数据故障解决方案

022 This section covers some Oracle solutions to data failures, including the following: 本节介绍 Oracle 数据故障解决方案,具体内容如下:
023

Overview of Backup and Recovery Features for High Availability

17.2.2.1 高可用性系统的备份与恢复特性概述

024 In addition to fast-start fault recovery and mean time to recovery, Oracle provides several solutions to protect against and recover from data and media failures. A system or network fault may prevent users from accessing data, but media failures without proper backups can lead to lost data that cannot be recovered. These include the following:
  • Recovery Manager (RMAN) is Oracle's utility to manage the backup and recovery of the database. It determines the most efficient method of running the requested backup, restore, or recovery operation. RMAN and the server automatically identify modifications to the structure of the database and dynamically adjust the required operation to adapt to the changes. You have the option to specify the maximum disk space when restoring logs during media recovery, thus enabling an efficient space management during the recovery process.
  • Oracle Flashback Database lets you quickly recover an Oracle database to a previous time to correct problems caused by logical data corruptions or user errors.
  • Oracle Flashback Query lets you view data at a point-in-time in the past. This can be used to view and reconstruct lost data that was deleted or changed by accident. Developers can use this feature to build self-service error correction into their applications, empowering end-users to undo and correct their errors.
  • Backup information can be stored in an independent flash recovery area. This increases the resilience of the information, and allows easy querying of backup information. It also acts as a central repository for backup information for all databases across the enterprise, providing a single point of management.
  • When performing a point in time recovery, you can query the database without terminating recovery. This helps determine whether errors affect critical data or non-critical structures, such as indexes. Oracle also provides trial recovery in which recovery continues but can be backed out if an error occurs. It can also be used to "undo" recovery if point in time recovery has gone on for too long.
  • With Oracle's block-level media recovery, if only a single block is damaged, then only that block needs to be recovered. The rest of the file, and thus the table containing the block, remains online and accessible.
  • LogMiner lets a DBA find and correct unwanted changes. Its simple SQL interface allows searching by user, table, time, type of update, value in update, or any combination of these. LogMiner provides SQL statements needed to undo the erroneous operation. The GUI interface shows the change history. Damaged log files can be searched with the LogMiner utility, thus recovering some of the transactions recorded in the log files.
在 fast-start fault recovery(快速启动故障恢复)及 mean time to recovery(平均时间恢复 )功能之外,Oracle 还提供了多种在数据故障(data failure)及介质故障(media failure)发生时进行恢复的功能。系统故障或网络故障只会影响用户对数据的访问,而介质故障则会导致数据丢失,如果没有完善的备份机制,数据将无法恢复。以下介绍主要的备份与恢复特性:
  • Recovery Manager(恢复管理器,RMAN)是管理数据备份与恢复的 Oracle 工具。她能以最高效的方式执行备份,复原,及恢复操作。RMAN 及 Oracle 数据库服务器能够自动地识别数据库结构的改变,并动态地调整备份恢复操作以适应结构的改变。用户可以设定在介质恢复期间用于复原日志文件的最大磁盘空间,这能够恢复过程中的空间管理更加高效。
  • Oracle Flashback Database(回闪数据库)令用户能够快速地将数据库恢复到之前的某个时间点,从而修复逻辑数据问题或用户误操作。
  • Oracle Flashback Query(回闪查询)令用户能够查询数据在过去某个时间点的状态。此功能 供用户查询及恢复误操作导致的数据删除或修改。开发者可以利用此特性在应用程序中添加自助错误数据修复功能,使终端用户能够还原或修正数据错误。
  • 备份数据可以被存储在独立的 flash recovery area(回闪恢复区)中。这增加了备份数据的使用灵活性,也使对备份信息的查询更为容易。Flash recovery area 可以作为整个企业内所有数据库的备份数据中央资料库,以便于集中管理。
  • 当执行按时间点恢复(point in time recovery)时,用户依然可以查询数据库而无需中断恢复操作。这有助于用户判断错误是否影响了关键数据,或只涉及非关键数据(例如,索引)。Oracle 还提供试验恢复(trial recovery)功能,即在恢复过程中发生错误时,相关的恢复操作可以被回退。用户在执行按时间点恢复时可以利此功能还原超过预计时间点的恢复数据。
  • Oracle 具备数据块级介质恢复(block-level media recovery)功能,如果只有少量数据块损坏,只需单独对其进行恢复。相关文件的剩余部分,以及使用损坏数据块的表都可以保持联机,并可被用户访问。
  • LogMiner 可以帮助 DBA 找出并修正意外的数据修改。LogMiner 以 SQL 作为接口,用户可以根据用户名,表名,时间,更新类型,以及更新值,或以上条件的组合进行查询。LogMiner 还能给出用于还原错误操作所需的 SQL 语句。用户可以使用 LogMiner 的 GUI 察看数据修改的历史记录。LogMiner 能够搜索被损坏的日志文件,并恢复记录在其中的事务信息。
025

See Also:

另见:

026

Overview of Partitioning

17.2.2.2 分区概述

027 Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.
 
分区技术(partitioning)使用户可以将大表或大索引分解为更小且更易管理的分区(partition),从而解决大数据量对象带来的问题。存取分区表的 SQL 查询及 DML 语句与应用于普通表的语句完全相同。定义了分区表后,DDL 语句可以单独操作某个分区,而不是整个表或索引。因此,分区技术能够简化对大数据库对象的管理,同时对应用程序完全透明。
 
028

See Also:

Chapter 18, "Partitioned Tables and Indexes"

另见:

第 18 章,“Partitioned Tables and Indexes
029

Overview of Transparent Application Failover

17.2.2.3 Transparent Application Failover 概述

030 Transparent Application Failover enables an application user to automatically reconnect to a database if the connection fails. Active transactions roll back, but the new database connection, made by way of a different node, is identical to the original. This is true regardless of how the connection fails.
 
Transparent Application Failover(透明应用故障恢复)可以使应用程序用户在遇到连接故障(connection fail)时自动地重新连接到数据库。原有的活动事务将被回滚,新的数据库连接与原来的连接完全相同(也许连接到了不同的节点上)。
 
031 With Transparent Application Failover, a client notices no loss of connection as long as there is one instance left serving the application. The database administrator controls which applications run on which instances and also creates a failover order for each application. This works best with Real Application Clusters (RAC): If one node dies, then you can quickly reconnect to another node in the cluster.
 
采用了 Transparent Application Failover 技术后,只要系统中至少存在一个实例为应用程序提供服务,用户就不会察觉出连接故障。DBA 可以控制实例为哪些应用程序提供服务,也可以控制应用程序在进行故障切换(failover)时选择实例的顺序。Transparent Application Failover 应与 RAC 结合才能发挥最佳效果:如果一个节点出现故障,用户可以被自动地重新连接到集群中的其他节点。
 
032

Elements Affected by Transparent Application Failover

17.2.2.3.1 受 Transparent Application Failover 影响的程序结构

033 During normal client/server database operations, the client maintains a connection to the database so the client and server can communicate. If the server fails, so then does the connection. The next time the client tries to use the connection the client issues an error. At this point, the user must log in to the database again.
 
在常规的 C/S 体系结构下进行数据库操作时,客户端会与数据库建立一个连接(connection)以便和服务器通信。如果服务器出现故障,连接也会停止。如果客户端再次使用此连接将得到错误提示。此时,用户必须重新登录到数据库。
 
034 With Transparent Application Failover, however, Oracle automatically obtains a new connection to the database. This enables users to continue working as if the original connection had never failed.
 
而采用 Transparent Application Failover 技术后,连接中断后 Oracle 能自动地建立客户端与服务器间的新连接。因此用户可以继续工作,如同原有连接没发生故障一样。
 
035 There are several elements associated with active database connections. These include:
  • Client/server database connections
  • Users' database sessions running statements
  • Open cursors used for fetching
  • Active transactions
  • Server-side program variables
以下程序结构(element)与活动数据库连接(active database connection)有关:
  • C/S 数据库连接
  • 用户运行 SQL 语句所用的数据库会话(session)
  • 打开的游标(open cursor),用于数据获取(fetching)
  • 活动事务(active transaction)
  • 服务端的程序变量(server-side program variable)
036 Transparent Application Failover can be used to restore client/server database connections, users' database sessions and optionally an active query. To restore other elements of an active database connection, such as active transactions and server-side package state, the application code must be capable of re-running statements that occurred after the last commit.
 
Transparent Application Failover 技术可以复原 C/S 数据库连接及用户数据库会话,也可以复原活动查询(active query)。如需复原与活动数据库连接(active database connection)相关的其他程序结构,例如活动事务或服务端程序包状态,开发者必须在应用程序代码中重新运行最后一次提交操作(commit)后运行的语句。
 
037

See Also:

Oracle Database Net Services Administrator's Guide

另见:

Oracle Database Net Services Administrator's Guide
038

RAC High Availability Event Notification

17.2.2.3.2 RAC 高可用性事件通知

039 OCI and JDBC (Thick) clients can register for RAC high availability event notification and take appropriate action when an event occurs. With this, you can improve connection failover response time and remove stale connections from connection pools and session pools. Reducing failure detection time allows Transparent Application Failover to react more quickly when failures do occur, benefiting client applications running during a node or instance failure.
 
OCI 及 JDBC 客户端可以注册(register)RAC 高可用性事件通知,并在事件发生时作出适当反映。利用此项功能,开发者可以优化连接故障切换的响应时间,并从连接池及会话池中及时移除有故障的连接。由于故障侦测时间缩短,Transparent Application Failover 能够在发生故障时更快地反应,提高了实例故障发生时客户端应用程序的服务质量。
 
040 Clients must connect to a database service that has been enabled for Oracle Streams Advanced Queuing high availability notifications. Database services may be modified with Enterprise Manager to support these notifications. Once enabled, clients can register a callback that is invoked whenever a high availability event occurs.
 
如需应用高可用性事件通知,数据库服务必须启用 Oracle Streams Advanced Queuing(数据流高级队列)高可用性通知。用户可以在Enterprise Manager(企业管理器)中开启相关功能。功能开启后,客户端就可以注册一个回调操作(callback ),当高可用性事件发生时就能够被调用。
 
041

Note:

With JDBC Thick clients, event notification is limited to connection pools.

提示:

在使用瘦客户端(JDBC Thick client)时,事件通知只能作用于连接池(connection pool)。
042

See Also:

Oracle Call Interface Programmer's Guide

另见:

Oracle Call Interface Programmer's Guide
043

Oracle Solutions to Disasters

17.2.3 Oracle 灾难解决方案

044 Oracle's primary solution to disasters is the Oracle Data Guard product.
 
Oracle Data Guard 产品是 Oracle 中主要的灾难解决方案。
 
045

Overview of Oracle Data Guard

17.2.3.1 Oracle Data Guard 概述

046 Oracle Data Guard lets you maintain uptime automatically and transparently, despite failures and outages. Oracle Data Guard maintains up to nine standby databases, each of which is a real-time copy of the production database, to protect against all threats—corruptions, data failures, human errors, and disasters. If a failure occurs on the production (primary) database, then you can fail over to one of the standby databases to become the new primary database. In addition, planned downtime for maintenance can be reduced, because you can quickly and easily move (switch over) production processing from the current primary database to a standby database, and then back again.
 
Oracle Data Guard 可以在系统发生故障或断电时自动地且透明地保证系统的可用性。Oracle Data Guard 最多支持九个备用数据库(standby database),备用数据库即生产数据库(production database)的实时副本,可用于消除数据库面对的各种威胁——系统故障,数据故障,人为错误,及灾难等问题。如果生产数据库发生故障,用户可以进行故障切换,将一个备用数据库转换为生产数据库。此外,因维护而导致的计划停机时间也可以缩短,因为用户可以迅速地将业务处理工作从当前生产数据库切换到备用数据库,同样也可以切换回原来的状态。
 
047 Fast-start failover provides the ability to automatically, quickly, and reliably fail over to a designated, synchronized standby database in the event of loss of the primary database, without requiring that you perform complex manual steps to invoke the failover. This lets you maintain uptime transparently and increase the degree of high availability for system failures, data failures, and site outages, as well the robustness of disaster recovery.
 
利用 fast-start failover(快速启动故障恢复)功能,当主数据库(primary database)发生故障时能够自动且可靠地进行故障切换,将业务处理工作迅速切换到预设的与主数据库同步的备用数据库上,整个故障切换工作无需用户进行复杂的手工操作。此功能透明地提高了系统的在线时间(uptime),并增加了数据库在系统故障,数据故障,及断电等情况下的高可用性级别,也增加了灾难恢复方案的健壮性。
 
048

Oracle Data Guard Configurations

17.2.3.1.1 Oracle Data Guard 结构

049 An Oracle Data Guard configuration is a collection of loosely connected systems, consisting of a single primary database and up to nine standby databases that can include a mix of both physical and logical standby databases. The databases in a Data Guard configuration can be connected by a LAN in the same data center, or—for maximum disaster protection—geographically dispersed over a WAN and connected by Oracle Net Services.
 
Oracle Data Guard 系统是由一组松散连接的数据库构成的,其中包括一个主数据库(single primary database)及最多九个备份数据库,备份数据库既可以为物理备份数据库(physical standby database),也可以为逻辑备份数据库 (logical standby database)。位于 Data Guard 系统可以位于同一数据中心内并通过 LAN 连接,也可以是在地理上分散的,通过 WAN 及 Oracle Net Services(Oracle 网络服务)连接(后者能够提供最大的灾难保护能力)。
 
050 A Data Guard configuration can be deployed for any database. This is possible because its use is transparent to applications; no application code changes are required to accommodate a standby database. Moreover, Data Guard lets you tune the configuration to balance data protection levels and application performance impact; you can configure the protection mode to maximize data protection, maximize availability, or maximize performance.
 
任何数据库都可以部署为 Data Guard 系统,因为 Data Guard 对应用软件来说是透明的,即无需修改应用程序代码来满足备份数据库的要求。此外,用户可以调整 Data Guard 系统的设置,在数据保护级别与应用程序性能之间进行平衡。用户可选的设置有最大数据保护能力,最大可用性,及最大性能。
 
051 As application transactions make changes to the primary database, the changes are logged locally in redo logs. For physical standby databases, the changes are applied to each physical standby database that is running in managed recovery mode. For logical standby databases, the changes are applied using SQL regenerated from the archived redo logs.
 
应用程序中的事务首先对主数据库进行修改,这些修改也会被记录到主数据库的重做日志内。对于物理备份数据库,修改将被应用到运行于恢复模式下的物理备份数据库中。对于逻辑备份数据库,将利用归档重做日志重新生成 SQL 来应用修改。
 
052

Physical Standby Databases

17.2.3.1.2 物理备份数据库

053 A physical standby database is physically identical to the primary database. While the primary database is open and active, a physical standby database is either performing recovery (by applying logs), or open for reporting access. A physical standby database can be queried read only when not performing recovery while the production database continues to ship redo data to the physical standby site.
 
物理备份数据库(physical standby database)在物理上与主数据库(primary database)完全相同。在主数据库处于打开(open)状态时,物理备份数据库或者执行恢复(即应用重做日志),或者打开提供只读访问(例如,报表服务)。当物理备份数据库没有执行恢复时可以被只读地访问,此时主数据库可以持续地向物理备份节点发送重做日志数据。
 
054 Physical standby on disk database structures must be identical to the primary database on a block-for-block basis, because a recovery operation applies changes block-for-block using the physical rowid. The database schema, including indexes, must be the same, and the database cannot be opened (other than for read-only access). If opened, the physical standby database will have different rowids, making continued recovery impossible.
 
物理备份数据库的磁盘存储结构必须和主数据库完全相同,即数据块一一对应,因为在备份数据库上的恢复操作使用物理 rowid(physical rowid)按数据块应用修改信息。物理备份数据库的模式(schema),包括索引等对象也必须和主数据库完全相同,且数据库不能处于打开状态(但可以以只读模式打开)。如果物理备份数据库被打开,其中可能出现与主数据库不同的 rowid,这将使其上的恢复操作无法执行。
 
055

Logical Standby Databases

17.2.3.1.3 逻辑备份数据库

056 A logical standby database takes standard Oracle archived redo logs, transforms the redo records they contain into SQL transactions, and then applies them to an open standby database. Although changes can be applied concurrently with end-user access, the tables being maintained through regenerated SQL transactions allow read-only access to users of the logical standby database. Because the database is open, it is physically different from the primary database. The database tables can have different indexes and physical characteristics from their primary database peers, but must maintain logical consistency from an application access perspective, to fulfill their role as a standby data source.
 
逻辑备份数据库(logical standby database)能够读取主数据库(primary database)的归档重做日志(archived redo logs),将其中的重做记录(redo record)转换为 SQL 事务并应用到数据库中。逻辑备份数据库可以在应用修改数据的同时被终端用户并发地访问,但是正在应用 SQL 事务的表只能允许被只读访问。由于逻辑备份数据库处于打开(open)状态,因此她在物理存储上与主数据库不同。在逻辑备份数据库中,表的物理属性及其上创建的索引均可与主数据库中的对应表不同;但逻辑备份数据库必须确保对应用程序的逻辑一致性(logical consistency),否则将无法作为备份数据源(standby data source)。
 
057

Oracle Data Guard Broker

17.2.3.1.4 Oracle Data Guard Broker

058 Oracle Data Guard Broker automates complex creation and maintenance tasks and provides dramatically enhanced monitoring, alert, and control mechanisms. It uses background agent processes that are integrated with the Oracle database server and associated with each Data Guard site to provide a unified monitoring and management infrastructure for an entire Data Guard configuration. Two user interfaces are provided to interact with the Data Guard configuration, a command-line interface (DGMGRL) and a graphical user interface called Data Guard Manager.
 
Oracle Data Guard Broker 能使 Data Guard 的实施与维护工作简单化,并能显著地增强监视,告警,及控制机制。Data Guard Broker 的后台代理进程(background agent process)与主数据库服务器及各个 Data Guard 节点相集成,从而对整个 Data Guard 系统进行监控与管理。Oracle 提供了两种用户接口,其一是名为 DGMGRL 的命令行工具,其二是名为 Data Guard Manager 的图形化工具。
 
059 Oracle Data Guard Manager, which is integrated with Oracle Enterprise Manager, provides wizards to help you easily create, manage, and monitor the configuration. This integration lets you take advantage of other Enterprise Manager features, such as to provide an event service for alerts, the discovery service for easier setup, and the job service to ease maintenance.
 
Oracle Data Guard Manager 是集成在 Oracle Enterprise Manager 之中的,她具备向导功能,从而使创建,管理,及监控 Data Guard 系统的工作更为简单。Data Guard Manager 可以利用 Enterprise Manager 的功能,例如利用事件服务(event service)进行告警,利用发现服务(discovery service)使设置工作更简单,或利用作业服务(job service)来完成对 Data Guard 的维护。
 
060

Data Guard with RAC

17.2.3.1.5 Data Guard 与 RAC

061 RAC enables multiple independent servers that are linked by an interconnect to share access to an Oracle database, providing high availability, scalability, and redundancy during failures. RAC and Data Guard together provide the benefits of both system-level, site-level, and data-level protection, resulting in high levels of availability and disaster recovery without loss of data:
  • RAC addresses system failures by providing rapid and automatic recovery from failures, such as node failures and instance crashes. It also provides increased scalability for applications.
  • Data Guard addresses site failures and data protection through transactionally consistent primary and standby databases that do not share disks, enabling recovery from site disasters and data corruption.
RAC 的特点是多个独立的 Oracle 实例互连,并共享同一个 Oracle 数据库,从而实现系统的高可用性及可伸缩性,并在发生故障时提供冗余。将 RAC 与 Data Guard 相结合,就能同时在系统级(system-level),位置级(site-level),数据级(data-level)提供保障,实现高可用性以及无数据丢失的灾难恢复:
  • RAC 用于解决系统故障。RAC 能够在出现节点故障或实例崩溃时迅速地进行自动恢复。同时,RAC 还能增加应用系统的可伸缩性。
  • Data Guard 用于解决位置故障(site failure)并提供数据保护。Data Guard 能够令使用不同磁盘的主数据库(primary database)及备份数据库(standby database)之间具备事务一致性,从而在发生位置灾难(site disaster)及数据故障时进行恢复。
062 Many different architectures using RAC and Data Guard are possible depending on the use of local and remote sites and the use of nodes and a combination of logical and physical standby databases.
 
采用 RAC 及 Data Guard 可以组合出多种系统体系结构,这取决于 Data Guard 的位置处于远程还是本地,Data Guard 备份数据库是逻辑的还是物理的,以及 RAC 的节点情况。
 
063

See Also:

另见:

064

Oracle Solutions to Human Errors

17.2.4 Oracle 人为错误解决方案

065 This section covers some Oracle solutions to human errors, including the following: 本节介绍 Oracle 提供的人为错误解决方案,具体内容如下:
066

Overview of Oracle Flashback Features

17.2.4.1 Oracle Flashback 特性概述

067 If a major error occurs, such as a batch job being run twice in succession, the database administrator can request a Flashback operation that quickly recovers the entire database to a previous point in time, eliminating the need to restore backups and do a point-in-time recovery. In addition to Flashback operations at the database level, it is also possible to flash back an entire table. Similarly, the database can recover tables that have been inadvertently dropped by a user.
  • Oracle Flashback Database lets you quickly bring your database to a prior point in time by undoing all the changes that have taken place since that time. This operation is fast, because you do not need to restore the backups. This in turn results in much less downtime following data corruption or human error.
  • Oracle Flashback Table lets you quickly recover a table to a point in time in the past without restoring a backup.
  • Oracle Flashback Drop provides a way to restore accidentally dropped tables.
  • Oracle Flashback Query lets you view data at a point-in-time in the past. This can be used to view and reconstruct lost data that was deleted or changed by accident. Developers can use this feature to build self-service error correction into their applications, empowering end-users to undo and correct their errors.
  • Oracle Flashback Version Query uses undo data stored in the database to view the changes to one or more rows along with all the metadata of the changes.
  • Oracle Flashback Transaction Query lets you examine changes to the database at the transaction level. As a result, you can diagnose problems, perform analysis, and audit transactions.
当发生较严重的人为错误时,例如重复运行了批处理作业,DBA 可以利用 Flashback 操作将整个数据库恢复到之前的某个时间点,而无需先使用备份复原(restore)再执行按时间点恢复(point-in-time recovery)。除了可以在数据库级执行 Flashback 操作,也可以对一个表单独执行 Flashback 操作。Oracle Flashback Drop 可用于恢复被用户意外删除的数据表。
  • 用户可以使用 Oracle Flashback Database 还原某个时间点之后执行的所有操作,从而将整个数据库恢复到之前的某个时间点。这样的操作相对较快捷,因为此操作无需利用备份复原(restore)数据库。因此,Flashback Database 技术能够缩短发生数据故障及人为错误后的系统停机时间。
  • 用户可以使用 Oracle Flashback Table 将一个数据表迅速地恢复到之前的某个时间点,而无需利用备份进行复原。
  • Oracle Flashback Drop 用于复原被意外删除的数据表。
  • 用户可以利用 Oracle Flashback Query 查看数据在过去某个时间点的状态。此功能可用于查询或重建被误删除或误修改的数据。开发者可以利用此特性在应用程序中加入自助错误修正功能,使终端用户能够自己对其误操作进行修正或还原。
  • Oracle Flashback Version Query 可以利用数据库中的还原信息(undo data)来查询个别数据行的修改记录以及相关元数据(metadata)的修改记录。
  • Oracle Flashback Transaction Query 可以在事务级查看数据库中的修改。用户可以利用此功能诊断并分析系统中的问题,或对事务进行监控。
068

See Also:

另见:

069

Overview of LogMiner

17.2.4.2 LogMiner 概述

070 Oracle LogMiner lets you query redo log files through a SQL interface. Redo log files contain information about the history of activity on a database. Oracle Enterprise Manager includes the Oracle LogMiner Viewer graphical user interface (GUI).
 
用户可以利用 Oracle LogMiner 提供的 SQL 接口查询重做日志文件(redo log file)中的信息。重做日志文件中包含了所有数据库活动的历史记录。在 Oracle Enterprise Manager 提供了采用图形化界面的 Oracle LogMiner Viewer,也可以进行相同的查询。
 
071 All changes made to user data or to the database dictionary are recorded in the Oracle redo log files. Therefore, redo log files contain all the necessary information to perform recovery operations. Because redo log file data is often kept in archived files, the data is already available. To take full advantage of all the features LogMiner offers, you should enable supplemental logging.
 
所有对用户数据及数据字典的修改都会被记录在重做日志文件中。因此,重做日志文件内包含了执行恢复操作(recovery operation)所需的所有信息。重做日志文件会被周期性的归档,因此历史修改信息也能获得。为了充分发挥 LogMiner 的特性,管理员应该启用补充重做日志(supplemental logging)功能。
 
072

See Also:

Chapter 11, "Oracle Utilities"

另见:

第 13 章,“Oracle Utilities
073

Overview of Security Features for High Availability

17.2.4.3 高可用性中的安全特性概述

074 Oracle Internet Directory lets you manage the security attributes and privileges for users, including users authenticated by X.509 certificates. Oracle Internet Directory also enforces attribute-level access control. This enables read, write, or update privileges on specific attributes to be restricted to specific named users, such as an enterprise security administrator. Directory queries and responses can use SSL encryption for enhanced protection during authentication and other interactions. Other database security features including Virtual Private Database (VPD), Label Security, audit, and proxy authentication can be leveraged for these directory-based users when configured as enterprise users.
 
管理员可以利用 Oracle Internet Directory 管理用户的安全属性(security attribute)及权限(privilege),Internet Directory 还支持使用 X.509 认证进行用户身份验证(authenticate)。 Oracle Internet Directory 能够强制执行属性级(attribute-level)的访问控制。因此,管理员可以控制每个用户(例如,企业安全管理用户)对各个属性的读取,写入,及修改权限。对目录(Directory)的查询及响应可以采用 SSL 加密,这增加了对身份认证及其他操作的保护程度。Oracle 提供的其他安全特性还有虚拟专用数据库(Virtual Private Database,VPD),Label Security,及 audit 等。配置为企业用户的目录用户(directory-based user)可以采用代理身份认证(proxy authentication)。
 
075 The Oracle Advanced Security User Migration Utility assists in migrating existing database users to Oracle Internet Directory. After a user is created in the directory, organizations can continue to build new applications in a Web environment and leverage the same user identity in Oracle Internet Directory for provisioning the user access to these applications.
 
Oracle Advanced Security User Migration Utility(高级安全用户迁移工具)可以将数据库用户迁移到 Oracle Internet Directory 中。如果在 Directory 中创建了用户后,企业又部署了新的 Web 应用系统,可以更新 Directory 中的用户信息,从而实现对新应用权限的控制。
 
076

See Also:

Chapter 20, "Database Security"

另见:

第 20 章,“Database Security
077

Overview of Planned Downtime

17.3 计划停机概述

078 Oracle provides a number of capabilities to reduce or eliminate planned downtime. These include the following: Oracle 提供数个功能,用于减少或消除计划停机。这些功能包括:
079

System Maintenance

17.3.1 系统维护

080 Oracle provides a high degree of self-management - automating routine DBA tasks and reducing complexity of space, memory, and resource administration. These include the following:
  • Automatic undo management–database administrators do not need to plan or tune the number and sizes of rollback segments or consider how to strategically assign transactions to a particular rollback segment.
  • Dynamic memory management to resize the Oracle shared memory components dynamically. Oracle also provides advisories to help administrators size the memory allocation for optimal database performance.
  • Oracle-managed files to automatically create and delete files as needed
  • Free space management within a table with bitmaps. Additionally, Oracle provides automatic extension of data files, so the files can grow automatically based on the amount of data in the files.
  • Data Guard for hardware and operating system maintenance
Oracle 数据库的自我管理(self-management)程度很高,即 Oracle 能够自动执行常规的 DBA 任务,并减少存储,内存,及资源管理的复杂性。这些功能包括:
  • 自动还原管理(automatic undo management):DBA 无需手工对回滚段(rollback segment)的数量与容量进行计划与调整,也无需考虑事务使用回滚段时采用的策略。
  • 动态内存管理(dynamic memory management)能够自动地调整 Oracle 共享内存组件的容量。Oracle 还能为内存容量分配提供建议,协助 DBA 优化数据库性能。
  • Oracle-managed files 能在需要时自动创建与删除数据文件。
  • 使用位图(bitmap)进行数据表可用空间管理(free space management)。此外,Oracle 支持数据文件自动扩展,即数据文件可以随着其中数据量的增长而自动扩展。
  • 在进行硬件及操作系统维护时,可以采用 Data Guard 来保证系统的可用性。
081

See Also:

另见:

 
082

Data Maintenance

17.3.2 数据维护

083 Database administrators can perform a variety of online operations to table definitions, including online reorganization of heap-organized tables. This makes it possible to reorganize a table while users have full access to it.
 
DBA 可以联机地执行各种表定义操作,例如联机重组(online reorganization)堆表(heap-organized table)。因此,即便重组数据表时也不会影响用户对表的访问。
 
084 This online architecture provides the following capabilities:
  • Any physical attribute of the table can be changed online. The table can be moved to a new location. The table can be partitioned. The table can be converted from one type of organization (such as a heap-organized) to another (such as index-organized).
  • Many logical attributes can also be changed. Column names, types, and sizes can be changed. Columns can be added, deleted, or merged. One restriction is that the primary key of the table cannot be modified.
  • Online creation and rebuilding of secondary indexes on index-organized tables (IOTs). Secondary indexes support efficient use of block hints (physical guesses). Invalid physical guesses can be repaired online.
  • Indexes can be created online and analyzed at the same time. Online fix-up of physical guess component of logical rowids (used in secondary indexes on index-organized tables) also can be used.
  • Fix the physical guess component of logical rowids stored in secondary indexes on IOTs. This allows online repair of invalid physical guesses.
Oracle 这种联机的体系结构提供了以下能力:
  • 表的物理属性(physical attribute)可以在其处于联机状态时被修改。用户可以将表移动到新位置,可以将表转为分区表,还可以改变表的组织类型(type of organization)(例如,从堆表(heap-organized table)转换为索引表(index-organized table))。
  • 表的许多逻辑属性(logical attribute)可以在其处于联机状态时被修改。列名,数据类型,及容量都可以修改。用户可以添加,删除,或合并列。但是表的主键不能被修改。
  • 联机为索引表(index-organized table,IOT)创建辅助索引(secondary index)。辅助索引有助于提高 block hint(physical guess)的效率。无效的 physical guesses 可以被联机地修改。
  • 索引可以被联机地创建,并在创建的同时进行分析。用户也可以联机地修复逻辑 rowid 的 physical guess(此功能针对索引表的辅助索引)。
  • 修正索引表辅助索引中存储的逻辑 rowid 的 physical guess。此功能用于修正无效的 physical guess。
085

Database Maintenance

17.3.3 数据库维护

086 Oracle provides technology to do maintenance of database software with little or no database downtime. Patches can be applied to Real Application Clusters instances one at a time, such that database service is always available.
 
Oracle 提供的数据库软件维护技术无需停机,或能缩短停机时间。为 RAC 各个实例安装补丁时可以逐个进行,因此数据库可以持续提供服务。
 
087 A Real Application Clusters system can run in this mixed mode for an arbitrary period to test the patch in the production environment. When satisfied that the patch is successful, this procedure is repeated for the remaining nodes in the cluster. When all nodes in the cluster have been patched, the rolling patch upgrade is complete, and all nodes are running the same version of Oracle.
 
RAC 系统的各个实例可以单独安装补丁程序,因此用户可以在生产系统上直接对补丁程序进行测试。如果在一个节点上安装成功,则可以在集群的其他节点上重复安装过程。当集群中所有节点上的补丁程序均安装完毕,滚动补丁升级工作即告结束,此时所有节点均在运行相同版本的 Oracle 软件。
 

A 翻译不确定的词汇(格式:黄色背景 )  

 

B 翻译不确定的Oracle/数据库词汇(格式:
黄色背景

[005] single points-of-failure
[012] fast-start fault recovery
[019] online reorganizations
[030] Transparent Application Failover
[059] discovery service
[061] site-level
[061] site failures
[061] site disasters
[067] Oracle Flashback Drop
[067] Oracle Flashback Version Query
[067] Oracle Flashback Transaction Query
[071] supplemental logging

C 翻译不确定的句子(格式:
黄色背景

[014] The impact of rolling back the rows requested by a query is negligible.
[024] This helps determine whether errors affect critical data or non-critical structures, such as indexes.
[030] Active transactions roll back, but the new database connection, made by way of a different node, is identical to the original.
[074] and proxy authentication can be leveraged for these directory-based users when configured as enterprise users.
[075] After a user is created in the directory, organizations can continue to build new applications in a Web environment and leverage the same user identity in Oracle Internet Directory for provisioning the user access to these applications.
[084] Secondary indexes support efficient use of block hints (physical guesses). Invalid physical guesses can be repaired online.
[084] Online fix-up of physical guess component of logical rowids (used in secondary indexes on index-organized tables) also can be used.
[084] Fix the physical guess component of logical rowids stored in secondary indexes on IOTs. This allows online repair of invalid physical guesses.

D 注释性的文字(格式:
[绿色]

 

E 未完成的链接


[016] mean time to recover (MTTR)
[028] Partitioned Tables and Indexes
[076] Database Security

F Oracle学习问题(格式:
黄色背景
1、“supplemental logging”指什么?
[071] To take full advantage of all the features LogMiner offers, you should enable supplemental logging.

2、
[024] You have the option to specify the maximum disk space when restoring logs during media recovery, thus enabling an efficient space management during the recovery process.

translator: zw1840@hotmail.com