Symantec Netbackup: falha na restauração do Oracle "saiu com status 13: falha na leitura do arquivo"

Symantec Netbackup: falha na restauração do Oracle "saiu com status 13: falha na leitura do arquivo"

Estou restaurando o tamanho do arquivo de dados Oracle de 1200 GB. O arquivo de dados é lido a partir do tamanho da peça de backup de 4.000 GB (4 TB).

Nota: Como monitorei 4 TB são processados ​​totalmente para ler todos os blocos do meu arquivo de dados.

Tive vários problemas durante a restauração:

Situação 1. Houve um período em que após 2 horas a restauração falhou (a coluna Operação no Netbackup Database Console estava vazia naquele momento)

Situação 2. A unidade foi montada antes de 2 horas e o processo de restauração não falhou antes de 2 horas. Mas de repente a restauração falhou após processar 55 GB.

Não tenho registros antigos. Agora estou com o seguinte problema:

  • A restauração falhou após o processamento de 3 TB (23 horas se passaram)... Vou postar algumas entradas de log, por favor ajude:

--------------- A guia Detalhes é exibida

23/08/2012 12:56:40 - imagem restaurada r2n1_1345046880 - (falha na leitura do arquivo (13)); horário de restauração 23:36:17 23/08/2012 12:56:44 PM - finalização da restauração; tempo decorrido: 23:36:26 a restauração não conseguiu recuperar os arquivos solicitados(5)

------------------------------ log bpbrm mostra

12:56:35.561 [7344.4804] <2> bpbrm brm_child_done: child 4852 exited with status 13: file read failed
12:56:35.561 [7344.4804] <2> bpbrm send_status_to_parent: bpbrm child is done, but the media manager child is not.
12:56:35.561 [7344.4804] <2> bpbrm tell_mm: sending media manager msg: STOP RESTORE r2n1_1345046880
12:56:40.709 [7344.4804] <2> bpbrm read_media_msg: read from media manager: EXIT r2n1_1345046880 150
12:56:40.709 [7344.4804] <2> bpbrm process_media_msg: media manager for backup id r2n1_1345046880 exited with status 150: termination requested by administrator
12:56:40.709 [7344.4804] <2> bpbrm send_status_to_parent: EXIT r2n1_1345046880 13 sent to parent process.
12:56:40.709 [7344.4804] <2> bpbrm read_media_msg: read from media manager: MEDIA NOT READY
12:56:40.771 [7344.4804] <2> bpbrm read_parent_msg: read from parent TERMINATE
12:56:40.771 [7344.4804] <2> bpbrm tell_mm: sending media manager msg: TERMINATE

----------------bprd

12:56:40.725 [3292.7188] <2> read_bpbrm_msg: (pid 7344) EXIT r2n1_1345046880 13
12:56:40.725 [3292.7188] <2> handle_image_status: r2n1_1345046880 restfiles pid 6696 bpbrm pid 7344 status = 13
12:56:40.725 [3292.7188] <2> remove_completed_restore: unlink record with restfiles pid 6696
12:56:40.725 [3292.7188] <2> remove_completed_restore: no more active restores
12:56:40.725 [3292.7188] <2> done_with_bpbrm: bpbrm pid 7344 has no more images to process
12:56:40.725 [3292.7188] <2> send_bpbrm_msg: (pid 7344) TERMINATE
12:56:40.725 [3292.7188] <2> start_indep_grp: active_mpx_groups = 0
12:56:40.725 [3292.7188] <2> start_indep_grp: p_active_restore_for_first_grp = NULL
12:56:40.725 [3292.7188] <2> start_indep_grp: p_first_mpx_group_for_restore = NULL
12:56:40.725 [3292.7188] <2> next_image_using_first_blks: returning NULL
12:56:40.725 [3292.7188] <2> run_restores: brm pid 0 has no active restores and none to start
12:56:40.725 [3292.7188] <2> start_indep_grp: active_mpx_groups = 0
12:56:40.725 [3292.7188] <2> start_indep_grp: p_active_restore_for_first_grp = NULL
12:56:40.725 [3292.7188] <2> start_indep_grp: p_first_mpx_group_for_restore = NULL
12:56:40.725 [6696.7284] <2> restorefiles: Done with port 0 backup_id r2n1_1345046880. Status = 13
12:56:40.725 [6696.7284] <2> add_msgs_to_progress_file: close and re-open the progress log file
12:56:40.725 [6696.7284] <2> log_disconnect: more than 30 mins since the last progress msg, closing sockets
12:56:40.725 [6696.7284] <2> put_short: (10) network write() error: An operation was attempted on something that is not a socket. ; socket = -1
12:56:40.725 [6696.7284] <2> bpcr_disconnect_rqst: bpcr protocol error - couldn't send request type
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't disconnect from bpcd
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_CMD_SOCK: An operation was attempted on something that is not a socket. 
12:56:40.725 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_STAT_SOCK: An operation was attempted on something that is not a socket. 
12:56:40.756 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:40.756 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:40.803 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52068 TO 192.168.34.150.13724
12:56:40.803 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52069 TO 192.168.34.150.13724 fd = 808
12:56:40.834 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:40.896 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32625345712215021290000000000-ERMDYb
12:56:41.754 [6696.7284] <2> get_short: (2) premature end of file (byte 1)
12:56:41.754 [6696.7284] <2> readclientstatus: bpcr short int read failed
12:56:41.754 [6696.7284] <2> readclientstatus: Access is denied. 
12:56:41.754 [6696.7284] <2> log_disconnect: Couldn't close progress file /usr/openv/netbackup/logs/user_ops/dbext/logs/17121.0.1345627225 on orcl.justice.gov.ge
12:56:41.754 [6696.7284] <2> log_disconnect: Client close errno = 1
12:56:41.754 [6696.7284] <2> put_short: (10) network write() error: An existing connection was forcibly closed by the remote host. ; socket = 780
12:56:41.754 [6696.7284] <2> bpcr_disconnect_rqst: bpcr protocol error - couldn't send request type
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't disconnect from bpcd
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_CMD_SOCK: An operation was attempted on something that is not a socket. 
12:56:41.754 [6696.7284] <2> log_disconnect_all: Couldn't close CLIENT_STAT_SOCK: An operation was attempted on something that is not a socket. 
12:56:41.754 [6696.7284] <2> restorefiles: Received finished from MPX main bprd. Stat = 0
12:56:41.754 [6696.7284] <2> mail_msg_and_set_exit_status: entered; status = 0
12:56:41.754 [6696.7284] <2> mail_msg_and_set_exit_status: Attempting to send mail to oracle on orcl.justice.gov.ge
12:56:41.770 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:41.770 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:41.817 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52070 TO 192.168.34.150.13724
12:56:41.817 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52071 TO 192.168.34.150.13724 fd = 780
12:56:41.863 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:41.879 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32628345712216006586000000000-fgV51f
12:56:42.285 [6696.7284] <2> mail_msg_and_set_exit_status: CLIENT_CMD_SOCK from bpcr = 808
12:56:42.285 [6696.7284] <2> mail_msg_and_set_exit_status: CLIENT_STAT_SOCK from bpcr = 780
12:56:42.690 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:42.909 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:43.330 [6696.7284] <2> bpcr_get_version_rqst: bpcd version: 07000000
12:56:43.767 [6696.7284] <2> mail_msg_and_set_exit_status: RESTORE EXIT STATUS = 5
12:56:43.782 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2054: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
12:56:43.782 [6696.7284] <2> vnet_vnetd_service_socket: ../../libvlibs/vnet_vnetd.c.2068: service: bpcd
12:56:43.829 [6696.7284] <2> logconnections: BPCD CONNECT FROM 192.168.59.63.52092 TO 192.168.34.150.13724
12:56:43.829 [6696.7284] <2> vnet_connect_to_vnetd_extra: ../../libvlibs/vnet_vnetd.c.188: msg: VNETD CONNECT FROM 192.168.59.63.52093 TO 192.168.34.150.13724 fd = 808
12:56:43.860 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.541: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
12:56:43.907 [6696.7284] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.558: ipc_string: /tmp/vnet-32635345712218044217000000000-EOpovo
12:56:44.750 [6696.7284] <2> job_monitoring_exex: ACK disconnect
12:56:44.750 [6696.7284] <2> job_disconnect: Disconnected
12:56:44.750 [6696.7284] <2> get_short: (2) premature end of file (byte 1)
12:56:44.750 [6696.7284] <2> readclientstatus: bpcr short int read failed
12:56:44.750 [6696.7284] <2> readclientstatus: Access is denied. 
12:56:44.750 [6696.7284] <2> log_disconnect: Couldn't close progress file /usr/openv/netbackup/logs/user_ops/dbext/logs/17121.0.1345627225 on orcl.justice.gov.ge
12:56:44.750 [6696.7284] <2> log_disconnect: Client close errno = 1
12:56:44.750 [6696.7284] <2> bpcr_disconnect_rqst: bpcr received the following message: 
12:56:44.750 [6696.7284] <2> terminate_restore: EXIT STATUS 5
12:56:44.750 [6696.7284] <2> bprd: C:\Program Files\Veritas\NetBackup\bin\userreq_notify.cmd restore orcl.justice.gov.ge oracle

----------------------RMAN

 failure of IRESTORE command on ch1_tape channel at 08/23/2012 12:57:17
RMAN-10038: database session for channel ch1_tape terminated unexpectedly

Observe que a restauração não foi interrompida manualmente.

Responder1

O RMAN-10038 normalmente significa que o processo Oracle (vinculado à biblioteca do Netbackup) responsável por lidar com a restauração foi abortado, possivelmente com um coredump. Verifique o alert*.log e procure os arquivos *.trc no diretório de diagnóstico do Oracle.

Provavelmente, este é um problema conhecidohttp://www.symantec.com/business/support/index?page=content&id=TECH77071

Um Emergency Engineering Binary (EEB) para ativar o Oracle 11.2 para NetBackup 6.5.5 está disponível no seguinte link: http://support.veritas.com/docs/346033

Para NetBackup 6.5.4: http://support.veritas.com/docs/346592

A tentativa de fazer backup do Oracle versão 11.2 sem essas correções resultará nas falhas abaixo:

RMAN-00571: =========================================================== 
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== 
RMAN-00571: =========================================================== 
RMAN-03009: failure of backup command on ch00 channel at 07/27/2010 09:23:23 
RMAN-10038: database session for channel ch00 terminated unexpectedly 

Fora isso, eu entraria em contato primeiro com o suporte da Oracle e depois com o suporte do Netbackup.

informação relacionada