Discussion:
[lustre-discuss] Lustre OSS kernel panic after mounting OSTs
Riccardo Veraldi
2018-10-30 12:05:38 UTC
Permalink
Hello,

I have quite a very critical problem.

One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.

After mounting 11 OSTs over 12 total OSTs it goes into kernel panic.
Does not matter hte order in which they are mounted.

Any clue on hints ?

I cannot really recover it and I have important data on it.

I already performed an e2fsck. Anyway it did not fix. it has found a few
inode count inconsistencies before.

kernel is 2.6.32-431.23.3.el6_lustre.x86_64

Red Hat Enterprise Linux Server release 6.7 (Santiago)

lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64


Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked
for more than 120 seconds.

Oct 30 04:58:52 psanaoss231 kernel:      Not tainted
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov     D 0000000000000003    
0  4569      2 0x00000080
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1da0 0000000000000046
0000000000000000 0000000000000003
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1d30 ffffffff81059096
ffff880bf2ae1d40 ffff880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2b01ab8 ffff880bf2ae1fd8
000000000000fbc8 ffff880bf2b01ab8
Oct 30 04:58:52 psanaoss231 kernel: Call Trace:
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81059096>] ?
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae560>] ?
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07afbcd>]
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae250>] ?
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109afa0>] ?
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b69d0>]
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81061d12>] ?
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109abf6>] kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery over
after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client
89ba817f-45c3-5e64-99a8-b472651bbe45 (at ***@o2ib) reconnecting
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous similar
messages
Oct 30 04:59:21 psanaoss231 kernel: LustreError:
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write
from 12345-***@tcp because locking object 0x0:14198730 took
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO write
error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at ***@tcp),
client will retry: rc -110
Oct 30 04:59:52 psanaoss231 kernel: ------------[ cut here ]------------
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at
fs/jbd2/transaction.c:1033!
Oct 30 04:59:52 psanaoss231 kernel: invalid opcode: 0000 [#1] SMP
Oct 30 04:59:52 psanaoss231 kernel: last sysfs file:
/sys/devices/system/cpu/online
Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U)
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U)
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U)
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs
ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode power_meter iTCO_wdt
iTCO_vendor_support dcdbas ipmi_devintf sb_edac edac_core lpc_ich
mfd_core shpchp igb i2c_algo_bit i2c_core ses enclosure sg ixgbe dca ptp
pps_core mdio ext4 jbd2 mbcache raid1 sd_mod crc_t10dif ahci wmi mlx4_ib
ib_sa ib_mad ib_core mlx4_en mlx4_core megaraid_sas dm_mirror
dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Oct 30 04:59:52 psanaoss231 kernel:
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[<ffffffffa01198ad>] 
[<ffffffffa01198ad>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP: 0018:ffff880c058437d0 EFLAGS:
00010246
Oct 30 04:59:52 psanaoss231 kernel: RAX: ffff880c05573dc0 RBX:
ffff880c043b8d08 RCX: ffff88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: RDX: 0000000000000000 RSI:
ffff88175b0fedc8 RDI: 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: RBP: ffff880c058437f0 R08:
9010000000000000 R09: e886f5e8fbf37202
Oct 30 04:59:52 psanaoss231 kernel: R10: 0000000000000002 R11:
0000000000000000 R12: ffff880c040c26d8
Oct 30 04:59:52 psanaoss231 kernel: R13: ffff88175b0fedc8 R14:
ffff88174728c800 R15: 0000000000000008
Oct 30 04:59:52 psanaoss231 kernel: FS:  0000000000000000(0000)
GS:ffff8800282a0000(0000) knlGS:0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: CS:  0010 DS: 0018 ES: 0018 CR0:
000000008005003b
Oct 30 04:59:52 psanaoss231 kernel: CR2: 00000034f304b750 CR3:
0000000001a85000 CR4: 00000000000407e0
Oct 30 04:59:52 psanaoss231 kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Oct 30 04:59:52 psanaoss231 kernel: Process ll_ost01_007 (pid: 4272,
threadinfo ffff880c05842000, task ffff880c0634eaa0)
Oct 30 04:59:52 psanaoss231 kernel: Stack:
Oct 30 04:59:52 psanaoss231 kernel: ffff880c043b8d08 ffffffffa0d136f0
ffff88175b0fedc8 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff880c05843830
ffffffffa0cd100b ffff880c05843820 ffffffff8109af8f
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff88175b105a40
ffff880c043b8d08 0000000000000018 ffff88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: Call Trace:
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0cd100b>]
__ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109af8f>] ?
wake_up_bit+0x2f/0x40
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d067c5>]
ldiskfs_quota_write+0x165/0x210 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eef11>]
v2_write_file_info+0xa1/0xe0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eb018>]
dquot_acquire+0x138/0x140
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05956>]
ldiskfs_acquire_dquot+0x66/0xb0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ecf8c>] dqget+0x2ac/0x390
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ed51b>]
dquot_initialize+0x7b/0x240
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8116f553>] ?
kmem_cache_alloc_trace+0x1a3/0x1b0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05bb3>]
ldiskfs_dquot_initialize+0x83/0xd0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0dd0baf>]
osd_attr_set+0x12f/0x540 [osd_ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecb969>]
dt_attr_set.clone.2+0x29/0xc0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecf472>]
ofd_attr_set+0x522/0x6c0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ec0e68>]
ofd_setattr+0x678/0xc10 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07eeeae>] ?
lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e711bb>]
ost_setattr+0x30b/0x930 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e741bd>]
ost_handle+0x1f8d/0x44d0 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f68db>] ?
ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07fecf5>]
ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05164ce>] ?
cfs_timer_arm+0xe/0x10 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05273cf>] ?
lc_watchdog_touch+0x6f/0x170 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f63d9>] ?
ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff810546b9>] ?
__wake_up_common+0x59/0x90
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa080005d>]
ptlrpc_main+0xaed/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07ff570>] ?
ptlrpc_main+0x0/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109abf6>] kthread+0x96/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109ab60>] ? kthread+0x0/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:52 psanaoss231 kernel: Code: c6 9c 03 00 00 4c 89 f7 e8 c1
21 41 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e8 11 ec ff ff 4c 89 f0 66 ff
00 66 66 90 e9 73 ff ff ff <0f> 0b eb fe 0f 0b eb fe 0f 0b 66 0f 1f 84
00 00 00 00 00 eb f5
Oct 30 04:59:52 psanaoss231 kernel: RIP  [<ffffffffa01198ad>]
jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP <ffff880c058437d0>
Oct 30 04:59:52 psanaoss231 kernel: ---[ end trace 5ceb40448d3277c6 ]---
Oct 30 04:59:52 psanaoss231 kernel: Kernel panic - not syncing: Fatal
exception
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007
Tainted: G      D    --------------- 2.6.32-431.23.3.el6_lustre.x86_64 #1
Riccardo Veraldi
2018-10-30 12:24:18 UTC
Permalink
I could mount the OSTs the only way though was to  mount with abort_recov

thanks to this old ticket

https://jira.whamcloud.com/browse/LU-5040
Post by Riccardo Veraldi
Hello,
I have quite a very critical problem.
One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.
After mounting 11 OSTs over 12 total OSTs it goes into kernel panic.
Does not matter hte order in which they are mounted.
Any clue on hints ?
I cannot really recover it and I have important data on it.
I already performed an e2fsck. Anyway it did not fix. it has found a
few inode count inconsistencies before.
kernel is 2.6.32-431.23.3.el6_lustre.x86_64
Red Hat Enterprise Linux Server release 6.7 (Santiago)
lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked
for more than 120 seconds.
Oct 30 04:58:52 psanaoss231 kernel:      Not tainted
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov     D
0000000000000003     0  4569      2 0x00000080
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1da0 0000000000000046
0000000000000000 0000000000000003
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1d30 ffffffff81059096
ffff880bf2ae1d40 ffff880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2b01ab8 ffff880bf2ae1fd8
000000000000fbc8 ffff880bf2b01ab8
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81059096>] ?
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae560>] ?
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07afbcd>]
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae250>] ?
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109afa0>] ?
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b69d0>]
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81061d12>] ?
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery
over after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous
similar messages
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO
write error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at
Oct 30 04:59:52 psanaoss231 kernel: ------------[ cut here ]------------
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at
fs/jbd2/transaction.c:1033!
Oct 30 04:59:52 psanaoss231 kernel: invalid opcode: 0000 [#1] SMP
/sys/devices/system/cpu/online
Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U)
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U)
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U)
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode
power_meter iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf sb_edac
edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core ses
enclosure sg ixgbe dca ptp pps_core mdio ext4 jbd2 mbcache raid1
sd_mod crc_t10dif ahci wmi mlx4_ib ib_sa ib_mad ib_core mlx4_en
mlx4_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: speedstep_lib]
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[<ffffffffa01198ad>] 
[<ffffffffa01198ad>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
00010246
ffff880c043b8d08 RCX: ffff88175b0fedc8
ffff88175b0fedc8 RDI: 0000000000000000
9010000000000000 R09: e886f5e8fbf37202
0000000000000000 R12: ffff880c040c26d8
ffff88174728c800 R15: 0000000000000008
Oct 30 04:59:52 psanaoss231 kernel: FS:  0000000000000000(0000)
GS:ffff8800282a0000(0000) knlGS:0000000000000000
000000008005003b
0000000001a85000 CR4: 00000000000407e0
0000000000000000 DR2: 0000000000000000
00000000ffff0ff0 DR7: 0000000000000400
Oct 30 04:59:52 psanaoss231 kernel: Process ll_ost01_007 (pid: 4272,
threadinfo ffff880c05842000, task ffff880c0634eaa0)
Oct 30 04:59:52 psanaoss231 kernel: ffff880c043b8d08 ffffffffa0d136f0
ffff88175b0fedc8 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff880c05843830
ffffffffa0cd100b ffff880c05843820 ffffffff8109af8f
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff88175b105a40
ffff880c043b8d08 0000000000000018 ffff88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0cd100b>]
__ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109af8f>] ?
wake_up_bit+0x2f/0x40
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d067c5>]
ldiskfs_quota_write+0x165/0x210 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eef11>]
v2_write_file_info+0xa1/0xe0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eb018>]
dquot_acquire+0x138/0x140
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05956>]
ldiskfs_acquire_dquot+0x66/0xb0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ecf8c>]
dqget+0x2ac/0x390
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ed51b>]
dquot_initialize+0x7b/0x240
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8116f553>] ?
kmem_cache_alloc_trace+0x1a3/0x1b0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05bb3>]
ldiskfs_dquot_initialize+0x83/0xd0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0dd0baf>]
osd_attr_set+0x12f/0x540 [osd_ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecb969>]
dt_attr_set.clone.2+0x29/0xc0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecf472>]
ofd_attr_set+0x522/0x6c0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ec0e68>]
ofd_setattr+0x678/0xc10 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07eeeae>] ?
lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e711bb>]
ost_setattr+0x30b/0x930 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e741bd>]
ost_handle+0x1f8d/0x44d0 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f68db>] ?
ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07fecf5>]
ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05164ce>] ?
cfs_timer_arm+0xe/0x10 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05273cf>] ?
lc_watchdog_touch+0x6f/0x170 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f63d9>] ?
ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff810546b9>] ?
__wake_up_common+0x59/0x90
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa080005d>]
ptlrpc_main+0xaed/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07ff570>] ?
ptlrpc_main+0x0/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:52 psanaoss231 kernel: Code: c6 9c 03 00 00 4c 89 f7 e8
c1 21 41 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e8 11 ec ff ff 4c 89 f0
66 ff 00 66 66 90 e9 73 ff ff ff <0f> 0b eb fe 0f 0b eb fe 0f 0b 66 0f
1f 84 00 00 00 00 00 eb f5
Oct 30 04:59:52 psanaoss231 kernel: RIP [<ffffffffa01198ad>]
jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP <ffff880c058437d0>
Oct 30 04:59:52 psanaoss231 kernel: ---[ end trace 5ceb40448d3277c6 ]---
Oct 30 04:59:52 psanaoss231 kernel: Kernel panic - not syncing: Fatal
exception
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007
Tainted: G      D    --------------- 2.6.32-431.23.3.el6_lustre.x86_64 #1
Fernando Perez
2018-10-30 12:28:12 UTC
Permalink
Dear Riccardo.

Have you tried to upgrade e2fsprogs packages before perform the e2fsck?

Regards.

=============================================
Fernando Pérez
Institut de Ciències del Mar (CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone: (+34) 93 230 96 35
=============================================
Post by Riccardo Veraldi
Hello,
I have quite a very critical problem.
One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.
After mounting 11 OSTs over 12 total OSTs it goes into kernel panic.
Does not matter hte order in which they are mounted.
Any clue on hints ?
I cannot really recover it and I have important data on it.
I already performed an e2fsck. Anyway it did not fix. it has found a
few inode count inconsistencies before.
kernel is 2.6.32-431.23.3.el6_lustre.x86_64
Red Hat Enterprise Linux Server release 6.7 (Santiago)
lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked
for more than 120 seconds.
Oct 30 04:58:52 psanaoss231 kernel:      Not tainted
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov     D
0000000000000003     0  4569      2 0x00000080
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1da0 0000000000000046
0000000000000000 0000000000000003
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1d30 ffffffff81059096
ffff880bf2ae1d40 ffff880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2b01ab8 ffff880bf2ae1fd8
000000000000fbc8 ffff880bf2b01ab8
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81059096>] ?
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae560>] ?
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07afbcd>]
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae250>] ?
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109afa0>] ?
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b69d0>]
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81061d12>] ?
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery
over after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous
similar messages
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO
write error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at
Oct 30 04:59:52 psanaoss231 kernel: ------------[ cut here ]------------
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at
fs/jbd2/transaction.c:1033!
Oct 30 04:59:52 psanaoss231 kernel: invalid opcode: 0000 [#1] SMP
/sys/devices/system/cpu/online
Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U)
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U)
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U)
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode
power_meter iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf sb_edac
edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core ses
enclosure sg ixgbe dca ptp pps_core mdio ext4 jbd2 mbcache raid1
sd_mod crc_t10dif ahci wmi mlx4_ib ib_sa ib_mad ib_core mlx4_en
mlx4_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: speedstep_lib]
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[<ffffffffa01198ad>] 
[<ffffffffa01198ad>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
00010246
ffff880c043b8d08 RCX: ffff88175b0fedc8
ffff88175b0fedc8 RDI: 0000000000000000
9010000000000000 R09: e886f5e8fbf37202
0000000000000000 R12: ffff880c040c26d8
ffff88174728c800 R15: 0000000000000008
Oct 30 04:59:52 psanaoss231 kernel: FS:  0000000000000000(0000)
GS:ffff8800282a0000(0000) knlGS:0000000000000000
000000008005003b
0000000001a85000 CR4: 00000000000407e0
0000000000000000 DR2: 0000000000000000
00000000ffff0ff0 DR7: 0000000000000400
Oct 30 04:59:52 psanaoss231 kernel: Process ll_ost01_007 (pid: 4272,
threadinfo ffff880c05842000, task ffff880c0634eaa0)
Oct 30 04:59:52 psanaoss231 kernel: ffff880c043b8d08 ffffffffa0d136f0
ffff88175b0fedc8 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff880c05843830
ffffffffa0cd100b ffff880c05843820 ffffffff8109af8f
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff88175b105a40
ffff880c043b8d08 0000000000000018 ffff88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0cd100b>]
__ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109af8f>] ?
wake_up_bit+0x2f/0x40
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d067c5>]
ldiskfs_quota_write+0x165/0x210 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eef11>]
v2_write_file_info+0xa1/0xe0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eb018>]
dquot_acquire+0x138/0x140
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05956>]
ldiskfs_acquire_dquot+0x66/0xb0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ecf8c>]
dqget+0x2ac/0x390
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ed51b>]
dquot_initialize+0x7b/0x240
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8116f553>] ?
kmem_cache_alloc_trace+0x1a3/0x1b0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05bb3>]
ldiskfs_dquot_initialize+0x83/0xd0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0dd0baf>]
osd_attr_set+0x12f/0x540 [osd_ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecb969>]
dt_attr_set.clone.2+0x29/0xc0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecf472>]
ofd_attr_set+0x522/0x6c0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ec0e68>]
ofd_setattr+0x678/0xc10 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07eeeae>] ?
lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e711bb>]
ost_setattr+0x30b/0x930 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e741bd>]
ost_handle+0x1f8d/0x44d0 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f68db>] ?
ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07fecf5>]
ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05164ce>] ?
cfs_timer_arm+0xe/0x10 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05273cf>] ?
lc_watchdog_touch+0x6f/0x170 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f63d9>] ?
ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff810546b9>] ?
__wake_up_common+0x59/0x90
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa080005d>]
ptlrpc_main+0xaed/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07ff570>] ?
ptlrpc_main+0x0/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:52 psanaoss231 kernel: Code: c6 9c 03 00 00 4c 89 f7 e8
c1 21 41 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e8 11 ec ff ff 4c 89 f0
66 ff 00 66 66 90 e9 73 ff ff ff <0f> 0b eb fe 0f 0b eb fe 0f 0b 66 0f
1f 84 00 00 00 00 00 eb f5
Oct 30 04:59:52 psanaoss231 kernel: RIP [<ffffffffa01198ad>]
jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP <ffff880c058437d0>
Oct 30 04:59:52 psanaoss231 kernel: ---[ end trace 5ceb40448d3277c6 ]---
Oct 30 04:59:52 psanaoss231 kernel: Kernel panic - not syncing: Fatal
exception
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007
Tainted: G      D    --------------- 2.6.32-431.23.3.el6_lustre.x86_64 #1
_______________________________________________
lustre-discuss mailing list
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Riccardo Veraldi
2018-10-30 12:43:01 UTC
Permalink
thank you Fernando  for the hint, I did it right  now thanks. I am
running e2fsck again.
Anyway my problem was this:

https://jira.whamcloud.com/browse/LU-5040

thank you
Post by Fernando Perez
Dear Riccardo.
Have you tried to upgrade e2fsprogs packages before perform the e2fsck?
Regards.
=============================================
Fernando Pérez
Institut de Ciències del Mar (CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=============================================
Post by Riccardo Veraldi
Hello,
I have quite a very critical problem.
One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.
After mounting 11 OSTs over 12 total OSTs it goes into kernel panic.
Does not matter hte order in which they are mounted.
Any clue on hints ?
I cannot really recover it and I have important data on it.
I already performed an e2fsck. Anyway it did not fix. it has found a
few inode count inconsistencies before.
kernel is 2.6.32-431.23.3.el6_lustre.x86_64
Red Hat Enterprise Linux Server release 6.7 (Santiago)
lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked
for more than 120 seconds.
Oct 30 04:58:52 psanaoss231 kernel:      Not tainted
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov     D
0000000000000003     0  4569      2 0x00000080
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1da0 0000000000000046
0000000000000000 0000000000000003
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2ae1d30 ffffffff81059096
ffff880bf2ae1d40 ffff880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: ffff880bf2b01ab8 ffff880bf2ae1fd8
000000000000fbc8 ffff880bf2b01ab8
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81059096>] ?
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae560>] ?
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07afbcd>]
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07ae250>] ?
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109afa0>] ?
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b69d0>]
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff81061d12>] ?
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffffa07b6490>] ?
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery
over after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client
reconnecting
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous
similar messages
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO
write error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at
Oct 30 04:59:52 psanaoss231 kernel: ------------[ cut here ]------------
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at
fs/jbd2/transaction.c:1033!
Oct 30 04:59:52 psanaoss231 kernel: invalid opcode: 0000 [#1] SMP
/sys/devices/system/cpu/online
Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U)
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U)
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U)
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode
power_meter iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf sb_edac
edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core ses
enclosure sg ixgbe dca ptp pps_core mdio ext4 jbd2 mbcache raid1
sd_mod crc_t10dif ahci wmi mlx4_ib ib_sa ib_mad ib_core mlx4_en
mlx4_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: speedstep_lib]
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[<ffffffffa01198ad>] 
[<ffffffffa01198ad>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP: 0018:ffff880c058437d0
EFLAGS: 00010246
ffff880c043b8d08 RCX: ffff88175b0fedc8
ffff88175b0fedc8 RDI: 0000000000000000
9010000000000000 R09: e886f5e8fbf37202
0000000000000000 R12: ffff880c040c26d8
ffff88174728c800 R15: 0000000000000008
Oct 30 04:59:52 psanaoss231 kernel: FS:  0000000000000000(0000)
GS:ffff8800282a0000(0000) knlGS:0000000000000000
000000008005003b
0000000001a85000 CR4: 00000000000407e0
0000000000000000 DR2: 0000000000000000
00000000ffff0ff0 DR7: 0000000000000400
Oct 30 04:59:52 psanaoss231 kernel: Process ll_ost01_007 (pid: 4272,
threadinfo ffff880c05842000, task ffff880c0634eaa0)
Oct 30 04:59:52 psanaoss231 kernel: ffff880c043b8d08 ffffffffa0d136f0
ffff88175b0fedc8 0000000000000000
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff880c05843830
ffffffffa0cd100b ffff880c05843820 ffffffff8109af8f
Oct 30 04:59:52 psanaoss231 kernel: <d> ffff88175b105a40
ffff880c043b8d08 0000000000000018 ffff88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0cd100b>]
__ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109af8f>] ?
wake_up_bit+0x2f/0x40
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d067c5>]
ldiskfs_quota_write+0x165/0x210 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eef11>]
v2_write_file_info+0xa1/0xe0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811eb018>]
dquot_acquire+0x138/0x140
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05956>]
ldiskfs_acquire_dquot+0x66/0xb0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ecf8c>]
dqget+0x2ac/0x390
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff811ed51b>]
dquot_initialize+0x7b/0x240
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8116f553>] ?
kmem_cache_alloc_trace+0x1a3/0x1b0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0d05bb3>]
ldiskfs_dquot_initialize+0x83/0xd0 [ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0dd0baf>]
osd_attr_set+0x12f/0x540 [osd_ldiskfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecb969>]
dt_attr_set.clone.2+0x29/0xc0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ecf472>]
ofd_attr_set+0x522/0x6c0 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0ec0e68>]
ofd_setattr+0x678/0xc10 [ofd]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07eeeae>] ?
lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e711bb>]
ost_setattr+0x30b/0x930 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa0e741bd>]
ost_handle+0x1f8d/0x44d0 [ost]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f68db>] ?
ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07fecf5>]
ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05164ce>] ?
cfs_timer_arm+0xe/0x10 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa05273cf>] ?
lc_watchdog_touch+0x6f/0x170 [libcfs]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07f63d9>] ?
ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff810546b9>] ?
__wake_up_common+0x59/0x90
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa080005d>]
ptlrpc_main+0xaed/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffffa07ff570>] ?
ptlrpc_main+0x0/0x1740 [ptlrpc]
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109abf6>]
kthread+0x96/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c20a>]
child_rip+0xa/0x20
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8109ab60>] ?
kthread+0x0/0xa0
Oct 30 04:59:52 psanaoss231 kernel: [<ffffffff8100c200>] ?
child_rip+0x0/0x20
Oct 30 04:59:52 psanaoss231 kernel: Code: c6 9c 03 00 00 4c 89 f7 e8
c1 21 41 e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e8 11 ec ff ff 4c 89 f0
66 ff 00 66 66 90 e9 73 ff ff ff <0f> 0b eb fe 0f 0b eb fe 0f 0b 66
0f 1f 84 00 00 00 00 00 eb f5
Oct 30 04:59:52 psanaoss231 kernel: RIP [<ffffffffa01198ad>]
jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP <ffff880c058437d0>
Oct 30 04:59:52 psanaoss231 kernel: ---[ end trace 5ceb40448d3277c6 ]---
Oct 30 04:59:52 psanaoss231 kernel: Kernel panic - not syncing: Fatal
exception
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007
Tainted: G      D    ---------------
2.6.32-431.23.3.el6_lustre.x86_64 #1
_______________________________________________
lustre-discuss mailing list
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Loading...