cancel
Showing results for 
Search instead for 
Did you mean: 

SAP on SLES with BtrFS

markus_doehr2
Active Contributor
0 Kudos
1,542

Hi all,

is someone using SLES system with BtrFS (either / or database or both)? I'd like to hear (and share) experiences.

Regards,

Markus

View Entire Topic
Former Member
0 Kudos

Hi Markus,

we are running btrfs for / since SLES11 SP2 on over 60 Systems.

In the meantime we only have SP3 / SP4 mixed ....

No problems so far.

Regards,

Daniel

markus_doehr2
Active Contributor
0 Kudos

Hi Daniel,

thank you for your input.

We run/ran roughly 80 systems on BtrFS and there seems to be a regression in Kernels > 3.0.101-0.29 that may corrupt the filesystem. SuSE is still investigating.

Seven systems have so far crashed with filesystem errors where the database had to be restored from backup, including two times our central BW (1,5+ TB), two systems could not be restored completely because the database online logs filesystem was hosed. Those crashes happened mostly under no to very little load. Older kernels do not have this problem, they run flawlessly.

Which kernel versions do you use?

Markus

Former Member
0 Kudos

Hi Markus,

We running the newest SP4 Kernels:

Linux  3.0.101-63-default #1 SMP Tue Jun 23 16:02:31 UTC 2015 (4b89d0c) x86_64 x86_64 x86_64 GNU/Linux

But we don't have databases on btrfs, only / (os installation) ,,, all SAN data are on ext3 since we use snapshots of the storage system.

Best regards, Daniel

markus_doehr2
Active Contributor
0 Kudos

I see.

We had one machine with also BtrFS on /, the database crashed, wrote a dump, filled the root filesystem and the whole system rebooted. The root filesystem was no more mountable (filesystem full), a balancing did not work.

Markus

Former Member
0 Kudos

For this problem there is one solution:

Boot an other / newer live linux like gentoo boot cd,

add one usb stick or other block device  to the fs, delete snaps or data, then shrink the fs to

the orginal device only.

You must use an ohter linux since add / remove devices to a

btrfs is disabled in SLES.

Newer Kernels have also a protection (reserve some space),

so its allways mountable (even if its full) and you can allways

delete files.( Remember, Deletion of Files generates new Metadata ),

but i don't know if suse has backported this to ..

I personally like btrfs, but is has some edges you must know 😉

Best regards,

Daniel

markus_doehr2
Active Contributor
0 Kudos

Thank you Daniel.

We have eight broken systems now, mainly showing kernel oopses as the following - and marking the filesystem read only. In this case it's was "just" /usr/sap but we had other occurences, where it was the filesystem that holds the database data or log files. In that case the filesystem is broken and one has to restore from a backup.

[   39.497688] WARNING: CPU: 5 PID: 3145 at ../fs/btrfs/super.c:259 __btrfs_abort_transaction+0x4b/0x120 [

btrfs]()

[   39.497690] BTRFS: Transaction aborted (error -5)

[   39.497692] Modules linked in: iscsi_ibft iscsi_boot_sysfs af_packet btrfs xfs libcrc32c nls_iso8859_1

nls_cp437 raid6_pq xor vfat fat vmw_balloon coretemp ppdev crc32c_intel vmxnet3 vmw_vmci shpchp parport_pc

pcspkr i2c_piix4 serio_raw processor battery ac parport efivars button efivarfs ext4 crc16 mbcache jbd2 v

mwgfx ttm drm floppy sr_mod cdrom sd_mod ata_generic ata_piix ahci libahci libata vmw_pvscsi dm_mirror dm_

region_hash dm_log dm_mod sg scsi_mod autofs4

[   39.497743] Supported: Yes

[   39.497747] CPU: 5 PID: 3145 Comm: sapstartsrv Not tainted 3.12.44-52.10-default #1

[   39.497750] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B6

4.1410210136 10/21/2014

[   39.497754]  ffffffffa06b5550 ffffffff81510581 ffff8807c4b21ad8 ffffffff81055362

[   39.497759]  ffff8808147ffa28 ffff8807c4b21b28 00000000fffffffb ffffffffa06b3e50

[   39.497764]  00000000000016b2 ffffffff810553ec ffffffffa06b8c88 0000000000000020

[   39.497769] Call Trace:

[   39.497791]  [<ffffffff8100471d>] dump_trace+0x7d/0x2d0

[   39.497798]  [<ffffffff81004a04>] show_stack_log_lvl+0x94/0x170

[   39.497804]  [<ffffffff81005e31>] show_stack+0x21/0x50

[   39.497812]  [<ffffffff81510581>] dump_stack+0x41/0x51

[   39.497821]  [<ffffffff81055362>] warn_slowpath_common+0x82/0xc0

[   39.497829]  [<ffffffff810553ec>] warn_slowpath_fmt+0x4c/0x50

[   39.497844]  [<ffffffffa060dc0b>] __btrfs_abort_transaction+0x4b/0x120 [btrfs]

[   39.497883]  [<ffffffffa062065f>] __btrfs_free_extent+0x30f/0xc40 [btrfs]

[   39.497930]  [<ffffffffa0625ad2>] __btrfs_run_delayed_refs+0x912/0x11d0 [btrfs]

[   39.497981]  [<ffffffffa062a459>] btrfs_run_delayed_refs.part.66+0x69/0x280 [btrfs]

[   39.498037]  [<ffffffffa063c40d>] __btrfs_end_transaction+0x2ad/0x3d0 [btrfs]

[   39.498113]  [<ffffffffa0645629>] btrfs_truncate+0x1e9/0x2b0 [btrfs]

[   39.498195]  [<ffffffffa0646100>] btrfs_setattr+0x230/0x2e0 [btrfs]

[   39.498266]  [<ffffffff811bc6e1>] notify_change+0x231/0x390

[   39.498275]  [<ffffffff8119fca5>] do_truncate+0x65/0x90

[   39.498283]  [<ffffffff8119ffff>] do_sys_ftruncate.constprop.11+0x11f/0x180

[   39.498294]  [<ffffffff8151e789>] system_call_fastpath+0x16/0x1b

[   39.498302]  [<00007ffff5e3fa97>] 0x7ffff5e3fa96

[   39.498305] ---[ end trace 4280fc12485ab7b5 ]---

Those problems seem to occur really randomly, in most of the cases they happen under no load so when the system is just sitting there.

They all appeared when we used kernels of SLES 11 SP3 > 3.0.101-0.29, the most of them with the latest kernel 0.55 but also with SLES12 (as you can see here).

Markus