cancel
Showing results for 
Search instead for 
Did you mean: 

Does IQ support huge pages on Linux x86-64?

Former Member
0 Kudos
231

I ask because I don't see any references to huge memory pages in the IQ manuals or white papers.  I know ASE supports it.  If not, why not?

Accepted Solutions (1)

Accepted Solutions (1)

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

Jason,

There has been some discussion that the most recent releases (16.0, 15.4 esd3) now support transparent hugepages.  But TSP is much different than the dedicated hugepages. We don't support dedicated hugepages like ASE does at this point.

Former Member
0 Kudos

Thanks Mark.  What is the ETA on supporting real huge pages? 

For those of you that aren't familiar with huge pages or transparent huge pages on Linux, you can read up on it at http://lwn.net/Articles/423584/

Answers (2)

Answers (2)

Former Member
0 Kudos

For Sybase IQ on Linux. Huge Pages should be set to Zero. Perfomance of IQ is very poor with Linux systems configured with huge pages.

Former Member
0 Kudos

I actually find it hard to believe that IQ would have poor performance with huge pages.  I suspect whomever did the internal testing for it misconfigured the huge memory pages or the code within the test binary was not correctly written so that the huge pages were not being used or used consistently. 

saroj_bagai
Employee
Employee
0 Kudos

Huge page support was disabled with SA CR 728597 due to RH bug.

CR 728597:

This problem is related to a possible bug in the transparent huge pages (THP) feature introduced in these operating system versions.  Red Hat bug 891857 has been created to track this issue.

The problem can be triggered by calling an external environment, xp_cmdshell, or other procedure that causes a fork while other I/O is occurring.  A known limitation with the Linux kernel limits the use of fork while doing O_DIRECT I/O operations.  Essentially what can happen is that the data can come from or go to the wrong process’ memory after the fork.  SQL Anywhere performs O_DIRECT I/O operations according to the documented safe usage.  However, THP appears to cause further problems and the O_DIRECT I/O data comprising database page reads/writes appears to get lost.

This has been fixed by disabling THP on the SQL Anywhere cache memory where possible.  We are working with Red Hat to identify a solution within the operating system.

There are two possible workarounds:

1. disable THP on a system-wide basis with one of the following methods:

  echo never > /sys/kernel/mm/transparent_hugepage/enabled

  echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled

  boot with transparent_hugepage=never

2. disable O_DIRECT I/O for database file reads/writes with one of the following methods:

  use the -u flag on the server command line

  set SA_DISABLE_DIRECTIO=1 in the environment before starting the server

Transparent huge pages cannot be disabled just for the SQL Anywhere cache memory on Red Hat Enterprise Linux 6.  SQL Anywhere now disables direct I/O if transparent huge pages are enabled and  cannot be disabled on a per-allocation basis.  A warning will be printed as the database file is being opened to indicate that direct I/O is disabled due to this bug.  This is similar to how SQL Anywhere handles file systems that do not support direct I/O.

Customers using RHEL 6 who wish to continue using direct I/O should use the previously-stated command to disable THP at the system level.

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

A word of caution if you follow item #2.  Disabling O_DIRECT can lead to issues as the data pages for the catalog could be in cache but not written to disk.  We use O_DIRECT to guarantee data integrity for the catalog.  If you choose to disable direct IO, corruption could occur in rare instances as the OS would not have written the catalog data to disk before the system crash.  For this reason, it is strongly recommended that TPH be disabled until RedHat has addressed the bug.

Former Member
0 Kudos

If you use raw partitions for your data devices, this RedHat bug should NOT affect you IMHO.

Does IQ use O_DIRECT when data loading from a local file?

Former Member
0 Kudos

Saroj,

Is this a RHEL bug or is this a Linux Kernel bug?  I ask because not everyone uses RHEL or RHEL based Linux distributions.

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

You are correct.  If IQ is on raw devices then they are not affected.  However, the catalog and catalog transaction log are on filesystems and thus subject to this bug.

O_DIRECT is only in effect for data writes.  We ope the primary and secondary load files only with the O_READONLY flag.

Former Member
0 Kudos

The catalog *should* have minimal activity so the risk, I'm guessing, will also be minimal.

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

Should is the key.  But during loads and transactions they do update the catalog and tran log.  There is always the chance that it could become corrupt if the host were to crash.

phil_mitchell
Explorer
0 Kudos

Jason,

I am the engineer who investigated this problem and reported it to RH.  Although the bug description only mentioned RH, we have also seen the problem on Fedora.  Fedora may be related to RH, but it is by no means the same kernel.

It has been brought to my attention that newer versions of SLES now support THP and have made some steps to optimize it further.  I will try to reproduce the bug on this version of SLES and report back to this forum.

It would help to know why you are interested in THP support.  Do you have an immediate need, or is this strictly academic?  If you have an immediate need, that would help to motivate the engineers involved.  How urgent is your desire to have THP support?

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

The question isn't really around THP and the bug, etc.

I think the bigger issue/question is around HUGEPAGES in general and why IQ doesn't directly support them.  In the ASE world using HUGEPAGES on large memory systems has been proven to be faster (about 10% in ASE engineering tests).  IQ will generally be on larger hardware than ASE, so why wouldn't we support native HUGEPAGES like ASE does.

My two cents....

Former Member
0 Kudos

Phil,

Red Hat Enterprise Linux (or RHEL) is a commercially supported derivative of Fedora tailored to meet the requirements of enterprise customers. It is a commercial product from Red Hat which also sponsors Fedora as a community project. Fedora is upstream for Red Hat Enterprise Linux.. -- https://fedoraproject.org/wiki/Red_Hat_Enterprise_Linux?rd=RHEL

The bug is not a RHEL bug but a Linux Kernel bug (see http://froebe.net/blog/2013/06/17/does-anyone-have-any-details-on-redhat-linux-bug-891857/).  Two very different things.  A RHEL bug is a bug that is specific to the RHEL distribution and derivatives.  A Linux Kernel bug affects all distributions using that kernel release.

As to why the interest in huge memory pages, please take a look at IBM's http://pic.dhe.ibm.com/infocenter/lnxinfo/v3r0m0/topic/liaat/liaattuning_pdf.pdf , specifically the part of the Translation Lookup Buffer.  I link to the pdf as it has a nice graphic showing what it is. 

Transparent Huge memory Pages are very similar to Huge Pages with the largest benefit of it happening automatically without any special coding or configuration for it.

Former Member
0 Kudos

Oracle actively promotes the use of huge pages for Oracle RDBMS 11g and 12g.  Sybase actively promotes huge pages for ASE.  IBM actively promotes huge pages for DB2 (UDB). 

I don't understand why IQ engineering is dragging their collective feet on this performance and corruption issue.   I can think of many unkind reasons but I'm sure it is simply lack of resources and the Linux platform that always seems to get a second class citizen status within Sybase.  (at least from the outside of SAP/Sybase)

With respect to ASE performance with huge memory pages, one of the reasons why the increase was only about 10% has to do with the fact that ASE's memory pages max out at 128k if using a 16k data page size server.  If the ASE memory page was able to be the x86-64 huge memory page size of 2mb, performance should be much higher.

With IQ, the only thing that is stopping us from using huge pages is the Linux kernel bug. 

phil_mitchell
Explorer
0 Kudos

Thanks for the extra background, Jason, as well as the reference to the upstream patch proposal.

For the record, although this is not an RHEL-specific bug, we had filed the bug with RedHat, since that's where we ran into it, hence the reference to the RH bug number.

Also, Fedora is upstream from RHEL, but RHEL is free to make additional modifications, hence my assertion that they are not the same kernel.   I was sure that the bug gets at least that high up, but I wasn't certain if it occurred higher upstream.

As for ASE, I'm not familiar with the architecture of ASE.  It's possible that they are not vulnerable to this bug.  We notified the ASE developers about this issue and its consequences, but I am not aware of what came of it.  For SA/IQ, working around this bug would require significant changes.

Former Member
0 Kudos

For ASE, it shouldn't occur very often due to the architecture of handling i/o.

I've been able to confirm that the problem occurs on the latest Linux kernel (on bare hardware).  Next step is to confirm it using Linux KVM.  If so, I can just package that up.  I haven't tested using kvm yet because of lack of time.  I have three kids. 

Anyone can reproduce it easily.  The repro (C program) is in the link I posted.

phil_mitchell
Explorer
0 Kudos

Good to hear.  FWIW, we haven't had any luck reproducing this on either VMWare or XenServer based VMs.  Only real hardware so far.  Also, some file systems seem to be more prone to it than others.  We've only really had luck reproducing it on ext3/ext4, but we haven't been very scientific about that aspect.

phil_mitchell
Explorer
0 Kudos

Jason has already verified that the THP bug exists in the latest kernel, but I thought I would explicitly confirm that the THP bug is still an issue for SA/IQ in RHEL 6.4 as well as SLES 11.2.  Thus, it is still important for us to disable THP on these platforms.

Note that on platforms other than RHEL 6.x, we disable THP only on the database page cache using madvise(), since that's the only place we use Direct I/O.  RHEL 6.x does not support the madvise() call that's required for this, so we require THP to be disabled system-wide.  If THP is enabled on RHEL 6.x, we fall back on disabling Direct I/O.

Given the architecture of SA/IQ, I honestly don't think we will be able to support huge pages until this bug has been fixed in the Linux kernel.

Former Member
0 Kudos

That's what I mean if you look at earlier replies.   I'm hoping for SAP to push this with Redhat and SuSE (the two fully supported Linux distributions).

The fact that this bug has been open for FOUR YEARS indicates that it has been put on very low priority.  We suffer from poor performance and risk corruption of our IQ databases meanwhile. 

phil_mitchell
Explorer
0 Kudos

I can assure you that the pushing is happening.  Sadly, Linus has a very negative view toward O_DIRECT, hence the low priority.

Also, there should be no risk of database corruption due to this issue, provided you accept the performance hit.  For typical SA-scale databases we determined that the performance hit was only about 1-2%.  However, I fully understand that at IQ scales, it's probably much more significant, though I haven't seen any actual numbers since I don't work directly with the IQ product.

Former Member
0 Kudos

The last mention of this on any of the linux kernel email lists is in 2009.  Other direct i/o bugs have been fixed.  Why is this one taking so long?

phil_mitchell
Explorer
0 Kudos

Excellent question.

phil_mitchell
Explorer
0 Kudos

I've heard a little bit more information from Red Hat:

The link to the kernel patch mentioned in http://froebe.net/blog/2013/06/17/does-anyone-have-any-details-on-redhat-linux-bug-891857/ does not give the full story about this kernel patch; as you can read on http://comments.gmane.org/gmane.linux.kernel.mm/31569 it was not the intention of the author of the patch to get it included in the Linux kernel in the form it was posted, and unfortunately Linus Torvalds himself seems to have been  strongly against including this patch in the Linux Kernel; so even though the blog post you mention makes it sound like that there is a patch ready to fix the underlying issue in the Linux kernel this is unfortunately not the case; and as long as Linus Torvalds is against this it will be very difficult to fix since this is a very critical area of the Linux kernel

They also mentioned that Red Hat has a policy not to add patches to the kernel that are not accepted upstream, with the exception of some hardware drivers that are provided by vendors as binaries.

So now to lobby Linus....

Former Member
0 Kudos

thanks Phil!

Unfortunately, Linus isn't the maintainer for that portion of the Linux Kernel.  He gets final say on patches sent to him by the maintainers... for the most part.

Linus Torvalds isn't the one to lobby  http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS

Jason

markmumy
Product and Topic Expert
Product and Topic Expert
0 Kudos

That I can't speak to.  I would imagine that it just comes down to prioritization of features.

Former Member
0 Kudos

Thanks again Mark.  Is there a feature request already created for it?  If so, what's the CR#?

Former Member
0 Kudos

Hi Jason,

As mentioned by Abhitjit it is best to leave huge pages turned off for IQ on Linux. If the above answered your question , could you mark one of the responses as the correct answer?

Thanks

Andrew