Giter Club home page Giter Club logo

libhugetlbfs's Introduction

03 December 2015 -- Yet another mailing list move

librelist seems to be dead or at least broken.  I have recieved several
emails directly saying that patches were posted, but they never got
responses and the archives always seem empty.  So we are moving the list
again.  This from here on out we will be using

[email protected]

as our mailing list.  Please send patches to this list rather than
creating a pull request.

03 June 2015  -- libhugetlbfs to find new home

As part of the fall out from the recent hijacking various "unmaintained"
projects, I no longer wish to host this (or any other) project at
sourceforge.  Effective today, the new official home for libhugetlbfs
code is

https://github.com/libhugetlbfs/libhugetlbfs

The doubling of the name is unfortunate, but I wanted the repo to belong
to the org of the same name so there it is.

Please do not submit pull requests, they will be closed with a redirect
to the mailing list (see below) at best, or ignored completely at worst.

Tarballs of specific releases can still be downloaded using the github
Download ZIP button from the appropriate tag (or branch).  The mailing
list will now hosted at librelists and can be found at

[email protected]

For libhugetlbfs usage, see the HOWTO, for what has changed see NEWS,
and for how to work with the project see SubmittingCode

10/03/2006 -- libhugetlbfs-1.0 Released

After roughly one year in development, version 1.0 of libhugetlbfs is here.
It can be downloaded from SourceForge or the OzLabs mirror:

	http://sourceforge.net/project/showfiles.php?group_id=156936
	http://libhugetlbfs.ozlabs.org/snapshots/

After a series of preview releases, we have tested a huge array of the
supported usage scenarios using benchmarks and real HPC applications.
Usability and reliability have greatly improved.  But... due to the
incredible diversity of applications that exist, there is bound to be a few
that will not work correctly.

If using libhugetlbfs makes your application slower:

 * Play around with the different combinations of hugetlb malloc and the
   two different supported link types to see which combination works best.

 * Keep in mind that huge pages are a niche performance tweak and are not
   suitable for every type of application.  They are specifically known to
   hurt performance in certain situations.

If you experience problems:

 * You've already read the HOWTO document, but read through it again.  It
   is full of hints, notes, warnings, and caveats that we have found over
   time.  This is the best starting point for a quick resolution to your
   issue.

 * Make sure you have enough huge pages allocated.  Even if you think you
   have enough, try increasing it to a number you know you will not use.

 * Set HUGETLB_VERBOSE=99 and HUGETLB_DEBUG=yes.  These options increase
   the verbosity of the library and enable extra checking to help diagnose
   the problem.

If the above steps do not help, send as much information about the problem
(including all libhugetlbfs debug output) to
[email protected] and we'll help out as much as we
can.  We will probably ask you to collect things like: straces,
/proc/pid/maps and gdb back traces.

libhugetlbfs's People

Contributors

alexghiti avatar alyssais avatar awhitcroft avatar b40290 avatar chleroy avatar chunyu-hu avatar dgibson avatar fboudra avatar gemorin avatar gerald-schaefer avatar gormanm avatar jarodwilson avatar joonsookim avatar jstancek avatar khers avatar koenkooi avatar kraj avatar mbgg avatar michelmno avatar mjkravetz avatar nick-arm avatar robbat2 avatar sandip4n avatar sharkcz avatar stevecapperarm avatar stevecapperlinaro avatar super7ramp avatar vapier avatar wangli5665 avatar watologo1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libhugetlbfs's Issues

libhugetlbfs compile failed with glibc2.34

error log:
[ 56s] morecore.c: In function '__lh_hugetlbfs_setup_morecore':
[ 56s] morecore.c:368:3: error: '__morecore' undeclared (first use in this function); did you mean 'thp_morecore'?
[ 56s] 368 | __morecore = &thp_morecore;
[ 56s] | ^~~~~~~~~~
[ 56s] | thp_morecore
[ 56s] morecore.c:368:3: note: each undeclared identifier is reported only once for each function it appears in

glibc2.34release notes: https://sourceware.org/pipermail/libc-alpha/2021-August/129718.html

The __morecore and __after_morecore_hook malloc hooks and the default
implementation __default_morecore have been removed from the API. Existing
applications will continue to link against these symbols but the interfaces
no longer have any effect on malloc.

Running libhugetlbfs test on both 2M and 1G Hugepages.

We have systems supporting 2M and 1G hugepages, Default any one of them during boot.
We can have both 2M and 1G hugepages on a given system.

How ever I see libhugetlbfs tests run only on the default hugepage size not on the one which is available

Is there any command line option to pass the hugepage size on which tests has to run?
if not
Can we enhance the tests to run on available hugepage sizes on the system.

numa-fying pre-allocations (obey-mempolicy) on multiple nodes

I'm attempting to use the --pool-pages-min and --pool-pages-max settings to pre-allocate a fixed number of hugepages on multiple NUMA nodes. If I want a 50/50, 100/0, or 0/100 split, this is easy (no numactl, numactl -N0 hugeadm --obey-mempolicy, and numactl -N1 hugeadm --obey-mempolicy respectively).

If I want to, for example, have a max of 100 hugepages, allocate 25 of those on node 0 and 75 on node 1, I can't figure out a way to reliably accomplish this.

Anyone have any suggestions on how to accomplish this?

This is what I'm trying that doesn't work (Well, it works, but it doesn't do what I'd like):

hugeadm --pool-pages-max 2097152:100
numactl -N 0 hugeadm --obey-mempolicy --pool-pages-min 2097152:25
numactl -N 1 hugeadm --obey-mempolicy --pool-pages-min 2097152:75

libhugetlbfs lacks support for PIEs

Ubuntu 16.10 and 17.04 ship compilers that default to creating position independent executables. See https://wiki.ubuntu.com/Security/Features This results in libhugetlbfs testsuite failures, which can be worked around by building with "make LDFLAGS=-no-pie" if you don't want to test PIEs.

However, this is not just a testsuite problem.

libhugetlbfs just doesn't handle PIEs properly. In some cases you get segfaults in the library, eg. linkhuge_rw with HUGETLB_ELFMAP=RW. In others such as linkhuge_rw with HUGETLB_ELFMAP=R the huge text segment is mapped at the wrong address, the linked address rather than the run-time address. See below for a ppc64le example. The PIE is running at 20000000.

cat /proc/56723/maps
00000000-01000000 r-xp 00000000 00:25 69871865 /dev/hugepages/libhugetlbfs.tmp.1HEm6e (deleted)
20000000-20020000 r-xp 00000000 fd:00 65144018 /home/amodra/build/libhugetlbfs/tests/obj64/linkhuge_rw
[snip rest]

The problem also reproduces on x86_64.

make check fail with kernel 4.13-rc2 since Failed to open direct-IO file: File exists

I have set debug related environment but do not get useful info:

direct (2M: 32):	libhugetlbfs [local:9868]: INFO: Found pagesize 2048 kB
libhugetlbfs [local:9868]: INFO: Detected page sizes:
libhugetlbfs [local:9868]: INFO:    Size: 2048 kB (default)  Mount: /dev/hugepages
libhugetlbfs [local:9868]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [local:9868]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [local:9868]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [local:9868]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [local:9868]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [local:9868]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [local:9868]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [local:9868]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [local:9868]: INFO: No segments were appropriate for remapping
Bad configuration: Failed to open direct-IO file: File exists
direct (2M: 64):	libhugetlbfs [local:9869]: INFO: Found pagesize 2048 kB
libhugetlbfs [local:9869]: INFO: Detected page sizes:
libhugetlbfs [local:9869]: INFO:    Size: 2048 kB (default)  Mount: /dev/hugepages
libhugetlbfs [local:9869]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [local:9869]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [local:9869]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [local:9869]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [local:9869]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [local:9869]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [local:9869]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [local:9869]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [local:9869]: INFO: No segments were appropriate for remapping
Bad configuration: Failed to open direct-IO file: File exists

It say bad configuration, but i do not find doc how to set files path for make check.
Do i miss something?

shm-perms fails with "Page size is too large for configured SEGMENT_SIZE" on aarch64

Due to CONFIG_ARM64_64K_PAGES=y we get 512MB as default hugepage size, however the test sets the maximum size of a shmem segment (SHMMAX) too low: 64MB, causing this check to trip:
80 check_hugetlb_shm_group();
81 if (hpage_size > SEGMENT_SIZE)
82 CONFIG("Page size is too large for configured SEGMENT_SIZE\n");

The attached patch adjust the segment size to 1GB, so we don't fail with gigantic hugepages anymore.
tests_shm-perms_adjust_max_segment_size_for_bigger_hugepages.patch.txt

non-zero memory when shrinking after a calloc()

There is some issue when some calloc() mem was not zeroed when using libhugetlbfs shrinking.

glibc by default expects "newly allocated" memory to be zeroed. This is controlled by the MORECORE_CLEARS #define in the source code. So it can't be modified without recompiling the library.

This is a problem when using shrinking since the code makes glibc believe that everything it asked for was trimmed while only full huge pages were really unmapped (and therefore will be zeroed when the heap grows again).

Since glibc only needs zeroing in some allocation paths, it would be wasteful imo to zero the memory that is "trimmed" but still mapped. Therefore I think the best way is simply to tell glibc we only trimmed full pages. To avoid that glibc tries to trim constantly an area we cannot unmap, I have upped the trim threshold.

Here is what I am suggesting:

--- a/morecore.c
+++ b/morecore.c
@@ -175,6 +175,7 @@ static void *hugetlbfs_morecore(ptrdiff_t increment)
                INFO("Attempting to unmap %ld bytes @ %p\n", -delta,
                        heapbase + mapsize + delta);
                ret = munmap(heapbase + mapsize + delta, -delta);
+                increment = heapbase - heaptop + mapsize + delta ;
                if (ret) {
                        WARNING("Unmapping failed while shrinking heap: "
                                "%s\n", strerror(errno));
@@ -357,7 +358,7 @@ void hugetlbfs_setup_morecore(void)
        /* Set some allocator options more appropriate for hugepages */

        if (__hugetlb_opts.shrink_ok)
-               mallopt(M_TRIM_THRESHOLD, hpage_size / 2);
+               mallopt(M_TRIM_THRESHOLD, hpage_size + hpage_size / 2);
        else
                mallopt(M_TRIM_THRESHOLD, -1);
        mallopt(M_TOP_PAD, hpage_size / 2);

data placed in hugepages make malloc core

Hi,I met a problem when using HUGETLB_ELFMAP=RW, it seems that libhugetlbfs remap the address which has been allocated by malloc.
It is because some constructor function in other dynamic library calls malloc before the constructor(setup_libhugetlbfs) in libhugetlbfs. And when libhugetlbfs remmaping data and bss,It aligns the end to 2M bound and reset the chunks
Is there anyway to make the constructor in libhugetlbfs runs first? Or Anyone can help me solve this case?

Read only Dynamic section in linkhuge_rw cause SIGSEGV on MIPS64 Port

MIPS SVR4 ABI require that the Dynamic section is Read ONLY (not like x86 which maybe has Dynamic section Read-Write, and "biased" by ld.so with run-time base address ), the symtab and symstr from find_table "is not" biased with base address, which cause SIGSEGV.

Base on the work on #49 , I had following patch

index ce2ed24..845e5b8 100644
--- a/elflink.c
+++ b/elflink.c
@@ -392,14 +392,19 @@ static int find_dynamic(Elf_Dyn **dyntab, const ElfW(Addr) addr,
 }

 /* Find the dynamic string and symbol tables */
-static int find_tables(Elf_Dyn *dyntab, Elf_Sym **symtab, char **strtab)
+static int find_tables(Elf_Dyn *dyntab, const ElfW(Addr) addr,
+                      Elf_Sym **symtab, char **strtab)
 {
        int i = 1;
        while ((dyntab[i].d_tag != DT_NULL)) {
                if (dyntab[i].d_tag == DT_SYMTAB)
-                       *symtab = (Elf_Sym *)dyntab[i].d_un.d_ptr;
+                       *symtab = dyntab[i].d_un.d_ptr < addr ?
+                               (Elf_Sym *)(addr + dyntab[i].d_un.d_ptr) :
+                               (Elf_Sym *)(dyntab[i].d_un.d_ptr);
                else if (dyntab[i].d_tag == DT_STRTAB)
-                       *strtab = (char *)dyntab[i].d_un.d_ptr;
+                       *strtab = dyntab[i].d_un.d_ptr < addr ?
+                               (char *)(addr + dyntab[i].d_un.d_ptr) :
+                               (char*)(dyntab[i].d_un.d_ptr);
                i++;
        }

@@ -440,11 +445,13 @@ static int find_numsyms(Elf_Sym *symtab, char *strtab)
  * - Object type (variable)
  * - Non-zero size (zero size means the symbol is just a marker with no data)
  */
-static inline int keep_symbol(char *strtab, Elf_Sym *s, void *start, void *end)
+static inline int keep_symbol(const ElfW(Addr) addr, char *strtab, Elf_Sym *s, void *start, void *end)
 {
-       if ((void *)s->st_value < start)
+       const void* sym_addr = s->st_value < addr ? (void*)(s->st_value + addr) :
+                                                   (void*)(s->st_value);
+       if (sym_addr < start)
                return 0;
-       if ((void *)s->st_value > end)
+       if (sym_addr > end)
                return 0;
        if ((ELF_ST_BIND(s->st_info) != STB_GLOBAL) &&
            (ELF_ST_BIND(s->st_info) != STB_WEAK))
@@ -455,7 +462,7 @@ static inline int keep_symbol(char *strtab, Elf_Sym *s, void *start, void *end)
                return 0;

        if (__hugetlbfs_debug)
-               DEBUG("symbol to copy at %p: %s\n", (void *)s->st_value,
+               DEBUG("symbol to copy at %p: %s\n", sym_addr,
                                                strtab + s->st_name);

        return 1;
@@ -499,7 +506,7 @@ static void get_extracopy(struct seg_info *seg, const ElfW(Addr) addr,
                goto bail;

        /* Find symbol and string tables */
-       ret = find_tables(dyntab, &symtab, &strtab);
+       ret = find_tables(dyntab, addr, &symtab, &strtab);
        if (ret < 0)
                goto bail;

@@ -514,12 +521,14 @@ static void get_extracopy(struct seg_info *seg, const ElfW(Addr) addr,
        end = start;

        for (sym = symtab; sym < symtab + numsyms; sym++) {
-               if (!keep_symbol(strtab, sym, start, end_orig))
+               if (!keep_symbol(addr, strtab, sym, start, end_orig))
                        continue;

                /* These are the droids we are looking for */
                found_sym = 1;
-               sym_end = (void *)(sym->st_value + sym->st_size);
+               sym_end = sym->st_value < addr ?
+                       (void *)(sym->st_value + addr + sym->st_size) :
+                       (void *)(sym->st_value + sym->st_size);
                if (sym_end > end)
                        end = sym_end;
        }

`
It passed on both x86-64 and mips64, any advice?

API request: Change __hugetlbfs_verbose

Thank you for maintaining a useful library!

Would it be possible to change __hugetlbfs_verbose on an API level?

I would like to be able to choose within my application the default verbosity level, without patching the source.

Thank you for the consideration.

When shrinking the heap, mapsize might not be just adjusted

If libhugetlbfs is compiled with MAP_HUGETLB, mapsize will not be adjusted after the munmap().

Here is my suggested fix (I can send a pull request if you prefer)

diff --git a/morecore.c b/morecore.c
index 62ad252..3fd998a 100644
--- a/morecore.c
+++ b/morecore.c
@@ -178,16 +178,18 @@ static void *hugetlbfs_morecore(ptrdiff_t increment)
                if (ret) {
                        WARNING("Unmapping failed while shrinking heap: "
                                "%s\n", strerror(errno));
-               } else if (!__hugetlb_opts.map_hugetlb && !using_default_pagesize){
-
-                       /*
-                        * Now shrink the hugetlbfs file.
-                        */
+               } else {
                        mapsize += delta;
-                       ret = ftruncate(heap_fd, mapsize);
-                       if (ret) {
-                               WARNING("Could not truncate hugetlbfs file to "
-                                       "shrink heap: %s\n", strerror(errno));
+                       if (!__hugetlb_opts.map_hugetlb && !using_default_pagesize){
+
+                               /*
+                               * Now shrink the hugetlbfs file.
+                               */
+                               ret = ftruncate(heap_fd, mapsize);
+                               if (ret) {
+                                       WARNING("Could not truncate hugetlbfs file to "
+                                               "shrink heap: %s\n", strerror(errno));
+                               }
                        }
                }

hugetlbfs_prefault potential deadlock

hugetlbfs_prefault as implemented opens /dev/zero and uses readv to force kernel to fault allocated memory regions. We use a combination of SolarFlare openonload (now AMD) and libhugetlbfs with LD_PRELOAD. OpenOnload intercepts calls to file syscalls including open/close and readv. When OpenOnload calls malloc to allocate memory for its private purposes and pre-faulting feature in libhugetlbfs is enabled, calling hugetlbfs_prefault ends up calling back into the onload code which is not re-entrancy safe at that point causing a potential deadlock.

I am also wandering why this implementation which uses at least 3 syscalls (open close and at least 1 readv) was chosen vs something like mlock to prefault the region.

Here's a callstack:

#0 0x00007ffff77077fd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007ffff7700cf4 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2 0x00007ffff7b4b8d6 in oo_rwlock_lock_write (l=) at ../../../../../src/include/onload/ul/rwlock.h:276
#3 citp_fdtable_probe (fd=6) at ../../../../../src/lib/transport/unix/fdtable.c:547
#4 citp_fdtable_probe (fd=6) at ../../../../../src/lib/transport/unix/fdtable.c:536
#5 0x00007ffff7b4f0b5 in citp_fdtable_lookup_fast (ctx=0x7fffffffbc30, fd=6) at ../../../../../src/lib/transport/unix/fdtable.c:747
#6 0x00007ffff7b4036b in onload_readv (fd=6, vector=0x7fffffffbcc0, count=1) at ../../../../../src/lib/transport/unix/sockcall_intercept.c:1669
#7 0x00007ffff791e28b in __lh_hugetlbfs_prefault () from /lib64/libhugetlbfs.so
#8 0x00007ffff791ed08 in hugetlbfs_morecore () from /lib64/libhugetlbfs.so
#9 0x00007ffff468c267 in sysmalloc () from /lib64/libc.so.6
#10 0x00007ffff468d341 in _int_malloc () from /lib64/libc.so.6
#11 0x00007ffff468e5b3 in malloc () from /lib64/libc.so.6
#12 0x00007ffff7b90508 in netif_tcp_helper_build (ni=ni@entry=0xc526590) at ../../../../../src/lib/transport/ip/netif_init.c:2250
#13 0x00007ffff7b97f54 in netif_tcp_helper_alloc_u (stack_name=0x7ffff7dd6000 <oo_stackname_global+32> "", flags=0, opts=, ni=0xc526590, fd=4) at ../../../../../src/lib/transport/ip/netif_init.c:2518
#14 ci_netif_ctor (ni=ni@entry=0xc526590, fd=4, stack_name=stack_name@entry=0x7ffff7dd6000 <oo_stackname_global+32> "", flags=flags@entry=0) at ../../../../../src/lib/transport/ip/netif_init.c:2785
#15 0x00007ffff7b64075 in __citp_netif_alloc (out_ni=0x7fffffffc470, flags=0, name=, fd=0x7fffffffc4c4) at ../../../../../src/lib/transport/common/netif_init.c:150
#16 citp_netif_alloc_and_init (fd=fd@entry=0x7fffffffc4c4, out_ni=out_ni@entry=0x7fffffffc4c8) at ../../../../../src/lib/transport/common/netif_init.c:356
#17 0x00007ffff7b55032 in citp_udp_socket (domain=2, type=2, protocol=) at ../../../../../src/lib/transport/unix/udp_fd.c:112
#18 0x00007ffff7b4f2fe in citp_protocol_manager_create_socket (domain=domain@entry=2, type=type@entry=2, protocol=protocol@entry=0) at ../../../../../src/lib/transport/unix/protocol_manager.c:161
#19 0x00007ffff7b3cff7 in onload_socket (domain=2, type=2, protocol=0) at ../../../../../src/lib/transport/unix/sockcall_intercept.c:288
... application calls socket() to create a socket

A similar issue was raised and addressed with openonload related to jemalloc library which also uses LD_PRELOAD and calls into various syscalls on first call to malloc (https://support.xilinx.com/s/article/75453?language=en_US)

Github's release tarball not useable - version.h:1:2: error: #error UNVERSIONED tarball

Hi,

when you try building from the latest release tarball (https://github.com/libhugetlbfs/libhugetlbfs/archive/2.20.tar.gz) you will end up with an error like

In file included from version.c:1:0:
version.h:1:2: error: #error UNVERSIONED tarball
 #error UNVERSIONED tarball
  ^
version.c:3:55: error: expected ‘,’ or ‘;’ before ‘VERSION’
 static const char libhugetlbfs_version[] = "VERSION: "VERSION;
                                                       ^
Makefile:294: recipe for target 'obj64/version.o' failed
In file included from elflink.c:42:0:
version.h:1:2: error: #error UNVERSIONED tarball
 #error UNVERSIONED tarball
  ^
make: *** [obj64/version.o] Error 1
make: *** Waiting for unfinished jobs....

When you compare the GitHub release tarball with the 2.19 release tarball from SF you will notice that that SF release tarball contains a "version" file...

Map Text section with Static Linking

Hello,

I am trying some experiments with libhugetlbfs library. I want to fit the text section of my executable in a hugepage. I also want to put any other dependent library function (like printf, scanf etc) in hugepage. So, I am trying to statically compile my binary.

But, I am not able to get it running. Any help on how to statically compile programs with libhugetlbfs and get it running will be great, I am struggling for two weeks now.

// a.c ------ Program ------

include<stdio.h>

int main() {

int a[1000000] = {1};
scanf("%d", &a[0]); // Just so that I can analyse pmaps
return 0;
}

//-------------End--------

Compilation Command:

$ gcc -c -o a.o a.c

$ gcc -static --verbose -B/usr/local/share/libhugetlbfs -L/usr/local/lib -Wl,--hugetlbfs-align -o static.out a.o -Wl,--no-as-needed -lpthread -Wl,--whole-archive -lhugetlbfs -Wl,--no-whole-archive


Trying to run using :

$ sudo hugectl --verbose 99 --text --no-preload ./static.out
hugectl: INFO: LD_PRELOAD disabled
hugectl: INFO: HUGETLB_DEBUG='yes'
hugectl: INFO: HUGETLB_VERBOSE='99'
hugectl: INFO: LD_LIBRARY_PATH='/usr/local/lib:/usr/local/lib64:'
hugectl: INFO: HUGETLB_ELFMAP='R'
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Found pagesize 2048 kB
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Detected page sizes:
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Size: 2048 kB (default) Mount: /dev/hugepages
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Parsed kernel version: [4] . [3] . [0]
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Kernel has MAP_PRIVATE reservations. Disabling heap prefaulting.
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Segment 0 (phdr 0): 0x8000000-0x81042f1 (filesz=0x1042f1) (prot = 0x5)
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: DEBUG: Total memsz = 0x1042f1, memsz of largest segment = 0x1042f1
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: libhugetlbfs version: 2.20
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4293]: INFO: Mapped hugeseg at 0xf7400000. Copying 0x1042f1 bytes and 0 extra bytes from 0x8000000...done
libhugetlbfs [p-Standard-PC-i440FX-PIIX-1996:4292]: INFO: Prepare succeeded


Core was generated by `./static.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x08081c25 in munmap ()

(gdb) bt
#0 0x08081c25 in munmap ()
#1 0x0804b743 in remap_segments (seg=, num=)

at elflink.c:1122

#2 __lh_hugetlbfs_setup_elflink () at elflink.c:1332
#3 0x08048921 in setup_libhugetlbfs () at init.c:36
#4 0x080528cd in __libc_csu_init ()
#5 0x0805240e in __libc_start_main ()
#6 0x08048b6e in _start ()

$ uname -a
Linux p-Standard-PC-i440FX-PIIX-1996 4.3.0+ #12 SMP Tue Mar 8 20:46:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux


If any more information is needed please tell.
Please help.

unrecognized option '--hugetlbfs-link=BDT'

I am trying to compile the project at Arch Linux and it fails


	 VERSION
version update: commit<4d54caaf4f6418fbde08737aa9416920a425dc09>
version string: commit<4d54caaf4f6418fbde08737aa9416920a425dc09> (modified)
	 CC64 obj64/elflink.o
	 AS64 obj64/sys-elf_x86_64.o
	 CC64 obj64/hugeutils.o
	 CC64 obj64/version.o
	 CC64 obj64/init.o
	 CC64 obj64/morecore.o
version.c:3:19: warning: ‘libhugetlbfs_version’ defined but not used [-Wunused-const-variable=]
 static const char libhugetlbfs_version[] = "VERSION: "VERSION;
                   ^~~~~~~~~~~~~~~~~~~~
	 CC64 obj64/debug.o
	 CC64 obj64/alloc.o
	 CC64 obj64/shm.o
	 CC64 obj64/kernel-features.o
	 CC64 obj64/init_privutils.o
	 CCHOST obj/init_privutils.o
	 CCHOST obj/debug.o
	 CCHOST obj/hugeutils.o
	 CCHOST obj/kernel-features.o
	 CCHOST obj/hugectl.o
	 CCHOST obj/hugeedit.o
	 CCHOST obj/hugeadm.o
	 CCHOST obj/pagesize.o
	 LDHOST obj/hugeedit
	 LDHOST obj/hugectl
	 LD64 (shared) obj64/libhugetlbfs_privutils.so
	 LD64 (shared) obj64/libhugetlbfs.so
	 AR64 obj64/libhugetlbfs.a
	 ARHOST obj/libhugetlbfs_privutils.a
	 LDHOST obj/pagesize
	 LDHOST obj/hugeadm
	 CC64 obj64/testutils.o
	 CC64 obj64/libtestutils.o
	 CC64 obj64/test_root.o
	 CC64 obj64/find_path.o
	 CC64 obj64/unlinked_fd.o
	 CC64 obj64/misalign.o
	 CC64 obj64/gethugepagesize.o
	 CC64 obj64/readback.o
	 CC64 obj64/truncate.o
	 CC64 obj64/shared.o
	 CC64 obj64/private.o
	 CC64 obj64/fork-cow.o
	 CC64 obj64/empty_mounts.o
	 CC64 obj64/large_mounts.o
	 CC64 obj64/meminfo_nohuge.o
	 CC64 obj64/ptrace-write-hugepage.o
	 CC64 obj64/icache-hygiene.o
	 CC64 obj64/slbpacaflush.o
	 CC64 obj64/chunk-overcommit.o
	 CC64 obj64/mprotect.o
icache-hygiene.c: In function ‘test_once’:
icache-hygiene.c:158:2: warning: ignoring return value of ‘ftruncate’, declared with attribute warn_unused_result [-Wunused-result]
  ftruncate(fd, 0);
  ^~~~~~~~~~~~~~~~
icache-hygiene.c:162:3: warning: ignoring return value of ‘ftruncate’, declared with attribute warn_unused_result [-Wunused-result]
   ftruncate(fd, 0);
   ^~~~~~~~~~~~~~~~
icache-hygiene.c:171:2: warning: ignoring return value of ‘ftruncate’, declared with attribute warn_unused_result [-Wunused-result]
  ftruncate(fd, hpage_size);
  ^~~~~~~~~~~~~~~~~~~~~~~~~
icache-hygiene.c:177:2: warning: ignoring return value of ‘ftruncate’, declared with attribute warn_unused_result [-Wunused-result]
  ftruncate(fd, 0);
  ^~~~~~~~~~~~~~~~
	 CC64 obj64/alloc-instantiate-race.o
	 CC64 obj64/mlock.o
	 CC64 obj64/truncate_reserve_wraparound.o
	 CC64 obj64/truncate_sigbus_versus_oom.o
	 CC64 obj64/map_high_truncate_2.o
	 CC64 obj64/truncate_above_4GB.o
alloc-instantiate-race.c: In function ‘thread_racer’:
alloc-instantiate-race.c:114:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable]
  int rc;
      ^~
	 CC64 obj64/direct.o
	 CC64 obj64/misaligned_offset.o
	 CC64 obj64/brk_near_huge.o
	 CC64 obj64/task-size-overrun.o
	 CC64 obj64/stack_grow_into_huge.o
	 CC64 obj64/counters.o
	 CC64 obj64/quota.o
	 CC64 obj64/heap-overflow.o
	 CC64 obj64/get_huge_pages.o
	 CC64 obj64/get_hugepage_region.o
	 CC64 obj64/gethugepagesizes.o
	 CC64 obj64/shmoverride_linked.o
	 CC64 obj64/madvise_reserve.o
	 CC64 obj64/fadvise_reserve.o
shmoverride_linked.c: In function ‘local_read_meminfo’:
shmoverride_linked.c:113:11: warning: variable ‘readerr’ set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 CC64 obj64/readahead_reserve.o
	 CC64 obj64/shm-perms.o
	 CC64 obj64/mremap-expand-slice-collision.o
	 CC64 obj64/mremap-fixed-normal-near-huge.o
	 CC64 obj64/mremap-fixed-huge-near-normal.o
	 CC64 obj64/corrupt-by-cow-opt.o
	 CC64 obj64/noresv-preserve-resv-page.o
	 CC64 obj64/noresv-regarded-as-resv.o
	 CC64 obj64/fallocate_basic.o
	 CC64 obj64/fallocate_align.o
	 CC64 obj64/fallocate_stress.o
	 CC64 obj64/malloc.o
	 CC64 obj64/malloc_manysmall.o
	 CC64 obj64/dummy.o
	 CC64 obj64/heapshrink.o
	 CC64 obj64/shmoverride_unlinked.o
	 CC64 obj64/mmap-gettest.o
	 CC64 obj64/mmap-cow.o
	 CC64 obj64/shm-gettest.o
	 CC64 obj64/shm-getraw.o
	 CC64 obj64/shm-fork.o
shmoverride_unlinked.c: In function ‘local_read_meminfo’:
shmoverride_unlinked.c:113:11: warning: variable ‘readerr’ set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 SCRIPT64 obj64/dummy.ldscript
	 CC64 obj64/zero_filesize_segment.o
	 CC64 obj64/linkhuge.o
	 CC64 obj64/linkhuge_nofd.o
	 CC64 obj64/linkshare.o
	 CC64 obj64/straddle_4GB.o
	 CC64 obj64/huge_at_4GB_normal_below.o
	 CC64 obj64/huge_below_4GB_normal_above.o
	 LD64 (lib test) obj64/shmoverride_linked_static
	 CC64 obj64/get_hugetlbfs_path.o
	 CC64 obj64/compare_kvers.o
	 CC64 obj64/heapshrink-helper-pic.o
	 LD64 (lib test) obj64/gethugepagesize
	 LD64 (lib test) obj64/test_root
heapshrink-helper.c: In function ‘setup_heapshrink_helper’:
heapshrink-helper.c:24:2: warning: ignoring return value of ‘malloc’, declared with attribute warn_unused_result [-Wunused-result]
  (void) malloc(1);
  ^~~~~~~~~~~~~~~~
	 LD64 (lib test) obj64/find_path
	 LD64 (lib test) obj64/unlinked_fd
	 LD64 (lib test) obj64/misalign
	 LD64 (lib test) obj64/readback
	 LD64 (lib test) obj64/truncate
	 LD64 (lib test) obj64/shared
	 LD64 (lib test) obj64/private
	 LD64 (lib test) obj64/fork-cow
	 LD64 (lib test) obj64/empty_mounts
	 LD64 (lib test) obj64/large_mounts
	 LD64 (lib test) obj64/meminfo_nohuge
	 LD64 (lib test) obj64/ptrace-write-hugepage
	 LD64 (lib test) obj64/icache-hygiene
	 LD64 (lib test) obj64/slbpacaflush
	 LD64 (lib test) obj64/chunk-overcommit
	 LD64 (lib test) obj64/mprotect
	 LD64 (lib test) obj64/alloc-instantiate-race
	 LD64 (lib test) obj64/mlock
	 LD64 (lib test) obj64/truncate_reserve_wraparound
	 LD64 (lib test) obj64/truncate_sigbus_versus_oom
	 LD64 (lib test) obj64/map_high_truncate_2
	 LD64 (lib test) obj64/truncate_above_4GB
	 LD64 (lib test) obj64/direct
	 LD64 (lib test) obj64/misaligned_offset
	 LD64 (lib test) obj64/brk_near_huge
	 LD64 (lib test) obj64/task-size-overrun
	 LD64 (lib test) obj64/stack_grow_into_huge
	 LD64 (lib test) obj64/counters
	 LD64 (lib test) obj64/quota
	 LD64 (lib test) obj64/heap-overflow
	 LD64 (lib test) obj64/get_huge_pages
	 LD64 (lib test) obj64/get_hugepage_region
	 LD64 (lib test) obj64/shmoverride_linked
	 LD64 (lib test) obj64/gethugepagesizes
	 LD64 (lib test) obj64/madvise_reserve
	 LD64 (lib test) obj64/readahead_reserve
	 LD64 (lib test) obj64/fadvise_reserve
	 LD64 (lib test) obj64/shm-perms
	 LD64 (lib test) obj64/mremap-expand-slice-collision
	 LD64 (lib test) obj64/mremap-fixed-normal-near-huge
	 LD64 (lib test) obj64/mremap-fixed-huge-near-normal
	 LD64 (lib test) obj64/corrupt-by-cow-opt
	 LD64 (lib test) obj64/noresv-preserve-resv-page
	 LD64 (lib test) obj64/noresv-regarded-as-resv
	 LD64 (lib test) obj64/fallocate_basic
	 LD64 (lib test) obj64/fallocate_align
	 LD64 (lib test) obj64/fallocate_stress
	 LD64 (nolib test) obj64/malloc
	 LD64 (nolib test) obj64/malloc_manysmall
	 LD64 (nolib test) obj64/dummy
	 LD64 (nolib test) obj64/heapshrink
	 LD64 (nolib test) obj64/shmoverride_unlinked
	 LD64 (lib test) obj64/mmap-gettest
	 LD64 (lib test) obj64/mmap-cow
	 LD64 (lib test) obj64/shm-gettest
	 LD64 (lib test) obj64/shm-getraw
	 LD64 (lib test) obj64/shm-fork
	 LD64 (preload test) obj64/zero_filesize_segment
	 LD64 (hugelink test) obj64/linkhuge
	 LD64 (hugelink test) obj64/linkhuge_nofd
	 LD64 (hugelink test) obj64/linkshare
	 LD64 (xB test) obj64/xB.linkhuge
	 LD64 (xB test) obj64/xB.linkhuge_nofd
/usr/bin/ld: warning: zero_filesize_segment.ld contains output sections; did you forget -T?
	 LD64 (xBDT test) obj64/xBDT.linkhuge
	 LD64 (xB test) obj64/xB.linkshare
	 LD64 (xBDT test) obj64/xBDT.linkhuge_nofd
	 LD64 (xBDT test) obj64/xBDT.linkshare
/usr/bin/ld: unrecognized option '--hugetlbfs-link=B'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
	 LD64 (lib test) obj64/straddle_4GB_static
make[1]: *** [Makefile:236: obj64/xB.linkhuge] Error 1
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: unrecognized option '--hugetlbfs-link=B'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:236: obj64/xB.linkhuge_nofd] Error 1
/usr/bin/ld: unrecognized option '--hugetlbfs-link=B'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:236: obj64/xB.linkshare] Error 1
/usr/bin/ld: unrecognized option '--hugetlbfs-link=BDT'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
/usr/bin/ld: unrecognized option '--hugetlbfs-link=BDT'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:248: obj64/xBDT.linkhuge] Error 1
make[1]: *** [Makefile:248: obj64/xBDT.linkhuge_nofd] Error 1
/usr/bin/ld: unrecognized option '--hugetlbfs-link=BDT'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:248: obj64/xBDT.linkshare] Error 1
make: *** [Makefile:249: tests/all] Error 2

Is there a way to fix it?

"ERROR: Line too long when parsing mounts"

Minor issue I stumbled on:

$ sudo mount -t hugetlbfs -o pagesize=1G,size=1G,uid=1000,gid=1000 none /huge

$ hugectl ...
[...]
libhugetlbfs []: ERROR: Line too long when parsing mounts
libhugetlbfs []: WARNING: No mount point found for default huge page size. Using first available mount point.
libhugetlbfs []: INFO: Detected page sizes:
libhugetlbfs []: INFO:    Size: 2048 kB (default)  Mount: 
libhugetlbfs [ioctl:116851]: INFO: Parsed kernel version: [5] . [3] . [0] 

Because of this error, it misses the hugetlbfs mount point, because the find_mounts function errors out in this case:

libhugetlbfs/hugeutils.c

Lines 642 to 648 in e6499ff

while ((bytes = read(fd, line, LINE_MAXLEN)) > 0) {
line[LINE_MAXLEN] = '\0';
eol = strchr(line, '\n');
if (!eol) {
ERROR("Line too long when parsing mounts\n");
break;
}

The very long line is simply caused by launching a docker container with many layers, using an overlay filesystem:

$ cat /proc/self/mounts
[...]
overlay /mnt/sda/docker/overlay2/e7f8f4dd4155bb5151d90ae8ea92788cb69c079ad62d822b40295e9e907eeea9/merged overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/AEKD2CKJMD3U7TIPUFY4LS2XRN:/var/lib/docker/overlay2/l/ROZDMXG2L6B6M67KJ5NVU2HRHF:/var/lib/docker/overlay2/l/MVOZ7OM65F4DHT4YBEIDWNVGXC:/var/lib/docker/overlay2/l/KVFF6IBTOWNTPK5OU44Q7W3CR6:/var/lib/docker/overlay2/l/NUGMPP2THZBDLFA64TK56CL5LJ:/var/lib/docker/overlay2/l/PQ7VXQDXD76JZ3IQLRRV3BLOUE:/var/lib/docker/overlay2/l/AYZ6NJ2BNIRHZBKYHDNTUN6SAZ:/var/lib/docker/overlay2/l/YV25GCPYWQF4ADLFYS7Y2U4SWS:/var/lib/docker/overlay2/l/OR6KR4RR745ZDF42JSKVUK5ARZ:/var/lib/docker/overlay2/l/IWYGDQU7EFNOFMPYEROLJOXLBC:/var/lib/docker/overlay2/l/5QNLUUHO7VZECPRMUWG64KRI4B:/var/lib/docker/overlay2/l/EU2N7M5ZCIALIMP6FI6QOMHQD4:/var/lib/docker/overlay2/l/YQEJIC27EJJQFUSGNBU2UJDJID:/var/lib/docker/overlay2/l/J4S273ARWY2CJIKOKV7LQLNP7J:/var/lib/docker/overlay2/l/26F66GLUX6A5YHXPJNIPYFD5JZ:/var/lib/docker/overlay2/l/F2V7FMEMT47AI3PZKAXQKACBCV:/var/lib/docker/overlay2/l/KGM3PD5GUQY2K4UM4YEGXLNEG6:/var/lib/docker/overlay2/l/EL3POXHZ4GYYII4FJZPVWYKQFY:/var/lib/docker/overlay2/l/VWT6QV4ZHSBR5LKSMNWOUBENGJ:/var/lib/docker/overlay2/l/ZBMCA5BVL5FONNW5LVQNXSLKTP:/var/lib/docker/overlay2/l/FLANFO5WB7IORYOUM3MDGMQM2N:/var/lib/docker/overlay2/l/OA7JF4TWMPZHGT7XKVXVWXRSDJ:/var/lib/docker/overlay2/l/XPMH6KIWGPZHUXEKG7BSVQFDLI:/var/lib/docker/overlay2/l/KCLYWGCPLVPKPYUSEGKE2UZUWY:/var/lib/docker/overlay2/l/T4MVJAIC35KW5SWGJEB3KK5R4G:/var/lib/docker/overlay2/l/MPUHYKBYNZWNR5HZ44A4Z2WO6T:/var/lib/docker/overlay2/l/R5636AMPI7CZFIVKDGB72MS4GR:/var/lib/docker/overlay2/l/7PTX26ASBPSKGY73F46K24O5IH:/var/lib/docker/overlay2/l/APNUMRDBQBZNLH266S63MQ7GMZ:/var/lib/docker/overlay2/l/UTOCNTGTDQVKY6X7FTKMLPGUA7:/var/lib/docker/overlay2/l/XBHAVZOSKNL7F6732HW37MQO7K:/var/lib/docker/overlay2/l/RV3S2XHDH7ZXQVNCBEJOSZYYKJ,upperdir=/var/lib/docker/overlay2/e7f8f4dd4155bb5151d90ae8ea92788cb69c079ad62d822b40295e9e907eeea9/diff,workdir=/var/lib/docker/overlay2/e7f8f4dd4155bb5151d90ae8ea92788cb69c079ad62d822b40295e9e907eeea9/work,xino=off 0 0

Cannot load data var/lib/hugetlbfs/global/

Hello,

I'm writing to you because I'm currently doing sysadmin for a research group.
They used to load data in their software the following way:

cpt00s: sudo cp /onefile/data.bin /var/lib/hugetlbfs/global/pagesize-1GB/

Now for an unkown reason they reach out to me because it doesn't work anymore:

cpt00s: sudo cp /onefile/data.bin /var/lib/hugetlbfs/global/pagesize-1GB/
cp: error writing ‘/var/lib/hugetlbfs/global/pagesize-1GB/data.bin’: Invalid argument

Could you help me understand how this was possible and if cp or another program would support such way of doing this ?
I witnessed after contacting the developer that on one machine cp was able to do this, but I don't know if they were potentially using a special cp

Regarding hugetable they seems in a expected state:

cpt00s: hugeadm --pool-list
      Size  Minimum  Current  Maximum  Default
   2097152    16384    16384    16384        *
1073741824       64       64       64
cpt00s: hugeadm --list-all-mounts
Mount Point                            Options
/dev/hugepages                         rw,seclabel,relatime
/var/lib/hugetlbfs/global/pagesize-2MB rw,seclabel,relatime,pagesize=2097152
/var/lib/hugetlbfs/global/pagesize-1GB rw,seclabel,relatime,pagesize=1073741824

Kernel seems booted the correct way:

BOOT_IMAGE=/boot/vmlinuz-3.10.0-1160.36.2.el7.x86_64 root=LABEL=cloudimg-rootfs ro console=tty0 crashkernel=auto console=ttyS0,115200 console=tty0 console=ttyS0,115200 no_timer_check transparent_hugepage=never default_hugepagesz=2M hugepagesz=2M hugepages=16384 hugepagesz=1G hugepages=64 LANG=en_US.UTF-8

My questions are:

  • Does cp support that way or loading data inside hugepage table ?
  • Is there a way we can track what's going wrong ?

Sincerly,

Best regards

some cases in make check killed by signal since segmentation fault

test steps:

make
obj/hugeadm --add-temp-swap=64 --pool-pages-min 2MB:64 --hard
make check

and below case all killed_by_signal

HUGETLB_ELFMAP=W linkhuge_rw (2M: 32):	
HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):	
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 32):	
HUGETLB_SHARE=0 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):	
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 32):	
HUGETLB_SHARE=1 HUGETLB_ELFMAP=W linkhuge_rw (2M: 64):

take HUGETLB_ELFMAP=W linkhuge_rw as example:

(gdb) r
Starting program: /root/libhugetlbfs/tests/obj32/linkhuge_rw 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Found pagesize 2048 kB
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Detected page sizes:
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO:    Size: 2048 kB (default)  Mount: /dev/hugepages
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [vm-lkp-nex04-8G-6:19571]: INFO: Segment 0 (phdr 3): 0x7ffec4-0x8201e8  (filesz=0x102e0) (prot = 0x3)

Program received signal SIGSEGV, Segmentation fault.
0xf7daf0a6 in find_tables (strtab=<optimized out>, symtab=<optimized out>, dyntab=<optimized out>) at elflink.c:397
397		while ((dyntab[i].d_tag != DT_NULL)) { <===

The i start from 1 but the dyntab only has one element, so lead to segmentation fault.
Is it a logic bug?

use libhugetlbfs “Using hugepages for malloc” in container cause Bus Error

I use k8s running program

According to this HOWTO, every command used libhugetlbfs in pod cause bus error:

# LD_PRELOAD=/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes tail -f aaa.conf
Bus error (core dumped)
# cat /proc/sys/vm/nr_overcommit_hugepages
300000000

if i drop HUGETLB_MORECORE env, it works but not use hugepage

But when I enter mount namespace use nsenter -m --target $PID, and execute this command, it works well

I did find a relevant question like this, I've tried to add capabilities, privilege, runAsUser root and other, but no use

  securityContext:
    allowPrivilegeEscalation: true
    capabilities:
      add:
      - SYS_ADMIN
      - IPC_LOCK
    privileged: true
    runAsUser: 0

How do i do, and what's this problem?

make check failed since xxx is not hugepage

HUGETLB_ELFMAP=R linkhuge_rw (2M: 64):	libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Found pagesize 2048 kB
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Segment 0 (phdr 2): 0-0x12b5c  (filesz=0x12b5c) (prot = 0x5)
libhugetlbfs [lkp-hsw-ep4:9751]: INFO: libhugetlbfs version: 2.20 (modified)
libhugetlbfs [lkp-hsw-ep4:9752]: INFO: Mapped hugeseg at 0x7f8fb7c00000. Copying 0x12b5c bytes and 0 extra bytes from (nil)...libhugetlbfs [lkp-hsw-ep4:9751]: INFO: Prepare succeeded
Starting testcase "linkhuge_rw", pid 9751
HUGETLB_ELFMAP=R
entry: small_data, data: 0xc23f35d140, writable: 1
entry: small_data, data: 0xc23f35d140, is_huge: 0
entry: big_data, data: 0xc23f34d140, writable: 1
entry: big_data, data: 0xc23f34d140, is_huge: 0
entry: small_bss, data: 0xc23f36d2c0, writable: 1
entry: small_bss, data: 0xc23f36d2c0, is_huge: 0
entry: big_bss, data: 0xc23f35d2c0, writable: 1
entry: big_bss, data: 0xc23f35d2c0, is_huge: 0
entry: small_const, data: 0xc23eb5f380, writable: 0
entry: small_const, data: 0xc23eb5f380, is_huge: 0
entry: big_const, data: 0xc23eb4f380, writable: 0
entry: big_const, data: 0xc23eb4f380, is_huge: 0
entry: static_func, data: 0xc23eb4e860, writable: 0
entry: static_func, data: 0xc23eb4e54d, is_huge: 0
entry: global_func, data: 0xc23eb4e840, writable: 0
entry: global_func, data: 0xc23eb4e54d, is_huge: 0
Hugepages used for:
FAIL	small_const is not hugepage
HUGETLB_ELFMAP=W HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 32):	libhugetlbfs [:9801]: INFO: HUGETLB_MINIMAL_COPY=no, disabling filesz copy optimization
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Found pagesize 2048 kB
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Segment 0 (phdr 3): 0x7ffec4-0x8201e8  (filesz=0x102e0) (prot = 0x3)
libhugetlbfs [lkp-hsw-ep4:9801]: INFO: libhugetlbfs version: 2.20 (modified)
libhugetlbfs [lkp-hsw-ep4:9802]: INFO: Mapped hugeseg at 0xf7800000. Copying 0x102e0 bytes and 0x10044 extra bytes from 0x7ffec4...libhugetlbfs [lkp-hsw-ep4:9801]: INFO: Prepare succeeded
Starting testcase "linkhuge_rw", pid 9801
HUGETLB_ELFMAP=W
entry: small_data, data: 0x56e300c0, writable: 1
entry: small_data, data: 0x56e300c0, is_huge: 0
entry: big_data, data: 0x56e200c0, writable: 1
entry: big_data, data: 0x56e200c0, is_huge: 0
entry: small_bss, data: 0x56e401e0, writable: 1
entry: small_bss, data: 0x56e401e0, is_huge: 0
entry: big_bss, data: 0x56e301e0, writable: 1
entry: big_bss, data: 0x56e301e0, is_huge: 0
entry: small_const, data: 0x56631f80, writable: 0
entry: small_const, data: 0x56631f80, is_huge: 0
entry: big_const, data: 0x56621f80, writable: 0
entry: big_const, data: 0x56621f80, is_huge: 0
entry: static_func, data: 0x566213b0, writable: 0
entry: static_func, data: 0x56621037, is_huge: 0
entry: global_func, data: 0x56621390, writable: 0
entry: global_func, data: 0x56621037, is_huge: 0
Hugepages used for:
FAIL	small_data is not hugepage
xB.linkhuge (2M: 32):	libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Found pagesize 2048 kB
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: Segment 0 (phdr 4): 0x9000000-0x9010048  (filesz=0) (prot = 0x7)
libhugetlbfs [lkp-hsw-ep4:9627]: INFO: libhugetlbfs version: 2.20 (modified)
libhugetlbfs [lkp-hsw-ep4:9628]: WARNING: Couldn't map hugepage segment to copy data: Invalid argument
libhugetlbfs [lkp-hsw-ep4:9628]: WARNING: Failed to prepare segment
libhugetlbfs [lkp-hsw-ep4:9627]: WARNING: Failed to setup hugetlbfs file for segment 0
Starting testcase "xB.linkhuge", pid 9627
Link string is [xB], HUGETLB_ELFMAP=(null)
Hugepages used for:
FAIL	small_bss is not hugepage

But some case can pass:

HUGETLB_ELFMAP=no linkhuge_rw (2M: 64):	libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Found pagesize 2048 kB
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [lkp-hsw-ep4:9774]: INFO: HUGETLB_ELFMAP=no, not attempting to remap program segments
Starting testcase "linkhuge_rw", pid 9774
HUGETLB_ELFMAP=no
entry: small_data, data: 0x1267356140, writable: 1
entry: small_data, data: 0x1267356140, is_huge: 0
entry: big_data, data: 0x1267346140, writable: 1
entry: big_data, data: 0x1267346140, is_huge: 0
entry: small_bss, data: 0x12673662c0, writable: 1
entry: small_bss, data: 0x12673662c0, is_huge: 0
entry: big_bss, data: 0x12673562c0, writable: 1
entry: big_bss, data: 0x12673562c0, is_huge: 0
entry: small_const, data: 0x1266b58380, writable: 0
entry: small_const, data: 0x1266b58380, is_huge: 0
entry: big_const, data: 0x1266b48380, writable: 0
entry: big_const, data: 0x1266b48380, is_huge: 0
entry: static_func, data: 0x1266b47860, writable: 0
entry: static_func, data: 0x1266b4754d, is_huge: 0
entry: global_func, data: 0x1266b47840, writable: 0
entry: global_func, data: 0x1266b4754d, is_huge: 0
Hugepages used for:
PASS

i just take following steps to test:

make 
obj/hugeadm --add-temp-swap=32 --pool-pages-min 2MB:32 --hard
make check

also i try to set environment val like HUGETLB_ELFMAP, HUGETLB_FORCE_ELFMAP, but do not help.
Could you help me?

hugeadm --pool-pages-max fails silently with 1G pages

Current Linux kernel (4.14-rc6 as of this writing) does not support overcommit for 1GB pages. This limitation is poorly documented, but please see a comment for patch "hugetlb: add support for gigantic page allocation at runtime" at https://patchwork.kernel.org/patch/3963961/ : "I didn't add support for hugepage overcommit, that is allocating a gigantic page on demand when /proc/sys/vm/nr_overcommit_hugepages > 0". Also the analysis of mm/hugetlb.c code suggests that this feature is unimplemented, and attempt to change /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages generates EINVAL.

For this reason 'hugeadm --pool-pages-max 1G:x' is unable to perform anything. The problem is that it fails silently, and the issue is not mentioned in the manpage.
What I propose is:

  1. Modify hugeadm.c to emit a meaningful error message in the aforementioned situation,
  2. Explain the above limitation in hugeadm manpage.

Static linking libc.so & tcmalloc results in segfaults when .text is remapped

Hi

I have a use case where statically linking tcmalloc causes a segfault on a .text segment remapping. This is because tcmalloc overrides the mmap/munmap symbols, when tcmalloc is unmapped by libhugetlbfs the mmap symbols are also ummaped leading to a segfault.

I have two proposed workarounds which allow the .text segment to be re-mapped in a binary that is statically linked against tcmalloc.

  1. If HUGETLB_MMAP_OVERRIDE = yes then explicitly pull in definitions of mmap/munmap from HUGETLB_MMAP_OVERRIDE_LIB using dlopen/dlsym (if HUGETLB_MMAP_OVERRIDE is unset then default to 'libc.so'). Any error in obtaining function pointers to mmap/munmap will result in the segments not being remapped.

  2. If HUGETLB_MMAP_OVERRIDE = yes then use our own syscall wrappers for mmap/munmap defined in elflink.c that simply call direct_syscall(..) instead of relying on ld to resolve the mmap symbols.

    I have attached the first proposed workaround, and can attach the second if required. I am leaning towards the second workaround as it is much simpler than the first.

Would either of these approaches be acceptable to be merged upstream?

Thanks

Max Tottenham
0001-Add-ability-to-remap-static-binaries-w-tcmalloc.patch.txt

'restrict' is reserved word in C99

GCC 5.1 switched to -std=c99 and this shows an compilation error in the codebase. restrict word should not be used for variable names.

version string: 2.19 (modified)
CC64 obj64/elflink.o
elflink.c: In function ‘get_extracopy’:
elflink.c:484:8: warning: variable ‘sym_start’ set but not used [-Wunused-but-set-variable]
void *sym_start, *sym_end;
^
AS64 obj64/sys-elf_x86_64.o
CC64 obj64/hugeutils.o
hugeutils.c: In function ‘__lh_hugetlbfs_setup_env’:
hugeutils.c:304:40: error: expected identifier or ‘(’ before ‘restrict’
char *p, *tok, *exe, buf[MAX_EXE+1], restrict[MAX_EXE];
^
hugeutils.c:309:11: error: expected expression before ‘restrict’
strncpy(restrict, env, sizeof restrict);
^
hugeutils.c:309:3: error: too few arguments to function ‘strncpy’
strncpy(restrict, env, sizeof restrict);
^
hugeutils.c:310:11: error: expected identifier or ‘(’ before ‘[’ token
restrict[sizeof(restrict)-1] = 0;
^
hugeutils.c:311:12: error: expected expression before ‘restrict’
for (p = restrict; (tok = strtok(p, ":")) != NULL; p = NULL) {
^
Makefile:292: recipe for target 'obj64/hugeutils.o' failed
make: *** [obj64/hugeutils.o] Error 1

linking obj32/linkhuge_rw fails on armv7

Starting with commit ff12744 the linker on armv7 crashes with "Floating point exception" when linking the linkhuge_rw test

	 LD32 (hugelink test) obj32/linkshare
	 CC32 obj32/linkhuge_rw.o
	 LD32 (hugelink_rw test) obj32/linkhuge_rw
./obj32/ld: line 130: 17291 Floating point exception(core dumped) ${LD} "${args[@]}" ${HTLBOPTS}
collect2: error: ld returned 136 exit status

This is on Fedora 27 with binutils-2.29-13.fc27.armv7hl

I have 2 segments overlap on 0x0-0x2000000,I think assignment layout area is a good idea.

libhugetlbfs: INFO: Segment 0 (phdr 2): 0x200000-0x2008e0 (filesz=0x8e0) (prot = 0x5)
libhugetlbfs: INFO: Segment 1 (phdr 3): 0x5ffdd8-0x600040 (filesz=0x260) (prot = 0x3)
libhugetlbfs: WARNING: Layout problem with segments 0 and 1:
Segments would overlap

[root@openEuler common]# readelf -l test

Elf file type is EXEC (Executable file)
Entry point 0x2005c0
There are 9 program headers, starting at offset 64

Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000200040 0x0000000000200040
0x00000000000001f8 0x00000000000001f8 R 0x8
INTERP 0x0000000000000238 0x0000000000200238 0x0000000000200238
0x000000000000001b 0x000000000000001b R 0x1
[Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
LOAD 0x0000000000000000 0x0000000000200000 0x0000000000200000
0x00000000000008e0 0x00000000000008e0 R E 0x200000
LOAD 0x00000000001ffdd8 0x00000000005ffdd8 0x00000000005ffdd8

gdb breakpoint unusable when using libhugetlbfs

hello, I found that when using libhugetlbfs, the gdb breakpoint is unusable.

Here is my test code

#include<stdio.h>
void xzpeng(void)
{
	int i,j;
    unsigned char *p = (unsigned char*)xzpeng;

    for (j=0; j<4; j++) {
        printf("%p: ",p);
        for (i=0; i<16; i++)
            printf("%.2x ", *p++);
        printf("\n");
    }
}
int main(void)
{
	char *p=NULL;
	xzpeng();
	printf("hello world\n");
	while(1);
}

normal output:

[root@localhost ~]# gdb ./test
GNU gdb (GDB) EulerOS 7.6.1-80.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-Huawei-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/test...done.
(gdb) b xzpeng
Breakpoint 1 at 0x400768: file test.c, line 5.
(gdb) r
Starting program: /root/./test 
libhugetlbfs [localhost:10158]: INFO: Found pagesize 2048 kB
libhugetlbfs [localhost:10158]: INFO: Parsed kernel version: [3] . [10] . [0]  [pre-release: 327]
libhugetlbfs [localhost:10158]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [localhost:10158]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [localhost:10158]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [localhost:10158]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [localhost:10158]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [localhost:10158]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [localhost:10158]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [localhost:10158]: INFO: No segments were appropriate for remapping

Breakpoint 1, xzpeng () at test.c:5
5	    unsigned char *p = (unsigned char*)xzpeng;
Missing separate debuginfos, use: debuginfo-install glibc-2.17-111.h15.x86_64 libhugetlbfs-2.16-11.x86_64
(gdb) c
Continuing.
0x400760: 55 48 89 e5 48 83 ec 10 cc c7 45 f0 60 07 40 00 
0x400770: c7 45 f8 00 00 00 00 eb 5a 48 8b 45 f0 48 89 c6 
0x400780: bf 90 08 40 00 b8 00 00 00 00 e8 b1 fe ff ff c7 
0x400790: 45 fc 00 00 00 00 eb 27 48 8b 45 f0 48 8d 50 01 
hello world
^C
Program received signal SIGINT, Interrupt.
main () at test.c:19
19		while(1);
(gdb) q
A debugging session is active.

	Inferior 1 [process 10158] will be killed.

Quit anyway? (y or n) y

when using libhugetlbfs,I got:

[root@localhost ~]# gdb ./test
GNU gdb (GDB) EulerOS 7.6.1-80.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-Huawei-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/test...done.
(gdb) b xzpeng
Breakpoint 1 at 0x400768: file test.c, line 5.
(gdb) r
Starting program: /root/./test 
libhugetlbfs [localhost:10126]: INFO: Found pagesize 2048 kB
libhugetlbfs [localhost:10126]: INFO: Parsed kernel version: [3] . [10] . [0]  [pre-release: 327]
libhugetlbfs [localhost:10126]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [localhost:10126]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [localhost:10126]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [localhost:10126]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [localhost:10126]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [localhost:10126]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [localhost:10126]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [localhost:10126]: INFO: Segment 0 (phdr 2): 0x400000-0x4009fc  (filesz=0x9fc) (prot = 0x5)
libhugetlbfs [localhost:10126]: INFO: Segment 1 (phdr 3): 0x7ffe00-0x800048  (filesz=0x244) (prot = 0x3)
libhugetlbfs [localhost:10126]: INFO: libhugetlbfs version: 2.16 (modified)
Detaching after fork from child process 10130.
libhugetlbfs [localhost:10130]: INFO: Mapped hugeseg at 0x2aaaaac00000. Copying 0x9fc bytes and 0 extra bytes from 0x400000...done
libhugetlbfs [localhost:10126]: INFO: Prepare succeeded
Detaching after fork from child process 10131.
libhugetlbfs [localhost:10131]: INFO: Mapped hugeseg at 0x2aaaaac00000. Copying 0x244 bytes and 0 extra bytes from 0x7ffe00...done
libhugetlbfs [localhost:10126]: INFO: Prepare succeeded
0x400760: 55 48 89 e5 48 83 ec 10 48 c7 45 f0 60 07 40 00 
0x400770: c7 45 f8 00 00 00 00 eb 5a 48 8b 45 f0 48 89 c6 
0x400780: bf 90 08 40 00 b8 00 00 00 00 e8 b1 fe ff ff c7 
0x400790: 45 fc 00 00 00 00 eb 27 48 8b 45 f0 48 8d 50 01 
hello world
^C
Program received signal SIGINT, Interrupt.
Cannot remove breakpoints because program is no longer writable.
Further execution is probably impossible.
main () at test.c:19
19		while(1);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-111.h15.x86_64 libhugetlbfs-2.16-11.x86_64
(gdb) q
A debugging session is active.

	Inferior 1 [process 10126] will be killed.

Quit anyway? (y or n) y 

from output above, it seems that when using libhugetlbfs, after ptrace(PTRACE_POKETEXT,...),the 0xcc did not inject to 0x400768 successfully but I could not find the reason. Can anyone help?

--hugetlbfs-align fails with 1GB hugepages for x86_64

Hi all,

I've been trying to link applications to use 1GB hugepages for the text/BSS segments. Unfortunately, I'm getting errors when running as segments would overlap. Upon closer inspection, segments are aligned to 4MiB boundaries, and therefore no wonder they would overlap. Looking at the linker script, I can see that the hugepagesize is computed as follows:

MB=$((1024*1024))
case "$EMU" in
elf32ppclinux)          HPAGE_SIZE=$((16*$MB)) SLICE_SIZE=$((256*$MB)) ;;
elf64ppc|elf64lppc)
        hpage_kb=$(cat /proc/meminfo  | grep Hugepagesize: | awk '{print $2}')
        MMU_TYPE=$(cat /proc/cpuinfo  | grep MMU | awk '{ print $3}')
        HPAGE_SIZE=$((hpage_kb * 1024))
        if [ "$MMU_TYPE" == "Hash" ] ; then
                SLICE_SIZE=$((256*$MB))
        else
                SLICE_SIZE=$HPAGE_SIZE
        fi ;;
elf_i386|elf_x86_64)    HPAGE_SIZE=$((4*$MB)) SLICE_SIZE=$HPAGE_SIZE ;;
elf_s390|elf64_s390)    HPAGE_SIZE=$((1*$MB)) SLICE_SIZE=$HPAGE_SIZE ;;
armelf*_linux_eabi|aarch64elf*|aarch64linux*)
        hpage_kb=$(cat /proc/meminfo  | grep Hugepagesize: | awk '{print $2}')
        HPAGE_SIZE=$((hpage_kb * 1024))
        SLICE_SIZE=$HPAGE_SIZE ;;

i.e., for elf_x86_64 the HPAGE_SIZE is fixed to 4MiB, instead of extracting it from /proc/meminfo. Trying to use the same approach as for elf64ppc fails, but how can this issue be fixed to make segments 1GiB-aligned?

Thanks in advance!

brk_new_huge failed when running with ./runtest -t func

it turned out brk_new_huge failed when running test with ./runtest -t func

root@localhost:/home/ambrosehua/projects/libhugetlbfs/tests# ./run_tests.py -t func
zero_filesize_segment (32M: 64): PASS
test_root (32M: 64): PASS
meminfo_nohuge (32M: 64): PASS
gethugepagesize (32M: 64): PASS
gethugepagesizes (32M: 64): PASS
HUGETLB_VERBOSE=1 empty_mounts (32M: 64): PASS
HUGETLB_VERBOSE=1 large_mounts (32M: 64): PASS
find_path (32M: 64): PASS
unlinked_fd (32M: 64): PASS
readback (32M: 64): PASS
truncate (32M: 64): PASS
shared (32M: 64): PASS
mprotect (32M: 64): PASS
mlock (32M: 64): PASS
misalign (32M: 64): PASS
fallocate_basic.sh (32M: 64): PASS
fallocate_align.sh (32M: 64): PASS
ptrace-write-hugepage (32M: 64): PASS
icache-hygiene (32M: 64): PASS
slbpacaflush (32M: 64): PASS (inconclusive)
straddle_4GB_static (32M: 64): PASS
huge_at_4GB_normal_below_static (32M: 64): PASS
huge_below_4GB_normal_above_static (32M: 64): PASS
map_high_truncate_2 (32M: 64): PASS
misaligned_offset (32M: 64): PASS (inconclusive)
truncate_above_4GB (32M: 64): PASS
brk_near_huge (32M: 64): brk_near_huge: malloc.c:2385: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

but if running brk_near_huge directly, it passed

root@localhost:/home/ambrosehua/projects/libhugetlbfs/tests# ./obj64/brk_near_huge
Starting testcase "./obj64/brk_near_huge", pid 15587
Initial break at 0xaabb54c000
Hugepage mapped at 0xaabe000000-0xaabfffffff
PASS

the failing is because the "test_addr_huge" trigger heap initialized, which made the chunk of malloc overwritten by later memset,
malloc from the PASS() oops

running it directly would turn on verbose, which triggered malloc (test_init -> printf -> malloc) and made heap initialized before brk_new_huge lifted brk0, so memset would not touch the chunk of malloc

Is there any fix for this issue?

/usr/bin/ld: cannot find -lgcc_s

I have download the libhugetlbfs-master.zip from github, when i used make to compile the libhugetlbfs, the error is reported as below
[root@hadoop libhugetlbfs-master]# make
CC32 obj32/elflink.o
AS32 obj32/sys-elf_i386.o
CC32 obj32/hugeutils.o
CC32 obj32/version.o
CC32 obj32/init.o
CC32 obj32/morecore.o
CC32 obj32/debug.o
CC32 obj32/alloc.o
CC32 obj32/shm.o
CC32 obj32/kernel-features.o
LD32 (shared) obj32/libhugetlbfs.so
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc_s.so when searching for -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
collect2: error: ld returned 1 exit status
make: *** [obj32/libhugetlbfs.so] Error 1

Huge pages not used when malloc allocates a new heap

I am trying to use libhugetlbfs to have the Java Hotspot (OpenJDK11) back all JNI malloc calls by huge pages in a multi-threaded application.

I am using libhugetlbfs through LD_PRELOAD with HUGETLB_MORECORE=yes. When malloc is called, it does not call __morecore but allocates a new heap:

#0  0x00007f83ed8c95a0 in mmap64 () from /lib64/libc.so.6
#1  0x00007f83ed85b511 in new_heap () from /lib64/libc.so.6
#2  0x00007f83ed85d549 in sYSMALLOc () from /lib64/libc.so.6
#3  0x00007f83ed85e3f9 in _int_malloc () from /lib64/libc.so.6
#4  0x00007f83ed85eaac in malloc () from /lib64/libc.so.6

I am using glibc 2.12, Linux kernel 4.9 and the most up to date version of libhugetlbfs. How can I make the malloc calls use huge pages memory?

HUGETLB_VERBOSE=0 xB.linkhuge_nofd with 32bit fail

HUGETLB_VERBOSE=0 xB.linkhuge_nofd (2M: 32):	
HUGETLB_VERBOSE=0 xB.linkhuge_nofd (2M: 64):	Starting testcase "xB.linkhuge_nofd", pid 9612
PASS

I use gdb to debug:

Starting program: /root/libhugetlbfs/tests/obj32/xB.linkhuge_nofd 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
libhugetlbfs [local:17646]: INFO: Found pagesize 2048 kB
libhugetlbfs [local:17646]: INFO: Detected page sizes:
libhugetlbfs [local:17646]: INFO:    Size: 2048 kB (default)  Mount: /dev/hugepages
libhugetlbfs [local:17646]: INFO: Parsed kernel version: [4] . [13] . [0]  [pre-release: 2]
libhugetlbfs [local:17646]: INFO: Feature private_reservations is present in this kernel
libhugetlbfs [local:17646]: INFO: Feature noreserve_safe is present in this kernel
libhugetlbfs [local:17646]: INFO: Feature map_hugetlb is present in this kernel
libhugetlbfs [local:17646]: INFO: Kernel has MAP_PRIVATE reservations.  Disabling heap prefaulting.
libhugetlbfs [local:17646]: INFO: Kernel supports MAP_HUGETLB
libhugetlbfs [local:17646]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [local:17646]: INFO: HUGETLB_NO_RESERVE=no, reservations enabled
libhugetlbfs [local:17646]: INFO: Segment 0 (phdr 4): 0x9000000-0x9000008  (filesz=0) (prot = 0x7)
libhugetlbfs [local:17646]: DEBUG: Total memsz = 0x8, memsz of largest segment = 0x8
libhugetlbfs [local:17646]: INFO: libhugetlbfs version: 2.20
libhugetlbfs [local:17646]: WARNING: Failed to setup hugetlbfs file for segment 0

Program received signal SIGSEGV, Segmentation fault.  <===
0x08049500 in __libc_csu_init () <===

Memory leak due to lack of closedir

When running a binary that links libhugetlbfs, we get constant memory leaks due to not closed fd that pinpoints to the gethugepagesizes function. The fd is not freed here: https://github.com/libhugetlbfs/libhugetlbfs/blob/master/hugeutils.c#L937.

The names and addresses commented out for security reasons.

=================================================================
==3082266==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 131264 byte(s) in 4 object(s) allocated from:
    #0 ... in malloc (name_of_the_binary)
    #1 ... in __alloc_dir dirent/../sysdeps/unix/sysv/linux/opendir.c:115:23
    #2 ... in opendir_tail dirent/../sysdeps/unix/sysv/linux/opendir.c:63:10
    #3 ... in opendir dirent/../sysdeps/unix/sysv/linux/opendir.c:86:10

SUMMARY: AddressSanitizer: 131264 byte(s) leaked in 4 allocation(s).
(gdb) bt
#0  ... in opendir ()
#1  ... in gethugepagesizes () from /lib/x86_64-linux-gnu/libhugetlbfs.so.0
#2  ...  in ?? () from some_library_linking_libhugetlbfs.so
#3  0x0000000000000000 in ?? ()

hugeutils.c: 2 * poor error checking

hugeutils.c:1170]: (style) Checking if unsigned variable 'ret' is less than zero.

Source code is

    if (ret < 0) {
        ERROR("Failed to read /proc/self/maps\n");
        return -1;
    }

but

size_t ret;

hugeutils.c:1177]: (style) Checking if unsigned variable 'ret' is less than zero.

Duplicate.

test/direct.c open failed because of O_DIRECT flag

test on x64 ubuntu 20.04 system, hugepage default size = 1G
test/direct.c, this line:

dfd = open(TMPFILE, O_CREAT|O_EXCL|O_DIRECT|O_RDWR, 0600);

will fail, report error:

Bad configuration: Failed to open direct-IO file: Invalid argument

when remove O_DIRECT it will pass this line (failed later when write )

why?

huge_page_setup_helper.py Python3 conversion

As Python2 will soon become deprecated, script huge_page_setup_helper.py must be converted to Python3. Attached patch is the output of 2to3 Python conversion for that matter.

Deprecation of morecore in glibc

I just noticed this from the glibc 2.32 release notes:

* The __morecore and __after_morecore_hook malloc hooks and the default
  implementation __default_morecore have been deprecated.  Applications
  should use malloc interposition to change malloc behavior, and mmap to
  allocate anonymous memory.  A future version of glibc may require that
  applications which use the malloc hooks must preload a special shared
  object, to enable the hooks.

Will libhugetlbfs work with these changes? It sounds like the request is to LD_PRELOAD malloc, mmap, and related functions instead?

hugeedit doesn't recognize DATA section

On 32 bits powerpc, source compiled with GCC 5.4 with binutils 2.26 and libhugetlbfs 2.20:

root@vgoip:~# ./hugeedit --data --text ./a.out
Segment 2 0x10000000 - 0x100008e4 (TEXT) default is HUGE pages
Segment 3 0x200008e4 - 0x20000aa4 () default is BASE pages

make checkv can not use

make checkv

	 VERSION
version string: 2.20 (modified)
	 CC32 obj32/elflink.o
	 AS32 obj32/sys-elf_i386.o
	 CC32 obj32/hugeutils.o
	 CC32 obj32/version.o
	 CC32 obj32/init.o
	 CC32 obj32/morecore.o
	 CC32 obj32/debug.o
	 CC32 obj32/alloc.o
version.c:3:19: warning: 'libhugetlbfs_version' defined but not used [-Wunused-const-variable=]
 static const char libhugetlbfs_version[] = "VERSION: "VERSION;
                   ^~~~~~~~~~~~~~~~~~~~
	 CC32 obj32/shm.o
	 CC32 obj32/kernel-features.o
	 CC64 obj64/elflink.o
	 AS64 obj64/sys-elf_x86_64.o
	 CC64 obj64/hugeutils.o
	 CC64 obj64/version.o
	 CC64 obj64/init.o
	 CC64 obj64/morecore.o
	 CC64 obj64/debug.o
	 CC64 obj64/alloc.o
version.c:3:19: warning: 'libhugetlbfs_version' defined but not used [-Wunused-const-variable=]
 static const char libhugetlbfs_version[] = "VERSION: "VERSION;
                   ^~~~~~~~~~~~~~~~~~~~
	 CC64 obj64/shm.o
	 CC64 obj64/kernel-features.o
	 CC32 obj32/init_privutils.o
	 CC64 obj64/init_privutils.o
	 CCHOST obj/init_privutils.o
	 CCHOST obj/debug.o
	 CCHOST obj/hugeutils.o
	 CCHOST obj/kernel-features.o
	 CCHOST obj/hugectl.o
	 CCHOST obj/hugeedit.o
	 CCHOST obj/hugeadm.o
	 CCHOST obj/pagesize.o
	 LDHOST obj/hugeedit
	 LDHOST obj/hugectl
	 LD64 (shared) obj64/libhugetlbfs_privutils.so
	 LD32 (shared) obj32/libhugetlbfs_privutils.so
	 ARHOST obj/libhugetlbfs_privutils.a
	 LD64 (shared) obj64/libhugetlbfs.so
	 AR64 obj64/libhugetlbfs.a
	 LDHOST obj/pagesize
	 LD32 (shared) obj32/libhugetlbfs.so
	 AR32 obj32/libhugetlbfs.a
	 LDHOST obj/hugeadm
	 CC32 obj32/gethugepagesize.o
	 CC32 obj32/testutils.o
	 CC32 obj32/libtestutils.o
	 CC32 obj32/test_root.o
	 CC32 obj32/find_path.o
	 CC32 obj32/unlinked_fd.o
	 CC32 obj32/misalign.o
	 CC32 obj32/readback.o
	 CC32 obj32/truncate.o
	 CC32 obj32/shared.o
	 CC32 obj32/private.o
	 CC32 obj32/fork-cow.o
	 CC32 obj32/empty_mounts.o
	 CC32 obj32/large_mounts.o
	 CC32 obj32/ptrace-write-hugepage.o
	 CC32 obj32/meminfo_nohuge.o
	 CC32 obj32/icache-hygiene.o
	 CC32 obj32/slbpacaflush.o
	 CC32 obj32/chunk-overcommit.o
	 CC32 obj32/mprotect.o
	 CC32 obj32/alloc-instantiate-race.o
	 CC32 obj32/mlock.o
	 CC32 obj32/truncate_reserve_wraparound.o
	 CC32 obj32/truncate_sigbus_versus_oom.o
	 CC32 obj32/map_high_truncate_2.o
	 CC32 obj32/truncate_above_4GB.o
	 CC32 obj32/direct.o
	 CC32 obj32/misaligned_offset.o
	 CC32 obj32/brk_near_huge.o
	 CC32 obj32/task-size-overrun.o
	 CC32 obj32/stack_grow_into_huge.o
	 CC32 obj32/counters.o
	 CC32 obj32/quota.o
	 CC32 obj32/heap-overflow.o
	 CC32 obj32/get_huge_pages.o
	 CC32 obj32/get_hugepage_region.o
	 CC32 obj32/shmoverride_linked.o
	 CC32 obj32/gethugepagesizes.o
	 CC32 obj32/madvise_reserve.o
	 CC32 obj32/fadvise_reserve.o
	 CC32 obj32/readahead_reserve.o
	 CC32 obj32/shm-perms.o
	 CC32 obj32/mremap-expand-slice-collision.o
	 CC32 obj32/mremap-fixed-normal-near-huge.o
	 CC32 obj32/mremap-fixed-huge-near-normal.o
	 CC32 obj32/noresv-preserve-resv-page.o
	 CC32 obj32/corrupt-by-cow-opt.o
	 CC32 obj32/noresv-regarded-as-resv.o
	 CC32 obj32/fallocate_basic.o
	 CC32 obj32/fallocate_align.o
	 CC32 obj32/fallocate_stress.o
	 CC32 obj32/malloc.o
	 CC32 obj32/malloc_manysmall.o
	 CC32 obj32/dummy.o
	 CC32 obj32/heapshrink.o
	 CC32 obj32/mmap-gettest.o
	 CC32 obj32/shmoverride_unlinked.o
	 CC32 obj32/mmap-cow.o
	 CC32 obj32/shm-gettest.o
	 CC32 obj32/shm-getraw.o
	 CC32 obj32/shm-fork.o
	 CC32 obj32/zero_filesize_segment.o
	 CC32 obj32/linkhuge.o
	 CC32 obj32/linkhuge_nofd.o
	 CC32 obj32/linkshare.o
	 CC32 obj32/linkhuge_rw.o
	 CC64 obj64/gethugepagesize.o
	 CC64 obj64/testutils.o
	 CC64 obj64/test_root.o
	 CC64 obj64/libtestutils.o
alloc-instantiate-race.c: In function 'thread_racer':
alloc-instantiate-race.c:114:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable]
  int rc;
      ^~
	 CC64 obj64/find_path.o
	 CC64 obj64/unlinked_fd.o
	 CC64 obj64/misalign.o
	 CC64 obj64/readback.o
	 CC64 obj64/truncate.o
shmoverride_linked.c: In function 'local_read_meminfo':
shmoverride_linked.c:113:11: warning: variable 'readerr' set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 CC64 obj64/shared.o
	 CC64 obj64/private.o
	 CC64 obj64/fork-cow.o
	 CC64 obj64/empty_mounts.o
shmoverride_unlinked.c: In function 'local_read_meminfo':
shmoverride_unlinked.c:113:11: warning: variable 'readerr' set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 CC64 obj64/large_mounts.o
	 CC64 obj64/meminfo_nohuge.o
	 CC64 obj64/ptrace-write-hugepage.o
	 CC64 obj64/icache-hygiene.o
	 CC64 obj64/slbpacaflush.o
	 CC64 obj64/chunk-overcommit.o
	 CC64 obj64/mprotect.o
	 CC64 obj64/alloc-instantiate-race.o
	 CC64 obj64/mlock.o
	 CC64 obj64/truncate_reserve_wraparound.o
	 CC64 obj64/truncate_sigbus_versus_oom.o
	 CC64 obj64/map_high_truncate_2.o
	 CC64 obj64/truncate_above_4GB.o
	 CC64 obj64/direct.o
	 CC64 obj64/misaligned_offset.o
	 CC64 obj64/brk_near_huge.o
	 CC64 obj64/task-size-overrun.o
	 CC64 obj64/stack_grow_into_huge.o
	 CC64 obj64/counters.o
	 CC64 obj64/heap-overflow.o
	 CC64 obj64/quota.o
	 CC64 obj64/get_huge_pages.o
	 CC64 obj64/shmoverride_linked.o
	 CC64 obj64/get_hugepage_region.o
	 CC64 obj64/gethugepagesizes.o
	 CC64 obj64/madvise_reserve.o
	 CC64 obj64/readahead_reserve.o
	 CC64 obj64/fadvise_reserve.o
	 CC64 obj64/shm-perms.o
	 CC64 obj64/mremap-expand-slice-collision.o
	 CC64 obj64/mremap-fixed-normal-near-huge.o
	 CC64 obj64/corrupt-by-cow-opt.o
	 CC64 obj64/mremap-fixed-huge-near-normal.o
	 CC64 obj64/noresv-preserve-resv-page.o
	 CC64 obj64/noresv-regarded-as-resv.o
	 CC64 obj64/fallocate_basic.o
	 CC64 obj64/fallocate_align.o
	 CC64 obj64/fallocate_stress.o
	 CC64 obj64/malloc.o
	 CC64 obj64/malloc_manysmall.o
	 CC64 obj64/dummy.o
	 CC64 obj64/heapshrink.o
	 CC64 obj64/shmoverride_unlinked.o
alloc-instantiate-race.c: In function 'thread_racer':
alloc-instantiate-race.c:114:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable]
  int rc;
      ^~
	 CC64 obj64/mmap-gettest.o
	 CC64 obj64/mmap-cow.o
	 CC64 obj64/shm-gettest.o
	 CC64 obj64/shm-getraw.o
	 CC64 obj64/shm-fork.o
	 CC64 obj64/linkhuge.o
	 CC64 obj64/zero_filesize_segment.o
	 CC64 obj64/linkhuge_nofd.o
	 CC64 obj64/linkshare.o
	 CC64 obj64/linkhuge_rw.o
	 CC64 obj64/straddle_4GB.o
	 CC64 obj64/huge_at_4GB_normal_below.o
shmoverride_linked.c: In function 'local_read_meminfo':
shmoverride_linked.c:113:11: warning: variable 'readerr' set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 CC64 obj64/huge_below_4GB_normal_above.o
	 CC32 obj32/get_hugetlbfs_path.o
	 CC32 obj32/compare_kvers.o
	 CC64 obj64/get_hugetlbfs_path.o
	 CC64 obj64/compare_kvers.o
	 CC32 obj32/heapshrink-helper-pic.o
	 CC64 obj64/heapshrink-helper-pic.o
	 LD32 (lib test) obj32/gethugepagesize
	 LD32 (lib test) obj32/test_root
	 LD32 (lib test) obj32/find_path
shmoverride_unlinked.c: In function 'local_read_meminfo':
	 LD32 (lib test) obj32/misalign
	 LD32 (lib test) obj32/unlinked_fd
shmoverride_unlinked.c:113:11: warning: variable 'readerr' set but not used [-Wunused-but-set-variable]
  int len, readerr;
           ^~~~~~~
	 LD32 (lib test) obj32/readback
	 LD32 (lib test) obj32/truncate
	 LD32 (lib test) obj32/shared
	 LD32 (lib test) obj32/private
	 LD32 (lib test) obj32/fork-cow
	 LD32 (lib test) obj32/large_mounts
	 LD32 (lib test) obj32/empty_mounts
	 LD32 (lib test) obj32/meminfo_nohuge
	 LD32 (lib test) obj32/ptrace-write-hugepage
	 LD32 (lib test) obj32/icache-hygiene
	 LD32 (lib test) obj32/slbpacaflush
	 LD32 (lib test) obj32/chunk-overcommit
	 LD32 (lib test) obj32/mprotect
	 LD32 (lib test) obj32/alloc-instantiate-race
	 LD32 (lib test) obj32/mlock
	 LD32 (lib test) obj32/truncate_reserve_wraparound
	 LD32 (lib test) obj32/direct
	 LD32 (lib test) obj32/truncate_sigbus_versus_oom
	 LD32 (lib test) obj32/map_high_truncate_2
	 LD32 (lib test) obj32/truncate_above_4GB
	 LD32 (lib test) obj32/misaligned_offset
	 LD32 (lib test) obj32/task-size-overrun
	 LD32 (lib test) obj32/brk_near_huge
	 LD32 (lib test) obj32/stack_grow_into_huge
	 LD32 (lib test) obj32/heap-overflow
	 LD32 (lib test) obj32/quota
	 LD32 (lib test) obj32/get_huge_pages
	 LD32 (lib test) obj32/get_hugepage_region
	 LD32 (lib test) obj32/shmoverride_linked
	 LD32 (lib test) obj32/madvise_reserve
	 LD32 (lib test) obj32/fadvise_reserve
	 LD32 (lib test) obj32/readahead_reserve
	 LD32 (lib test) obj32/shm-perms
	 LD32 (lib test) obj32/mremap-fixed-huge-near-normal
	 LD32 (lib test) obj32/mremap-fixed-normal-near-huge
	 LD32 (lib test) obj32/corrupt-by-cow-opt
	 LD32 (lib test) obj32/noresv-preserve-resv-page
	 LD32 (lib test) obj32/noresv-regarded-as-resv
	 LD32 (lib test) obj32/fallocate_basic
	 LD32 (lib test) obj32/fallocate_align
	 LD32 (lib test) obj32/fallocate_stress
	 LD32 (nolib test) obj32/malloc_manysmall
	 LD32 (nolib test) obj32/dummy
	 LD32 (nolib test) obj32/malloc
	 LD32 (nolib test) obj32/heapshrink
	 LD32 (nolib test) obj32/shmoverride_unlinked
	 LD32 (lib test) obj32/mmap-gettest
	 LD32 (lib test) obj32/mmap-cow
	 LD32 (lib test) obj32/shm-gettest
	 LD32 (lib test) obj32/shm-getraw
	 LD32 (lib test) obj32/shm-fork
	 LD32 (hugelink test) obj32/linkhuge
	 SCRIPT32 obj32/dummy.ldscript
	 LD32 (hugelink test) obj32/linkhuge_nofd
	 LD32 (preload test) obj32/zero_filesize_segment
	 LD32 (xB test) obj32/xB.linkhuge
	 LD32 (xB test) obj32/xB.linkhuge_nofd
	 LD32 (xBDT test) obj32/xBDT.linkhuge
	 LD32 (xBDT test) obj32/xBDT.linkhuge_nofd
	 LD64 (lib test) obj64/gethugepagesize
	 LD64 (lib test) obj64/test_root
	 LD64 (lib test) obj64/find_path
	 LD64 (lib test) obj64/unlinked_fd
	 LD64 (lib test) obj64/misalign
	 LD64 (lib test) obj64/readback
	 LD64 (lib test) obj64/truncate
	 LD64 (lib test) obj64/fork-cow
	 LD64 (lib test) obj64/shared
	 LD64 (lib test) obj64/private
	 LD64 (lib test) obj64/empty_mounts
	 LD64 (lib test) obj64/large_mounts
	 LD64 (lib test) obj64/icache-hygiene
	 LD64 (lib test) obj64/slbpacaflush
	 LD64 (lib test) obj64/meminfo_nohuge
	 LD64 (lib test) obj64/chunk-overcommit
	 LD64 (lib test) obj64/ptrace-write-hugepage
	 LD64 (lib test) obj64/mprotect
	 LD64 (lib test) obj64/mlock
	 LD64 (lib test) obj64/truncate_reserve_wraparound
	 LD64 (lib test) obj64/truncate_sigbus_versus_oom
	 LD64 (lib test) obj64/map_high_truncate_2
	 LD64 (lib test) obj64/truncate_above_4GB
	 LD64 (lib test) obj64/direct
	 LD64 (lib test) obj64/brk_near_huge
	 LD64 (lib test) obj64/task-size-overrun
	 LD64 (lib test) obj64/stack_grow_into_huge
	 LD64 (lib test) obj64/heap-overflow
	 LD64 (lib test) obj64/misaligned_offset
	 LD64 (lib test) obj64/get_huge_pages
	 LD64 (lib test) obj64/get_hugepage_region
	 LD64 (lib test) obj64/madvise_reserve
	 LD64 (lib test) obj64/fadvise_reserve
	 LD64 (lib test) obj64/readahead_reserve
	 LD64 (lib test) obj64/shm-perms
/usr/bin/ld: warning: zero_filesize_segment.ld contains output sections; did you forget -T?
	 LD64 (lib test) obj64/mremap-fixed-normal-near-huge
	 LD64 (lib test) obj64/corrupt-by-cow-opt
	 LD64 (lib test) obj64/noresv-preserve-resv-page
	 LD64 (lib test) obj64/noresv-regarded-as-resv
	 LD64 (lib test) obj64/fallocate_align
	 LD64 (nolib test) obj64/malloc
	 LD64 (lib test) obj64/fallocate_basic
	 LD64 (lib test) obj64/mremap-fixed-huge-near-normal
	 LD64 (nolib test) obj64/dummy
	 LD64 (nolib test) obj64/malloc_manysmall
	 LD64 (lib test) obj64/mmap-gettest
	 LD64 (nolib test) obj64/heapshrink
	 LD64 (lib test) obj64/shm-gettest
	 LD64 (lib test) obj64/shm-getraw
	 SCRIPT64 obj64/dummy.ldscript
	 LD64 (preload test) obj64/zero_filesize_segment
	 LD64 (hugelink test) obj64/linkhuge
	 LD64 (xB test) obj64/xB.linkhuge
	 LD64 (xB test) obj64/xB.linkhuge_nofd
	 LD64 (xBDT test) obj64/xBDT.linkhuge
	 LD64 (hugelink test) obj64/linkhuge_nofd
	 LD64 (xBDT test) obj64/xBDT.linkhuge_nofd
	 LD64 (lib test) obj64/huge_at_4GB_normal_below_static
ln: failed to create symbolic link 'obj64/ld': File exists
	 LD64 (lib test) obj64/shmoverride_linked_static
	 LD32 (lib test) obj32/shmoverride_linked_static
	 LD32 (helper) obj32/get_hugetlbfs_path
Makefile:245: recipe for target 'obj64/xBDT.linkhuge' failed
make[1]: *** [obj64/xBDT.linkhuge] Error 1
make[1]: *** Waiting for unfinished jobs....
	 LD32 (helper) obj32/compare_kvers
	 LD64 (lib test) obj64/straddle_4GB_static
	 LD64 (lib test) obj64/huge_below_4GB_normal_above_static
/usr/bin/ld: warning: zero_filesize_segment.ld contains output sections; did you forget -T?
Makefile:249: recipe for target 'tests/all' failed
make: *** [tests/all] Error 2

Position independent binaries caught SIGSEGV after remapping data section (patch applied)

How to reproduce
We tried to remap .text and .data segments in mysqld binary (https://github.com/mysql/mysql-server). Finally, we detected that using ld.gold (v1.16, --pie is specified) and remapping .data segment gives us SIGSEGV. Specifying --no-pie heals the problem.

  1. Link mysqld with libhugetlbfs.a and ld.gold (-fuse-ld=gold)
  2. hugeedit --text --data ./mysqld
  3. sudo ./mysqld --version

Main problem
The elflink.c :: get_extracopy doesn't copy partially initialized .bss segment at all, proof:

HUGETLB_DEBUG=1 ./mysqld --version # doesn't work
HUGETLB_DEBUG=1 HUGETLB_MINIMAL_COPY=no ./mysqld --version # works

Bug
elflink.c::keep_symbol filters symbols while .bss is copied, meanwhile "start/end" function's input variables have the address calculated as virtual address from ELF file plus the virtual address given by kernel (non-zero for PIE binaries, typical value for kernel 5.4 is 0x555...554000), while (Elf_Sym*)s->st_value is the virtual address from ELF file only. That's why s->st_value is always less than start in keep_symbol function, and .bss segment is totally skipped.

static inline int keep_symbol(char *strtab, Elf_Sym *s, void *start, void *end)
{
	if ((void *)s->st_value < start) // always true for PIE binaries
		return 0;
....

Since .bss segment is totally skipped, all the partially initialized variables are lost, after .data remapping we get 'rubbish' variables and finally SIGSEGV.

Patch to fix

```diff
diff --git a/elflink.c b/elflink.c
index ce2ed24..153df81 100644
--- a/elflink.c
+++ b/elflink.c
@@ -440,11 +440,12 @@ static int find_numsyms(Elf_Sym *symtab, char *strtab)
  * - Object type (variable)
  * - Non-zero size (zero size means the symbol is just a marker with no data)
  */
-static inline int keep_symbol(char *strtab, Elf_Sym *s, void *start, void *end)
+static inline int keep_symbol(const ElfW(Addr) addr, char *strtab, Elf_Sym *s, void *start, void *end)
 {
-       if ((void *)s->st_value < start)
+       const void* sym_addr = (void*)(s->st_value + addr);
+       if (sym_addr < start)
                return 0;
-       if ((void *)s->st_value > end)
+       if (sym_addr > end)
                return 0;
        if ((ELF_ST_BIND(s->st_info) != STB_GLOBAL) &&
            (ELF_ST_BIND(s->st_info) != STB_WEAK))
@@ -455,7 +456,7 @@ static inline int keep_symbol(char *strtab, Elf_Sym *s, void *start, void *end)
                return 0;

        if (__hugetlbfs_debug)
-               DEBUG("symbol to copy at %p: %s\n", (void *)s->st_value,
+               DEBUG("symbol to copy at %p: %s\n", sym_addr,
                                                strtab + s->st_name);

        return 1;
@@ -514,12 +515,12 @@ static void get_extracopy(struct seg_info *seg, const ElfW(Addr) addr,
        end = start;

        for (sym = symtab; sym < symtab + numsyms; sym++) {
-               if (!keep_symbol(strtab, sym, start, end_orig))
+               if (!keep_symbol(addr, strtab, sym, start, end_orig))
                        continue;

                /* These are the droids we are looking for */
                found_sym = 1;
-               sym_end = (void *)(sym->st_value + sym->st_size);
+               sym_end = (void *)(sym->st_value + addr + sym->st_size);
                if (sym_end > end)
                        end = sym_end;
        }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.