yugr / implib.so Goto Github PK
View Code? Open in Web Editor NEWPOSIX equivalent of Windows DLL import libraries
License: MIT License
POSIX equivalent of Windows DLL import libraries
License: MIT License
Thanks a lot for your work on the project. I'm working on reverse engineering an embedded device with mipsel architecture. It would be brilliant if you have some time to make it work on mipsel arch so I can write some patches with it. Thanks.
Thank you for providing this library!
I have a question in regard to multi-threading:
one of the project limitations is "proper support for multi-threading"
In my project initialization, I plan to read which libraries are necessary from configuration, and then load them with all their symbols using this project (by using _LIBNAME_tramp_resolve_all()).
Is there any danger in using multi-threading after all the libraries and their symbols are loaded?
Is there anything else I should look out for when multi-threading and using libraries that were loaded with this project?
Thanks.
Hi, your README says:
support Android and OSX (none should be hard to add so let me know if you need it).
Just wanted to say Android support would be great if you are still open to adding it.
Implib.so/arch/common/init.c.tpl
Line 60 in bbca01e
When implib is built to export wrapped symbols (-DIMPLIB_EXPORT_SHIMS), the imported library initializers may end up resolving some of the symbols to the already-available symbols provided by the wrapper and that triggers an assertion checking is_lib_loading
-- we're still in the middle of calling dlopen() at that point.
One way to deal with that is to dlopen() the wrapped library with RTLD_DEEPBIND
. This would tell dynamic linker to bind the symbols to the library itself and avoid the unfortunate recursion. It would bypass the implib's wrapper, but I think it's the right thing to do when we build implib with visible symbols.
We may want to make it a user-controllable option, in case someone may need to guarantee that all references to a symbol go through the wrapper, but that would have to be a trade-off as in that case they would lose the ability to import libraries that refer to themselves in their initializers.
My teacher said that even though we statically link a shared library the symbol of shared library's functions are still delay loaded when say an executable need it. So what is the advantage of Implib.so?
mov
can only handle the values that fit into 16 bit. To guarantee that we can load larger values we should be using mov + movk
. https://godbolt.org/z/9f5zx3e9r
I don't know how much it is in scope or how it's simple to implement but libX11 seem to export a lot of variables and it's not really possible to make it optional using Implib right now.
Would it be possible to generate the counterparts for "exported" vtables and typeinfos? These symbols are needed on caller's side if one needs to do e.g. a dynamic_cast or if virtual dtor of "imported" class is inlined.
On Windows the compiler generates vtable containing addresses of stub (trampoline) functions on caller's side. The typeinfos are duplicated in the caller's module (exe/dll) as well so delay loading works fine even with polymorhic classes and RTTI involved.
Would it be possible to do the same on Linux?
Thanks, Tomas
Our project (JAX) uses implib.so internally, and one of our end users asked for POWER architecture support, which requires POWER support in implib.so. It'd be great if POWER support were added!
(By the way: thanks for this project, it's incredibly useful! We use it to ship Python wheels that refer to NVIDIA's CUDA libraries while obeying the Python manylinux2014 rules, which only allow direct dynamic linking against a small allowlist of libraries.)
If the flag --no-dlopen is used your script defines NO_DLOPEN as follows in the *.so.init.c. I assume its a bug in your script.
#define NO_DLOPEN True
True is not defined and NO_DLOPEN is automatically assigned to 0 which is wrong in this case
I assume the error is at this line https://github.com/yugr/Implib.so/blob/master/implib-gen.py#L544
where
no_dlopen=not int(dlopen),
should probably be
no_dlopen=int(not dlopen),
P.s.: Thank you for this cool script! I will probably use it for a project
Hi, many thanks for this useful tool, it seems like it will facilitate our job a lot!
However, I have a problem using it:
I have used implib-gen on a dynamic library like this:
../implib-gen.py libnvinfer.so.6.5.0
When loading the library I get the following error:
implib-gen: libnvinfer.so.6.5.0: library function '_ZN8nvinfer16LogBufC1EPNS_7ILoggerENS1_8SeverityEm' called during library load
With my limited knowledge, I cannot understand what this error means.
Is a default constructor of a class (Logger) being called upon loading?
I have also tried to disable lazy loading but still the same error.
What would be the next steps in investigating it?
Thanks.
Is there a way to hide the generated symbols? I have the following situation:
#!/usr/bin/env python3
. This makes a big difference with CentOS devtoolsets and with NixOS.
At first, thanks yugr for creating this nice proejct.
In the example case, it works fine but there is an issue in my case.
I applied Implib.so to libA.so which I want for delay-loading.
I created a shared library, libB.so from libA.so.init.c and libA.tramp.S
And then I built an executable with libB.
The problem is the executable tries to load libA at strarup even though a function in libA is called after some processing is finished. Following is an error message.
$ ./executable
implib-gen: libA.so.4: failed to load library 'libA.so.4' via dlopen: libA.so.4: cannot open shared object file: No such file or directory
executable: /home/xyz/Implib.so/libA.so.4.init.c:61: void *load_library(): Assertion `0 && "Assertion in generated code"' failed.
Aborted
Is this expected result? Or could you help me delay loading works fine in my case?
I'm converting a bunch of shared libs that depend on each other into delay loaded shims.
The second lib that links against the static loader of the first ends up exporting all of its symbols.
The fix for this seems to be to add a ".hidden $sym" to the respective https://github.com/yugr/Implib.so/blob/master/arch/x86_64/trampoline.S.tpl file.
Is there any reason to not add that by default? I can't immediately think of a reason why you'd ever want to build a shared library that just re-exports the symbols of the actual shared library.
In any case, an option to optionally hide those symbols would also be perfectly fine for me.
Build system knows only the build-time handle which has no soversion but those files are usually not installed on production systems (that is, not having -dev packages installed). Implib should deduce the soname (probably using readelf -d
?) and use it rather than the basename.
I am looking at using your project on an arm32 and arm64 platform. Have you thought about supporting additional architectures besides x86?
This leads to executing arbitrary code when libraries with event loop like Qt are in use. abort()
would be much better or _exit(1)
at least. abort()
is preferable as that would lead to a crash that could be intercepted by crash dump reporting systems.
As you mention with the following FIXME comment you have to somehow find the loaded lib_handle if you don't want your script to load the library by itself:
// FIXME: instead of RTLD_NEXT we should search for loaded lib_handle
// as in https://github.com/jethrogb/ssltrace/blob/bf17c150a7/ssltrace.cpp#L74-L112
To be honest I don't really understand what the approach from the ssltrace project does. For my use case it would be okay to provide the handler myself to the *_tramp_resolve_all(void)
function. Therefor I modified your script and generated an other function *_tramp_resolve_all_fromHandler(void *handler)
void _${lib_suffix}_tramp_resolve(int i, void *h) {
assert((unsigned)i + 1 < sizeof(sym_names) / sizeof(sym_names[0]));
CHECK(!is_lib_loading, "library function '%s' called during library load", sym_names[i]);
#if NO_DLOPEN
// FIXME: instead of RTLD_NEXT we should search for loaded lib_handle
// as in https://github.com/jethrogb/ssltrace/blob/bf17c150a7/ssltrace.cpp#L74-L112
//h = RTLD_NEXT;
#elif LAZY_LOAD
h = load_library();
#else
h = lib_handle;
CHECK(h, "failed to resolve symbol '%s', library failed to load", sym_names[i]);
#endif
// Dlsym is thread-safe so don't need to protect it.
_${lib_suffix}_tramp_table[i] = dlsym(h, sym_names[i]);
CHECK(_${lib_suffix}_tramp_table[i], "failed to resolve symbol '%s'", sym_names[i]);
}
// Helper for user to resolve all symbols
void _${lib_suffix}_tramp_resolve_all(void) {
size_t i;
for(i = 0; i + 1 < sizeof(sym_names) / sizeof(sym_names[0]); ++i)
_${lib_suffix}_tramp_resolve(i,0);
}
void _${lib_suffix}_tramp_resolve_all_fromHandler(void *handler) {
CHECK(handler, "no valid handler provided");
size_t i;
for(i = 0; i + 1 < sizeof(sym_names) / sizeof(sym_names[0]); ++i)
_${lib_suffix}_tramp_resolve(i,handler);
}
What do you think about this approach? Do you like to add it to your project?
If you can explain me how the ssltrace approach works I might try to implement it too.
very thanks for your useful tool, i have a question about cfi
Implib.so/arch/x86_64/table.S.tpl
Line 29 in bbca01e
.cfi_adjust_cfa_offset 8;
should this offset be 16, since previous instruction call also push IP to stack?
It seems dlopen
/dlsym
do not preserve floating-point operands on amd64 and aarch64:
yugr@yugr-VirtualBox:~/src/Implib.so/tests/1$ ./run.sh
Standalone executable: GFLAGS += '', CFLAGS += '-no-pie'
1c1
< Calling foo from libtest: 25 0.5
---
> Calling foo from libtest: 25 0
Need to use FXSAVE/FXRSTOR to preserve them.
Same issue happens on AArch64.
When I use CMake to compile and link the generated .tramp.S
file, the following warnings appear:
[ 6%] Linking CXX shared library libdeepmd_dyn_cudart.so
ld: warning: CMakeFiles/deepmd_dyn_cudart.dir/libcudart.so.tramp.S.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
My version:
GNU ld version 2.39-9.fc38
gcc (conda-forge gcc 12.3.0-2) 12.3.0
cmake version 3.27.6
Used version: 11f7b4c
After generating with ./implib-gen.py --target arm-linux-gnueabihf lib.so
I get
implib-gen.py: warning: library 'lib.so' contains data symbols which won't be intercepted: [...symbols]
Not sure this is relevant here, but just so it's mentioned.
Then in the generated code after $number gets bigger than 256:
lib.so.tramp.S: Assembler messages:
lib.so.tramp.S:11944: Error: invalid literal constant: pool needs to be closer
from the generated tramp +11944
ldr ip, =257
I found some comments on this error, but nothing helped.
And also here after $offset gets bigger than 4095:
lib.so.tramp.S:47218: Error: bad immediate value for offset (4096)
from the generated tramp +47218
ldr ip, [ip, #4096]
Can this just be fixed by loading the constant into a register first?
ldr r1, #$offset
ldr ip, [ip, r1]
Thanks for great library.
I'm trying to use your library on armv7 architecture.
I've compiled program and library on arm-linux-gnueabihf-g++ compiler.
Generating stub files and resolving my own so in runtime are OK.
But when I use functions, I got segmentation fault.
I am newbie on arm assembly, so I can't debug well, but referencing ip register arise segmentation fault.
ldr ip, 3f
2:
add ip, pc, ip
ldr ip, [ip, #336]
cmp ip, #0
// Fast path
bxne ip
// Slow path
ldr ip, =84
push {ip}
Hope to any suggestions.
Best regards.
@yugr, thank you for providing this tool! It looks to do most of what I need, but I'm wondering if it can be extended to throw an exception if the target library can't be loaded (similar to https://github.com/jackyf/so-stub).
My use case. I need to link statically against a library that can be missing on the target computer, but don't want to abort loading the entire module/executable, as that functionality may be optional. I've tried other options (some of which you listed in README), but none of them work well for this use case. I'd like to build an .so file using your tool that would replace the (potentially missing) library, so that if the target library is present, then the calls get forwarded to it (like you do now), but if it's missing, the (run-time) exception is thrown, but the library itself is loaded to satisfy the dependency.
Also, you mentioned that you can do 32-bit variant as well, which would be great for my case (as I need both 64bit and 32bit versions). How difficult would be to add it?
My understanding is LLVM has its own assembly language it supports, that is also supported by emcc. Is that a supported target of this project? Is LLVM assembly just the same as something else that's common?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.