EDLF64 is the JSON/wayland of the executables and dynamic libraries for 64bits platforms with a
reentrancy lock mechanism.
EDLF64=_E_xecutable and _D_ynamic _L_ibrary _F_ormat for _64_bits platforms with a reentrancy lock
mechanism.
The excrutiating simplicity of the format is intended while doing a good enough job.
To avoid a circular dependency with a high level threading dynamic library, the hardware
architecture with or without the kernel must provide a pre-inited/ready to use reentrancy lock
mechanism. On many cores systems, reentrancy lock means thread safe, and usually is an atomic
compare and exchange hardware instruction or kernel syscall on memory locations which are already
inited with some specific content.
====================================================================================================
0x00 "EDLF64",0x00,version_b (version_b will very probably stay 0x00 forever)
0x08 alignment, power of two (then cannot be 0, namely at least 1), from "EDLF64"
0x10 mem_bytes_n.
0x18 process_entry_file_offset, 0 if this is a dynamic library (register passing, no stack).
0x20 resolve_file_offset, 0 if there are no symbols in this edlf64 file.
----
0x28 edlf64_hdr_bytes_n.
====================================================================================================
A loader64 instance will keep a registry of what was loaded via reference counting. Usually, there
will be only one static instance of loader64 per process inited by the process_entry function.
The main loader64 instance code is usually dependent only on the hardware architecture and kernel,
to stay independent of all dynamic libraries.
Only one thread can use a loader64 instance at a time. Entry must be guarded by the reentrancy lock
mechanism.
Dynamic libraries should try to presume not they are the only instance in a process: there could be
others loaded by other loader64 instances, or the "same" dynamic library but from different files.
Dodging related conflicts could be expensive and should be clearly documented in a dynamic library
documentation if it is supported or not.
uint64_t loader64_open( /*INPUT*/ void *pathname, /* we presume the pathname is self-sizing */
/*OUTPUT*/ uint64_t *handle, void **start);
Return values:
0 if ok, and a loader handle and the pointer on the start of the loaded dynamic
library, namely the first byte of the loaded dynamic library ELDF64 header.
11 (-EAGAIN), if the loader is currently busy.
other an error did happen.
If pathname targets an already loaded file, the same handle/start will be returned by the
loader. On linux, the triplet (dev_major/dev_minor/inode) should defines file system
unicity.
The _CALLER_ may use, on platforms which can support it, the EDLF64_LIBRARY_PATH environment
variable to build the pathnames, or any other private means (similar to ELF DT_RUNPATH). See
below for a description of the EDLF64_LIBRARY_PATH.
NOTE: we use a handle, that to avoid to have to map a start virtual address to some loader
internals (for instance, the handle could be directly an offset into such internals).
uint64_t loader64_close(/*INPUT*/ uint64_t handle);
Return values:
0 if ok, non-zero The handle becomes invalid.
11 (-EAGAIN), if the loader is currently busy. The handle stays valid.
other if something wrong did happen while closing the edlf64 file. The handle becomes
invalid.
====================================================================================================
EDLF64 is about loading only one RWX memory segment.
====================================================================================================
A EDLF64 file may honor the following environment variable in order to lookup for dynamic
libraries. Of course, only on platforms where it is possible. Such incompatible platforms may
defines their own "EDLF64_LIBRARY_PATH way" (could have a conflicting name separator).
EDLF64_LIBRARY_PATH environment variable to lookup for EDLF64 dynamic libraries: byte string
ending with a 0x00 byte. Each path from EDLF64_LIBRARY_PATH is prepended to a dynamic library name.
We use the char '%' used to escape the path separator ':'. To have a '%' in path, you must double
the '%', "%%", and to have a ':' in a path not being interpreted as a separator, use "%:". "::"
means an empty path, usually the current working directory at the time of the lookup. Any invalid
combination of '%' with a non handled char ('%' or ':') will result in the path being skiped.
====================================================================================================
Prototype of the process_entry function, WARNING, there is no stack, those must be passed in
registers. This is, very probably, not a classic ABI function call.
void process_entry(
void *process_info, /* Virtual address */
uint64_t process_info_bytes_n, /* size in bytes */
);
process_info is, on compatible platforms, arg,env and aux with their respective data and they are
mostly the same than the SYSV ABI (arg has no argc). The process_entry function will have to setup
its stack:could be a "mmap" system call or a part of the "bss", namely from the extra memory past
the end of the loaded file (including a guard memory page or not), or blunty book some room into the
file, etc.
On "mmap/munmap" platforms, process_info must be cleanely munmap-able, namely process_info address
must fit the requirement of the platform munmap to do so (usually aligned on the size of the memory
page used to mmap it). Other platform types may implement other mechanisms to let process_entry
remove/free/reuse such bytes from the process.
Additionnaly, for 'mmap/munmap' platforms, the executable must be cleanely munmap-able: the aux
vector must provide the address used to mmap the executable which would be used to munmap it. Same
fate than process_info for the other platform types.
(On linux, the EDLF64 vdso would have to be cleanely munmap-able too).
For the executable, you may need its path in order to load private dynamic libraries based on
its file system location. On linux it is possible only if the /proc file system is mounted and it
will target the actual executable via the pathname in the symbolic link /proc/self/exe (all symbolic
links were resolved in this pathname). Namely, without mounted /proc, the executable will need a
private method to lookup for its private dynamic libraries (similar to ELF LD_ORIGIN environment
variable).
====================================================================================================
C prototype of the resolve function:
uint64_t resolve(
/*INPUT*/ uint64_t *symbol_id,
/*OUTPUT*/ void **symbol_virtual_address);
Entry must be guarded by the reentrancy lock mechanism.
symbol_id is a 64bits unique value identifying a symbol (similar to kernel syscalls). Like kernel
syscalls, those symbol ids must be _EXTREMELY_ stable in time.
Return values:
0 if ok, with the virtual address of the symbol via the symbol_virtual_address argument. No
assumption must be made about the location of this virtual address.
11 (-EAGAIN), if the resolve function cannot be run right now.
other the symbol was not found.
====================================================================================================
In init or fini functions, be very careful about circular dependencies with other components. It
is worth on the long run to provide permanent reentrancy detection while aborting using a very loud
maneer.
Init/fini functions have the responsibility to keep the dynamic library state consistent (for
instance using reference counting like a loader instance).
Example C prototype of a basic init function:
uint64_t init(
/*INPUT*/ void *process_info, uint64_t process_info_bytes_n, void *pathname,
uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start),
uint64_t (*loader64_close)(uint64_t handle));
The process_info here may be a variant from the one provided by the kernel to process_entry.
Pathname is a pointer on the pathname used to load this edlf64 file. If successful, it should return
0. Do not expect process_info to stay in memory after the call, neither pathname. On x64/x86_64 with
the SYSV ABI, if going with more than 6 parameters, just init a transient structure to pass the
whole data to work around some ABI kludge.
Alternative C prototype of a basic init function:
uint64_t init(
/*INPUT*/ void *process_info, uint64_t process_info_bytes_n,
int distribution_dir_fd, int fd,
uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start),
uint64_t (*loader64_close)(uint64_t handle));
Same thing than the previous one, but with the process file descriptor used to load/mmap the edlf64
file instead of the pathname supplemented with the process directory file descriptor of the
distribution directory.
Example C prototype of basic fini function:
void fini(void)
====================================================================================================
NOTES:
No static and implicit TLS anymore, explicit initialization is required, for instance via the
pthread dynamic library on POSIX-like systems.
errno would be accessed via an all-thread shared pthread key, or an additional parameter, or
bluntly trashed. errno would become a macro using that shared pthread key to get the value.