sylware / nyanlinux (public) (License: AFFERO GPLv3) (since 2019-09-09) (hash sha1)
scripts for a lean, from scratch, amd hardware, linux distro

/files/EDLF64.draft (4e7ced12e41d89a6c4cc1debd4fc37b2b662a27c) (9468 bytes) (mode 100644) (type blob)

EDLF64 is the JSON/wayland of the executables and dynamic libraries for 64bits platforms with a
reentrancy lock mechanism.

EDLF64=_E_xecutable and _D_ynamic _L_ibrary _F_ormat for _64_bits platforms with a reentrancy lock
mechanism.

The excrutiating simplicity of the format is intended while doing a good enough job.

To avoid a circular dependency with a high level threading dynamic library, the hardware
architecture with or without the kernel must provide a pre-inited/ready to use reentrancy lock
mechanism. On many cores systems, reentrancy lock means thread safe, and usually is an atomic
compare and exchange hardware instruction or kernel syscall on memory locations which are already
inited with some specific content.
====================================================================================================
0x00 "EDLF64",0x00,version_b (version_b will very probably stay 0x00 forever)
0x08 alignment, power of two (then cannot be 0, namely at least 1), from "EDLF64" 
0x10 mem_bytes_n.
0x18 process_entry_file_offset, 0 if this is a dynamic library (register passing, no stack).
0x20 resolve_file_offset, 0 if there are no symbols in this edlf64 file.
----
0x28 edlf64_hdr_bytes_n.
====================================================================================================
A loader64 instance will keep a registry of what was loaded via reference counting. Usually, there
will be only one static instance of loader64 per process inited by the process_entry function.
The main loader64 instance code is usually dependent only on the hardware architecture and kernel,
to stay independent of all dynamic libraries.

Only one thread can use a loader64 instance at a time. Entry must be guarded by the reentrancy lock
mechanism.

Dynamic libraries should try to presume not they are the only instance in a process: there could be
others loaded by other loader64 instances, or the "same" dynamic library but from different files.
Dodging related conflicts could be expensive and should be clearly documented in a dynamic library
documentation if it is supported or not.

uint64_t loader64_open(	/*INPUT*/ void *pathname, /* we presume the pathname is self-sizing */
			/*OUTPUT*/ uint64_t *handle, void **start);

	Return values:
	0	if ok, and a loader handle and the pointer on the start of the loaded dynamic
		library, namely the first byte of the loaded dynamic library ELDF64 header.
	11	(-EAGAIN), if the loader is currently busy.
	other	an error did happen.

	If pathname targets an already loaded file, the same handle/start will be returned by the
	loader. On linux, the triplet (dev_major/dev_minor/inode) should defines file system
	unicity.

	The _CALLER_ may use, on platforms which can support it, the EDLF64_LIBRARY_PATH environment
        variable to build the pathnames, or any other private means (similar to ELF DT_RUNPATH). See
	below for a description of the EDLF64_LIBRARY_PATH.

	NOTE: we use a handle, that to avoid to have to map a start virtual address to some loader
	internals (for instance, the handle could be directly an offset into such internals).

uint64_t loader64_close(/*INPUT*/ uint64_t handle);

	Return values:
	0	if ok, non-zero  The handle becomes invalid.
	11	(-EAGAIN), if the loader is currently busy. The handle stays valid.
	other	if something wrong did happen while closing the edlf64 file. The handle becomes
		invalid.
====================================================================================================
EDLF64 is about loading only one RWX memory segment.
====================================================================================================
A EDLF64 file may honor the following environment variable in order to lookup for dynamic
libraries. Of course, only on platforms where it is possible. Such incompatible platforms may
defines their own "EDLF64_LIBRARY_PATH way" (could have a conflicting name separator).

EDLF64_LIBRARY_PATH environment variable to lookup for EDLF64 dynamic libraries: byte string
ending with a 0x00 byte. Each path from EDLF64_LIBRARY_PATH is prepended to a dynamic library name.

We use the char '%' used to escape the path separator ':'. To have a '%' in path, you must double
the '%', "%%", and to have a ':' in a path not being interpreted as a separator, use "%:". "::"
means an empty path, usually the current working directory at the time of the lookup. Any invalid
combination of '%' with a non handled char ('%' or ':') will result in the path being skiped.
====================================================================================================
Prototype of the process_entry function, WARNING, there is no stack, those must be passed in
registers. This is, very probably, not a classic ABI function call.

	void process_entry(
		void *process_info,		/* Virtual address */
		uint64_t process_info_bytes_n,	/* size in bytes */
	);

process_info is, on compatible platforms, arg,env and aux with their respective data and they are
mostly the same than the SYSV ABI (arg has no argc). The process_entry function will have to setup
its stack:could be a "mmap" system call or a part of the "bss", namely from the extra memory past
the end of the loaded file (including a guard memory page or not), or blunty book some room into the
file, etc.

On "mmap/munmap" platforms, process_info must be cleanely munmap-able, namely process_info address
must fit the requirement of the platform munmap to do so (usually aligned on the size of the memory
page used to mmap it). Other platform types may implement other mechanisms to let process_entry
remove/free/reuse such bytes from the process. 

Additionnaly, for 'mmap/munmap' platforms, the executable must be cleanely munmap-able: the aux
vector must provide the address used to mmap the executable which would be used to munmap it. Same
fate than process_info for the other platform types.

(On linux, the EDLF64 vdso would have to be cleanely munmap-able too).

For the executable, you may need its path in order to load private dynamic libraries based on
its file system location. On linux it is possible only if the /proc file system is mounted and it
will target the actual executable via the pathname in the symbolic link /proc/self/exe (all symbolic
links were resolved in this pathname). Namely, without mounted /proc, the executable will need a
private method to lookup for its private dynamic libraries (similar to ELF LD_ORIGIN environment
variable).
====================================================================================================
C prototype of the resolve function:

	uint64_t resolve(
	            /*INPUT*/ uint64_t *symbol_id,
	            /*OUTPUT*/ void **symbol_virtual_address);

Entry must be guarded by the reentrancy lock mechanism.

symbol_id is a 64bits unique value identifying a symbol (similar to kernel syscalls). Like kernel
syscalls, those symbol ids must be _EXTREMELY_ stable in time.

Return values:
0	if ok, with the virtual address of the symbol via the symbol_virtual_address argument. No
	assumption must be made about the location of this virtual address.
11	(-EAGAIN), if the resolve function cannot be run right now. 
other	the symbol was not found.
====================================================================================================
In init or fini functions, be very careful about circular dependencies with other components. It
is worth on the long run to provide permanent reentrancy detection while aborting using a very loud
maneer.

Init/fini functions have the responsibility to keep the dynamic library state consistent (for
instance using reference counting like a loader instance).

Example C prototype of a basic init function:
	uint64_t init(
	        /*INPUT*/ void *process_info, uint64_t process_info_bytes_n, void *pathname,
	                  uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start),
	                  uint64_t (*loader64_close)(uint64_t handle));

The process_info here may be a variant from the one provided by the kernel to process_entry.
Pathname is a pointer on the pathname used to load this edlf64 file. If successful, it should return
0. Do not expect process_info to stay in memory after the call, neither pathname. On x64/x86_64 with
the SYSV ABI, if going with more than 6 parameters, just init a transient structure to pass the
whole data to work around some ABI kludge.

Alternative C prototype of a basic init function:
	uint64_t init(
	        /*INPUT*/ void *process_info, uint64_t process_info_bytes_n,
			  int distribution_dir_fd, int fd,
	                  uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start),
	                  uint64_t (*loader64_close)(uint64_t handle));

Same thing than the previous one, but with the process file descriptor used to load/mmap the edlf64
file instead of the pathname supplemented with the process directory file descriptor of the
distribution directory. 

Example C prototype of basic fini function:
	void fini(void)
====================================================================================================
NOTES:

No static and implicit TLS anymore, explicit initialization is required, for instance via the
pthread dynamic library on POSIX-like systems.

errno would be accessed via an all-thread shared pthread key, or an additional parameter, or
bluntly trashed. errno would become a macro using that shared pthread key to get the value.


Mode Type Size Ref File
100644 blob 5 8eba6c8dd4dcaf6166bd22285ed34625f38a84ff .gitignore
100755 blob 1587 57fa4264b9ee0ae0a6f678f2527a05d3b22dda78 00-bootstrap-build.sh
100755 blob 848 a30f443bf405d56682efe3b4c5d3a19d5f7eb45d 01-re-bootstrap-build.sh
100644 blob 2142 f19c2d6b293244bb11a3f74ee77c10675cadc7d6 INSTALL
100644 blob 30 c9b735fa1332286f4b3f5f81fa10527fd7506b6e LICENSE
040000 tree - 9d9fe6b09d83cd69b6456038d81629f5bc402293 builders
100644 blob 1773 ef1551089a803bde37e36edc8d61bb819d06f793 conf.bootstrap.sh
100644 blob 479 8cc15efe46965ac7750fe304460f5a2b0aa4201c conf.sh
040000 tree - 2b1d4cf65639324f1843703d7ffda6d6eea6953c files
100755 blob 333 06859f922e41c1e691c72ada1be3f981ef05f602 pkg-build
100644 blob 22800641 e9e6291054c857401f6835c728f31541dae4311e steam.tar.bz2
100644 blob 173 2047af328b22f9d146585cd9e759edbc18122250 utils.sh
040000 tree - 8e23f551092a35f82b37129dd08c35c4d313c17b x64
040000 tree - b7a22de7f5cbd97650dd45412ef7d4246e395eb8 x86
Hints:
Before first commit, do not forget to setup your git environment:
git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):
git clone https://rocketgit.com/user/sylware/nyanlinux

Clone this repository using ssh (do not forget to upload a key first):
git clone ssh://rocketgit@ssh.rocketgit.com/user/sylware/nyanlinux

Clone this repository using git:
git clone git://git.rocketgit.com/user/sylware/nyanlinux

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:
... clone the repository ...
... make some changes and some commits ...
git push origin main