sylware / nyanlinux (public) (License: AFFERO GPLv3) (since 2019-09-09) (hash sha1)
scripts for a lean, from scratch, amd hardware, linux distro

/files/EDLF64.draft (90f01d6be7b22b47ad26c4e73a3f82137afc34b7) (7306 bytes) (mode 100644) (type blob)

EDLF64 is the JSON/wayland of the executables and dynamic libraries for 64bits platforms.

EDLF64=Executable and Dynamic Library Format for 64bits platforms.

The excrutiating simplicity of the format is intended while doing a good enough job.

Endianness is the one from the CPU. Offsets 0->edlf_hdr_bytes_n-1 are obviously invalid.
0x00 "EDLF64",0x00[2],version_b,undef_b[7] (version_b will very probably stay 0x00 forever)
0x08 alignment, power of two (then cannot be 0, namely at least 1), from "EDLF64" 
0x10 mem_bytes_n.
0x18 process_entry_file_offset, 0 if this is a dynamic library (register passing, no stack).
0x20 resolve_file_offset, 0 if there are no symbols in this edlf64 file.
0x28 edlf64_hdr_bytes_n.
A loader64 instance will keep a registry of what was loaded via reference counting. Usually, there
will be only one static instance of loader64 per process inited by the process_entry function.

0x00 uint64_t (*open)(	/*INPUT*/ void *pathname, /* we presume the pathname is self-sizing */
			/*OUTPUT*/ uint64_t *handle, void **start);

	Must be reentrant/thread safe.

	Return 0 if ok, non-zero if an error did happen, and a loader handle and the pointer
	on the start of the loaded dynamic library, namely the first byte of the loaded dynamic
	library ELDF64 header.

	Init functions, if any, should be resolved and called right after open. Usually, their C
	prototypes do include pathname, see below for a recommended init function prototype.

	If pathname targets an already loaded file, the same handle/start will be returned by the
	loader. On linux, the triplet (dev_major/dev_minor/inode) should defines file system

	The _CALLER_ may use, on platforms which can support it, the EDLF64_LIBRARY_PATH environment
        variable to build the pathnames, or any other private means (similar to ELF DT_RUNPATH). See
	below for a description of the EDLF64_LIBRARY_PATH.

	NOTE: we use a handle, that to avoid to have to map a start virtual address to some loader
	internals (for instance, the handle could be directly an offset into such internals).

0x08 uint64_t (*close)(/*INPUT*/ uint64_t handle);

	Must be reentrant/thread safe. Return 0 if ok, non-zero if something wrong did happen while
	closing the edlf64 file.

	Fini functions, if any, should be called (may be resolved if not provided by init functions)
	right before the close call.

Init/fini functions have the responsibility to keep the dynamic library state consistent (for
instance using reference counting like a loader instance).
0x10 loader64_bytes_n.
EDLF64 is about loading only one RWX memory segment.
A EDLF64 executable may honor the following environment variable in order to lookup for dynamic
libraries. Of course, only on platforms where it is possible. Such incompatible platforms should
defines their own EDLF64_LIBRARY_PATH.

EDLF64_LIBRARY_PATH environment variable to lookup for EDLF64 dynamic libraries: byte string
ending with a 0x00 byte. Each path from EDLF64_LIBRARY_PATH is prepended to a dynamic library name.

We use the char '%' used to escape the path separator ':'. To have a '%' in path, you must double
the %, "%%", and to have a ':' in a path not being interpreted as a separator, use "%:". "::" means
an empty path, usually the current working directory at the time of the lookup. Any invalid
combination of '%' with a non handled char ('%' or ':') will result in the path being skiped.
Prototype of process_entry function, WARNING, there is no stack, those must be passed in registers.
This is, very probably, not a classic ABI function call.

	void process_entry(
		void *process_info,		/* Virtual address */
		uint64_t process_info_bytes_n,	/* size in bytes */

process_info is, on compatible platforms, arg,env and aux with their respective data and they are
mostly the same than the SYSV ABI (arg has no argc). The process_entry function will have to setup
its stack:could be a "mmap" system call or a part of the "bss", namely from the extra memory past
the end of the loaded file (including a guard memory page or not), or blunty book some room into the
file, etc.
On "mmap/munmap" platforms, process_info must be cleanely munmap-able. Other platforms may implement
another mechanism to let process_entry remove/free/reuse such bytes from the process.

For the executable, you may need its path in order to load private dynamic libraries based on
its file system location. On linux it is possible only if the /proc file system is mounted and it
will target the actual executable via the pathname in the symbolic link /proc/self/exe (all symbolic
links were resolved in this pathname). Namely, without mounted /proc, the executable will need a
private method to lookup for its private dynamic libraries (similar to ELF LD_ORIGIN environment
C prototype of resolve function:

	uint64_t resolve(	/*INPUT*/ uint64_t *symbol_id,
				/*OUTPUT*/ void **symbol_virtual_address);

symbol_id is a 64bits unique id identifying a symbol (similar to kernel syscalls). Like kernel
syscalls, those symbol ids must be _EXTREMELY_ stable in time. You could segment the id space with
categories (init calls, fini calls, etc, 64bits is huge).
Return 0 if ok, with the virtual address of the symbol via the symbol_virtual_address argument.
Recommended C prototype of an init function:
	uint64_t init(	/*INPUT*/ void *process_info,uint64_t process_info_bytes_n,void *pathname,
			          void *loader64,
			/*OUTPUT*/ void **fini));

The process_info here may be a variant from the one provided by the kernel to process_entry.
Pathname is a pointer on the pathname used to load this edlf file. If successful, it should return
0. Do not expect process_info to stay in memory after the call.

Alternative C prototype of an init function:
	uint64_t init(	/*INPUT*/ void *process_info,uint64_t process_info_bytes_n,int fd,
			          void *loader64,
			/*OUTPUT*/ void **fini));

Same thing than the previous one, but with the process file descriptor used to load/mmap the edlf64
file instead of the pathname. You could even add the directory file descriptor.

Recommended C prototype of fini function:
	void fini(void)

No static and implicit TLS anymore, explicit initialization is required, for instance via the
pthread dynamic library on POSIX-like systems.

errno would be accessed via an all-thread shared pthread key, or an additional parameter, or
bluntly trashed. errno would become a macro using that shared pthread key to get the value.

