File files/EDLF64.draft added (mode: 100644) (index 0000000..90f01d6) |
|
1 |
|
EDLF64 is the JSON/wayland of the executables and dynamic libraries for 64bits platforms. |
|
2 |
|
|
|
3 |
|
EDLF64=Executable and Dynamic Library Format for 64bits platforms. |
|
4 |
|
|
|
5 |
|
The excrutiating simplicity of the format is intended while doing a good enough job. |
|
6 |
|
|
|
7 |
|
Endianness is the one from the CPU. Offsets 0->edlf_hdr_bytes_n-1 are obviously invalid. |
|
8 |
|
==================================================================================================== |
|
9 |
|
0x00 "EDLF64",0x00[2],version_b,undef_b[7] (version_b will very probably stay 0x00 forever) |
|
10 |
|
0x08 alignment, power of two (then cannot be 0, namely at least 1), from "EDLF64" |
|
11 |
|
0x10 mem_bytes_n. |
|
12 |
|
0x18 process_entry_file_offset, 0 if this is a dynamic library (register passing, no stack). |
|
13 |
|
0x20 resolve_file_offset, 0 if there are no symbols in this edlf64 file. |
|
14 |
|
---- |
|
15 |
|
0x28 edlf64_hdr_bytes_n. |
|
16 |
|
==================================================================================================== |
|
17 |
|
A loader64 instance will keep a registry of what was loaded via reference counting. Usually, there |
|
18 |
|
will be only one static instance of loader64 per process inited by the process_entry function. |
|
19 |
|
|
|
20 |
|
0x00 uint64_t (*open)( /*INPUT*/ void *pathname, /* we presume the pathname is self-sizing */ |
|
21 |
|
/*OUTPUT*/ uint64_t *handle, void **start); |
|
22 |
|
|
|
23 |
|
Must be reentrant/thread safe. |
|
24 |
|
|
|
25 |
|
Return 0 if ok, non-zero if an error did happen, and a loader handle and the pointer |
|
26 |
|
on the start of the loaded dynamic library, namely the first byte of the loaded dynamic |
|
27 |
|
library ELDF64 header. |
|
28 |
|
|
|
29 |
|
Init functions, if any, should be resolved and called right after open. Usually, their C |
|
30 |
|
prototypes do include pathname, see below for a recommended init function prototype. |
|
31 |
|
|
|
32 |
|
If pathname targets an already loaded file, the same handle/start will be returned by the |
|
33 |
|
loader. On linux, the triplet (dev_major/dev_minor/inode) should defines file system |
|
34 |
|
unicity. |
|
35 |
|
|
|
36 |
|
The _CALLER_ may use, on platforms which can support it, the EDLF64_LIBRARY_PATH environment |
|
37 |
|
variable to build the pathnames, or any other private means (similar to ELF DT_RUNPATH). See |
|
38 |
|
below for a description of the EDLF64_LIBRARY_PATH. |
|
39 |
|
|
|
40 |
|
NOTE: we use a handle, that to avoid to have to map a start virtual address to some loader |
|
41 |
|
internals (for instance, the handle could be directly an offset into such internals). |
|
42 |
|
|
|
43 |
|
0x08 uint64_t (*close)(/*INPUT*/ uint64_t handle); |
|
44 |
|
|
|
45 |
|
Must be reentrant/thread safe. Return 0 if ok, non-zero if something wrong did happen while |
|
46 |
|
closing the edlf64 file. |
|
47 |
|
|
|
48 |
|
Fini functions, if any, should be called (may be resolved if not provided by init functions) |
|
49 |
|
right before the close call. |
|
50 |
|
|
|
51 |
|
Init/fini functions have the responsibility to keep the dynamic library state consistent (for |
|
52 |
|
instance using reference counting like a loader instance). |
|
53 |
|
---- |
|
54 |
|
0x10 loader64_bytes_n. |
|
55 |
|
==================================================================================================== |
|
56 |
|
EDLF64 is about loading only one RWX memory segment. |
|
57 |
|
==================================================================================================== |
|
58 |
|
A EDLF64 executable may honor the following environment variable in order to lookup for dynamic |
|
59 |
|
libraries. Of course, only on platforms where it is possible. Such incompatible platforms should |
|
60 |
|
defines their own EDLF64_LIBRARY_PATH. |
|
61 |
|
|
|
62 |
|
EDLF64_LIBRARY_PATH environment variable to lookup for EDLF64 dynamic libraries: byte string |
|
63 |
|
ending with a 0x00 byte. Each path from EDLF64_LIBRARY_PATH is prepended to a dynamic library name. |
|
64 |
|
|
|
65 |
|
We use the char '%' used to escape the path separator ':'. To have a '%' in path, you must double |
|
66 |
|
the %, "%%", and to have a ':' in a path not being interpreted as a separator, use "%:". "::" means |
|
67 |
|
an empty path, usually the current working directory at the time of the lookup. Any invalid |
|
68 |
|
combination of '%' with a non handled char ('%' or ':') will result in the path being skiped. |
|
69 |
|
==================================================================================================== |
|
70 |
|
Prototype of process_entry function, WARNING, there is no stack, those must be passed in registers. |
|
71 |
|
This is, very probably, not a classic ABI function call. |
|
72 |
|
|
|
73 |
|
void process_entry( |
|
74 |
|
void *process_info, /* Virtual address */ |
|
75 |
|
uint64_t process_info_bytes_n, /* size in bytes */ |
|
76 |
|
); |
|
77 |
|
|
|
78 |
|
process_info is, on compatible platforms, arg,env and aux with their respective data and they are |
|
79 |
|
mostly the same than the SYSV ABI (arg has no argc). The process_entry function will have to setup |
|
80 |
|
its stack:could be a "mmap" system call or a part of the "bss", namely from the extra memory past |
|
81 |
|
the end of the loaded file (including a guard memory page or not), or blunty book some room into the |
|
82 |
|
file, etc. |
|
83 |
|
On "mmap/munmap" platforms, process_info must be cleanely munmap-able. Other platforms may implement |
|
84 |
|
another mechanism to let process_entry remove/free/reuse such bytes from the process. |
|
85 |
|
|
|
86 |
|
For the executable, you may need its path in order to load private dynamic libraries based on |
|
87 |
|
its file system location. On linux it is possible only if the /proc file system is mounted and it |
|
88 |
|
will target the actual executable via the pathname in the symbolic link /proc/self/exe (all symbolic |
|
89 |
|
links were resolved in this pathname). Namely, without mounted /proc, the executable will need a |
|
90 |
|
private method to lookup for its private dynamic libraries (similar to ELF LD_ORIGIN environment |
|
91 |
|
variable). |
|
92 |
|
==================================================================================================== |
|
93 |
|
C prototype of resolve function: |
|
94 |
|
|
|
95 |
|
uint64_t resolve( /*INPUT*/ uint64_t *symbol_id, |
|
96 |
|
/*OUTPUT*/ void **symbol_virtual_address); |
|
97 |
|
|
|
98 |
|
symbol_id is a 64bits unique id identifying a symbol (similar to kernel syscalls). Like kernel |
|
99 |
|
syscalls, those symbol ids must be _EXTREMELY_ stable in time. You could segment the id space with |
|
100 |
|
categories (init calls, fini calls, etc, 64bits is huge). |
|
101 |
|
Return 0 if ok, with the virtual address of the symbol via the symbol_virtual_address argument. |
|
102 |
|
==================================================================================================== |
|
103 |
|
Recommended C prototype of an init function: |
|
104 |
|
uint64_t init( /*INPUT*/ void *process_info,uint64_t process_info_bytes_n,void *pathname, |
|
105 |
|
void *loader64, |
|
106 |
|
/*OUTPUT*/ void **fini)); |
|
107 |
|
|
|
108 |
|
The process_info here may be a variant from the one provided by the kernel to process_entry. |
|
109 |
|
Pathname is a pointer on the pathname used to load this edlf file. If successful, it should return |
|
110 |
|
0. Do not expect process_info to stay in memory after the call. |
|
111 |
|
|
|
112 |
|
Alternative C prototype of an init function: |
|
113 |
|
uint64_t init( /*INPUT*/ void *process_info,uint64_t process_info_bytes_n,int fd, |
|
114 |
|
void *loader64, |
|
115 |
|
/*OUTPUT*/ void **fini)); |
|
116 |
|
|
|
117 |
|
Same thing than the previous one, but with the process file descriptor used to load/mmap the edlf64 |
|
118 |
|
file instead of the pathname. You could even add the directory file descriptor. |
|
119 |
|
|
|
120 |
|
Recommended C prototype of fini function: |
|
121 |
|
void fini(void) |
|
122 |
|
---------------------------------------------------------------------------------------------------- |
|
123 |
|
NOTES: |
|
124 |
|
|
|
125 |
|
No static and implicit TLS anymore, explicit initialization is required, for instance via the |
|
126 |
|
pthread dynamic library on POSIX-like systems. |
|
127 |
|
|
|
128 |
|
errno would be accessed via an all-thread shared pthread key, or an additional parameter, or |
|
129 |
|
bluntly trashed. errno would become a macro using that shared pthread key to get the value. |