File files/EDLF64.draft deleted (index 0281493..0000000) |
1 |
|
EDLF64 is the JSON/wayland of the executables and dynamic libraries for 64bits platforms with a |
|
2 |
|
reentrancy lock mechanism. |
|
3 |
|
|
|
4 |
|
EDLF64=_E_xecutable and _D_ynamic _L_ibrary _F_ormat for _64_bits platforms with a reentrancy lock |
|
5 |
|
mechanism. |
|
6 |
|
|
|
7 |
|
The excrutiating simplicity of the format is intended while doing a good enough job. |
|
8 |
|
|
|
9 |
|
To avoid a circular dependency with a high level threading dynamic library, the hardware |
|
10 |
|
architecture with or without the kernel must provide a pre-inited/ready to use reentrancy lock |
|
11 |
|
mechanism. On many cores systems, reentrancy lock means thread safe, and usually is an atomic |
|
12 |
|
compare and exchange hardware instruction or kernel syscall on memory locations which are already |
|
13 |
|
inited with some specific content. |
|
14 |
|
==================================================================================================== |
|
15 |
|
0x00 "EDLF64",0x00,version_b (version_b will very probably stay 0x00 forever). |
|
16 |
|
0x08 alignment, power of two (then cannot be 0, namely at least 1), from "EDLF64" . |
|
17 |
|
0x10 mem_bytes_n (including the header). |
|
18 |
|
0x18 process_entry_file_offset, 0 if this is a dynamic library (register passing, no stack). |
|
19 |
|
0x20 resolve_file_offset, 0 if there are no symbols in this edlf64 file. |
|
20 |
|
---- |
|
21 |
|
0x28 edlf64_hdr_bytes_n. |
|
22 |
|
==================================================================================================== |
|
23 |
|
A loader64 instance will keep a registry of what was loaded via reference counting. Usually, there |
|
24 |
|
will be only one static instance of loader64 per process inited by the process_entry function. |
|
25 |
|
The main loader64 instance code is usually dependent only on the hardware architecture and kernel, |
|
26 |
|
to stay independent of all dynamic libraries. |
|
27 |
|
|
|
28 |
|
Only one thread can use a loader64 instance at a time. Entry must be guarded by the reentrancy lock |
|
29 |
|
mechanism. |
|
30 |
|
|
|
31 |
|
Dynamic libraries should try to presume not they are the only instance in a process: there could be |
|
32 |
|
others loaded by other loader64 instances, or the "same" dynamic library but from different files. |
|
33 |
|
Dodging related conflicts could be expensive and should be clearly documented in a dynamic library |
|
34 |
|
documentation if it is supported or not. |
|
35 |
|
|
|
36 |
|
uint64_t loader64_open( /*INPUT*/ void *pathname, /* we presume the pathname is self-sizing */ |
|
37 |
|
/*OUTPUT*/ uint64_t *handle, void **start); |
|
38 |
|
|
|
39 |
|
Return values: |
|
40 |
|
0 if ok, and a loader handle and the pointer on the start of the loaded dynamic |
|
41 |
|
library, namely the first byte of the loaded dynamic library ELDF64 header. |
|
42 |
|
11 (-EAGAIN), if the loader is currently busy. |
|
43 |
|
other an error did happen. |
|
44 |
|
|
|
45 |
|
If pathname targets an already loaded file, the same handle/start will be returned by the |
|
46 |
|
loader. On linux, the triplet (dev_major/dev_minor/inode) should defines file system |
|
47 |
|
unicity. |
|
48 |
|
|
|
49 |
|
The _CALLER_ may use, on platforms which can support it, the EDLF64_LIBRARY_PATH environment |
|
50 |
|
variable to build the pathnames, or any other private means (similar to ELF DT_RUNPATH). See |
|
51 |
|
below for a description of the EDLF64_LIBRARY_PATH. |
|
52 |
|
|
|
53 |
|
NOTE: we use a handle, that to avoid to have to map a start virtual address to some loader |
|
54 |
|
internals (for instance, the handle could be directly an offset into such internals). |
|
55 |
|
|
|
56 |
|
uint64_t loader64_close(/*INPUT*/ uint64_t handle); |
|
57 |
|
|
|
58 |
|
Return values: |
|
59 |
|
0 if ok, non-zero The handle becomes invalid. |
|
60 |
|
11 (-EAGAIN), if the loader is currently busy. The handle stays valid. |
|
61 |
|
other if something wrong did happen while closing the edlf64 file. The handle becomes |
|
62 |
|
invalid. |
|
63 |
|
==================================================================================================== |
|
64 |
|
EDLF64 is about loading only one RWX memory segment. |
|
65 |
|
==================================================================================================== |
|
66 |
|
A EDLF64 file may honor the following environment variable in order to lookup for dynamic |
|
67 |
|
libraries. Of course, only on platforms where it is possible. Such incompatible platforms may |
|
68 |
|
defines their own "EDLF64_LIBRARY_PATH way" (could have a conflicting name separator). |
|
69 |
|
|
|
70 |
|
EDLF64_LIBRARY_PATH environment variable to lookup for EDLF64 dynamic libraries: byte string |
|
71 |
|
ending with a 0x00 byte. Each path from EDLF64_LIBRARY_PATH is prepended to a dynamic library name. |
|
72 |
|
|
|
73 |
|
We use the char '%' used to escape the path separator ':'. To have a '%' in path, you must double |
|
74 |
|
the '%', "%%", and to have a ':' in a path not being interpreted as a separator, use "%:". "::" |
|
75 |
|
means an empty path, usually the current working directory at the time of the lookup. Any invalid |
|
76 |
|
combination of '%' with a non handled char ('%' or ':') will result in the path being skiped. |
|
77 |
|
==================================================================================================== |
|
78 |
|
Prototype of the process_entry function, WARNING, there is no stack, those must be passed in |
|
79 |
|
registers. This is, very probably, not a classic ABI function call. |
|
80 |
|
|
|
81 |
|
void process_entry( |
|
82 |
|
void *process_info, /* Virtual address */ |
|
83 |
|
uint64_t process_info_bytes_n, /* size in bytes */ |
|
84 |
|
); |
|
85 |
|
|
|
86 |
|
process_info is, on compatible platforms, arg,env and aux with their respective data and they are |
|
87 |
|
mostly the same than the SYSV ABI (arg has no argc). The process_entry function will have to setup |
|
88 |
|
its stack:could be a "mmap" system call or a part of the "bss", namely from the extra memory past |
|
89 |
|
the end of the loaded file (including a guard memory page or not), or blunty book some room into the |
|
90 |
|
file, etc. |
|
91 |
|
|
|
92 |
|
On "mmap/munmap" platforms, process_info must be cleanely munmap-able, namely process_info address |
|
93 |
|
must fit the requirement of the platform munmap to do so (usually aligned on the size of the memory |
|
94 |
|
page used to mmap it). Other platform types may implement other mechanisms to let process_entry |
|
95 |
|
remove/free/reuse such bytes from the process. |
|
96 |
|
|
|
97 |
|
Additionnaly, for 'mmap/munmap' platforms, the executable must be cleanely munmap-able: the aux |
|
98 |
|
vector must provide the address used to mmap the executable which would be used to munmap it. Same |
|
99 |
|
fate than process_info for the other platform types. |
|
100 |
|
|
|
101 |
|
(On linux, the EDLF64 vdso would have to be cleanely munmap-able too). |
|
102 |
|
|
|
103 |
|
For the executable, you may need its path in order to load private dynamic libraries based on |
|
104 |
|
its file system location. On linux it is possible only if the /proc file system is mounted and it |
|
105 |
|
will target the actual executable via the pathname in the symbolic link /proc/self/exe (all symbolic |
|
106 |
|
links were resolved in this pathname). Namely, without mounted /proc, the executable will need a |
|
107 |
|
private method to lookup for its private dynamic libraries (similar to ELF LD_ORIGIN environment |
|
108 |
|
variable). |
|
109 |
|
==================================================================================================== |
|
110 |
|
C prototype of the resolve function: |
|
111 |
|
|
|
112 |
|
uint64_t resolve( |
|
113 |
|
/*INPUT*/ uint64_t *symbol_id, |
|
114 |
|
/*OUTPUT*/ void **symbol_virtual_address); |
|
115 |
|
|
|
116 |
|
Entry must be guarded by the reentrancy lock mechanism. |
|
117 |
|
|
|
118 |
|
symbol_id is a 64bits unique value identifying a symbol (similar to kernel syscalls). Like kernel |
|
119 |
|
syscalls, those symbol ids must be _EXTREMELY_ stable in time. |
|
120 |
|
|
|
121 |
|
Return values: |
|
122 |
|
0 if ok, with the virtual address of the symbol via the symbol_virtual_address argument. No |
|
123 |
|
assumption must be made about the location of this virtual address. |
|
124 |
|
11 (-EAGAIN), if the resolve function cannot be run right now. |
|
125 |
|
other the symbol was not found. |
|
126 |
|
==================================================================================================== |
|
127 |
|
In init or fini functions, be very careful about circular dependencies with other components. It |
|
128 |
|
is worth on the long run to provide permanent reentrancy detection while aborting using a very loud |
|
129 |
|
maneer. |
|
130 |
|
|
|
131 |
|
Init/fini functions have the responsibility to keep the dynamic library state consistent (for |
|
132 |
|
instance using reference counting like a loader instance). |
|
133 |
|
|
|
134 |
|
Example C prototype of a basic init function: |
|
135 |
|
uint64_t init( |
|
136 |
|
/*INPUT*/ void *process_info, uint64_t process_info_bytes_n, void *pathname, |
|
137 |
|
uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start), |
|
138 |
|
uint64_t (*loader64_close)(uint64_t handle)); |
|
139 |
|
|
|
140 |
|
The process_info here may be a variant from the one provided by the kernel to process_entry. |
|
141 |
|
Pathname is a pointer on the pathname used to load this edlf64 file. If successful, it should return |
|
142 |
|
0. Do not expect process_info to stay in memory after the call, neither pathname. On x64/x86_64 with |
|
143 |
|
the SYSV ABI, if going with more than 6 parameters, just init a transient structure to pass the |
|
144 |
|
whole data to work around some ABI kludge. |
|
145 |
|
|
|
146 |
|
Alternative C prototype of a basic init function: |
|
147 |
|
uint64_t init( |
|
148 |
|
/*INPUT*/ void *process_info, uint64_t process_info_bytes_n, |
|
149 |
|
int distribution_dir_fd, int fd, |
|
150 |
|
uint64_t (*loader64_open)(void *pathname, uint64_t *handle, void **start), |
|
151 |
|
uint64_t (*loader64_close)(uint64_t handle)); |
|
152 |
|
|
|
153 |
|
Same thing than the previous one, but with the process file descriptor used to load/mmap the edlf64 |
|
154 |
|
file instead of the pathname supplemented with the process directory file descriptor of the |
|
155 |
|
distribution directory. |
|
156 |
|
|
|
157 |
|
Example C prototype of basic fini function: |
|
158 |
|
void fini(void) |
|
159 |
|
==================================================================================================== |
|
160 |
|
NOTES: |
|
161 |
|
|
|
162 |
|
No static and implicit TLS anymore, explicit initialization is required, for instance via the |
|
163 |
|
pthread dynamic library on POSIX-like systems. |
|
164 |
|
|
|
165 |
|
errno would be accessed via an all-thread shared pthread key, or an additional parameter, or |
|
166 |
|
bluntly trashed. errno would become a macro using that shared pthread key to get the value. |
|