This note is not stable and may be changed at any time.
ByteCode Overview
- All of the bytecodes are 4 bytes totally 32-bit width.
- Byte codes can be treated as a byte stream.
- The notation below will be ordered by byte order.
- default format is OPC(1byte), R1(1byte), R2(1byte), R3(1byte) order.
- Total 256 registers available (8bit).
- Registers are 64bit width.
- Optional 32bit width registers can be implemented.
- Different register width can break the binary compatibility.
- Registers can be interger unsigned float or object reference.
- Registers are not typed.
- compiler or programmer should keep the type in mind and select the operator.
Reserved Operators
| Mnemonic |
Code |
Description |
| NOP |
0x00 XX XX XX |
No Operation (for alignment) |
Binary Operators
| Mnemonic |
Code |
Description |
| ADD.i |
0x10 R1 R2 R3 |
R1 <- R2 + R3 |
| SUB.i |
0x11 R1 R2 R3 |
R1 <- R2 - R3 |
| MUL.i |
0x12 R1 R2 R3 |
R1 <- R2 * R3 |
| DIV.i |
0x13 R1 R2 R3 |
R1 <- R2 / R3 |
| ADD.u |
0x14 R1 R2 R3 |
R1 <- R2 + R3 |
| SUB.u |
0x15 R1 R2 R3 |
R1 <- R2 - R3 |
| MUL.u |
0x16 R1 R2 R3 |
R1 <- R2 * R3 |
| DIV.u |
0x17 R1 R2 R3 |
R1 <- R2 / R3 |
| ADD.f |
0x18 R1 R2 R3 |
R1 <- R2 + R3 |
| SUB.f |
0x19 R1 R2 R3 |
R1 <- R2 - R3 |
| MUL.f |
0x1A R1 R2 R3 |
R1 <- R2 * R3 |
| DIV.f |
0x1B R1 R2 R3 |
R1 <- R2 / R3 |
| ADD.r |
0x1C R1 R2 R3 |
R1 <- R2.__add( R3 ) |
| SUB.r |
0x1D R1 R2 R3 |
R1 <- R2.__sub( R3 ) |
| MUL.r |
0x1E R1 R2 R3 |
R1 <- R2.__mul( R3 ) |
| DIV.r |
0x1F R1 R2 R3 |
R1 <- R2.__div( R3 ) |
Flow Control Operators
| Mnemonic |
Code |
Description |
| JMP |
0x20 DD DD DD |
Jump to the address |
| JE |
0x21 DD R1 R2 |
if ((u64)R1 == (u64)R2) goto DD |
| JG.i |
0x22 DD R1 R2 |
if ((i64)R1 > (i64)R2) goto DD |
| JL.i |
0x23 DD R1 R2 |
if ((i64)R1 < (i64)R2) goto DD |
| JG.u |
0x24 DD R1 R2 |
if ((u64)R1 > (u64)R2) goto DD |
| JL.u |
0x25 DD R1 R2 |
if ((u64)R1 < (u64)R2) goto DD |
| JE.f |
0x26 DD R1 R2 |
if ((f64)R1 == (f64)R2) goto DD |
| JG.f |
0x27 DD R1 R2 |
if ((f64)R1 > (f64)R2) goto DD |
| JL.f |
0x28 DD R1 R2 |
if ((f64)R1 < (f64)R2) goto DD |
| JE.r |
0x29 DD R1 R2 |
if (*R1.eq(R2)) goto DD |
| JG.r |
0x2A DD R1 R2 |
if (*R1.gt(R2)) goto DD |
| JL.r |
0x2B DD R1 R2 |
if (*R1.lt(R2)) goto DD |
| JT.r |
0x2C DD R1 R2 |
if (*R1.type == *R2) goto DD |
| jmp_if_? |
0x2D DD R1 CC |
if (CC(R1)) goto DD |
CC conditonal list
| id |
condition |
description |
| 0x00 |
type_nil |
Not valid reference |
| 0x01 |
type_boolean |
|
| 0x02 |
type_signed |
If this object can be loaded as signed integer |
| 0x03 |
type_i8 |
|
| 0x04 |
type_i16 |
|
| 0x05 |
type_i32 |
|
| 0x06 |
type_i64 |
|
| 0x07 |
type_unsigned |
If this object can be loaded as unsigned integer |
| 0x08 |
type_u8 |
|
| 0x09 |
type_u16 |
|
| 0x0A |
type_u32 |
|
| 0x0B |
type_u64 |
|
| 0x0C |
type_float |
If this object can be loaded as float |
| 0x0D |
type_f32 |
|
| 0x0E |
type_f64 |
|
| 0x0F |
type_array |
|
| 0x10 |
type_map |
|
| 0x11 |
type_name |
|
| 0x12 |
type_function |
|
| 0x13 |
type_closure |
|
Load operators
| load_imm_int |
0x30 R1 DD DD |
Signed extended DDDD to int64 and load to R1 |
| load_imm_uint |
0x31 R1 DD DD |
Extened dBBBB to uint64 and load to R1 |
| load_imm_float |
0x32 R1 DD DD |
Extend binary 16 DDDD to binary 64 and load to R1 |
| load_const |
0x33 R1 DD DD |
Load const pool DDDD to R1 |
| load_global |
0x34 R1 DD DD |
Load global const pool DDDD to R1 |
Function Call Operators
| Mnemonic |
Code |
Description |
| INVOKE |
0x30 R1 R2 NN |
Call function. *R1(R2, R3, … R2+NN) |
| RETURN |
0x31 R1 |
Return from function. |
Types
- Types
- Boolean represents true or false
- Nil represents nil
- Signed represents a signed integer
- Unsigned represents an unsigned integer
- Float represents a IEEE 754 double precision floating point number including NaN and Infinity
- Map represents key-value pairs of objects
- Array represents a sequence of objects
- String represents a UTF-8 string
- Extension represents a tuple of type information and a byte array where type information is an integer whose meaning is defined by applications or MessagePack specification
Type Identifier
| ccc |
int |
uint |
float |
| 000 |
int8 |
uint8 |
- |
| 001 |
int16 |
uint16 |
binary16 |
| 010 |
int32 |
uint32 |
binary32 |
| 011 |
int64 |
uint64 |
binary64 |
| 100 |
- |
- |
binary128 |
| 101 |
- |
- |
- |
| 110 |
- |
- |
- |
| 111 |
va_int |
va_uint |
mp |
| DataType |
Signature |
BON |
TVM |
Description |
| Reserved |
0000 00xx |
N |
N |
|
| Reserved |
0000 0100 |
N |
N |
Should never be shown (Treated as nil) |
| Null |
0000 0101 |
Y |
Y |
Null Object |
| Boolean |
0000 011b |
Y |
Y |
b = 0 for false |
| Float |
0000 1ccc |
Y |
Y |
ccc for bits |
| Signed |
0001 0ccc |
Y |
Y |
ccc for bits |
| Unigned |
0001 0ccc |
Y |
Y |
ccc for bits |
| Dict |
001s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| Array |
010s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| name |
011s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| DomainSpecific |
1xxx xxxx |
Y |
Y* |
MUST folow with an va_len |
Type Identifier
| DataType |
Sign(Bin) |
Signature |
TBON (1) |
TBON (2) |
TVM Type |
Description |
| Reserved |
0000 0000 |
0x00 |
N |
Y |
N |
Spiltor |
| Reserved |
0000 0001 |
0x01 |
N |
N |
Y |
External Reference |
| Reserved |
0000 0010 |
0x02 |
N |
Y |
N |
Const Pool |
| Reserved |
0000 0011 |
0x03 |
N |
Y |
N |
Reference to Const Pool |
| Reserved |
0000 0100 |
0x04 |
Y |
Y |
Y |
Object |
| Nil |
0000 0101 |
0x05 |
Y |
Y |
Y |
Nil Object |
| Boolean |
0000 011b |
0x06 |
Y |
Y |
Y |
b = 0 for false or type signture |
| Reserved |
0000 1000 |
0x08 |
N |
N |
N |
Reserved for float binary8 |
| Float16 |
0000 1001 |
0x09 |
Y |
Y |
Y |
IEEE754 binary16 |
| Float32 |
0000 1010 |
0x0A |
Y |
Y |
Y |
IEEE754 binary32 |
| Float64 |
0000 1011 |
0x0B |
Y |
Y |
Y |
IEEE754 binary64 |
| Float128 |
0000 1100 |
0x0C |
Y |
Y |
Y |
IEEE754 binary128 |
| Reserved |
0000 1101 |
0x0D |
N |
N |
N |
Reserved for float binary256 |
| Reserved |
0000 1110 |
0x0E |
N |
N |
N |
Reserved for float binary512 |
| Reserved |
0000 1111 |
0x0F |
N |
N |
N |
Reserved for multi-precision float |
| Signed 8 |
0001 0000 |
0x10 |
Y |
Y |
Y |
Signed 8-bit integer |
| Signed 16 |
0001 0001 |
0x11 |
Y |
Y |
Y |
Signed 16-bit integer |
| Signed 32 |
0001 0010 |
0x12 |
Y |
Y |
Y |
Signed 32-bit integer |
| Signed 64 |
0001 0011 |
0x13 |
Y |
Y |
Y |
Signed 64-bit integer |
| Reserved |
|
0x14 - 0x16 |
N |
N |
N |
Reserved for signed integer 128 - 512 |
| Signed VA |
0001 0111 |
0x17 |
Y |
Y |
Y |
Signed variable-length integer |
| Unsigned 8 |
0001 1000 |
0x18 |
Y |
Y |
Y |
Unsigned 8-bit integer |
| Unsigned 16 |
0001 1001 |
0x19 |
Y |
Y |
Y |
Unsigned 16-bit integer |
| Unsigned 32 |
0001 1010 |
0x1A |
Y |
Y |
Y |
Unsigned 32-bit integer |
| Unsigned 64 |
0001 1011 |
0x1B |
Y |
Y |
Y |
Unsigned 64-bit integer |
| Reserved |
|
0x1C - 0x1E |
N |
N |
N |
Reserved for unsigned integer 128 - 512 |
| Unsigned VA |
0001 1111 |
0x1F |
Y |
Y |
Y |
Unsigned variable-length integer |
| Array 0-31 |
001s ssss |
0x20 - 0x3E |
N |
Y |
N |
Array of 0 - 31 elements |
| Array VA |
0011 1111 |
0x3F |
N |
Y |
Y |
Array of variable-length elements |
| Map 0-31 |
010s ssss |
0x40 - 0x5E |
N |
Y |
N |
Map of 0 - 31 elements |
| Map VA |
0101 1111 |
0x5F |
N |
Y |
Y |
Map of variable-length elements |
| Name 0-31 |
011s ssss |
0x60 - 0x7E |
N |
Y |
N |
Name of 0 - 31 characters |
| Name VA |
0111 1111 |
0x7F |
N |
Y |
Y |
Name of variable-length characters |
| Reserved |
1xxx xxxx |
0x80 - 0xFF |
N |
Y |
N |
Reserved for domain-specific |
- Can be used as TBON type signture for array type
- Can be stored in TBON
Notation in diagrams
one byte:
+--------+
| |
+--------+
a variable number of bytes:
+========+
| |
+========+
variable number of objects stored in MessagePack format:
+~~~~~~~~~~~~~~~~~+
| |
+~~~~~~~~~~~~~~~~~+
TBON MAGIC
+------------+------------+------------+------------+
| 0x54 ('T') | 0x42 ('B') | 0x4F ('O') | 0x4E ('N') |
+------------+------------+------------+------------+
Nil format stores nil in 1 byte.
nil:
+--------+
| 0x05 |
+--------+
Bool format family stores false or true in 1 byte.
false:
+--------+
| 0x06 |
+--------+
true:
+--------+
| 0x07 |
+--------+
字节码二进制格式
类型标签
| ccc |
int |
uint |
float |
| 000 |
int8 |
uint8 |
- |
| 001 |
int16 |
uint16 |
binary16 |
| 010 |
int32 |
uint32 |
binary32 |
| 011 |
int64 |
uint64 |
binary64 |
| 100 |
- |
- |
binary128 |
| 101 |
- |
- |
- |
| 110 |
- |
- |
- |
| 111 |
va_int |
va_uint |
mp |
| DataType |
Signature |
BON |
TVM |
Description |
| Reserved |
0000 00xx |
N |
N |
|
| Reserved |
0000 0100 |
N |
N |
Should never be shown (Treated as nil) |
| Null |
0000 0101 |
Y |
Y |
Null Object |
| Boolean |
0000 011b |
Y |
Y |
b = 0 for false |
| Float |
0000 1ccc |
Y |
Y |
ccc for bits |
| Signed |
0001 0ccc |
Y |
Y |
ccc for bits |
| Unigned |
0001 0ccc |
Y |
Y |
ccc for bits |
| Dict |
001s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| Array |
010s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| name |
011s ssss |
Y |
Y* |
sssss for length, 11111 for va_len |
| DomainSpecific |
1xxx xxxx |
Y |
Y* |
MUST folow with an va_len |
Type Identifier
- Type Class
- (P) for Primitive type,
- (C) for Composite type,
- (O) for a Container type.
- Can be used as Array element type signature.
- (Y) means this type can be used as Array element type signature. Almost Primitive types.
- (O) means this type can be used as Array element type signature, but the element should have their own type signature.
| DataType |
Sign(Bin) |
Signature |
(1) |
(2) |
TVM Type |
Description |
| Invalid Obj |
0000 0000 |
0x00 |
- |
- |
Y |
Invalid Object. This should not appear |
| Object Root |
0000 0001 |
0x01 |
O |
- |
N |
Object Root, Someone like --- in YAML |
| Const Pool |
0000 0010 |
0x02 |
C |
- |
N |
Set const pool for parser |
| Const Ref |
0000 0011 |
0x03 |
C |
- |
N |
Pickup reference in const pool |
| Object |
0000 0100 |
0x04 |
O |
O |
Y |
Object |
| Nil |
0000 0101 |
0x05 |
P |
Y |
Y |
Nil Object |
| Boolean |
0000 011b |
0x06 |
P |
Y |
Y |
b = 0 for false or type signture |
| Byte |
0000 1000 |
0x08 |
- |
Y |
N |
Bytes |
| Float16 |
0000 1001 |
0x09 |
P |
Y |
Y |
IEEE754 binary16 |
| Float32 |
0000 1010 |
0x0A |
P |
Y |
Y |
IEEE754 binary32 |
| Float64 |
0000 1011 |
0x0B |
P |
Y |
Y |
IEEE754 binary64 |
| Float128 |
0000 1100 |
0x0C |
P |
Y |
Y |
IEEE754 binary128 |
| Reserved |
0000 1101 |
0x0D |
- |
- |
N |
Reserved for float binary256 |
| Reserved |
0000 1110 |
0x0E |
- |
- |
N |
Reserved for float binary512 |
| Reserved |
0000 1111 |
0x0F |
- |
- |
N |
Reserved for multi-precision float |
| Signed 8 |
0001 0000 |
0x10 |
P |
Y |
Y |
Signed 8-bit integer |
| Signed 16 |
0001 0001 |
0x11 |
P |
Y |
Y |
Signed 16-bit integer |
| Signed 32 |
0001 0010 |
0x12 |
P |
Y |
Y |
Signed 32-bit integer |
| Signed 64 |
0001 0011 |
0x13 |
P |
Y |
Y |
Signed 64-bit integer |
| Reserved |
|
0x14 - 0x16 |
- |
- |
N |
Reserved for signed integer 128 - 512 |
| Big Int |
0001 0111 |
0x17 |
C |
- |
Y |
Reserved for big int |
| Unsigned 8 |
0001 1000 |
0x18 |
P |
Y |
Y |
Unsigned 8-bit integer |
| Unsigned 16 |
0001 1001 |
0x19 |
P |
Y |
Y |
Unsigned 16-bit integer |
| Unsigned 32 |
0001 1010 |
0x1A |
P |
Y |
Y |
Unsigned 32-bit integer |
| Unsigned 64 |
0001 1011 |
0x1B |
P |
Y |
Y |
Unsigned 64-bit integer |
| Reserved |
|
0x1C - 0x1E |
- |
- |
N |
Reserved for unsigned integer 128 - 512 |
| Unsigned VA |
0001 1111 |
0x1F |
- |
- |
Y |
Unsigned variable-length integer |
| Name 0-31 |
001s ssss |
0x20 - 0x3E |
C |
- |
N |
Name of 0 - 31 characters |
| Name VA |
0011 1111 |
0x3F |
C |
- |
Y |
Name of variable-length characters |
| Array 0-31 |
011s ssss |
0x20 - 0x3E |
C |
- |
N |
Array of 0 - 31 elements |
| Array VA |
0111 1111 |
0x3F |
C |
- |
Y |
Array of variable-length elements |
| Map 0-31 |
010s ssss |
0x40 - 0x5E |
C |
- |
N |
Map of 0 - 31 elements |
| Map VA |
0101 1111 |
0x5F |
C |
- |
Y |
Map of variable-length elements |
| Extension |
1xxx xxxx |
0x80 - 0xFF |
- |
- |
N |
Reserved for domain-specific |
| Extension |
1xxx xxxx |
0x80 - 0xFF |
- |
- |
N |
Reserved for domain-specific |
Object Root (Signtaure: 0x00)
Object Root is a special type signture that indicates the start of a new object. It is similar to --- in YAML.
Object Root:
+--------+~~~~~~~~~~~~~~~~~+
| 0x00 | Object Content |
+--------+~~~~~~~~~~~~~~~~~+
nil Object (Signature: 0x05)
Nil format stores nil in 1 byte.
Nil is not null. null value should not be used in the system.
nil:
+--------+
| 0x05 |
+--------+
Bool format family stores false or true in 1 byte.
false:
+--------+
| 0x06 |
+--------+
true:
+--------+
| 0x07 |
+--------+
Float family stores a big endian IEEE754 binary in 3, 5, 9, 17 byte.
Currently only binary16, binary32, binary64, binary128 are supported.
for float8 like FP8 E4M3 or FP8 E5M2, use array of byte instead.
float16 stores a big endian IEEE754 binary16 in 3 byte:
+--------+--------+--------+
| 0x09 |XXXXXXXX|XXXXXXXX|
+--------+--------+--------+
float32 stores a big endian IEEE754 binary32 in 5 byte:
+--------+--------+--------+--------+--------+
| 0x0A |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+
float64 stores a big endian IEEE754 binary64 in 9 byte:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| 0x0B |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
float128 stores a big endian IEEE754 binary128 in 17 byte:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| 0x0C |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+--------+--------+--------+
int 8 stores a 8-bit signed integer in 2 byte:
+--------+--------+
| 0x10 |XXXXXXXX|
+--------+--------+
int 16 stores a 16-bit big endian signed integer in 3 byte:
+--------+--------+--------+
| 0x11 |XXXXXXXX|XXXXXXXX|
+--------+--------+--------+
int 32 stores a 32-bit big endian signed integer in 5 byte:
+--------+--------+--------+--------+--------+
| 0x12 |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+
int 64 stores a 64-bit big endian signed integer in 9 byte:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| 0x13 |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
uint 8 stores a 8-bit unsigned integer in 2 byte:
+--------+--------+
| 0x18 |XXXXXXXX|
+--------+--------+
uint 16 stores a 16-bit big endian unsigned integer in 3 byte:
+--------+--------+--------+
| 0x19 |XXXXXXXX|XXXXXXXX|
+--------+--------+--------+
uint 32 stores a 32-bit big endian unsigned integer in 5 byte:
+--------+--------+--------+--------+--------+
| 0x1A |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+
uint 64 stores a 64-bit big endian unsigned integer in 9 byte:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| 0x1B |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
Array family (Signature: 0x20 - 0x3F)
FixArray stores a sequence of objects whose length is upto 15 elements:
+---------+-----------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
|001S SSSS| Signature | S of Objects in Signature type |
+---------+-----------+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
Array stands for a sequence of objects.
+--------+-----------+================+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
| 0x3F | Signature | Length (VA128) | Length objects in Signature type |
+--------+-----------+================+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+
- The Valid signature is in (2)
对象文件格式
SIGNTURE: `TBON`
ConstPool: []
ObjectRoot: {
Symbols: {
int_symbol: 114
float_symbol: 3.14
string_symbol: "Hello, World!"
array_symbol: [1, 2, 3, 4, 5]
closure_symbol: {
// Treated as Map
bytecode: []
constpool: []
locals: []
}
init_closure: {
// Treated as Map
bytecode: []
constpool: []
locals: []
}
}
Declears: {
// For publiced types, interfaces
}
}