Biscuit language compiler is standalone terminal application called blc. It can be compiled from source code found on GitHub repository or downloaded from home page as prebuilded binary executable. All three major operating systems (Windows, macOS and Linux) are supported, but current active development is done on Windows and it usually takes some time to port latest changes to the other platforms. Compiler executable can be found in bin directory it’s usually good idea to add executable location to system PATH to be accessible from other locations. Following steps describes how to create and compile simple program.
There are several options which can be passed to the compiler.
blc [options] <source-files>
Option
Description
-b, -build
Use BL build pipeline.
-h, -help
Print usage information and exit.
-r, -run
Execute ‘main’ method in compile time.
-rt, -run-tests
Execute all unit tests in compile time.
-emit-llvm
Write LLVM-IR to file.
-emit-mir
Write MIR to file.
-ast-dump
Print AST.
-lex-dump
Print output of lexer.
-syntax-only
Check syntax and exit.
-no-bin
Don’t write binary to disk.
-no-warning
Ignore all warnings.
-no-api
Don’t load internal api.
-no-llvm
Disable LLVM backend.
-no-analyze
Disable analyze pass, only parse and exit.
-no-color
Disable colored output.
-verbose
Verbose mode.
-force-test-to-llvm
Force llvm generation of unit tests.
-configure
Generate config file.
-release-<fast|small>
Enable release mode. (when not specified, compiler use debug mode by default)
-reg-split-<on|off>
Enable or disable splitting structures passed into the function by value into registers
-no-vcvars
Disable injection of Visual Studio environment on Windows.
-di-<dwarf|codeview>
Set debug info format.
Basically every construct in bl follows the same rules of declaration syntax. We define name of the entity, type and optionally some initial value. Name can be usually used to reference the entity later in code and type describes layout of data represented by the entity. It could be a number, text or more complex types.
Possible declarations:
<name>: <type>; // mutable declaration <name>: [type] = <value>; // mutable declaration <name>: [type] : <value>; // immutable declaration (value can be set only once)
foo: s32; // integer variable without initial value name: string = "Martin"; // string variable name: string : "Martin"; // string constant
When we decide to explicitly specify initial value, data type can be inferred from this value. In such case the type is optional.
name := "Martin"; // string variable name :: "Martin"; // string constant
Comments
Comment lines will be ignored by compiler.
// this is line comment /* this is multi line comment */
Fundamental types are atomic basic types builtin into BL compiler.
Name
s8
Signed 8-bit number.
s16
Signed 16-bit number.
s32
Signed 32-bit number.
s64
Signed 64-bit number.
u8
Unsigned 8-bit number.
u16
Unsigned 16-bit number.
u32
Unsigned 32-bit number.
u64
Unsigned 64-bit number.
usize
Unsigned 64-bit size.
bool
Boolean. (true/false)
f32
32-bit floating point number.
f64
64-bit floating point number.
string
String slice.
Represents the address of some allocated data.
*<T>
Example:
#load "std/test.bl" pointers :: fn () #test { i := 666; i_ptr : *s32 = &i; // taking the address of 'i' variable and set 'i_ptr' j := ^i_ptr; // pointer dereferencing test_true(j == i); };
Array is aggregate type of multiple values of the same type. Size value number must be known in compile time.
[<size>] <T>
Arrays can be inline initialized with compound block, type is required. Zero initializer can be used for zero initialization of whole array storage, otherwise we must specify value for every element in an array.
{:<T>: [val], ...}
array_type :: fn () #test { arr1 : [10] s32; // declare uninitialized array variable arr1[0] = 666; arr1.len; // yields array element count (s64) arr1.ptr; // yields pointer to first element '&arr[0]' // inline initialization of array type arr2 := {:[10]s32: 0 }; // initialize whole array to 0 arr3 := {:[4]s32: 1, 2, 3, 4 }; // initialize array to the sequence 1, 2, 3, 4 };
Arrays can be implicitly converted to slice:
array_to_slice :: fn () #test { arr : [10] s32; slice : []s32 = arr; };
String type in Biscuit is slice containing pointer to string data and string length. String literals are always zero terminated.
string_type :: fn () #test { msg : string = "Hello world\n"; msg.len; // character count of the string msg.ptr; // pointer to the string content };
Array slice is consist of pointer to the first array element and array length.
Syntax:
[] <type>
Slice layout:
Slice :: struct { len: s64; ptr: *T };
array_slice :: fn () #test { arr :: {:[4]s32: 1, 2, 3, 4}; slice : []s32 = arr; loop i := 0; i < slice.len; i += 1 { print("%\n", slice[i]); } };
Hint
slice_init can be used to allocate slice on the heap using context allocator.
Structure is a composite type representing group of data as a single type. Structure is as an array another way to define user data type, but types of structure members could be different. It can be used in situations when it’s better to group data into one unit instead of interact with separate units.
Structure can be declared with use of struct keyword.
Person :: struct { id: s32; name: string; age: s32; }
Structure Person in example is consist of id, name and age. Now we can create variable of this type and fill it with data. To access person’s member fields use . operator.
main :: fn () s32 { my_person: Person; // Create instance of type Person my_person.id = 1; my_person.age = 20; my_person.name = "Martin"; return 0; }
Inline initialization is also possible. We can use compound expression to set all members at once.
main :: fn () s32 { my_person1 := {:Person: 0}; // set all data in person to 0 my_person2 := {:Person: 1, "Martin", 20}; return 0; }
Structure content can be printed by print function.
main :: fn () s32 { my_person := {:Person: 1, "Martin", 20}; print("%\n", my_person); return 0; }
Person {id = 1, name = Martin, age = 20}
Due to lack of OOP support we cannot declare member functions in structures and there is no class or object concept in the language. Common way to manipulate with data is passing them into the function as an argument.
person_add_age :: fn (person: *Person, add: s32) { person.age += add; }
Structure can extend any type with use of #base <T>. This is kind of inheritance similar to C style where inheritance can be simulated by composition. The #base <T> basically insert base: T; as the first member into the structure. The compiler can use this information later to provide more inheritance related features like merging of scopes to enable direct access to base-type members via . operator or implicit cast from child to parent type.
#base <T>
base: T
.
Example of struct extension:
Entity :: struct { id: s32 } // Player has base type Entity Player :: struct #base Entity { // base: Entity; is implicitly inserted as first member name: string }; Wall :: struct #base Entity { height: s32 }; Enemy :: struct #base Entity { health: s32 }; // Multi-level extension Boss -> Enemy -> Entity Boss :: struct #base Enemy { // Extended struct can be empty. }; struct_extending :: fn () #test { p: Player; p.id = 10; // direct access to base-type members p.name = "Travis"; assert(p.base.id == 10); // access via .base w: Wall; w.id = 11; w.height = 666; e: Enemy; e.id = 12; e.health = 100; b: Boss; b.id = 13; // implicit down cast to entity update(&p); update(&w); update(&e); update(&b); } update :: fn (e: *Entity) { print("id = %\n", e.id); }
Union is special composite type representing value of multiple types. Union size is always equal to size of the biggest member type and memory offset of all members is same. Union is usually associated with some enum providing information about stored type.
Token :: union { as_string: string; as_int: s32; } Kind :: enum { String; Int; } test_union :: fn () #test { token1: Token; token2: Token; // Token has total size of the biggest member. assert(sizeof(token1 == sizeof(string)); token1.as_string = "This is string"; consumer(&token, Kind.String); token2.as_int = 666; consumer(&token, Kind.Int); } consumer :: fn (token: *Token, kind: TokenKind) { switch kind { Kind.String { print("%\n", token.as_string); } Kind.Int { print("%\n", token.as_int); } default { panic(); } } }
The Any type is special builtin structure containing pointer to TypeInfo and pointer to data. Any value can be implicitly casted to this type on function call.
Any type layout:
Any :: struct #compiler { type_info: *TypeInfo; data: *u8 };
Remember that the Any instance does not contains copy of the value but only pointer to already stack or heap allocated data. The Any instance never owns pointed data and should not be responsible for memory free.
Since Any contains pointer to data, we need to generate temporary storage on stack for constant literals converted to Any.
... foo(10); // temp for '10' is created here ... foo :: fn (v: Any) {}
For types converted to the Any compiler implicitly sets type_info field to pointer to the TypeType type-info and data field to the pointer to actual type-info of the converted type.
type_info
... foo(s32); // Type passed ... foo :: fn (v: Any) { assert(v.type_info.kind == TypeKind.Type); data_info := cast(*TypeInfo) v.data; assert(data_info.kind == TypeKind.Int); }
Any can be combined with vargs, good example of this use case is print function where args argument type is vargs of Any (… is same as …Any). The print function can take values of any type passed in args.
print :: fn (format: string, args: ...) { ... };
The enum allows the creation of type representing one of listed variants. Biscuit enums can represent variants of any integer type (s32 by default). All variants are grouped into enum’s namespace.
// Enum declaration (base type is by default s32) Color : type : enum { Red; // default value 0 Green; // default value 1 Blue // default value 2 }; simple_enumerator :: fn () #test { assert(cast(s32) Color.Red == 0); assert(cast(s32) Color.Green == 1); assert(cast(s32) Color.Blue == 2); // Base type is s32 assert(sizeof(Color) == 4); // Declare variable of type Color with value Red color := Color.Red; assert(cast(s32) color == 0); }; // Enum declaration (base type is u8) Day :: enum u8 { Sat :: 1; // first value explicitly set to 1 Sun; // implicitly set to previous value + 1 -> 2 Mon; // 3 Tue; // ... Wed; Thu; Fri }; test_enumerator :: fn () #test { /* Day */ assert(cast(s32) Day.Sat == 1); assert(cast(s32) Day.Sun == 2); assert(cast(s32) Day.Mon == 3); // Base type is u8 assert(sizeof(Day) == 1); };
It’s possible to create alias to any data type except function types, those can be referenced only by pointers.
<alias name> :: <type>;
alias :: fn () #test { T :: s32; i : T; i = 10; print("%\n", i); };
Type of function.
fn ([arguments]) [return type]
// type of function without arguments and without return value fn () // type of function without arguments, returning value of 's32' type fn () s32 // type of function with two arguments, returning value of 's32' type fn (s32, bool) s32
Change type of value to the other type. Conventions between integer types, from pointer to bool and from array to slice are generated implicitly by the compiler.
cast(<T>) <expr>
type_cast :: fn () #test { // default type of integer literal is 's32' i := 666; // type of the integer literal is changed to u64 j : u16 = 666; // implicit cast on function call fn (num: u64) { } (j); // explicit cast of 'f32' type to 's32' l := 1.5f; m := cast(s32) l; };
Biscuit type casting rules are more strict compared to C or C++, there are no void pointers or implicit conversion between integers and enums etc. Despite this fact an explicit cast can be in some cases replaced by auto cast. The auto cast operator does not need explicit destination type notation, it will automatically detect destination type based on expression if possible. When auto operator cannot detect type, it will keep expression’s type untouched. In such case auto does not generate any instructions into IR.
auto <expr>
type_auto_cast :: fn () #test { s32_ptr : *s32; u32_ptr : *u32; // auto cast from *u32 to *s32 s32_ptr = auto u32_ptr; // keep expession type s32 i := auto 10; };
b :: true; // bool true literal b :: false; // bool false literal ptr : *s32 = null; // *s32 null pointer literal
Biscuit language provides constant integer literals written in various formats showed in example section. Integer literals has volatile type, when desired type is not specified compiler will choose best type to hold the value. Numbers requiring less space than 32 bits will be implicitly set to s32, numbers requiring more space than 31 bits and less space than 64 bits will be set to s64 and numbers requiring 64 bits will be set to u64 type. Bigger numbers are not supported and compiler will complain. When we specify type explicitly (ex.: foo : u8 : 10;), integer literal will inherit that type.
i :: 10; // s32 literal i_u8 : u8 : 10; // u8 literal i_hex :: 0x10; // s32 literal i_bin :: 0b1011; // s32 literal f :: 13.43f; // f32 literal d :: 13.43; // f64 literal char :: 'i'; // u8 literal
Symbol
Relevant for types
+
Integers, Floats
Addition.
-
Subtraction.
*
Multiplication.
/
Division.
%
Remainder division.
+=
Addition and assign.
-=
Subtraction and assign.
*=
Multiplication and assign.
/=
Division and assign.
%=
Remainder division and assign.
<
Less.
>
Greater.
<=
Less or equals.
>=
Greater or equals.
==
Integers, Floats, Booleans
Equals.
&&
Booleans
Logical AND
||
Logical
<<
Bitshift left.
>>
Bitshift right.
Usage:
<expr> <op> <expr>
Positive value.
Negative value.
^
Pointers Pointer
Pointer dereference.
&
Allocated value
Address of.
<op> <expr>
sizeof
Any
Determinates size in bytes.
alignof
Determinates alignment of type.
typeinfo
Determinates TypeInfo of type.
typekind
Determinates TypeKind of type.
Biscuit language provides type reflection allowing access to the type structure of the code. Pointer to the type information structure can be yielded by typeinfo(:raw-html-m2r:`<T>`) builtin operator call. Type information can be yielded in compile time and also in runtime, with low additional overhead for runtime (only pointer to the TypeInfo constant is pushed on the stack).
#load "std/basic.bl" RTTI :: fn () #test { // yields pointer to TypeInfo constant structure info := typeinfo(s32); if info.kind == TypeKind.Int { // safe cast to *TypeInfoInt info_int := cast(*TypeInfoInt) info; print("bit_count = %\n", info_int.bit_count); if info_int.is_signed { print("signed\n"); } else { print("unsigned\n"); } } };
By calling the typeinfo operator compiler will automatically include desired type information into output binary.
Hash directives specify special compile-time information used by compiler. They are introduced by # character followed by directive name and optionally some other information.
#
Load source file into the current assembly. Every file is included into the assembly only once even if we load it from multiple locations.
Lookup order:
Current file parent directory.
BL API directory set in install location/etc/bl.conf.
System PATH environment variable.
#load "<bl file>"
Creates private (file scope) block in the file. Everything after this is going to be private and visible only inside the current file.
// main is public main :: fn () s32 { foo(); // can be called only inside this file. return 0; }; #private // private function can be called only inside this file foo :: fn () { }; // private constant bar :: 10;
Used for marking entities as an external (imported from dynamic library). Custom linkage name can be specified since version 0.5.2 as a string #extern "malloc", when linkage name is not explicitly specified compiler will use name of the entity as linkage name.
#extern "malloc"
// libc functions malloc :: fn (size: usize) *u8 #extern; // since 0.5.2 my_free :: fn (ptr: *u8) #extern "free";
Used for marking entities as an compiler internals. This flag should not be used by user.
Introduce test case function. The test case function is supposed not to take any arguments and return always void. All function with test hash directive are automatically stored into builtin implicit array and can be acquired by testcases() function call. Every test case is stored as TestCase type.
this_is_my_test :: fn () #test { ... }
Fetch current line in source code as s32.
Fetch current source file name string.
Disable variable default initialization. This directive cannot be used with global variables (those must be initialized every time).
test_no_init :: fn () #test { my_large_array: [1024]u8 #noinit; }
This directive yields pointer to static CodeLocation structure generated by compiler containing call-side location in code. The call_location can be used only as function argument default value. It’s useful in cases we want to know from where function was called.
call_location
test_call_location :: fn () #test { print_location(); } print_location :: fn (loc := #call_location) { print("%\n", loc); }
Function related directives giving the compiler information about possibility of inlining marked function during optimization pass.
my_inline_function :: fn () #inline { ... }
Specify base type of structure.
Type :: struct #base s32 { ... }
Specify executable entry function.
Specify build system entry function.
Specify struct member tags. This value can be evaluated by type info.
NO_SERIALIZE :: 1; Type :: struct { i: s32 #tags NO_SERIALIZE; }
Mark external function as compiler specific intrinsic function.
Variable associate name with value of some type. Variables in BL can be declared as mutable or immutable, value of immutable variable cannot be changed and can be set only by variable initializer. Type of variable is optional when value is specified. Variables can be declared in local or global scope, local variable lives only in particular function during function execution, global variables lives during whole execution.
Variables without explicit initialization value are zero initialized (set to default value). We can suppress this behaviour by #noinit directive places instead of value. Global variables must be initialized every time (explicitly or zero initialized) so #noinit cannot be used.
zero initialized
#noinit
mutable_variables :: fn () #test { i : s32 = 666; j := 666; // type is optional here i = 0; // value can be changed }; immutable_variables :: fn () #test { i : s32 : 666; j :: 666; // type is optional here // value cannot be changed }; variable_initialization :: fn () #test { i: s32; // implicitly initialized to 0 arr: [1024]u8 #noinit; // not initialized }
Prefer immutable variables as possible, immutable value can be effectively optimized by compiler and could be evaluated in compile time in some cases.
Compound expression can be used for inline initialization of variables or directly as value. Implicit temporary variable is created as needed. Zero initializer can be used as short for memset(0) call.
array_compound :: fn () #test { // print out all array values print_arr :: fn (v: [2]s32) { loop i := 0; i < v.len; i += 1 { print("v[%] = %\n", i, v[i]); } }; // create array of 2 elements directly in call print_arr({:[2]s32: 10, 20}); // create zero initialized array print_arr({:[2]s32: 0}); }; struct_compound :: fn () #test { Foo :: struct { i: s32; j: s32 }; print_strct :: fn (v: Foo) { print("v.i = %\n", v.i); print("v.j = %\n", v.j); }; // create structure in call print_strct({:Foo: 10, 20}); // create zero initialized structure print_strct({:Foo: 0}); };
Function is chunk of code representing specific piece of program functionality. Function can be called with call operator (), we can provide any number of arguments into function and get return value back on call-side.
()
Functions can be declared in global or local scope (one function can be nested in other).
Function associated with name can be later called by this name. In this case we treat function like immutable variable.
// named function my_function :: fn () { print("Hello!!!\n"); }; my_function_with_return_value :: fn () s32 { return 10; }; my_function_with_arguments :: fn (i: s32, j: s32) s32 { return i + j; }; test_fn :: fn () #test { // call function by name my_function(); result1 :: my_function_with_return_value(); result2 :: my_function_with_arguments(10, 20); }
Functions can be used without explicit name defined and can be directly called.
test_anonymous_function :: fn () #test { i := fn (i: s32) s32 { return i; } (666); print("%\n", i); }
Functions can be called via pointer. Call on null pointer will produce error in interpreter.
test_fn_pointers :: fn () #test { foo :: fn () { print("Hello from foo!!!\n"); }; bar :: fn () { print("Hello from bar!!!\n"); }; // Grab the pointer of 'foo' fn_ptr := &foo; // Call via pointer reference. fn_ptr(); fn_ptr = &bar; fn_ptr(); };
Biscuit supports functions with variable argument count of the same type. VArgs type must be last in function argument list. Compiler internally creates temporary array of all arguments passed in vargs. Inside function body variable argument list acts like regular array slice.
sum :: fn (nums: ...s32) s32 { // nums is slice of s32 result := 0; loop i := 0; i < nums.len; i += 1 { result += nums[i]; } return result; }; test_vargs :: fn () #test { s := sum(10, 20, 30); assert(s == 60); s = sum(10, 20); assert(s == 30); s = sum(); assert(s == 0); };
Function can be declared even in local scope of another function. Local-scoped functions does not capture variables from parent scope (scope of the upper_func in example), this leads to some restrictions. You cannot access i variable declared in upper_func from the inner_func.
upper_func :: fn () { i := 10; // local for upper_func inner_func :: fn () { i := 20; // local for inner_func (no capture) }; }
Function arguments can use default value if value is not provided on call side. Default value must be known in compile time.
foo :: fn (i: s32, j := 10) {} test_foo :: fn () #test { // here we call foo only with one argument so j will // use default value 10 foo(10); }
More functions can be associated with one name with explicit function overloading groups. Call to group of functions is replaced with proper function call during compilation, based on provided arguments.
group :: fn { s32_add; f32_add; } s32_add :: fn (a: s32, b: s32) s32 { return a + b; } f32_add :: fn (a: f32, b: f32) f32 { return a + b; } test_group :: fn () #test { i :: group(10, 20); j :: group(0.2f, 13.534f); print("i = %\n", i); print("j = %\n", j); }
Block can limit scope of the variable.
#load "std/debug.bl" #test "blocks" { a := 10; { // this variable lives only in this scope i := a; assert(i == 10); } i := 20; assert(i == 20); };
If represents condition statement which can change program flow. If executes following code block only if passed condition is true, otherwise skip the block and continue on next statement after block. We can specify else block which is executed only if condition is false.
true
else
false
test_ifs :: fn () #test { b := true; if b { print("b is true!\n"); } else { print("b is false!\n"); } };
simple_loops :: fn () #test { count :: 10; i := 0; loop { i += 1; if i == count { break; } } i = 0; loop i < count { i += 1; } loop j := 0; j < count; j += 1 { // do something amazing here } };
Break/continue statements can be used in loops to control execution flow.
Examples:
#test "break and continue" { i := 0; loop { i += 1; if i == 10 { break; } else { continue; } } };
The defer statement can be used for defering execution of some expression. All deferred expressions will be executed at the end of the current scope in reverse order. This is usually useful for calling cleanup functions. When scope is terminated by return all previous defers up the scope tree will be called after evaluation of return value.
test_defer_example :: fn () #test { defer print("1\n"); { defer print("2 "); defer print("3 "); defer print("4 "); } // defer 4, 3, 2 defer_with_return(); defer print("5 "); } // defer 5, 1 defer_with_return :: fn () s32 { defer print("6 "); defer print("7 "); if true { defer print("8 "); return 1; } // defer 8, 7, 6 defer print("9 "); // never reached return 0; };
Output:
4 3 2 8 7 6 5 1
Biscuit compiler provides unit testing by default.
Create unit test case:
#load "std/test.bl" // function to be tested add :: fn (a: s32, b: s32) s32 { return a + b; }; this_is_OK :: fn () #test { assert(add(10, 20) == 30); }; this_is_not_OK :: fn () #test { assert(add(10, 20) != 30); }; main :: fn () s32 { test_run(); return 0; }
Run tests:
$ blc -rt test.bl Compiler version: 0.7.0, LLVM: 10 Compile assembly: out [DEBUG] Target: x86_64-pc-windows-msvc Testing start in compile time -------------------------------------------------------------------------------- [ PASS | ] this_is_OK (0.021000 ms) assert [test.bl:21]: Assertion failed! execution reached unreachable code C:/Develop/bl/lib/bl/api/std/debug.bl:113:5 112 | if IS_DEBUG { _os_debug_break(); } > 113 | unreachable; | ^^^^^^^^^^^ 114 | }; called from: C:/Develop/bl/tests/test.bl:21:11 20 | this_is_not_OK :: fn () #test { > 21 | assert(add(10, 20) != 30); | ^ 22 | }; [ | FAIL ] this_is_not_OK (1.630000 ms) Results: -------------------------------------------------------------------------------- [ | FAIL ] this_is_not_OK (1.630000 ms) -------------------------------------------------------------------------------- Executed: 2, passed 50%. --------------------------------------------------------------------------------
Biscuit has integrated build system replacing CMake or similar tools. Main advantage is integration of the build system directly in to the compiler. All we need is build.bl file containing #build_entry function. Setup file is nothing more than simple BL program executed in compile time with some special features enabled.
Example of minimal build.bl:
build :: fn () #build_entry { // create new executable assembly exe :: add_executable("MyProgram"); // add 'main.bl' file into the assembly 'exe' add_unit(exe, "main.bl"); }
Start build pipeline using our build.bl file:
$ blc -build
Compiler will automatically use build.bl file as build script and execute build function in compile time. SDK file build/build.bl containing compiler API for build pipeline manipulation is loaded implicitly.
List of builtin variables set by compiler.
IS_DEBUG Is bool immutable variable set to true when assembly is running in debug mode.
IS_DEBUG