Table of Contents

Compiler usage

blc [options] <source-files>
Option Description
-h, -help Print usage information and exit.
-r, -run Execute 'main' method in compile time.
-rt, -run-tests Execute all unit tests in compile time.
-emit-llvm Write LLVM-IR to file.
-emit-mir Write MIR to file.
-ast-dump Print AST.
-lex-dump Print output of lexer.
-syntax-only Check syntax and exit.
-no-bin Don't write binary to disk.
-no-warning Ignore all warnings.
-verbose Verbose mode.
-no-api Don't load internal api.
-force-test-to-llvm Force llvm generation of unit tests.
-configure Generate config file.
-opt-<none | less | default | aggressive> Set optimization level. (use 'default' when not specified)
-debug Debug mode build. (when opt level is not specified 'none' is used.

Language

Base syntax

Basicly every construct in Biscuit language fallows the same rules of declaration. We define name of the entity, type and optionally some initial value.

Declarations and mutability:

<name>: <type>;              // mutable declaration
<name>: [type] = <value>;    // mutable declaration
<name>: [type] : <value>;    // immutable declaration 

Data type is optional when we specify some value.

foo: s32; // data type required
foo := 10; // data type based on value type (in this case s32)

Comments

// this is line comment
/*
 this
 is
 multi line
 comment
*/

Operators

There is no operator overloading in Biscuit.

Binary

Symbol Relevant for types Description
+ Integers, Floats Addition.
Integers, Floats Subtraction.
* Integers, Floats Multiplication.
/ Integers, Floats Division.
% Integers, Floats Remainder division.
+= Integers, Floats Addition and assign.
−= Integers, Floats Subtraction and assign.
*= Integers, Floats Multiplication and assign.
/= Integers, Floats Division and assign.
%= Integers, Floats Remainder division and assign.
< Integers, Floats Less.
> Integers, Floats Greater.
<= Integers, Floats Less or equals.
>= Integers, Floats Greater or equals.
== Integers, Floats, Booleans Equals.
&& Booleans Logical AND
|| Booleans Logical
<< Integers, Floats Bitshift left.
>> Integers, Floats Bitshift right.

Usage:

<expr> <op> <expr>

Unary

Symbol Relevant for types Description
+ Integers, Floats Positive value.
Integers, Floats Negative value.
Pointers Pointer dereference.
& Allocated value Address of.

Usage:

<op> <expr>

Special

Symbol Relevant for types Description
sizeof Any Determinates size in bytes.
alignof Any Determinates alignment of type.
typeinfo Any Determinates TypeInfo of type.
typekind Any Determinates TypeKind of type.

Example:

#test "special operators" {
    s := sizeof(s32);  // size in bytes
    a := alignof(s32); // alignment
};

Type Info

Biscuit language provides type reflection allowing access to the type structure of the code. Pointer to the type information structure can be yielded by typeinfo(<T>) builtin operator call. Type informations can be yielded in compile time and in runtime also, with low additional overhead for runtime (only pointer to the TypeInfo constant is pushed on the stack).

Type description structures:

TypeKind :: enum #compiler {
    Type   :: 1;
    Void   :: 2;
    Int    :: 3;
    Real   :: 4;
    Fn     :: 5;
    Ptr    :: 6;
    Bool   :: 7;
    Array  :: 8;
    Struct :: 9;
    Enum   :: 10;
    Null   :: 11;
    String :: 12;
};

TypeInfo :: struct #compiler {
    kind: TypeKind;
    size_bytes: usize
};

TypeInfoType :: struct #base TypeInfo  #compiler {
};

TypeInfoVoid :: struct #base TypeInfo #compiler {
};

TypeInfoInt :: struct #base TypeInfo #compiler {
    bit_count: s32; 
    is_signed: bool; 
};

TypeInfoReal :: struct #base TypeInfo  #compiler {
    bit_count: s32 
};

TypeInfoFn :: struct #base TypeInfo #compiler {
    args: []TypeInfoFnArg;
    ret_type: *TypeInfo;
    is_vargs: bool; 
};

TypeInfoPtr :: struct #base TypeInfo #compiler {
    pointee_type: *TypeInfo 
};

TypeInfoBool :: struct #base TypeInfo #compiler {
};

TypeInfoArray :: struct #base TypeInfo #compiler {
    name: string;
    elem_type: *TypeInfo; 
    len: s64 
};

TypeInfoStruct :: struct #base TypeInfo #compiler {
    name: string; 
    members: []TypeInfoStructMember; 
    is_slice: bool
};

TypeInfoEnum :: struct #base TypeInfo #compiler {
    name: string; 
    base_type: *TypeInfo; 
    variants: []TypeInfoEnumVariant
};

TypeInfoNull :: struct #base TypeInfo #compiler {
};

TypeInfoString :: struct #base TypeInfo #compiler {
};

TypeInfoStructMember :: struct #compiler {
    name: string;
    base_type: *TypeInfo;
    offset_bytes: s32;
    index: s32
};

TypeInfoEnumVariant :: struct #compiler {
    name: string;
    value: s64
};

TypeInfoFnArg :: struct #compiler {
    name: string;
    base_type: *TypeInfo
};

Example:

#load "std/basic.bl"

#test "RTTI" {
    // yields pointer to TypeInfo constant structure
    info := typeinfo(s32);

    if info.kind == TypeKind.Int {
    // safe cast to *TypeInfoInt
    info_int := cast(*TypeInfoInt) info;

    print("bit_count = %\n", info_int.bit_count);

    if info_int.is_signed {
        print("signed\n");
    } else {
        print("unsigned\n");
    }
    }
};

RTTI is generated in static segment of compiled binary (only desired types are included).

Hash directives

Hash directives are used to specify internal compiler operations.

#load

Load source file into the current assembly. Every file is included into the assembly only once even if we load it from multiple locations.

Lookup order:

  1. Current file parent directory
  2. BL API directory set in install location/etc/bl.conf.
  3. System PATH environment variable.
#load "<bl file>"

TODO #link

#link "<lib>"

#private

Creates private (file scope) block in the file. Everything after this is going to be private and visible only inside the current file.

Example:

// main is public
main :: fn () s32 {
    foo(); // can be called only inside this file.
    return 0;
};

#private

// private function can be called only inside this file
foo :: fn () {
};

// private constant
bar :: 10;

Since version 0.4.2

#extern

Used for marking entities as an external.

Example:

// libc functions
malloc  :: fn (size: usize) *u8 #extern;
free    :: fn (ptr: *u8) #extern;

#compiler

Used for marking entities as an compiler internals. This flag should not be used by user.

Example:

Any :: struct #compiler {
    type_info: *TypeInfo;
    data: *u8
};

#test

Introduce test case function. Unit testing section

#line

Fetch current line in source code as s32.

#file

Fetch current source file name string.

Data types

Fundamental data types

Name Description
s8 Signed 8-bit number.
s16 Signed 16-bit number.
s32 Signed 32-bit number.
s64 Signed 64-bit number.
u8 Unsigned 8-bit number.
u16 Unsigned 16-bit number.
u32 Unsigned 32-bit number.
u64 Unsigned 64-bit number.
usize Unsigned 64-bit size.
bool Boolean. (true/false)
f32 32-bit floating point number
f64 64-bit floating point number.
string String slice.

Pointers

Represents the address of some allocated data.

*<T>

Example:

#load "std/debug.bl"

#test "pointers" {
    i := 666;
    i_ptr : *s32 = &i; // taking the address of 'i' variable and set 'i_ptr'
    j := ^i_ptr;       // pointer dereferencing

    assert(j == i);
};

Arrays

Array is aggregate type of multiple values of the same type.

[<size>] <T>

Arrays can be inline initialized with compound block, type is required. Zero initializer can be used for zero initialization of whole array storage, otherwise we must specify value for every element in an array.

{:<T>: [val], ...}

Example:

#test "array type" {
    arr1 : [10] s32; // declare uninitialized array variable 
    arr1[0] = 666;

    arr1.len; // yields array element count (s64)
    arr1.ptr; // yields pointer to first element '&arr[0]'

    // inline initialization of array type
    arr2 := {:[10]s32: 0 };         // initialize whole array to 0
    arr3 := {:[4]s32: 1, 2, 3, 4 }; // initialize array to the sequence 1, 2, 3, 4
};

Strings

String type in biscuit is slice containting pointer to string data and string lenght. String literals are zero terimated.

Example:

#test "string type" {
    msg : string = "Hello world\n"; 
    msg.len; // character count of the string
    msg.ptr; // pointer to the string content
};

Array slice

Array slice is consist of pointer to the first array element and array lenght.

Syntax:

[] <type>

Slice layout:

Slice :: struct {
    len: s64;
    ptr: *T
};

Example:

#test "array reference" {
    arr     : [10]s32; // uninitialized s32 array
    arr_ref : []s32;   // reference to array of any size with s32 elements

    // TODO
};

Structures

Structure is a composite data type representing group of various data.

struct { 
    <member1 name>: <type>;
    <member2 name>: <type>;
    <member3 name>: <type>
};

Example:

#load "std/debug.bl"

#test "simple structure" {
    Foo :: struct {
    i: s32;
    j: s32 
    };

    foo : Foo;
    foo.i = 10;
    foo.j = 20;

    assert(foo.i == 10);
    assert(foo.j == 20);
};

#test "anonymous structure" {
    foo : struct {
    i: s32;
    j: s32 
    };

    foo.i = 10;
    foo.j = 20;

    assert(foo.i == 10);
    assert(foo.j == 20);
};

#test "auto dereference for structure poiners" {
    Foo :: struct {
    i: s32;
    j: s32 
    };

    foo : Foo;
    foo_ptr := &foo;
    foo_ptr.i = 10;
    foo_ptr.j = 20;

    assert(foo.i == 10);
    assert(foo.j == 20);
};

#test "structure initializer" {
    Foo :: struct {
    i: s32;
    j: s32
    };

    foo := {:Foo: 10, 20 };
    assert(foo.i == 10);
    assert(foo.j == 20);
};

Structure can extend any type with use of #base <T>. This is kind of inheritance similar to C style where inheritance can be simulated with compisition. The #base <T> basically insert base: T; as the first member into the structure. The compiler can use this information later for more features like merging of scopes to enable direct access to base-type members via . operator or implicit cast from child to parent type.

Example of struct extension:

Entity :: struct {
    id: s32
}

// Player has base type Entity
Player :: struct #base Entity {
    // base: Entity; is implicitly inserted as first member
    name: string
};

Wall :: struct #base Entity {
    height: s32
};

Enemy :: struct #base Entity {
    health: s32
};

// Multi-level extension Boss -> Enemy -> Entity
Boss :: struct #base Enemy {
    // Extended struct can be empty.
};

#test "struct extending" {
    p: Player;
    p.id = 10; // direct access to base-type members
    p.name = "Travis";
    assert(p.base.id == 10); // access via .base

    w: Wall;
    w.id = 11;
    w.height = 666;

    e: Enemy;
    e.id = 12;
    e.health = 100;

    b: Boss;
    b.id = 13;

    // implicit down cast to entity
    update(&p);
    update(&w);
    update(&e);
    update(&b);
}

update :: fn (e: *Entity) {
    print("id = %\n", e.id);
}

Since version 0.5.1

Any

The Any type is special builtin structure containting pointer to TypeInfo and pointer to data. Any value can be implicitly casted to this type on function call.

Any type layout:

Any :: struct #compiler {
    type_info: *TypeInfo;
    data: *u8
};

Remember that the Any instance does not contains copy of the value but only pointer to already stack or heap allocated data. The Any instance never owns pointed data and should not be responsible for memory free.

Since Any contains pointer to data, we need to generate temporary storage on stack for constant literals converted to Any.

...
foo(10); // temp for '10' is created here
...

foo :: fn (v: Any) {}

For types converted to the Any we implicitly set type_info field to pointer to the TypeType type-info and data field to the pointer to actual type-info of the converted type.

...
foo(s32); // Type passed
...

foo :: fn (v: Any) {
    assert(v.type_info.kind == TypeKind.Type);

    data_info := cast(*TypeInfo) v.data;
    assert(data_info.kind == TypeKind.Int);
}

Any can be combined with vargs, good example of this use case is print function where args argument type is vargs of Any (... is same as ...Any). The print function can take values of any type passed in args.

print :: fn (format: string, args: ...) {
    ...
};

Since version 0.4.2

Enums

The enum allows the creation of type representing one of listed variants. Biscuit enums can represent variants of any integer type (s32 by default). All variants are grouped into enum's namespace.

Example:

#load "std/debug.bl"

// Enum declaration (base type is by default s32)
Color : type : enum {
    Red;    // default value 0
    Green;  // default value 1
    Blue    // default value 2
};

#test "simple enumerator" {
    assert(cast(s32) Color.Red == 0);
    assert(cast(s32) Color.Green == 1);
    assert(cast(s32) Color.Blue == 2);

    // Base type is s32
    assert(sizeof(Color) == 4);

    // Declare variable of type Color with value Red
    color := Color.Red;
    assert(cast(s32) color == 0);
};

// Enum declaration (base type is u8)
Day :: enum u8 {
    Sat :: 1; // first value explicitly set to 1
    Sun;      // implicitly set to previous value + 1 -> 2
    Mon;      // 3 
    Tue;      // ...
    Wed;
    Thu;
    Fri
};

#test "enumerator" {
    /* Day */ 
    assert(cast(s32) Day.Sat == 1);
    assert(cast(s32) Day.Sun == 2);
    assert(cast(s32) Day.Mon == 3);

    // Base type is u8
    assert(sizeof(Day) == 1);
};

Type aliasing

It's posible to create alias to any data type.

<alias name> :: <type>;

Example:

#test "alias" {
    T :: s32;
    i : T;
};

Function type

Type of function.

fn ([arguments]) [return type]
// type of function without arguments and without return value
fn ()             

// type of function without arguments, returning value of 's32' type
fn () s32

// type of function with two arguments, returning value of 's32' type
fn (s32, bool) s32 

Type casting

Change type of value to the other type. Conventions between integer types are generated implicitly by the compiler.

cast(<T>) <expr>

Example:

#test "type cast" {
    // default type of integer literal is 's32'
    i := 666; 

    // type of the integer literal is changed to u64
    j : u16 = 666;

    // implicit cast on function call
    fn (num: u64) {
    } (j);

    // explicit cast of 'f32' type to 's32'
    l := 1.5f;
    m := cast(s32) l;
};

Biscuit type casting rules are more strict compared to C or C++, there are no void pointers or implicit conversion between integers and enums etc. Despite this fact an explicit cast can be in some cases replaced by auto cast. The auto cast operator does not need explicit destination type notation, it will automatically detect destination type based on expression if possible. When auto operator cannot detect expected type it will keep expression's type untouched. In such case auto does not generate any instructions into IR.

auto <expr>

Example:

#test "type auto cast" {
    s32_ptr : *s32;
    u32_ptr : *u32;

    // auto cast from *u32 to *s32
    s32_ptr = auto u32_ptr;

    // keep expession type s32
    i := auto 10;
};

Literals

Simple literals

b :: true;         // bool true literal 
b :: false;        // bool false literal 
ptr : *s32 = null; // *s32 null pointer literal

Integer literals

Biscuit language provides constant integer literals written in various formats showed in example section. Integer literals has volatile type, when desired type is not specified compiler will choose best type to hold the value. Numbers requiring less space than 32 bits will be implicitly set to s32, numbers requiring more space than 31 bits and less space than 64 bits will be set to s64 and numbers requiring 64 bits will be set to u64 type. Bigger numbers are not supported and compiler will complain. When we specify type explicitly (ex.: foo : u8 : 10;), integer literal will inherit that type.

Example:

i     :: 10;      // s32 literal
i_u8  : u8 : 10;  // u8 literal
i_hex :: 0x10;    // s32 literal
i_bin :: 0b1011;  // s32 literal
f     :: 13.43f;  // f32 literal
d     :: 13.43;   // f64 literal
char  :: 'i';     // u8 literal 

Variables

Example of variable allocated on stack.

<name> : <type>;
<name> : [type] = <value>;

Example:

#test "variables" {
    i : s32 = 666; 
    j := 666; // type is optional here
};

Constants

Example of constant allocated on stack. Constant must be initialized and cannot be changed later.

Syntax:

<name> : [type] : <value>;

Example:

#test "constants" {
    i : s32 : 666; 
    j :: 666; // type is optional here
};

Compound expressions

Compound expression can be used for inline initialization of variables or directly as value. Implicit temporary variable is created as needed. Zero initializer can be used as short for memset(0) call.

Syntax:

{:<type>: <arg1, arg2, ...>};
{:<type>: 0}; // zero initializer

Example:

#load "std/basic.bl"

#test "array compound" {
    // print out all array values
    print_arr :: fn (v: [2]s32) {
    loop i := 0; i < v.len; i += 1 {
        print("v[%] = %\n", i, v[i]);
    }
    };

    // create array of 2 elements directly in call
    print_arr({:[2]s32: 10, 20});

    // create zero initialized array
    print_arr({:[2]s32: 0});
};

#test "struct compound" {
    Foo :: struct {
    i: s32;
    j: s32
    };

    print_strct :: fn (v: Foo) {
    print("v.i = %\n", v.i);
    print("v.j = %\n", v.j);
    };

    // create structure in call
    print_strct({:Foo: 10, 20});

    // create zero initialized structure
    print_strct({:Foo: 0});
};

Functions

Named function

Examples of named function.

<name> : [type] : fn ([args]) [return type] {[body]};

Example:

#test "named functions" {
    foo :: fn () {
    };

    bar :: fn (i: s32) {
    i; // use i
    };

    baz :: fn (i: s32) s32 {
    return i;
    };

    foo();
    bar(666);
    j := baz(666);
};

Anonymous function

Anonymous function has no name and contains only function literal.

fn ([args]) [return type] {[body]};

Example of anonymous function.

#test "anonymous function" {
    i := fn (i: s32) s32 {
    return i;
    } (666);
};

Function pointers

Functions can be called via pointer. Call on null pointer will produce error in interpreter.

Example:

#load "std/basic.bl"

#test "fn pointers" {
    foo :: fn () {
    print("Hello from foo!!!\n");
    };

    bar :: fn () {
    print("Hello from bar!!!\n");
    };

    // Grab the pointer of 'foo'
    fn_ptr := &foo;

    // Call via pointer reference.
    fn_ptr();

    fn_ptr = &bar;
    fn_ptr();
};

Functions with variable argument count

Biscuit supports functions with variable argument count of the same type. VArgs type must be last in function argument list. Compiler internally creates temporary array of all arguments passed in vargs. Inside function body variable argument list acts like regular array.

Example of variable argument count function:

#load "std/debug.bl"

sum :: fn (nums: ...s32) s32 {
    // nums is slice of s32
    result := 0;
    loop i := 0; i < nums.len; i += 1 {
    result += nums[i];
    }

    return result;
};

#test "vargs" {
    s := sum(10, 20, 30);
    assert(s == 60);

    s = sum(10, 20);
    assert(s == 30);

    s = sum();
    assert(s == 0);
};

Blocks

Block can limit scope of the variable.

Example:

#load "std/debug.bl"

#test "blocks" {
    a := 10;

    {
    // this variable lives only in this scope
    i := a;
    assert(i == 10);
    }

    i := 20;
    assert(i == 20);
};

Ifs

If - else base syntax:

if <condition> {[then block]} [else {[else block]}]

Example:

#test "ifs" {
    b := true;
    if b {
    } else {
    unreachable;
    }

    i := 10;
    if i != 10 {
    unreachable;
    } 
};

Loops

Loop base syntax:

loop {[block]} 
loop <condition> {[block]} 
loop <initialization>; <condition>; <increment> {[block]} 

Example:

#test "simple loops" {
    count :: 10;
    i := 0;

    loop {
    i += 1;
    if i == count { break; }
    }

    i = 0;
    loop i < count {
    i += 1;
    }

    loop j := 0; j < count; j += 1 {
    // do something amazing here
    }
};

Break and continue

Break/continue statements can be used in loops to control execution flow.

Examples:

#test "break and continue" {
    i := 0;
    loop {
    i += 1;
    if i == 10 { 
        break; 
    } else {
        continue;
    }
    }
};

Defer statement

The defer statement can be used for defering execution of some expression. All deferred expressions will be executed at the end of the current scope in reverse order. This is usually useful for calling cleanup functions. When scope is terminated by return all previous defers up the scope tree will be called after evaluation of return value.

defer <expr>;

Example:

#load "std/print.bl"

#test "defer example" {
    defer print("1\n");

    {
    defer print("2 ");
    defer print("3 ");
    defer print("4 ");
    } // defer 4, 3, 2

    defer_with_return();

    defer print("5 ");
}; // defer 5, 1

#private 

defer_with_return :: fn () s32 {
    defer print("6 ");
    defer print("7 ");

    if true {
    defer print("8 ");
    return 1;
    } // defer 8, 7, 6

    defer print("9 "); // never reached
    return 0;
};
$ blc -rt -no-bin defer.bl
Compiler version: 0.4.2 (pre-alpha)
Compile assembly: defer
Target: x86_64-apple-darwin18.6.0

Executing test cases...
4 3 2 8 7 6 5 1
[ PASSED ] (1/1) defer.bl:3 'defer example'
--------------------------------------------------------------------------------
Testing done, 0 of 1 failed. Completed: 100%
--------------------------------------------------------------------------------
Compiled 821 lines in 0.007891 seconds.

Finished at 27-08-2019 12:33:59

Implicit context

Implicit context is compiler internal global variable containing basic context for whole assembly. This variable is mutable and can be modified by user code.

Declaration:

_AllocFn :: * fn (size: usize) *u8;
_FreeFn :: * fn (p: *u8);

_Context :: struct {
    /* Default memory allocation function. */
    alloc_fn: _AllocFn;

    /* Defualt memory free function. */
    free_fn: _FreeFn;
};

_context := {:_Context: &malloc, &free};

Command-line arguments

Command line arguments for current process are passed via command_line_arguments global variable. Values are served as slice of strings, containing executable name as first.

Example:

#load "std/print.bl"

main :: fn () s32 {
    print("argc: %\n", command_line_arguments.len);
    loop i := 0; i < command_line_arguments.len; i += 1 {
    print("%: %\n", i, command_line_arguments[i]);
    }

    return 0;
}

Unit tests

Biscuit compiler supports unit testing by default.

Create unit test case:

#load "std/debug.bl"

// function to be tested
add :: fn (a: s32, b: s32) s32 {
  return a + b;
};

#test "this is OK" {
  assert(add(10, 20) == 30); 
};

#test "this is not OK" {
  assert(add(10, 20) != 30); 
};

Run tests:

$ blc -no-bin -run-tests test.bl
compiler version: 0.4.0 (pre-alpha)
compile assembly: test

executing test cases...
[ PASSED ] (1/2) /Users/travis/Desktop/test.bl:8 'this is my test'
error: execution reached unreachable code
/Users/travis/Develop/bl/api/std/debug.bl:31:5 
  30 |   if (!cond) {
  31 |     unreachable;
     |     ^^^^^^^^^^^
  32 |   }
/Users/travis/Desktop/test.bl:13:12 
  12 |    #test "this is not OK" {
  13 |      assert(add(10, 20) != 30); 
     |            ^
  14 |    };
[ FAILED ] (2/2) /Users/travis/Desktop/test.bl:12 'this is not OK'
testing done, 1 of 2 failed

compiled 47 lines in 0.001551 seconds

finished at 22-01-2019 21:28:10
done

Author: Martin Dorazil

Created: 2019-12-02 Mon 12:07

Validate