diff --git a/Chapters/01-memory.qmd b/Chapters/01-memory.qmd index bfefa99..1efcb19 100644 --- a/Chapters/01-memory.qmd +++ b/Chapters/01-memory.qmd @@ -639,7 +639,7 @@ pub fn main() !void { } ``` -Also, notice that in this example, we use the keyword `defer` to run a small +Also, notice that in this example, we use the `defer` keyword (which I described at @sec-defer) to run a small piece of code at the end of the current scope, which is the expression `allocator.free(input)`. When you execute this expression, the allocator will free the memory that it allocated for the `input` object. diff --git a/Chapters/01-zig-weird.qmd b/Chapters/01-zig-weird.qmd index 47f217f..5a11d0e 100644 --- a/Chapters/01-zig-weird.qmd +++ b/Chapters/01-zig-weird.qmd @@ -1265,6 +1265,47 @@ lowercase letter, and it would work fine. +### The `defer` keyword {#sec-defer} + +With the `defer` keyword you can execute expressions at the end of the current scope. +Take the `foo()` function below as an example. When we execute this function, the expression +that prints the message "Exiting function ..." get's executed only at +the end of the function scope. + +```{zig} +#| auto_main: false +#| echo: true +#| results: "hide" +const std = @import("std"); +const stdout = std.io.getStdOut().writer(); +fn foo() !void { + defer std.debug.print( + "Exiting function ...\n", .{} + ); + try stdout.print("Adding some numbers ...\n", .{}); + const x = 2 + 2; _ = x; + try stdout.print("Multiplying ...\n", .{}); + const y = 2 * 8; _ = y; +} + +pub fn main() !void { + try foo(); +} +``` + +``` +Adding some numbers ... +Multiplying ... +Exiting function ... +``` + +It doesn't matter how the function exits (i.e. because +of an error, or, because of an return statement, or whatever), +just remember, this expression get's executed when the function exits. + + + + ### For loops A loop allows you to execute the same lines of code multiple times, @@ -1434,12 +1475,8 @@ This is just a naming convention that you will find across the entire Zig standa So, in Zig, the `init()` method of a struct is normally the constructor method of the class represented by this struct. While the `deinit()` method is the method used for destroying an existing instance of that struct. -Both the `init()` and `deinit()` methods are used extensively in Zig code, and you will see both of them at @sec-arena-allocator. In this section, -I present the `ArenaAllocator()`, which is a special type of allocator object that receives a second (child) -allocator object at instantiation. We use the `init()` method to create a new `ArenaAllocator()` object, -then, on the next line, we also used the `deinit()` method in conjunction with the `defer` keyword, to destroy this arena allocator object at the end -of the current scope. - +The `init()` and `deinit()` methods are both used extensively in Zig code, and you will see both of +them being used when we talk about allocators at @sec-allocators. But, as another example, let's build a simple `User` struct to represent an user of some sort of system. If you look at the `User` struct below, you can see the `struct` keyword, and inside of a pair of curly braces, we write the struct's body. diff --git a/Chapters/09-error-handling.qmd b/Chapters/09-error-handling.qmd index 86488ce..1263499 100644 --- a/Chapters/09-error-handling.qmd +++ b/Chapters/09-error-handling.qmd @@ -474,7 +474,6 @@ fn create_user(db: Database, allocator: Allocator) !User { By using `errdefer` to destroy the `user` object that we have just created, we garantee that the memory allocated for this `user` object get's freed, before the execution of the program stops. - Because if the expression `try db.add(user)` returns an error value, the execution of our program stops, and we loose all references and control over the memory that we have allocated for the `user` object. @@ -482,12 +481,47 @@ As a result, if we do not free the memory associated with the `user` object befo we cannot free this memory anymore. We simply loose our chance to do the right thing. That is why `errdefer` is essential in this situation. -Having all this in mind, the `errdefer` keyword is different but also similar -to the `defer` keyword. The only difference between the two is when the provided expression -get's executed. The `defer` keyword always execute the provided expression at the end of the -current scope, while `errdefer` executes the provided expression when an error occurs in the +Just to make very clear the differences between `defer` (which I described at @sec-defer) +and `errdefer`, it might be worth to discuss the subject a bit further. +You might still have the question "why use `errdefer` if we can use `defer` instead?" +in your mind. + +Although being similar, the key difference between `errdefer` and `defer` keyword +is when the provided expression get's executed. +The `defer` keyword always execute the provided expression at the end of the +current scope, no matter how your code exits this scope. +In contrast, `errdefer` executes the provided expression only when an error occurs in the current scope. +This becomes important if a resource that you allocate in the +current scope get's freed later in your code, in a different scope. +The `create_user()` functions is an example of this. If you think +closely about this function, you will notice that this function returns +the `user` object as the result. + +In other words, the allocated memory for the `user` object does not get +freed inside the `create_user()`, if the function returns succesfully. +So, if an error does not occur inside this function, the `user` object +is returned from the function, and probably, the code that runs after +this `create_user()` function will be responsible for freeying +the memory of the `user` object. + +But what if an error do occur inside the `create_user()`? What happens then? +This would mean that the execution of your code would stop in this `create_user()` +function, and, as a consequence, the code that runs after this `create_user()` +function would simply not run, and, as a result, the memory of the `user` object +would not be freed before your program stops. + +This is the perfect scenario for `errdefer`. We use this keyword to garantee +that our program will free the allocated memory for the `user` object, +even if an error occurs inside the `create_user()` function. + +If you allocate and free some memory for an object in the same scope, then, +just use `defer` and be happy, `errdefer` have no use for you in such situation. +But if you allocate some memory in a scope A, but you only free this memory +later, in a scope B for example, then, `errdefer` becomes useful to avoid leaking memory +in sketchy situations. + ## Union type in Zig {#sec-unions} diff --git a/_freeze/Chapters/01-memory/execute-results/html.json b/_freeze/Chapters/01-memory/execute-results/html.json index 450c841..a19a1fa 100644 --- a/_freeze/Chapters/01-memory/execute-results/html.json +++ b/_freeze/Chapters/01-memory/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "772698f2d84bfdb998b8c324035a196a", + "hash": "a3770265a91ecf3f188b1afb7b388941", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n# Memory and Allocators in Zig\n\n\nIn this chapter, we will talk about memory. How does Zig controls memory? What\ncommom tools are used? Are there any important aspect that makes memory\ndifferent/special in Zig? You will find the answers here.\n\nEvery computer needs memory. Is by having memory that computers can temporarily store\nthe values/results of your calculations. Without memory, programming languages would never have\nconcepts such as \"variables\", or \"objects\", to store the values that you generate.\n\n\n## Memory spaces\n\nEvery object that you create in your Zig source code needs to be stored somewhere,\nin your computer's memory. Depending on where and how you define your object, Zig\nwill use a different \"memory space\", or a different\ntype of memory to store this object.\n\nEach type of memory normally serves for different purposes.\nIn Zig, there are 3 types of memory (or 3 different memory spaces) that we care about. They are:\n\n- Global data register (or the \"global data section\");\n- Stack;\n- Heap;\n\n\n### Compile-time known versus runtime known {#sec-compile-time}\n\nOne strategy that Zig uses to decide where it will store each object that you declare, is by looking\nat the value of this particular object. More specifically, by investigating if this value is\nknown at \"compile-time\" or at \"runtime\".\n\nWhen you write a program in Zig, the values of some of the objects that you write in your program are *known\nat compile time*. Meaning that, when you compile your Zig source code, during the compilation process,\nthe `zig` compiler can figure it out what is the exact value of a particular object\nthat exists in your source code.\nKnowing the length (or the size) of each object is also important. So the length (or the size) of each object that you write in your program is,\nin some cases, *known at compile time*.\n\nThe `zig` compiler cares more about knowing the length (or the size) of a particular object\n, than to know it's actual value. But, if the `zig` compiler knows the value of the object, then, it\nautomatically knows the size of this object. Because it can simply calculate the\nsize of the object by looking at the size of the value.\n\nTherefore, the priority for the `zig` compiler is to discover the size of each object in your source code.\nIf the value of the object in question is known at compile-time, then, the `zig` compiler\nautomatically knows the size/length of this object. But if the value of this object is not\nknown at compile-time, then, the size of this object is only known at compile-time if,\nand only if, the type of this object have a known fixed size.\n\nIn order to a type have a known fixed size, this type must have data members whose size is fixed.\nIf this type includes, for example, a variable sized array in it, then, this type do not have a known\nfixed size. Because this array can have any size at runtime\n(i.e. it can be an array of 2 elements, or 50 elements, or 1 thousand elements, etc.).\n\nFor example, a string object, which internally is an array of constant u8 values (`[]const u8`)\nhave a variable size. It can be a string object with 100 or 500 characters in it. If we do not\nknow at compile-time, which exact string will be stored inside this string object, then, we cannot calculate\nthe size of this string object at compile-time. So, any type, or any struct declaration that you make, that\nincludes a string data member that do not have an explicit fixed size, makes this type, or this\nnew struct that you are declaring, a type that do not have a known fixed size at compile-time.\n\nIn contrast, if the type or this struct that you are declaring, includes a data member that is an array,\nbut this array have a known fixed size, like `[60]u8` (which declares an array of 60 `u8` values), then,\nthis type, or, this struct that you are declaring, becomes a type with a known fixed size at compile-time.\nAnd because of that, in this case, the `zig` compiler do not need to known at compile-time the exact value of\nany object of this type. Since the compiler can find the necessary size to store this object by\nlooking at the size of it's type.\n\n\nLet's look at an example. In the source code below, we have two constant objects (`name` and `array`) declared.\nBecause the values of these particular objects are written down, in the source code itself (`\"Pedro\"`\nand the number sequence from 1 to 4), the `zig` compiler can easily discover the values of these constant\nobjects (`name` and `array`) during the compilation process.\nThis is what \"known at compile time\" means. It refers to any object that you have in your Zig source code\nwhose value can be identified at compile time.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = \"Pedro\";\nconst array = [_]u8{1, 2, 3, 4};\n_ = name; _ = array;\n\nfn input_length(input: []const u8) usize {\n const n = input.len;\n return n;\n}\n```\n:::\n\n\nThe other side of the spectrum are objects whose values are not known at compile time.\nFunction arguments are a classic example of this. Because the value of each function\nargument depends on the value that you assign to this particular argument,\nwhen you call the function.\n\nFor example, the function `input_length()` contains an argument named `input`, which is an array of constant `u8` integers (`[]const u8`).\nIs impossible to know at compile time the value of this particular argument. And it also is impossible to know the size/length\nof this particular argument. Because it is an array that do not have a fixed size specified explicitly in the argument type annotation.\n\nSo, we know that this `input` argument will be an array of `u8` integers. But we do not know at compile-time, it's value, and neither his size.\nThis information is known only at runtime, which is the period of time when you program is executed.\nAs a consequence, the value of the expression `input.len` is also known only at runtime.\nThis is an intrinsic characteristic of any function. Just remember that the value of function arguments is usually not \"compile-time known\".\n\nHowever, as I mentioned earlier, what really matters to the compiler is to know the size of the object\nat compile-time, and not necessarily it's value. So, although we don't know the value of the object `n`, which is the result of the expression\n`input.len`, at compile-time, we do know it's size. Because the expression `input.len` always return a value of type `usize`,\nand the type `usize` have a known fixed size.\n\n\n\n### Global data register\n\nThe global data register is a specific section of the executable of your Zig program, that is responsible\nfor storing any value that is known at compile time.\n\nEvery constant object whose value is known at compile time that you declare in your source code,\nis stored in the global data register. Also, every literal value that you write in your source code,\nsuch as the string `\"this is a string\"`, or the integer `10`, or a boolean value such as `true`,\nis also stored in the global data register.\n\nHonestly, you don't need to care much about this memory space. Because you can't control it,\nyou can't deliberately access it or use it for your own purposes.\nAlso, this memory space does not affect the logic of your program.\nIt simply exists in your program.\n\n\n### Stack vs Heap\n\nIf you are familiar with system's programming, or just low-level programming in general, you\nprobably have heard of the \"duel\" between Stack vs Heap. These are two different types of memory,\nor different memory spaces, which are both available in Zig.\n\nThese two types of memory don't actually duel with\neach other. This is a commom mistake that beginners have, when seeing \"x vs y\" styles of\ntabloid headlines. These two types of memory are actually complementary to each other.\nSo, in almost every Zig program that you ever write, you will likely use a combination of both.\nI will describe each memory space in detail over the next sections. But for now, I just want to\nstablish the main difference between these two types of memory.\n\nIn essence, the stack memory is normally used to store values whose length is fixed and known\nat compile time. In contrast, the heap memory is a *dynamic* type of memory space, meaning that, it is\nused to store values whose length might grow during the execution (runtime) of your program [@jenny2022].\n\nLengths that grow during runtime are intrinsically associated with \"runtime known\" type of values.\nIn other words, if you have an object whose length might grow during runtime, then, the length\nof this object becomes not known at compile time. If the length is not known at compile-time,\nthe value of this object also becomes not known at compile-time.\nThese types of objects should be stored in the heap memory space, which is\na dynamic memory space, which can grow or shrink to fit the size of your objects.\n\n\n\n### Stack {#sec-stack}\n\nThe stack is a type of memory that uses the power of the *stack data structure*, hence the name. \nA \"stack\" is a type of *data structure* that uses a \"last in, first out\" (LIFO) mechanism to store the values\nyou give it to. I imagine you are familiar with this data structure.\nBut, if you are not, the [Wikipedia page](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))[^wiki-stack]\n, or, the [Geeks For Geeks page](https://www.geeksforgeeks.org/stack-data-structure/)[^geek-stack] are both\nexcellent and easy resources to fully understand how this data structure works.\n\n[^wiki-stack]: \n[^geek-stack]: \n\nSo, the stack memory space is a type of memory that stores values using a stack data structure.\nIt adds and removes values from the memory by following a \"last in, first out\" (LIFO) principle.\n\nEvery time you make a function call in Zig, an amount of space in the stack is\nreserved for this particular function call [@jenny2022; @zigdocs].\nThe value of each function argument given to the function in this function call is stored in this\nstack space. Also, every local object that you declare inside the function scope is\nusually stored in this same stack space.\n\n\nLooking at the example below, the object `result` is a local object declared inside the scope of the `add()`\nfunction. Because of that, this object is stored inside the stack space reserved for the `add()` function.\nThe `r` object (which is declared outside of the `add()` function scope) is also stored in the stack.\nBut since it is declared in the \"outer\" scope, this object is stored in the\nstack space that belongs to this outer scope.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) u8 {\n const result = x + y;\n return result;\n}\n```\n:::\n\n\n\nSo, any object that you declare inside the scope of a function is always stored inside\nthe space that was reserved for that particular function in the stack memory. This\nalso counts for any object declared inside the scope of your `main()` function for example.\nAs you would expect, in this case, they\nare stored inside the stack space reserved for the `main()` function.\n\nOne very important detail about the stack memory is that **it frees itself automatically**.\nThis is very important, remember that. When objects are stored in the stack memory,\nyou don't have the work (or the responsibility) of freeing/destroying these objects.\nBecause they will be automatically destroyed once the stack space is freed at the end of the function scope.\n\nSo, once the function call returns (or ends, if you prefer to call it this way)\nthe space that was reserved in the stack is destroyed, and all of the objects that were in that space goes away with it.\nThis mechanism exists because this space, and the objects within it, are not necessary anymore,\nsince the function \"finished it's business\".\nUsing the `add()` function that we exposed above as an example, it means that the object `result` is automatically\ndestroyed once the function returns.\n\n::: {.callout-important}\nLocal objects that are stored in the stack space of a function are automatically\nfreed/destroyed at the end of the function scope.\n:::\n\n\nThis same logic applies to any other special structure in Zig that have it's own scope by surrounding\nit with curly braces (`{}`).\nFor loops, while loops, if else statements, etc. For example, if you declare any local\nobject in the scope of a for loop, this local object is accessible only within the scope\nof this particular for loop. Because once the scope of this for loop ends, the space in the stack\nreserved for this for loop is freed.\nThe example below demonstrates this idea.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This does not compile succesfully!\nconst a = [_]u8{0, 1, 2, 3, 4};\nfor (0..a.len) |i| {\n const index = i;\n _ = index;\n}\n// Trying to use an object that was\n// declared in the for loop scope,\n// and that does not exist anymore.\nstd.debug.print(\"{d}\\n\", index);\n```\n:::\n\n\n\n\nOne important consequence of this mechanism is that, once the function returns, you can no longer access any memory\naddress that was inside the space in the stack reserved for this particular function. Because this space was\ndestroyed. This means that, if this local object is stored in the stack,\nyou cannot make a function that **returns a pointer to this object**.\n\nThink about that for a second. If all local objects in the stack are destroyed at the end of the function scope, why\nwould you even consider returning a pointer to one of these objects? This pointer is at best,\ninvalid, or, more likely, \"undefined\".\n\nConclusion, is totally fine to write a function that returns the local object\nitself as result, because then, you return the value of that object as the result.\nBut, if this local object is stored in the stack, you should never write a function\nthat returns a pointer to this local object. Because the memory address pointed by the pointer\nno longer exists.\n\n\nSo, using again the `add()` function as an example, if you rewrite this function so that it\nreturns a pointer to the local object `result`, the `zig` compiler will actually compile\nyou program, with no warnings or erros. At first glance, it looks that this is good code\nthat works as expected. But this is a lie!\n\nIf you try to take a look at the value inside of the `r` object,\nor, if you try to use this `r` object in another expression\nor function call, then, you would have undefined behaviour, and major\nbugs in your program [@zigdocs, see \"Lifetime and Ownership\"[^life] and \"Undefined Behaviour\"[^undef] sections].\n\n[^life]: \n[^undef]: \n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This code compiles succesfully. But it has\n// undefined behaviour. Never do this!!!\n\n// The `r` object is undefined!\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) *const u8 {\n const result = x + y;\n return &result;\n}\n```\n:::\n\n\nThis \"invalid pointer to stack variable\" problem is very known across many programming language communities.\nIf you try to do the same thing, for example, in a C or C++ program (i.e. returning an address to\na local object stored in the stack), you would also get undefined behaviour\nin the program.\n\n::: {.callout-important}\nIf a local object in your function is stored in the stack, you should never\nreturn a pointer to this local object from the function. Because\nthis pointer will always become undefined after the function returns, since the stack space of the function\nis destroyed at the end of it's scope.\n:::\n\nBut what if you really need to use this local object in some way after your function returns?\nHow can you do this? The answer is: \"in the same you would do if this was a C or C++ program. By returning\nan address to an object stored in the heap\". The heap memory have a much more flexible lifecycle,\nand allows you to get a valid pointer to a local object of a function that already returned\nfrom it's scope.\n\n\n### Heap {#sec-heap}\n\nOne important limitation of the stack, is that, only objects whose length/size is known at compile-time can be\nstored in it. In contrast, the heap is a much more dynamic\n(and flexible) type of memory. It is the perfect type of memory to use\non objects whose size/length might grow during the execution of your program.\n\nVirtually any application that behaves as a server is a classic use case of the heap.\nA HTTP server, a SSH server, a DNS server, a LSP server, ... any type of server.\nIn summary, a server is a type of application that runs for long periods of time,\nand that serves (or \"deals with\") any incoming request that reaches this particular server.\n\nThe heap is a good choice for this type of system, mainly because the server does not know upfront\nhow many requests it will receive from users, while it is active. It could be one single request,\nor, 5 thousand requests, or, it could also be zero requests.\nThe server needs to have the ability to allocate and manage it's memory according to how many requests it receives.\n\nAnother key difference between the stack and the heap, is that the heap is a type\nof memory that you, the programmer, have complete control over. This makes the heap a\nmore flexible type of memory, but it also makes it harder to work with it. Because you,\nthe programmer, is responsible for managing everything related to it. Including where the memory is allocated,\nhow much memory is allocated, and where this memory is freed.\n\n> Unlike stack memory, heap memory is allocated explicitly by programmers and it won’t be deallocated until it is explicitly freed [@jenny2022].\n\nTo store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so,\nby using an allocator to allocate some space in the heap. At @sec-allocators, I will present how you can use allocators to allocate memory\nin Zig.\n\n::: {.callout-important}\nEvery memory you allocate in the heap needs to be explicitly freed by you, the programmer.\n:::\n\nThe majority of allocators in Zig do allocate memory on the heap. But some exceptions to this rule are\n`ArenaAllocator()` and `FixedBufferAllocator()`. The `ArenaAllocator()` is a special\ntype of allocator that works in conjunction with a second type of allocator.\nOn the other side, the `FixedBufferAllocator()` is an allocator that works based on\nbuffer objects created on the stack. This means that the `FixedBufferAllocator()` makes\nallocations only on the stack.\n\n\n\n\n### Summary\n\nAfter discussing all of these boring details, we can quickly recap what we learned.\nIn summary, the Zig compiler will use the following rules to decide where each\nobject you declare is stored:\n\n1. every literal value (such as `\"this is string\"`, `10`, or `true`) is stored in the global data section.\n\n1. every constant object (`const`) whose value **is known at compile-time** is also stored in the global data section.\n\n1. every object (constant or not) whose length/size **is known at compile time** is stored in the stack space for the current scope.\n\n1. if an object is created with the method `alloc()` or `create()` of an allocator object, this object is stored in the memory space used by this particular allocator object. Most of allocators available in Zig use the heap memory, so, this object is likely stored in the heap (`FixedBufferAllocator()` is an exception to that).\n\n1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap.\n\n\n## Allocators {#sec-allocators}\n\nOne key aspect about Zig, is that there are \"no hidden-memory allocations\" in Zig.\nWhat that really means, is that \"no allocations happen behind your back in the standard library\" [@zigguide].\n\nThis is a known problem, specially in C++. Because in C++, there are some operators that do allocate\nmemory behind the scene, and there is no way for you to known that, until you actually read the\nsource code of these operators, and find the memory allocation calls.\nMany programmers find this behaviour annoying and hard to keep track of.\n\nBut, in Zig, if a function, an operator, or anything from the standard library\nneeds to allocate some memory during it's execution, then, this function/operator needs to receive (as input) an allocator\nprovided by the user, to actually be able to allocate the memory it needs.\n\nThis creates a clear distinction between functions that \"do not\" from those that \"actually do\"\nallocate memory. Just look at the arguments of this function.\nIf a function, or operator, have an allocator object as one of it's inputs/arguments, then, you know for\nsure that this function/operator will allocate some memory during it's execution.\n\nAn example is the `allocPrint()` function from the Zig standard library. With this function, you can\nwrite a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C.\nIn order to write such new string, the `allocPrint()` function needs to allocate some memory to store the\noutput string.\n\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives\nas input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator\nobject. But I could easily use any other type of allocator object from the Zig standard library.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello Pedro!!!\n```\n\n\n:::\n:::\n\n\n\nYou get a lot of control\nover where and how much memory this function can allocate. Because it is you,\nthe user/programmer, that provides the allocator for the function to use.\nThis makes \"total control\" over memory management easier to achieve in Zig.\n\n### What are allocators?\n\nAllocators in Zig are objects that you can use to allocate memory for your program.\nThey are similar to the memory allocating functions in C, like `malloc()` and `calloc()`.\nSo, if you need to use more memory than you initially have, during the execution of your program, you can simply ask\nfor more memory using an allocator.\n\nZig offers different types of allocators, and they are usually available through the `std.heap` module of\nthe standard library. So, just import the Zig standard library into your Zig module (with `@import(\"std\")`), and you can start\nusing these allocators in your code.\n\nFurthermore, every allocator object is built on top of the `Allocator` interface in Zig. This\nmeans that, every allocator object you find in Zig must have the methods `alloc()`,\n`create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using,\nbut you don't need to change the function calls to the methods that do the memory allocation\n(and the free memory operations) for your program.\n\n### Why you need an allocator?\n\nAs we described at @sec-stack, everytime you make a function call in Zig,\na space in the stack is reserved for this function call. But the stack\nhave a key limitation which is: every object stored in the stack have a\nknown fixed length.\n\nBut in reality, there are two very commom instances where this \"fixed length limitation\" of the stack is a deal braker:\n\n1. the objects that you create inside your function might grow in size during the execution of the function.\n\n2. sometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer\nto a local object. As I described at @sec-stack, you cannot do that if this local object is stored in the\nstack. However, if this object is stored in the heap, then, you can return a pointer to this object at the\nend of the function. Because you (the programmer) control the lyfetime of any heap memory that you allocate. You decide\nwhen this memory get's destroyed/freed.\n\nThese are commom situations where the stack is not good for.\nThat is why you need a different memory management strategy to\nstore these objects inside your function. You need to use\na memory type that can grow together with your objects, or that you\ncan control the lyfetime of this memory.\nThe heap fit this description.\n\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size\nduring the execution of your program, you grow the amount of memory\nyou have by allocating more memory in the heap to store these objects. \nAnd you that in Zig, by using an allocator object.\n\n\n### The different types of allocators\n\n\nAt the moment of the writing of this book, in Zig, we have 6 different\nallocators available in the standard library:\n\n- `GeneralPurposeAllocator()`.\n- `page_allocator()`.\n- `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()`.\n- `ArenaAllocator()`.\n- `c_allocator()` (requires you to link to libc).\n\n\nEach allocator have it's own perks and limitations. All allocators, except `FixedBufferAllocator()` and `ArenaAllocator()`,\nare allocators that use the heap memory. So any memory that you allocate with\nthese allocators, will be placed in the heap.\n\n### General-purpose allocators\n\nThe `GeneralPurposeAllocator()`, as the name suggests, is a \"general purpose\" allocator. You can use it for every type\nof task. In the example below, I'm allocating enough space to store a single integer in the object `some_number`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n```\n:::\n\n\n\nWhile useful, you might want to use the `c_allocator()`, which is a alias to the C standard allocator `malloc()`. So, yes, you can use\n`malloc()` in Zig if you want to. Just use the `c_allocator()` from the Zig standard library. However,\nif you do use `c_allocator()`, you must link to Libc when compiling your source code with the\n`zig` compiler, by including the flag `-lc` in your compilation process.\nIf you do not link your source code to Libc, Zig will not be able to find the\n`malloc()` implementation in your system.\n\n### Page allocator\n\nThe `page_allocator()` is an allocator that allocates full pages of memory in the heap. In other words,\nevery time you allocate memory with `page_allocator()`, a full page of memory in the heap is allocated,\ninstead of just a small piece of it.\n\nThe size of this page depends on the system you are using.\nMost systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally\nallocated in each call by `page_allocator()`. That is why, `page_allocator()` is considered a\nfast, but also \"wasteful\" allocator in Zig. Because it allocates a big amount of memory\nin each call, and you most likely will not need that much memory in your program.\n\n### Buffer allocators\n\nThe `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that\nwork with a fixed sized buffer that is stored in the stack. So these two allocators only allocates\nmemory in the stack. This also means that, in order to use these allocators, you must first\ncreate a buffer object, and then, give this buffer as an input to these allocators.\n\nIn the example below, I am creating a `buffer` object that is 10 elements long.\nNotice that I give this `buffer` object to the `FixedBufferAllocator()` constructor.\nNow, because this `buffer` object is 10 elements long, this means that I am limited to this space.\nI cannot allocate more than 10 elements with this allocator object. If I try to\nallocate more than that, the `alloc()` method will return an `OutOfMemory` error value.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n```\n:::\n\n\n\n### Arena allocator {#sec-arena-allocator}\n\nThe `ArenaAllocator()` is an allocator object that takes a child allocator as input. The idea behind the `ArenaAllocator()` in Zig\nis similar to the concept of \"arenas\" in the programming language Go[^go-arena]. It is an allocator object that allows you\nto allocate memory as many times you want, but free all memory only once.\nIn other words, if you have, for example, called 5 times the method `alloc()` of an `ArenaAllocator()` object, you can\nfree all the memory you allocated over these 5 calls at once, by simply calling the `deinit()` method of the same `ArenaAllocator()` object.\n\n[^go-arena]: \n\nIf you give, for example, a `GeneralPurposeAllocator()` object as input to the `ArenaAllocator()` constructor, like in the example below, then, the allocations\nyou perform with `alloc()` will actually be made with the underlying object `GeneralPurposeAllocator()` that was passed.\nSo, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator\nreally do is helping you to free all the memory you allocated multiple times with just a single command. In the example\nbelow, I called `alloc()` 3 times. So, if I did not used an arena allocator, then, I would need to call\n`free()` 3 times to free all the allocated memory.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n```\n:::\n\n\n\n\n### The `alloc()` and `free()` methods\n\nIn the code example below, we are accessing the `stdin`, which is\nthe standard input channel, to receive an input from the\nuser. We read the input given by the user with the `readUntilDelimiterOrEof()`\nmethod.\n\nNow, after reading the input of the user, we need to store this input somewhere in\nour program. That is why I use an allocator in this example. I use it to allocate some\namount of memory to store this input given by the user. More specifically, the method `alloc()`\nof the allocator object is used to allocate an array capable of storing 50 `u8` values.\n\nNotice that this `alloc()` method receives two inputs. The first one, is a type.\nThis defines what type of values the allocated array will store. In the example\nbelow, we are allocating an array of unsigned 8-bit integers (`u8`). But\nyou can create an array to store any type of value you want. Next, on the second argument, we\ndefine the size of the allocated array, by specifying how much elements\nthis array will contain. In the case below, we are allocating an array of 50 elements.\n\nAt @sec-zig-strings we described that strings in Zig are simply arrays of characters.\nEach character is represented by an `u8` value. So, this means that the array that\nwas allocated in the object `input` is capable of storing a string that is\n50-characters long.\n\nSo, in essence, the expression `var input: [50]u8 = undefined` would create\nan array for 50 `u8` values in the stack of the current scope. But, you\ncan allocate the same array in the heap by using the expression `var input = try allocator.alloc(u8, 50)`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n```\n:::\n\n\nAlso, notice that in this example, we use the keyword `defer` to run a small\npiece of code at the end of the current scope, which is the expression `allocator.free(input)`.\nWhen you execute this expression, the allocator will free the memory that it allocated\nfor the `input` object.\n\nWe have talked about this at @sec-heap. You **should always** explicitly free any memory that you allocate\nusing an allocator! You do that by using the `free()` method of the same allocator object you\nused to allocate this memory. The `defer` keyword is used in this example only to help us execute\nthis free operation at the end of the current scope.\n\n\n### The `create()` and `destroy()` methods\n\nWith the `alloc()` and `free()` methods, you can allocate memory to store multiple elements\nat once. In other words, with these methods, we always allocate an array to store multiple elements at once.\nBut what if you need enough space to store just a single item? Should you\nallocate an array of a single element through `alloc()`?\n\nThe answer is no! In this case,\nyou should use the `create()` method of the allocator object.\nEvery allocator object offers the `create()` and `destroy()` methods,\nwhich are used to allocate and free memory for a single item, respectively.\n\nSo, in essence, if you want to allocate memory to store an array of elements, you\nshould use `alloc()` and `free()`. But if you need to store just a single item,\nthen, the `create()` and `destroy()` methods are ideal for you.\n\nIn the example below, I'm defining a struct to represent an user of some sort.\nIt could be an user for a game, or a software to manage resources, it doesn't mater.\nNotice that I use the `create()` method this time, to store a single `User` object\nin the program. Also notice that I use the `destroy()` method to free the memory\nused by this object at the end of the scope.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n```\n:::\n", + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n# Memory and Allocators in Zig\n\n\nIn this chapter, we will talk about memory. How does Zig controls memory? What\ncommom tools are used? Are there any important aspect that makes memory\ndifferent/special in Zig? You will find the answers here.\n\nEvery computer needs memory. Is by having memory that computers can temporarily store\nthe values/results of your calculations. Without memory, programming languages would never have\nconcepts such as \"variables\", or \"objects\", to store the values that you generate.\n\n\n## Memory spaces\n\nEvery object that you create in your Zig source code needs to be stored somewhere,\nin your computer's memory. Depending on where and how you define your object, Zig\nwill use a different \"memory space\", or a different\ntype of memory to store this object.\n\nEach type of memory normally serves for different purposes.\nIn Zig, there are 3 types of memory (or 3 different memory spaces) that we care about. They are:\n\n- Global data register (or the \"global data section\");\n- Stack;\n- Heap;\n\n\n### Compile-time known versus runtime known {#sec-compile-time}\n\nOne strategy that Zig uses to decide where it will store each object that you declare, is by looking\nat the value of this particular object. More specifically, by investigating if this value is\nknown at \"compile-time\" or at \"runtime\".\n\nWhen you write a program in Zig, the values of some of the objects that you write in your program are *known\nat compile time*. Meaning that, when you compile your Zig source code, during the compilation process,\nthe `zig` compiler can figure it out what is the exact value of a particular object\nthat exists in your source code.\nKnowing the length (or the size) of each object is also important. So the length (or the size) of each object that you write in your program is,\nin some cases, *known at compile time*.\n\nThe `zig` compiler cares more about knowing the length (or the size) of a particular object\n, than to know it's actual value. But, if the `zig` compiler knows the value of the object, then, it\nautomatically knows the size of this object. Because it can simply calculate the\nsize of the object by looking at the size of the value.\n\nTherefore, the priority for the `zig` compiler is to discover the size of each object in your source code.\nIf the value of the object in question is known at compile-time, then, the `zig` compiler\nautomatically knows the size/length of this object. But if the value of this object is not\nknown at compile-time, then, the size of this object is only known at compile-time if,\nand only if, the type of this object have a known fixed size.\n\nIn order to a type have a known fixed size, this type must have data members whose size is fixed.\nIf this type includes, for example, a variable sized array in it, then, this type do not have a known\nfixed size. Because this array can have any size at runtime\n(i.e. it can be an array of 2 elements, or 50 elements, or 1 thousand elements, etc.).\n\nFor example, a string object, which internally is an array of constant u8 values (`[]const u8`)\nhave a variable size. It can be a string object with 100 or 500 characters in it. If we do not\nknow at compile-time, which exact string will be stored inside this string object, then, we cannot calculate\nthe size of this string object at compile-time. So, any type, or any struct declaration that you make, that\nincludes a string data member that do not have an explicit fixed size, makes this type, or this\nnew struct that you are declaring, a type that do not have a known fixed size at compile-time.\n\nIn contrast, if the type or this struct that you are declaring, includes a data member that is an array,\nbut this array have a known fixed size, like `[60]u8` (which declares an array of 60 `u8` values), then,\nthis type, or, this struct that you are declaring, becomes a type with a known fixed size at compile-time.\nAnd because of that, in this case, the `zig` compiler do not need to known at compile-time the exact value of\nany object of this type. Since the compiler can find the necessary size to store this object by\nlooking at the size of it's type.\n\n\nLet's look at an example. In the source code below, we have two constant objects (`name` and `array`) declared.\nBecause the values of these particular objects are written down, in the source code itself (`\"Pedro\"`\nand the number sequence from 1 to 4), the `zig` compiler can easily discover the values of these constant\nobjects (`name` and `array`) during the compilation process.\nThis is what \"known at compile time\" means. It refers to any object that you have in your Zig source code\nwhose value can be identified at compile time.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = \"Pedro\";\nconst array = [_]u8{1, 2, 3, 4};\n_ = name; _ = array;\n\nfn input_length(input: []const u8) usize {\n const n = input.len;\n return n;\n}\n```\n:::\n\n\nThe other side of the spectrum are objects whose values are not known at compile time.\nFunction arguments are a classic example of this. Because the value of each function\nargument depends on the value that you assign to this particular argument,\nwhen you call the function.\n\nFor example, the function `input_length()` contains an argument named `input`, which is an array of constant `u8` integers (`[]const u8`).\nIs impossible to know at compile time the value of this particular argument. And it also is impossible to know the size/length\nof this particular argument. Because it is an array that do not have a fixed size specified explicitly in the argument type annotation.\n\nSo, we know that this `input` argument will be an array of `u8` integers. But we do not know at compile-time, it's value, and neither his size.\nThis information is known only at runtime, which is the period of time when you program is executed.\nAs a consequence, the value of the expression `input.len` is also known only at runtime.\nThis is an intrinsic characteristic of any function. Just remember that the value of function arguments is usually not \"compile-time known\".\n\nHowever, as I mentioned earlier, what really matters to the compiler is to know the size of the object\nat compile-time, and not necessarily it's value. So, although we don't know the value of the object `n`, which is the result of the expression\n`input.len`, at compile-time, we do know it's size. Because the expression `input.len` always return a value of type `usize`,\nand the type `usize` have a known fixed size.\n\n\n\n### Global data register\n\nThe global data register is a specific section of the executable of your Zig program, that is responsible\nfor storing any value that is known at compile time.\n\nEvery constant object whose value is known at compile time that you declare in your source code,\nis stored in the global data register. Also, every literal value that you write in your source code,\nsuch as the string `\"this is a string\"`, or the integer `10`, or a boolean value such as `true`,\nis also stored in the global data register.\n\nHonestly, you don't need to care much about this memory space. Because you can't control it,\nyou can't deliberately access it or use it for your own purposes.\nAlso, this memory space does not affect the logic of your program.\nIt simply exists in your program.\n\n\n### Stack vs Heap\n\nIf you are familiar with system's programming, or just low-level programming in general, you\nprobably have heard of the \"duel\" between Stack vs Heap. These are two different types of memory,\nor different memory spaces, which are both available in Zig.\n\nThese two types of memory don't actually duel with\neach other. This is a commom mistake that beginners have, when seeing \"x vs y\" styles of\ntabloid headlines. These two types of memory are actually complementary to each other.\nSo, in almost every Zig program that you ever write, you will likely use a combination of both.\nI will describe each memory space in detail over the next sections. But for now, I just want to\nstablish the main difference between these two types of memory.\n\nIn essence, the stack memory is normally used to store values whose length is fixed and known\nat compile time. In contrast, the heap memory is a *dynamic* type of memory space, meaning that, it is\nused to store values whose length might grow during the execution (runtime) of your program [@jenny2022].\n\nLengths that grow during runtime are intrinsically associated with \"runtime known\" type of values.\nIn other words, if you have an object whose length might grow during runtime, then, the length\nof this object becomes not known at compile time. If the length is not known at compile-time,\nthe value of this object also becomes not known at compile-time.\nThese types of objects should be stored in the heap memory space, which is\na dynamic memory space, which can grow or shrink to fit the size of your objects.\n\n\n\n### Stack {#sec-stack}\n\nThe stack is a type of memory that uses the power of the *stack data structure*, hence the name. \nA \"stack\" is a type of *data structure* that uses a \"last in, first out\" (LIFO) mechanism to store the values\nyou give it to. I imagine you are familiar with this data structure.\nBut, if you are not, the [Wikipedia page](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))[^wiki-stack]\n, or, the [Geeks For Geeks page](https://www.geeksforgeeks.org/stack-data-structure/)[^geek-stack] are both\nexcellent and easy resources to fully understand how this data structure works.\n\n[^wiki-stack]: \n[^geek-stack]: \n\nSo, the stack memory space is a type of memory that stores values using a stack data structure.\nIt adds and removes values from the memory by following a \"last in, first out\" (LIFO) principle.\n\nEvery time you make a function call in Zig, an amount of space in the stack is\nreserved for this particular function call [@jenny2022; @zigdocs].\nThe value of each function argument given to the function in this function call is stored in this\nstack space. Also, every local object that you declare inside the function scope is\nusually stored in this same stack space.\n\n\nLooking at the example below, the object `result` is a local object declared inside the scope of the `add()`\nfunction. Because of that, this object is stored inside the stack space reserved for the `add()` function.\nThe `r` object (which is declared outside of the `add()` function scope) is also stored in the stack.\nBut since it is declared in the \"outer\" scope, this object is stored in the\nstack space that belongs to this outer scope.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) u8 {\n const result = x + y;\n return result;\n}\n```\n:::\n\n\n\nSo, any object that you declare inside the scope of a function is always stored inside\nthe space that was reserved for that particular function in the stack memory. This\nalso counts for any object declared inside the scope of your `main()` function for example.\nAs you would expect, in this case, they\nare stored inside the stack space reserved for the `main()` function.\n\nOne very important detail about the stack memory is that **it frees itself automatically**.\nThis is very important, remember that. When objects are stored in the stack memory,\nyou don't have the work (or the responsibility) of freeing/destroying these objects.\nBecause they will be automatically destroyed once the stack space is freed at the end of the function scope.\n\nSo, once the function call returns (or ends, if you prefer to call it this way)\nthe space that was reserved in the stack is destroyed, and all of the objects that were in that space goes away with it.\nThis mechanism exists because this space, and the objects within it, are not necessary anymore,\nsince the function \"finished it's business\".\nUsing the `add()` function that we exposed above as an example, it means that the object `result` is automatically\ndestroyed once the function returns.\n\n::: {.callout-important}\nLocal objects that are stored in the stack space of a function are automatically\nfreed/destroyed at the end of the function scope.\n:::\n\n\nThis same logic applies to any other special structure in Zig that have it's own scope by surrounding\nit with curly braces (`{}`).\nFor loops, while loops, if else statements, etc. For example, if you declare any local\nobject in the scope of a for loop, this local object is accessible only within the scope\nof this particular for loop. Because once the scope of this for loop ends, the space in the stack\nreserved for this for loop is freed.\nThe example below demonstrates this idea.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This does not compile succesfully!\nconst a = [_]u8{0, 1, 2, 3, 4};\nfor (0..a.len) |i| {\n const index = i;\n _ = index;\n}\n// Trying to use an object that was\n// declared in the for loop scope,\n// and that does not exist anymore.\nstd.debug.print(\"{d}\\n\", index);\n```\n:::\n\n\n\n\nOne important consequence of this mechanism is that, once the function returns, you can no longer access any memory\naddress that was inside the space in the stack reserved for this particular function. Because this space was\ndestroyed. This means that, if this local object is stored in the stack,\nyou cannot make a function that **returns a pointer to this object**.\n\nThink about that for a second. If all local objects in the stack are destroyed at the end of the function scope, why\nwould you even consider returning a pointer to one of these objects? This pointer is at best,\ninvalid, or, more likely, \"undefined\".\n\nConclusion, is totally fine to write a function that returns the local object\nitself as result, because then, you return the value of that object as the result.\nBut, if this local object is stored in the stack, you should never write a function\nthat returns a pointer to this local object. Because the memory address pointed by the pointer\nno longer exists.\n\n\nSo, using again the `add()` function as an example, if you rewrite this function so that it\nreturns a pointer to the local object `result`, the `zig` compiler will actually compile\nyou program, with no warnings or erros. At first glance, it looks that this is good code\nthat works as expected. But this is a lie!\n\nIf you try to take a look at the value inside of the `r` object,\nor, if you try to use this `r` object in another expression\nor function call, then, you would have undefined behaviour, and major\nbugs in your program [@zigdocs, see \"Lifetime and Ownership\"[^life] and \"Undefined Behaviour\"[^undef] sections].\n\n[^life]: \n[^undef]: \n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// This code compiles succesfully. But it has\n// undefined behaviour. Never do this!!!\n\n// The `r` object is undefined!\nconst r = add(5, 27);\n_ = r;\n\nfn add(x: u8, y: u8) *const u8 {\n const result = x + y;\n return &result;\n}\n```\n:::\n\n\nThis \"invalid pointer to stack variable\" problem is very known across many programming language communities.\nIf you try to do the same thing, for example, in a C or C++ program (i.e. returning an address to\na local object stored in the stack), you would also get undefined behaviour\nin the program.\n\n::: {.callout-important}\nIf a local object in your function is stored in the stack, you should never\nreturn a pointer to this local object from the function. Because\nthis pointer will always become undefined after the function returns, since the stack space of the function\nis destroyed at the end of it's scope.\n:::\n\nBut what if you really need to use this local object in some way after your function returns?\nHow can you do this? The answer is: \"in the same you would do if this was a C or C++ program. By returning\nan address to an object stored in the heap\". The heap memory have a much more flexible lifecycle,\nand allows you to get a valid pointer to a local object of a function that already returned\nfrom it's scope.\n\n\n### Heap {#sec-heap}\n\nOne important limitation of the stack, is that, only objects whose length/size is known at compile-time can be\nstored in it. In contrast, the heap is a much more dynamic\n(and flexible) type of memory. It is the perfect type of memory to use\non objects whose size/length might grow during the execution of your program.\n\nVirtually any application that behaves as a server is a classic use case of the heap.\nA HTTP server, a SSH server, a DNS server, a LSP server, ... any type of server.\nIn summary, a server is a type of application that runs for long periods of time,\nand that serves (or \"deals with\") any incoming request that reaches this particular server.\n\nThe heap is a good choice for this type of system, mainly because the server does not know upfront\nhow many requests it will receive from users, while it is active. It could be one single request,\nor, 5 thousand requests, or, it could also be zero requests.\nThe server needs to have the ability to allocate and manage it's memory according to how many requests it receives.\n\nAnother key difference between the stack and the heap, is that the heap is a type\nof memory that you, the programmer, have complete control over. This makes the heap a\nmore flexible type of memory, but it also makes it harder to work with it. Because you,\nthe programmer, is responsible for managing everything related to it. Including where the memory is allocated,\nhow much memory is allocated, and where this memory is freed.\n\n> Unlike stack memory, heap memory is allocated explicitly by programmers and it won’t be deallocated until it is explicitly freed [@jenny2022].\n\nTo store an object in the heap, you, the programmer, needs to explicitly tells Zig to do so,\nby using an allocator to allocate some space in the heap. At @sec-allocators, I will present how you can use allocators to allocate memory\nin Zig.\n\n::: {.callout-important}\nEvery memory you allocate in the heap needs to be explicitly freed by you, the programmer.\n:::\n\nThe majority of allocators in Zig do allocate memory on the heap. But some exceptions to this rule are\n`ArenaAllocator()` and `FixedBufferAllocator()`. The `ArenaAllocator()` is a special\ntype of allocator that works in conjunction with a second type of allocator.\nOn the other side, the `FixedBufferAllocator()` is an allocator that works based on\nbuffer objects created on the stack. This means that the `FixedBufferAllocator()` makes\nallocations only on the stack.\n\n\n\n\n### Summary\n\nAfter discussing all of these boring details, we can quickly recap what we learned.\nIn summary, the Zig compiler will use the following rules to decide where each\nobject you declare is stored:\n\n1. every literal value (such as `\"this is string\"`, `10`, or `true`) is stored in the global data section.\n\n1. every constant object (`const`) whose value **is known at compile-time** is also stored in the global data section.\n\n1. every object (constant or not) whose length/size **is known at compile time** is stored in the stack space for the current scope.\n\n1. if an object is created with the method `alloc()` or `create()` of an allocator object, this object is stored in the memory space used by this particular allocator object. Most of allocators available in Zig use the heap memory, so, this object is likely stored in the heap (`FixedBufferAllocator()` is an exception to that).\n\n1. the heap can only be accessed through allocators. If your object was not created through the `alloc()` or `create()` methods of an allocator object, then, he is most certainly not an object stored in the heap.\n\n\n## Allocators {#sec-allocators}\n\nOne key aspect about Zig, is that there are \"no hidden-memory allocations\" in Zig.\nWhat that really means, is that \"no allocations happen behind your back in the standard library\" [@zigguide].\n\nThis is a known problem, specially in C++. Because in C++, there are some operators that do allocate\nmemory behind the scene, and there is no way for you to known that, until you actually read the\nsource code of these operators, and find the memory allocation calls.\nMany programmers find this behaviour annoying and hard to keep track of.\n\nBut, in Zig, if a function, an operator, or anything from the standard library\nneeds to allocate some memory during it's execution, then, this function/operator needs to receive (as input) an allocator\nprovided by the user, to actually be able to allocate the memory it needs.\n\nThis creates a clear distinction between functions that \"do not\" from those that \"actually do\"\nallocate memory. Just look at the arguments of this function.\nIf a function, or operator, have an allocator object as one of it's inputs/arguments, then, you know for\nsure that this function/operator will allocate some memory during it's execution.\n\nAn example is the `allocPrint()` function from the Zig standard library. With this function, you can\nwrite a new string using format specifiers. So, this function is, for example, very similar to the function `sprintf()` in C.\nIn order to write such new string, the `allocPrint()` function needs to allocate some memory to store the\noutput string.\n\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives\nas input to the function. In the example below, I am using the `GeneralPurposeAllocator()` as my allocator\nobject. But I could easily use any other type of allocator object from the Zig standard library.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello Pedro!!!\n```\n\n\n:::\n:::\n\n\n\nYou get a lot of control\nover where and how much memory this function can allocate. Because it is you,\nthe user/programmer, that provides the allocator for the function to use.\nThis makes \"total control\" over memory management easier to achieve in Zig.\n\n### What are allocators?\n\nAllocators in Zig are objects that you can use to allocate memory for your program.\nThey are similar to the memory allocating functions in C, like `malloc()` and `calloc()`.\nSo, if you need to use more memory than you initially have, during the execution of your program, you can simply ask\nfor more memory using an allocator.\n\nZig offers different types of allocators, and they are usually available through the `std.heap` module of\nthe standard library. So, just import the Zig standard library into your Zig module (with `@import(\"std\")`), and you can start\nusing these allocators in your code.\n\nFurthermore, every allocator object is built on top of the `Allocator` interface in Zig. This\nmeans that, every allocator object you find in Zig must have the methods `alloc()`,\n`create()`, `free()` and `destroy()`. So, you can change the type of allocator you are using,\nbut you don't need to change the function calls to the methods that do the memory allocation\n(and the free memory operations) for your program.\n\n### Why you need an allocator?\n\nAs we described at @sec-stack, everytime you make a function call in Zig,\na space in the stack is reserved for this function call. But the stack\nhave a key limitation which is: every object stored in the stack have a\nknown fixed length.\n\nBut in reality, there are two very commom instances where this \"fixed length limitation\" of the stack is a deal braker:\n\n1. the objects that you create inside your function might grow in size during the execution of the function.\n\n2. sometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer\nto a local object. As I described at @sec-stack, you cannot do that if this local object is stored in the\nstack. However, if this object is stored in the heap, then, you can return a pointer to this object at the\nend of the function. Because you (the programmer) control the lyfetime of any heap memory that you allocate. You decide\nwhen this memory get's destroyed/freed.\n\nThese are commom situations where the stack is not good for.\nThat is why you need a different memory management strategy to\nstore these objects inside your function. You need to use\na memory type that can grow together with your objects, or that you\ncan control the lyfetime of this memory.\nThe heap fit this description.\n\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size\nduring the execution of your program, you grow the amount of memory\nyou have by allocating more memory in the heap to store these objects. \nAnd you that in Zig, by using an allocator object.\n\n\n### The different types of allocators\n\n\nAt the moment of the writing of this book, in Zig, we have 6 different\nallocators available in the standard library:\n\n- `GeneralPurposeAllocator()`.\n- `page_allocator()`.\n- `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()`.\n- `ArenaAllocator()`.\n- `c_allocator()` (requires you to link to libc).\n\n\nEach allocator have it's own perks and limitations. All allocators, except `FixedBufferAllocator()` and `ArenaAllocator()`,\nare allocators that use the heap memory. So any memory that you allocate with\nthese allocators, will be placed in the heap.\n\n### General-purpose allocators\n\nThe `GeneralPurposeAllocator()`, as the name suggests, is a \"general purpose\" allocator. You can use it for every type\nof task. In the example below, I'm allocating enough space to store a single integer in the object `some_number`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n```\n:::\n\n\n\nWhile useful, you might want to use the `c_allocator()`, which is a alias to the C standard allocator `malloc()`. So, yes, you can use\n`malloc()` in Zig if you want to. Just use the `c_allocator()` from the Zig standard library. However,\nif you do use `c_allocator()`, you must link to Libc when compiling your source code with the\n`zig` compiler, by including the flag `-lc` in your compilation process.\nIf you do not link your source code to Libc, Zig will not be able to find the\n`malloc()` implementation in your system.\n\n### Page allocator\n\nThe `page_allocator()` is an allocator that allocates full pages of memory in the heap. In other words,\nevery time you allocate memory with `page_allocator()`, a full page of memory in the heap is allocated,\ninstead of just a small piece of it.\n\nThe size of this page depends on the system you are using.\nMost systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally\nallocated in each call by `page_allocator()`. That is why, `page_allocator()` is considered a\nfast, but also \"wasteful\" allocator in Zig. Because it allocates a big amount of memory\nin each call, and you most likely will not need that much memory in your program.\n\n### Buffer allocators\n\nThe `FixedBufferAllocator()` and `ThreadSafeFixedBufferAllocator()` are allocator objects that\nwork with a fixed sized buffer that is stored in the stack. So these two allocators only allocates\nmemory in the stack. This also means that, in order to use these allocators, you must first\ncreate a buffer object, and then, give this buffer as an input to these allocators.\n\nIn the example below, I am creating a `buffer` object that is 10 elements long.\nNotice that I give this `buffer` object to the `FixedBufferAllocator()` constructor.\nNow, because this `buffer` object is 10 elements long, this means that I am limited to this space.\nI cannot allocate more than 10 elements with this allocator object. If I try to\nallocate more than that, the `alloc()` method will return an `OutOfMemory` error value.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n```\n:::\n\n\n\n### Arena allocator {#sec-arena-allocator}\n\nThe `ArenaAllocator()` is an allocator object that takes a child allocator as input. The idea behind the `ArenaAllocator()` in Zig\nis similar to the concept of \"arenas\" in the programming language Go[^go-arena]. It is an allocator object that allows you\nto allocate memory as many times you want, but free all memory only once.\nIn other words, if you have, for example, called 5 times the method `alloc()` of an `ArenaAllocator()` object, you can\nfree all the memory you allocated over these 5 calls at once, by simply calling the `deinit()` method of the same `ArenaAllocator()` object.\n\n[^go-arena]: \n\nIf you give, for example, a `GeneralPurposeAllocator()` object as input to the `ArenaAllocator()` constructor, like in the example below, then, the allocations\nyou perform with `alloc()` will actually be made with the underlying object `GeneralPurposeAllocator()` that was passed.\nSo, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator\nreally do is helping you to free all the memory you allocated multiple times with just a single command. In the example\nbelow, I called `alloc()` 3 times. So, if I did not used an arena allocator, then, I would need to call\n`free()` 3 times to free all the allocated memory.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n```\n:::\n\n\n\n\n### The `alloc()` and `free()` methods\n\nIn the code example below, we are accessing the `stdin`, which is\nthe standard input channel, to receive an input from the\nuser. We read the input given by the user with the `readUntilDelimiterOrEof()`\nmethod.\n\nNow, after reading the input of the user, we need to store this input somewhere in\nour program. That is why I use an allocator in this example. I use it to allocate some\namount of memory to store this input given by the user. More specifically, the method `alloc()`\nof the allocator object is used to allocate an array capable of storing 50 `u8` values.\n\nNotice that this `alloc()` method receives two inputs. The first one, is a type.\nThis defines what type of values the allocated array will store. In the example\nbelow, we are allocating an array of unsigned 8-bit integers (`u8`). But\nyou can create an array to store any type of value you want. Next, on the second argument, we\ndefine the size of the allocated array, by specifying how much elements\nthis array will contain. In the case below, we are allocating an array of 50 elements.\n\nAt @sec-zig-strings we described that strings in Zig are simply arrays of characters.\nEach character is represented by an `u8` value. So, this means that the array that\nwas allocated in the object `input` is capable of storing a string that is\n50-characters long.\n\nSo, in essence, the expression `var input: [50]u8 = undefined` would create\nan array for 50 `u8` values in the stack of the current scope. But, you\ncan allocate the same array in the heap by using the expression `var input = try allocator.alloc(u8, 50)`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n```\n:::\n\n\nAlso, notice that in this example, we use the `defer` keyword (which I described at @sec-defer) to run a small\npiece of code at the end of the current scope, which is the expression `allocator.free(input)`.\nWhen you execute this expression, the allocator will free the memory that it allocated\nfor the `input` object.\n\nWe have talked about this at @sec-heap. You **should always** explicitly free any memory that you allocate\nusing an allocator! You do that by using the `free()` method of the same allocator object you\nused to allocate this memory. The `defer` keyword is used in this example only to help us execute\nthis free operation at the end of the current scope.\n\n\n### The `create()` and `destroy()` methods\n\nWith the `alloc()` and `free()` methods, you can allocate memory to store multiple elements\nat once. In other words, with these methods, we always allocate an array to store multiple elements at once.\nBut what if you need enough space to store just a single item? Should you\nallocate an array of a single element through `alloc()`?\n\nThe answer is no! In this case,\nyou should use the `create()` method of the allocator object.\nEvery allocator object offers the `create()` and `destroy()` methods,\nwhich are used to allocate and free memory for a single item, respectively.\n\nSo, in essence, if you want to allocate memory to store an array of elements, you\nshould use `alloc()` and `free()`. But if you need to store just a single item,\nthen, the `create()` and `destroy()` methods are ideal for you.\n\nIn the example below, I'm defining a struct to represent an user of some sort.\nIt could be an user for a game, or a software to manage resources, it doesn't mater.\nNotice that I use the `create()` method this time, to store a single `User` object\nin the program. Also notice that I use the `destroy()` method to free the memory\nused by this object at the end of the scope.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n```\n:::\n", "supporting": [], "filters": [ "rmarkdown/pagebreak.lua" diff --git a/_freeze/Chapters/01-zig-weird/execute-results/html.json b/_freeze/Chapters/01-zig-weird/execute-results/html.json index 364ab94..3451978 100644 --- a/_freeze/Chapters/01-zig-weird/execute-results/html.json +++ b/_freeze/Chapters/01-zig-weird/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "8fe59fd82cffe3229f564996b5b93d13", + "hash": "20f6eefcc93eafa96aa4fdf6b1f6253f", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n# Introducing Zig\n\nIn this chapter, I want to introduce you to the world of Zig.\nDespite it's rapidly growing over the last years, Zig is, still, a very young language^[New programming languages in general, take years and years to be developed.].\nAs a consequence, it's world is still very wild and to be explored.\nThis book is my attempt to help you on your personal journey for\nunderstanding and exploring the exciting world of Zig.\n\nI assume you have previous experience with some programming\nlanguage in this book, not necessarily with a low-level one.\nSo, if you have experience with Python, or Javascript, for example, is fine.\nBut, if you do have experience with low-level languages, such as C, C++, or\nRust, you will probably learn faster throughout this book.\n\n\n\n## What is Zig?\n\nZig is a modern, low-level, and general-purpose programming language. Some programmers interpret\nZig as the \"modern C language\". It is a simple language like C, but with some\nmodern features.\n\nIn the author's personal interpretation, Zig is tightly connected with \"less is more\".\nInstead of trying to become a modern language by adding more and more features,\nmany of the core improvements that Zig brings to the\ntable are actually about removing annoying and evil behaviours/features from C and C++.\nIn other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.\nAs a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.\n\nThis philosophy becomes clear with the following phrase from the official website of Zig:\n\n> \"Focus on debugging your application rather than debugging your programming language knowledge\".\n\nThis phrase is specially true for C++ programmers. Because C++ is a gigantic language,\nwith tons of features, and also, there are lots of different \"flavors of C++\". These elements\nare what makes C++ so much complex and hard to learn. Zig tries to go in the opposite direction.\nZig is a very simple language, more closely related to other simple languages such as C and Go.\n\nThe phrase above is still important for C programmers too. Because, even C being a simple\nlanguage, it is still hard sometimes to read and understand C code. For example, pre-processor macros in\nC are an evil source of confusion. They really makes it hard sometimes to debug\nC programs. Because macros are essentially a second language embedded in C that obscures\nyour C code. With macros, you are no longer 100% sure about which pieces\nof code are being sent to the compiler. It obscures the actual source code that you wrote.\n\nYou don't have macros in Zig. In Zig, the code you write, is the actual code that get's compiled by the compiler.\nYou don't have evil features that obscures you code.\nYou also don't have hidden control flow happening behind the scenes. And, you also\ndon't have functions or operators from the standard library that make\nhidden memory allocations behind your back.\n\nBy being a simpler language, Zig becomes much more clear and easier to read/write,\nbut at the same time, it also achieves a much more robust state, with more consistent\nbehaviour in edge situations. Once again, less is more.\n\n\n## Hello world in Zig\n\nWe begin our journey in Zig by creating a small \"Hello World\" program.\nTo start a new Zig project in your computer, you simply call the `init` command\nfrom the `zig` compiler.\nJust create a new directory in your computer, then, init a new Zig project\ninside this directory, like this:\n\n```bash\nmkdir hello_world\ncd hello_world\nzig init\n```\n\n```\ninfo: created build.zig\ninfo: created build.zig.zon\ninfo: created src/main.zig\ninfo: created src/root.zig\ninfo: see `zig build --help` for a menu of options\n```\n\n### Understanding the project files {#sec-project-files}\n\nAfter you run the `init` command from the `zig` compiler, some new files\nare created inside of your current directory. First, a \"source\" (`src`) directory\nis created, containing two files, `main.zig` and `root.zig`. Each `.zig` file\nis a separate Zig module, which is simply a text file that contains some Zig code.\n\n\nThe `main.zig` file for example, contains a `main()` function, which represents\nthe entrypoint of your program. It is where the execution of your program begins.\nAs you would expect from a C, C++, Rust or Go,\nto build an executabe program in Zig, you also need to declare a `main()` function in your module.\nSo, the `main.zig` module represents an executable program written in Zig.\n\nOn the other side, the `root.zig` module does not contain a `main()` function. Because\nit represents a library written in Zig. Libraries are different than executables.\nThey don't need to have an entrypoint to work.\nSo, you can choose which file (`main.zig` or `root.zig`) you want to follow depending on which type\nof project (executable or library) you want to develop.\n\n```bash\ntree .\n```\n\n```\n.\n├── build.zig\n├── build.zig.zon\n└── src\n ├── main.zig\n └── root.zig\n\n1 directory, 4 files\n```\n\n\nNow, in addition to the source directory, two other files were created in our working directory:\n`build.zig` and `build.zig.zon`. The first file (`build.zig`) represents a build script written in Zig.\nThis script is executed when you call the `build` command from the `zig` compiler.\nIn other words, this file contain Zig code that executes the necessary steps to build the entire project.\n\nIn general, low-level languages normally use a compiler to build your\nsource code into binary executables or binary libraries.\nNevertheless, this process of compiling your source code and building\nbinary executables or binary libraries from it, became a real challenge\nin the programming world, once the projects became bigger and bigger.\nAs a result, programmers created \"build systems\", which are a second set of tools designed to make this process\nof compiling and building complex projects, easier.\n\nExamples of build systems are CMake, GNU Make, GNU Autoconf and Ninja,\nwhich are used to build complex C and C++ projects.\nWith these systems, you can write scripts, which are called \"build scripts\".\nThey simply are scripts that describes the necessary steps to compile/build\nyour project.\n\nHowever, these are separate tools, that do not\nbelong to C/C++ compilers, like `gcc` or `clang`.\nAs a result, in C/C++ projects, you have not only to install and\nmanage your C/C++ compilers, but you also have to install and manage\nthese build systems separately.\n\nBut instead of using a separate build system, in Zig, we use the\nZig language itself to write build scripts.\nIn other words, Zig contains a native build system in it. And\nwe can use this build system to write small scripts in Zig,\nwhich describes the necessary steps to build/compile our Zig project[^zig-build-system].\nSo, everything you need to build a complex Zig project is the\n`zig` compiler, and nothing more.\n\n[^zig-build-system]: .\n\n\nNow that we described this topic in more depth, let's focus\non the second generated file (`build.zig.zon`), which is the Zig package manager configuration file,\nwhere you can list and manage the dependencies of your project. Yes, Zig have\na package manager (like `pip` in Python, `cargo` in Rust, or `npm` in Javascript) called Zon,\nand this `build.zig.zon` file is similar to the `package.json` file\nin Javascript projects, or, the `Pipfile` in Python projects.\n\n\n### Looking at the `root.zig` file {#sec-root-file}\n\nLet's take a look at the `root.zig` file, and start to analyze some of the\nsyntax of Zig.\nThe first thing that you might notice, is that every line of code\nthat have an expression in it, ends with a semicolon character (`;`). This is\nsimilar syntax to other languages such as C, C++ and Rust,\nwhich have the same rule.\n\nAlso, notice the `@import()` call at the first line. We use this built-in function\nto import functionality from other Zig modules into our current module.\nIn other words, the `@import()` function works similarly to the `#include` pre-processor\nin C or C++, or, to the `import` statement in Python or Javascript code.\nIn this example, we are importing the `std` module,\nwhich gives you access to the Zig standard library.\n\nIn this `root.zig` file, we can also see how assignments (i.e. creating new objects)\nare made in Zig. You can create a new object in Zig by using the following syntax\n`(const|var) name = value;`. In the example below, we are creating two constant\nobjects (`std` and `testing`). At @sec-assignments we talk more about objects in general.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst testing = std.testing;\n\nexport fn add(a: i32, b: i32) i32 {\n return a + b;\n}\n```\n:::\n\n\nFunctions in Zig are declared similarly to functions in Rust, using the `fn` keyword. In the example above,\nwe are declaring a function called `add()`, which have two arguments named `a` and `b`, and returns\na integer number (`i32`) as result.\n\nMaybe Zig is not exactly a strongly-typed language, because you do not need\nnecessarily to specify the type of every single object you create across your source code.\nBut you do have to explicitly specify the type of every function argument, and also,\nthe return type of every function you create in Zig. So, at least in function declarations,\nZig is a strongly-typed language.\n\nWe specify the type of an object or a function argument in Zig, by\nusing a colon character (`:`) followed by the type after the name of this object/function argument.\nWith the expressions `a: i32` and `b: i32`, we know that, both `a` and `b` arguments have type `i32`,\nwhich is a signed 32 bit integer. In this part,\nthe syntax in Zig is identical to the syntax in Rust, which also specifies types by\nusing the colon character.\n\nLastly, we have the return type of the function at the end of the line, before we open\nthe curly braces to start writing the function's body, which, in the example above is\nagain a signed 32 bit integer (`i32`) value. This specific part is different than it is in Rust.\nBecause in Rust, the return type of a function is specified after an arrow (`->`).\nWhile in Zig, we simply declare the return type directly after the parentheses with the function arguments.\n\nWe also have an `export` keyword before the function declaration. This keyword\nis similar to the `extern` keyword in C. It exposes the function\nto make it available in the library API.\n\nIn other words, if you have a project where you are currently building\na library for other people to use, you need to expose your functions\nso that they are available in the library's API, so that users can use it.\nIf we removed the `export` keyword from the `add()` function declaration,\nthen, this function would be no longer exposed in the library object built\nby the `zig` compiler.\n\n\nHaving that in mind, the keyword `export` is a keyword used in libraries written in Zig.\nSo, if you are not currently writing a library in your project, then, you do not need to\ncare about this keyword.\n\n\n### Looking at the `main.zig` file {#sec-main-file}\n\nNow that we have learned a lot about Zig's syntax from the `root.zig` file,\nlet's take a look at the `main.zig` file.\nA lot of the elements we saw in `root.zig` are also present in `main.zig`.\nBut we have some other elements that we did not have seen yet, so let's dive in.\n\nFirst, look at the return type of the `main()` function in this file.\nWe can see a small change. Now, the return\ntype of the function (`void`) is accompanied by an exclamation mark (`!`).\nWhat this exclamation mark is telling us, is that this `main()` function\nmight also return an error.\n\nSo, in this example, the `main()` function can either return `void`, or, return an error.\nThis is an interesting feature of Zig. If you write a function, and, something inside of\nthe body of this function might return an error, then, you are forced to:\n\n- either add the exclamation mark to the return type of the function, to make it clear that\nthis function might return an error.\n- or explicitly handle this error that might occur inside the function, to make sure that,\nif this error does happen, you are prepared, and your function will no longer return an error\nbecause you handled the error inside your function.\n\nIn most programming languages, we normally handle (or deals with) an error through\na *try catch* pattern, and Zig, this is no different. But, if we look at the `main()` function\nbelow, you can see that we do have a `try` keyword in the 5th line. But we do not have a `catch` keyword\nin this code.\n\nThis means that, we are using the keyword `try` to execute a code that might return an error,\nwhich is the `stdout.print()` expression. But because we do not have a `catch` keyword in this line,\nwe are not treating (or dealing with) this error.\nSo, if this expression do return an error, we are not catching and solving this error in any way.\nThat is why the exclamation mark was added to the return type of the function.\n\nSo, in essence, the `try` keyword executes the expression `stdout.print()`. If this expression\nreturns a valid value, then, the `try` keyword do nothing essentially. It simply passes this value forward. But, if the expression do\nreturn an error, then, the `try` keyword will unwrap and return this error from the function, and also print it's\nstack trace to `stderr`.\n\nThis might sound weird to you, if you come from a high-level language. Because in\nhigh-level languages, such as Python, if an error occurs somewhere, this error is automatically\nreturned and the execution of your program will automatically stops, even if you don't want\nto stop the execution. You are obligated to face the error.\n\nBut if you come from a low-level language, then, maybe, this idea do not sound so weird or distant to you.\nBecause in C for example, normally functions doesn't raise errors, or, they normally don't stop the execution.\nIn C, error handling\nis done by constantly checking the return value of the function. So, you run the function,\nand then, you use an if statement to check if the function returned a value that is valid,\nor, if it returned an error. If an error was returned from the function, then, the if statement\nwill execute some code that fixes this error.\n\nSo, at least for C programmers, they do need to write a lot of if statements to\nconstantly check for errors around their code. And because of that, this simple feature from Zig, might be\nextraordinary for them. Because this `try` keyword can automatically unwrap the error,\nand warn you about this error, and let you deal with it, without any extra work from the programmer.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n:::\n\n\nNow, another thing that you might have noticed in this code example, is that\nthe `main()` function is marked with the `pub` keyword. This keyword means\n\"public\". It marks the `main()` function as a *public function* from this module.\n\nIn other words, every function that you declare in your Zig module is, by default, a private (or \"static\")\nfunction that belongs to this Zig module, and can only be used (or called) from within this same module.\nUnless, you explicitly mark this function as a public function with the `pub` keyword.\nThis means that the `pub` keyword in Zig do essentially the opposite of what the `static` keyword\ndo in C/C++.\n\nBy making a function \"public\", you allow other Zig modules to access and call this function,\nand use it for they own purposes.\nall these other Zig modules need to do is, to import your module with the `@import()`\nbuilt-in function. Then, they get access to all public functions that are present in\nyour Zig module.\n\n\n### Compiling your source code {#sec-compile-code}\n\nYou can compile your Zig modules into a binary executable by running the `build-exe` command\nfrom the `zig` compiler. You simply list all the Zig modules that you want to build after\nthe `build-exe` command, separated by spaces. In the example below, we are compiling the module `main.zig`.\n\n```bash\nzig build-exe src/main.zig\n```\n\nSince we are building an executable, the `zig` compiler will look for a `main()` function\ndeclared in any of the files that you list after the `build-exe` command. If\nthe compiler does not find a `main()` function declared somewhere, a\ncompilation error will be raised, warning about this mistake.\n\nThe `zig` compiler also offers a `build-lib` and `build-obj` commands, which work\nthe exact same way as the `build-exe` command. The only difference is that, they compile your\nZig modules into a portale C ABI library, or, into object files, respectively.\n\nIn the case of the `build-exe` command, a binary executable file is created by the `zig`\ncompiler in the root directory of your project.\nIf we take a look now at the contents of our current directory, with a simple `ls` command, we can\nsee the binary file called `main` that was created by the compiler.\n\n```bash\nls\n```\n\n```\nbuild.zig build.zig.zon main src\n```\n\nIf I execute this binary executable, I get the \"Hello World\" message in the terminal\n, as we expected.\n\n```bash\n./main\n```\n\n```\nHello, world!\n```\n\n\n### Compile and execute at the same time {#sec-compile-run-code}\n\nOn the previous section, I presented the `zig build-exe` command, which\ncompiles Zig modules into an executable file. However, this means that,\nin order to execute the executable file, we have to run two different commands.\nFirst, the `zig build-exe` command, and then, we call the executable file\ncreated by the compiler.\n\nBut what if we wanted to perform these two steps,\nall at once, in a single command? We can do that by using the `zig run`\ncommand.\n\n```bash\nzig run src/main.zig\n```\n\n```\nHello, world!\n```\n\n### Compiling the entire project {#sec-compile-project}\n\nJust as I described at @sec-project-files, as our project grows in size and\ncomplexity, we usually prefer to organize the compilation and build process\nof the project into a build script, using some sort of \"build system\".\n\nIn other words, as our project grows in size and complexity,\nthe `build-exe`, `build-lib` and `build-obj` commands become\nharder to use directly. Because then, we start to list\nmultiple and multiple modules at the same time. We also\nstart to add built-in compilation flags to customize the\nbuild process for our needs, etc. It becomes a lot of work\nto write the necessary commands by hand.\n\nIn C/C++ projects, programmers normally opt to use CMake, Ninja, `Makefile` or `configure` scripts\nto organize this process. However, in Zig, we have a native build system in the language itself.\nSo, we can write build scripts in Zig to compile and build Zig projects. Then, all we\nneed to do, is to call the `zig build` command to build our project.\n\nSo, when you execute the `zig build` command, the `zig` compiler will search\nfor a Zig module named `build.zig` inside your current directory, which\nshould be your build script, containing the necessary code to compile and\nbuild your project. If the compiler do find this `build.zig` file in your directory,\nthen, the compiler will essentially execute a `zig run` command\nover this `build.zig` file, to compile and execute this build\nscript, which in turn, will compile and build your entire project.\n\n\n```bash\nzig build\n```\n\n\nAfter you execute this \"build project\" command, a `zig-out` directory\nis created in the root of your project directory, where you can find\nthe binary executables and libraries created from your Zig modules\naccordingly to the build commands that you specified at `build.zig`.\nWe will talk more about the build system in Zig latter in this book.\n\nIn the example below, I'm executing the binary executable\nnamed `hello_world` that was generated by the compiler after the\n`zig build` command.\n\n```bash\n./zig-out/bin/hello_world\n```\n\n```\nHello, world!\n```\n\n\n\n## How to learn Zig?\n\nWhat are the best strategies to learn Zig? \nFirst of all, of course this book will help you a lot on your journey through Zig.\nBut you will also need some extra resources if you want to be really good at Zig.\n\nAs a first tip, you can join a community with Zig programmers to get some help\n, when you need it:\n\n- Reddit forum: ;\n- Ziggit community: ;\n- Discord, Slack, Telegram, and others: ;\n\nNow, one of the best ways to learn Zig is to simply read Zig code. Try\nto read Zig code often, and things will become more clear.\nA C/C++ programmer would also probably give you this same tip.\nBecause this strategy really works!\n\nNow, where you can find Zig code to read?\nI personally think that, the best way of reading Zig code is to read the source code of the\nZig Standard Library. The Zig Standard Library is available at the [`lib/std` folder](https://github.com/ziglang/zig/tree/master/lib/std)[^zig-lib-std] on\nthe official GitHub repository of Zig. Access this folder, and start exploring the Zig modules.\n\nAlso, a great alternative is to read code from other large Zig\ncodebases, such as:\n\n1. the [Javascript runtime Bun](https://github.com/oven-sh/bun)[^bunjs].\n1. the [game engine Mach](https://github.com/hexops/mach)[^mach].\n1. a [LLama 2 LLM model implementation in Zig](https://github.com/cgbur/llama2.zig/tree/main)[^ll2].\n1. the [financial transactions database `tigerbeetle`](https://github.com/tigerbeetle/tigerbeetle)[^tiger].\n1. the [command-line arguments parser `zig-clap`](https://github.com/Hejsil/zig-clap)[^clap].\n1. the [UI framework `capy`](https://github.com/capy-ui/capy)[^capy].\n1. the [Language Protocol implementation for Zig, `zls`](https://github.com/zigtools/zls)[^zls].\n1. the [event-loop library `libxev`](https://github.com/mitchellh/libxev)[^xev].\n\n[^xev]: \n[^zls]: \n[^capy]: \n[^clap]: \n[^tiger]: \n[^ll2]: \n[^mach]: \n[^bunjs]: .\n\nAll these assets are available on GitHub,\nand this is great, because we can use the GitHub search bar in our advantage,\nto find Zig code that fits our description.\nFor example, you can always include `lang:Zig` in the GitHub search bar when you\nare searching for a particular pattern. This will limit the search to only Zig modules.\n\n[^zig-lib-std]: \n\nAlso, a great alternative is to consult online resources and documentations.\nHere is a quick list of resources that I personally use from time to time to learn\nmore about the language each day:\n\n- Zig Language Reference: ;\n- Zig Standard Library Reference: ;\n- Zig Guide: ;\n- Karl Seguin Blog: ;\n- Zig News: ;\n- Read the code written by one of the Zig core team members: ;\n- Some livecoding sessions are transmitted in the Zig Showtime Youtube Channel: ;\n\n\nAnother great strategy to learn Zig, or honestly, to learn any language you want,\nis to practice it by solving exercises. For example, there is a famous repository\nin the Zig community called [Ziglings](https://codeberg.org/ziglings/exercises/)[^ziglings]\n, which contains more than 100 small exercises that you can solve. It is a repository of\ntiny programs written in Zig that are currently broken, and your responsibility is to\nfix these programs, and make them work again.\n\n[^ziglings]: .\n\nA famous tech YouTuber known as *The Primeagen* also posted some videos (at YouTube)\nwhere he solves these exercises from Ziglings. The first video is named\n[\"Trying Zig Part 1\"](https://www.youtube.com/watch?v=OPuztQfM3Fg&t=2524s&ab_channel=TheVimeagen)[^prime1].\n\n[^prime1]: .\n\nAnother great alternative, is to solve the [Advent of Code exercises](https://adventofcode.com/)[^advent-code].\nThere are people that already took the time to learn and solve the exercises, and they posted\ntheir solutions on GitHub as well, so, in case you need some resource to compare while solving\nthe exercises, you can look at these two repositories:\n\n- ;\n- ;\n\n[^advent-code]: \n\n\n\n\n\n\n## Creating new objects in Zig (i.e. identifiers) {#sec-assignments}\n\nLet's talk more about objects in Zig. Readers that have past experience\nwith other programming languages might know this concept through\na different name, such as: \"variable\" or \"identifier\". In this book, I choose\nto use the term \"object\" to refer to this concept.\n\nTo create a new object (or a new \"identifier\") in Zig, we use\nthe keywords `const` or `var`. These keywords specificy if the object\nthat you are creating is mutable or not.\nIf you use `const`, then the object you are\ncreating is a constant (or immutable) object, which means that once you declare this object, you\ncan no longer change the value stored inside this object.\n\nOn the other side, if you use `var`, then, you are creating a variable (or mutable) object.\nYou can change the value of this object as many times you want. Using the\nkeyword `var` in Zig is similar to using the keywords `let mut` in Rust.\n\n### Constant objects vs variable objects\n\nIn the code example below, we are creating a new constant object called `age`.\nThis object stores a number representing the age of someone. However, this code example\ndoes not compiles succesfully. Because on the next line of code, we are trying to change the value\nof the object `age` to 25.\n\nThe `zig` compiler detects that we are trying to change\nthe value of an object/identifier that is constant, and because of that,\nthe compiler will raise a compilation error, warning us about the mistake.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 24;\n// The line below is not valid!\nage = 25;\n```\n:::\n\n\n```\nt.zig:10:5: error: cannot assign to constant\n age = 25;\n ~~^~~\n```\n\nIn contrast, if you use `var`, then, the object created is a variable object.\nWith `var` you can declare this object in your source code, and then,\nchange the value of this object how many times you want over future points\nin your source code.\n\nSo, using the same code example exposed above, if I change the declaration of the\n`age` object to use the `var` keyword, then, the program gets compiled succesfully.\nBecause now, the `zig` compiler detects that we are changing the value of an\nobject that allows this behaviour, because it is an \"variable object\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = 24;\nage = 25;\n```\n:::\n\n\n\n### Declaring without an initial value\n\nBy default, when you declare a new object in Zig, you must give it\nan initial value. In other words, this means\nthat we have to declare, and, at the same time, initialize every object we\ncreate in our source code.\n\nOn the other hand, you can, in fact, declare a new object in your source code,\nand not give it an explicit value. But we need to use a special keyword for that,\nwhich is the `undefined` keyword.\n\nIs important to emphasize that, you should avoid using `undefined` as much as possible.\nBecause when you use this keyword, you leave your object uninitialized, and, as a consequence,\nif for some reason, your code use this object while it is uninitialized, then, you will definitely\nhave undefined behaviour and major bugs in your program.\n\nIn the example below, I'm declaring the `age` object again. But this time,\nI do not give it an initial value. The variable is only initialized at\nthe second line of code, where I store the number 25 in this object.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = undefined;\nage = 25;\n```\n:::\n\n\nHaving these points in mind, just remember that you should avoid as much as possible to use `undefined` in your code.\nAlways declare and initialize your objects. Because this gives you much more safety in your program.\nBut in case you really need to declare an object without initializing it... the\n`undefined` keyword is the way to do it in Zig.\n\n\n### There is no such thing as unused objects\n\nEvery object (being constant or variable) that you declare in Zig **must be used in some way**. You can give this object\nto a function call, as a function argument, or, you can use it in another expression\nto calculate the value of another object, or, you can call a method that belongs to this\nparticular object. \n\nIt doesn't matter in which way you use it. As long as you use it.\nIf you try to break this rule, i.e. if your try to declare a object, but not use it,\nthe `zig` compiler will not compile your Zig source code, and it will issue a error\nmessage warning that you have unused objects in your code.\n\nLet's demonstrate this with an example. In the source code below, we declare a constant object\ncalled `age`. If you try to compile a simple Zig program with this line of code below,\nthe compiler will return an error as demonstrated below:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 15;\n```\n:::\n\n\n```\nt.zig:4:11: error: unused local constant\n const age = 15;\n ^~~\n```\n\nEverytime you declare a new object in Zig, you have two choices:\n\n1. you either use the value of this object;\n2. or you explicitly discard the value of the object;\n\nTo explicitly discard the value of any object (constant or variable), all you need to do is to assign\nthis object to an special character in Zig, which is the underscore (`_`).\nWhen you assign an object to a underscore, like in the example below, the `zig` compiler will automatically\ndiscard the value of this particular object.\n\nYou can see in the example below that, this time, the compiler did not\ncomplain about any \"unused constant\", and succesfully compiled our source code.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It compiles!\nconst age = 15;\n_ = age;\n```\n:::\n\n\nNow, remember, everytime you assign a particular object to the underscore, this object\nis essentially destroyed. It is discarded by the compiler. This means that you can no longer\nuse this object further in your code. It doesn't exist anymore.\n\nSo if you try to use the constant `age` in the example below, after we discarded it, you\nwill get a loud error message from the compiler (talking about a \"pointless discard\")\nwarning you about this mistake.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It does not compile.\nconst age = 15;\n_ = age;\n// Using a discarded value!\nstd.debug.print(\"{d}\\n\", .{age + 2});\n```\n:::\n\n\n```\nt.zig:7:5: error: pointless discard\n of local constant\n```\n\n\nThis same rule applies to variable objects. Every variable object must also be used in\nsome way. And if you assign a variable object to the underscore,\nthis object also get's discarded, and you can no longer use this object.\n\n\n\n### You must mutate every variable objects\n\nEvery variable object that you create in your source code must be mutated at some point.\nIn other words, if you declare an object as a variable\nobject, with the keyword `var`, and you do not change the value of this object\nat some point in the future, the `zig` compiler will detect this,\nand it will raise an error warning you about this mistake.\n\nThe concept behind this is that every object you create in Zig should be preferably a\nconstant object, unless you really need an object whose value will\nchange during the execution of your program.\n\nSo, if I try to declare a variable object such as `where_i_live` below,\nand I do not change the value of this object in some way,\nthe `zig` compiler raises an error message with the phrase \"variable is never mutated\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar where_i_live = \"Belo Horizonte\";\n_ = where_i_live;\n```\n:::\n\n\n```\nt.zig:7:5: error: local variable is never mutated\nt.zig:7:5: note: consider using 'const'\n```\n\n## Primitive Data Types\n\nZig have many different primitive data types available for you to use.\nYou can see the full list of available data types at the official\n[Language Reference page](https://ziglang.org/documentation/master/#Primitive-Types)[^lang-data-types].\n\n[^lang-data-types]: .\n\nBut here is a quick list:\n\n- Unsigned integers: `u8`, 8-bit integer; `u16`, 16-bit integer; `u32`, 32-bit integer; `u64`, 64-bit integer; `u128`, 128-bit integer.\n- Signed integers: `i8`, 8-bit integer; `i16`, 16-bit integer; `i32`, 32-bit integer; `i64`, 64-bit integer; `i128`, 128-bit integer.\n- Float number: `f16`, 16-bit floating point; `f32`, 32-bit floating point; `f64`, 64-bit floating point; `f128`, 128-bit floating point;\n- Boolean: `bool`, represents true or false values.\n- C ABI compatible types: `c_long`, `c_char`, `c_short`, `c_ushort`, `c_int`, `c_uint`, and many others.\n- Pointer sized integers: `isize` and `usize`.\n\n\n\n\n\n\n\n## Arrays {#sec-arrays}\n\nYou create arrays in Zig by using a syntax that resembles the C syntax.\nFirst, you specify the size of the array (i.e. the number of elements that will be stored in the array)\nyou want to create inside a pair of brackets.\n\nThen, you specify the data type of the elements that will be stored inside this array.\nAll elements present in an array in Zig must have the same data type. For example, you cannot mix elements\nof type `f32` with elements of type `i32` in the same array.\n\nAfter that, you simply list the values that you want to store in this array inside\na pair of curly braces.\nIn the example below, I am creating two constant objets that contain different arrays.\nThe first object contains an array of 4 integer values, while the second object,\nan array of 3 floating point values.\n\nNow, you should notice that in the object `ls`, I am\nnot explicitly specifying the size of the array inside of the brackets. Instead\nof using a literal value (like the value 4 that I used in the `ns` object), I am\nusing the special character underscore (`_`). This syntax tells the `zig` compiler\nto fill this field with the number of elements listed inside of the curly braces.\nSo, this syntax `[_]` is for lazy (or smart) programmers who leave the job of\ncounting how many elements there are in the curly braces for the compiler.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst ls = [_]f64{432.1, 87.2, 900.05};\n_ = ns; _ = ls;\n```\n:::\n\n\n### Selecting elements of the array\n\nOne very commom activity is to select specific portions of an array\nyou have in your source code.\nIn Zig, you can select a specific element from your\narray, by simply providing the index of this particular\nelement inside brackets after the object name.\nIn the example below, I am selecting the third element from the\n`ns` array. Notice that Zig is a \"zero-index\" based language,\nlike C, C++, Rust, Python, and many other languages.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\ntry stdout.print(\"{d}\\n\", .{ ns[2] });\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n12\n```\n\n\n:::\n:::\n\n\nIn contrast, you can also select specific slices (or sections) of your array, by using a\nrange selector. Some programmers also call these selectors of \"slice selectors\",\nand they also exist in Rust, and have the exact same syntax as in Zig.\nAnyway, a range selector is a special expression in Zig that defines\na range of indexes, and it have the syntax `start..end`.\n\nIn the example below, at the second line of code,\nthe `sl` object stores a slice (or a portion) of the\n`ns` array. More precisely, the elements at index 1 and 2\nin the `ns` array. \n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\n_ = sl;\n```\n:::\n\n\nWhen you use the `start..end` syntax,\nthe \"end tail\" of the range selector is non-inclusive,\nmeaning that, the index at the end is not included in the range that is\nselected from the array.\nTherefore, the syntax `start..end` actually means `start..end - 1` in practice.\n\nYou can for example, create a slice that goes from the first to the\nlast elements of the array, by using `ar[0..ar.len]` syntax\nIn other words, it is a slice that\naccess all elements in the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ar = [4]u8{48, 24, 12, 6};\nconst sl = ar[0..ar.len];\n_ = sl;\n```\n:::\n\n\nYou can also use the syntax `start..` in your range selector.\nWhich tells the `zig` compiler to select the portion of the array\nthat begins at the `start` index until the last element of the array.\nIn the example below, we are selecting the range from index 1\nuntil the end of the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..];\n_ = sl;\n```\n:::\n\n\n\n### More on slices\n\nAs we discussed before, in Zig, you can select specific portions of an existing\narray. This is called *slicing* in Zig [@zigguide], because when you select a portion\nof an array, you are creating a slice object from that array.\n\nA slice object is essentially a pointer object accompained by a length number.\nThe pointer object points to the first element in the slice, and the\nlength number tells the `zig` compiler how many elements there are in this slice.\n\n> Slices can be thought of as a pair of `[*]T` (the pointer to the data) and a `usize` (the element count) [@zigguide].\n\nThrough the pointer contained inside the slice you can access the elements (or values)\nthat are inside this range (or portion) that you selected from the original array.\nBut the length number (which you can access through the `len` property of your slice object)\nis the really big improvement (over C arrays for example) that Zig brings to the table here.\n\nBecause with this length number\nthe `zig` compiler can easily check if you are trying to access an index that is out of the bounds of this particular slice,\nor, if you are causing any buffer overflow problems. In the example below,\nwe access the `len` property of the slice `sl`, which tells us that this slice\nhave 2 elements in it.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\ntry stdout.print(\"{d}\\n\", .{sl.len});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n2\n```\n\n\n:::\n:::\n\n\n\n### Array operators\n\nThere are two array operators available in Zig that are very useful.\nThe array concatenation operator (`++`), and the array multiplication operator (`**`). As the name suggests,\nthese are array operators.\n\nOne important detail about these two operators is that they work\nonly when both operands have a size (or \"length\") that is compile-time known.\nWe are going to talk more about\nthe differences between \"compile-time known\" and \"runtime known\" at @sec-compile-time.\nBut for now, keep this information in mind, that you cannot use these operators in every situation.\n\nIn summary, the `++` operator creates a new array that is the concatenation,\nof both arrays provided as operands. So, the expression `a ++ b` produces\na new array which contains all the elements from arrays `a` and `b`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst b = [_]u8{4,5};\nconst c = a ++ b;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 4, 5 }\n```\n\n\n:::\n:::\n\n\nThis `++` operator is particularly useful to concatenate strings together.\nStrings in Zig are described in depth at @sec-zig-strings. In summary, a string object in Zig\nis essentially an arrays of bytes. So, you can use this array concatenation operator\nto effectively concatenate strings together.\n\nIn contrast, the `**` operator is used to replicate an array multiple\ntimes. In other words, the expression `a ** 3` creates a new array\nwhich contains the elements of the array `a` repeated 3 times.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst c = a ** 2;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 1, 2, 3 }\n```\n\n\n:::\n:::\n\n\n\n## Blocks and scopes\n\nBlocks are created in Zig by a pair of curly braces. A block is just a group of\nexpressions (or statements) contained inside of a pair of curly braces. All of these expressions that\nare contained inside of this pair of curly braces belongs to the same scope.\n\nIn other words, a block just delimits a scope in your code.\nThe objects that you define inside the same block belongs to the same\nscope, and, therefore, are accessible from within this scope.\nAt the same time, these objects are not accessible outside of this scope.\nSo, you could also say that blocks are used to limit the scope of the objects that you create in\nyour source code. In less technical terms, blocks are used to specify where in your source code\nyou can access whatever object you have in your source code.\n\nSo, a block is just a group of expressions contained inside a pair of curly braces.\nAnd every block have it's own scope separated from the others.\nThe body of a function is a classic example of a block. If statements, for and while loops\n(and any other structure in the language that uses the pair of curly braces)\nare also examples of blocks.\n\nThis means that, every if statement, or for loop,\netc., that you create in your source code have it's own separate scope.\nThat is why you can't access the objects that you defined inside\nof your for loop (or if statement) in an outer scope, i.e. a scope outside of the for loop.\nBecause you are trying to access an object that belongs to a scope that is different\nthan your current scope.\n\n\nYou can create blocks within blocks, with multiple levels of nesting.\nYou can also (if you want to) give a label to a particular block, with the colon character (`:`).\nJust write `label:` before you open the pair of curly braces that delimits your block. When you label a block\nin Zig, you can use the `break` keyword to return a value from this block, like as if it\nwas a function's body. You just write the `break` keyword, followed by the block label in the format `:label`,\nand the expression that defines the value that you want to return.\n\nLike in the example below, where we are returning the value from the `y` object\nfrom the block `add_one`, and saving the result inside the `x` object.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar y: i32 = 123;\nconst x = add_one: {\n y += 1;\n break :add_one y;\n};\nif (x == 124 and y == 124) {\n try stdout.print(\"Hey!\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHey!\n```\n\n\n:::\n:::\n\n\n\n## Type inference {#sec-type-inference}\n\nZig is kind of a strongly typed language. I say \"kind of\" because there are situations\nwhere you don't have to explicitly write the type of every single object in your source code,\nas you would expect from a traditional strongly typed language, such as C and C++.\n\nIn some situations, the `zig` compiler can use type inference to solves the data types for you, easing some of\nthe burden that you carry as a developer.\nThe most commom way this happens is through function arguments that receives struct objects\nas input.\n\nIn general, type inference in Zig is done by using the dot character (`.`).\nEverytime you see a dot character written before a struct literal, or before a enum value, or something like that,\nyou know that this dot character is playing a special party in this place. More specifically, it is\ntelling the `zig` compiler something on the lines of: \"Hey! Can you infer the type of this\nvalue for me? Please!\". In other words, this dot character is playing a role similar to the `auto` keyword in C++.\n\nI give you some examples of this at @sec-anonymous-struct-literals, where we present anonymous struct literals.\nBecause anonymous struct literals are, essentially, struct literals that use type inference to\ninfer the exact type of this particular struct literal.\nThis type inference is done by looking for some minimal hint of the correct data type to be used.\nYou could say that the `zig` compiler looks for any neighbouring type annotation that might tell him what would be the correct type.\n\nAnother commom place where we use type inference in Zig is at switch statements (which we talk about at @sec-switch).\nTake a look at this `fence()` function, which comes from the [`atomic.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/atomic.zig)[^fence-fn]\nfrom the Zig Standard Library.\n\n[^fence-fn]: .\n\nThere are a lot of things in this function that we haven't talked about yet, such as:\nwhat `comptime` means? `inline`? `extern`? What is this star symbol before `Self`?\nLet's just ignore all of these things, and focus solely on the switch statement\nthat is inside this function.\n\nWe can see that this switch statement uses the `order` object as input. This `order`\nobject is one of the inputs of this `fence()` function, and we can see in the type annotation,\nthat this object is of type `AtomicOrder`. We can also see a bunch of values inside the\nswitch statements that begins with a dot character, such as `.release` and `.acquire`.\n\nBecause these weird values contain a dot character before them, we are asking the `zig`\ncompiler to infer the types of these values inside the switch statement. Then, the `zig`\ncompiler is looking into the current context where these values are being used, and it is\ntrying to infer the types of these values.\n\nSince they are being used inside a switch statement, the `zig` compiler looks into the type\nof the input object given to the switch statement, which is the `order` object in this case.\nBecause this object have type `AtomicOrder`, the `zig` compiler infers that these values\nare data members from this type `AtomicOrder`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub inline fn fence(self: *Self, comptime order: AtomicOrder) void {\n // LLVM's ThreadSanitizer doesn't support the normal fences so we specialize for it.\n if (builtin.sanitize_thread) {\n const tsan = struct {\n extern \"c\" fn __tsan_acquire(addr: *anyopaque) void;\n extern \"c\" fn __tsan_release(addr: *anyopaque) void;\n };\n\n const addr: *anyopaque = self;\n return switch (order) {\n .unordered, .monotonic => @compileError(@tagName(order) ++ \" only applies to atomic loads and stores\"),\n .acquire => tsan.__tsan_acquire(addr),\n .release => tsan.__tsan_release(addr),\n .acq_rel, .seq_cst => {\n tsan.__tsan_acquire(addr);\n tsan.__tsan_release(addr);\n },\n };\n }\n\n return @fence(order);\n}\n```\n:::\n\n\nThis is how basic type inference is done in Zig. If we didn't use the dot character before\nthe values inside this switch statement, then, we would be forced to write explicitly\nthe types of these values. For example, instead of writing `.release` we would have to\nwrite `AtomicOrder.release`. We would have to do this for every single value\nin this switch statement, and this is kind of painful. That is why type inference\nis commonly used on switch statements in Zig.\n\n\n\n\n## Control flow {#sec-zig-control-flow}\n\nSometimes, you need to make decisions in your program. Maybe you need to decide\nwether to execute or not a specific piece of code. Or maybe,\nyou need to apply the same operation over a sequence of values. These kinds of tasks,\ninvolve using structures that are capable of changing the \"control flow\" of our program.\n\nIn computer science, the term \"control flow\" usually refers to the order in which expressions (or commands)\nare evaluated in a given language or program. But this term is also used to refer\nto structures that are capable of changing this \"evaluation order\" of the commands\nexecuted by a given language/program.\n\nThese structures are better known\nby a set of terms, such as: loops, if/else statements, switch statements, among others. So,\nloops and if/else statements are examples of structures that can change the \"control\nflow\" of our program. The keywords `continue` and `break` are also examples of symbols\nthat can change the order of evaluation, since they can move our program to the next iteration\nof a loop, or make the loop stop completely.\n\n\n### If/else statements\n\nAn if/else statement performs an \"conditional flow operation\".\nA conditional flow control (or choice control) allows you to execute\nor ignore a certain block of commands based on a logical condition.\nMany programmers and computer science professionals also use\nthe term \"branching\" in this case.\nIn essence, we use if/else statements to use the result of a logical test\nto decide whether or not to execute a given block of commands.\n\nIn Zig, we write if/else statements by using the keywords `if` and `else`.\nWe start with the `if` keyword followed by a logical test inside a pair\nof parentheses, and then, a pair of curly braces with contains the lines\nof code to be executed in case the logical test returns the value `true`.\n\nAfter that, you can optionally add an `else` statement. Just add the `else`\nkeyword followed by a pair of curly braces, with the lines of code\nto executed in case the logical test defined in the `if`\nreturns `false`.\n\nIn the example below, we are testing if the object `x` contains a number\nthat is greater than 10. Judging by the output printed to the console,\nwe know that this logical test returned `false`. Because the output\nin the console is compatible with the line of code present in the\n`else` branch of the if/else statement.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nx <= 10!\n```\n\n\n:::\n:::\n\n\n\n\n### Swith statements {#sec-switch}\n\nSwitch statements are also available in Zig.\nA switch statement in Zig have a similar syntax to a switch statement in Rust.\nAs you would expect, to write a switch statement in Zig we use the `switch` keyword.\nWe provide the value that we want to \"switch over\" inside a\npair of parentheses. Then, we list the possible combinations (or \"branchs\")\ninside a pair of curly braces.\n\nLet's take a look at the code example below. You can see in this example that,\nI'm creating an enum type called `Role`. We talk more about enums at @sec-enum.\nBut in essence, this `Role` type is listing different types of roles in a fictituous\ncompany, like `SE` for Software Engineer, `DE` for Data Engineer, `PM` for Product Manager,\netc.\n\nNotice that we are using the value from the `role` object in the\nswitch statement, to discover which exact area we need to store in the `area` variable object.\nAlso notice that we are using type inference inside the switch statement, with the dot character,\nas we described at @sec-type-inference.\nThis makes the `zig` compiler infer the correct data type of the values (`PM`, `SE`, etc.) for us.\n\nAlso notice that, we are grouping multiple values in the same branch of switch statement.\nWe just separate each possible value with a comma. So, for example, if `role` contains either `DE` or `DA`,\nthe `area` variable would contain the value `\"Data & Analytics\"`, instead of `\"Platform\"`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nPlatform\n```\n\n\n:::\n:::\n\n\nNow, one very important aspect about this switch statement presented\nin the code example above, is that it exhaust all existing possibilities.\nIn other words, all possible values that could be found inside the `order`\nobject are explicitly handled in this switch statement.\n\nSince the `role` object have type `Role`, the only possible values to\nbe found inside this object are `PM`, `SE`, `DPE`, `PO`, `DE`, `DA` and `KS`.\nThere is no other possible value to be stored in this `role` object.\nThis what \"exhaust all existing possibilities\" means. The switch statement covers\nevery possible case.\n\nIn Zig, switch statements must exhaust all existing possibilities. You cannot write\na switch statement, and leave an edge case with no expliciting action to be taken.\nThis is a similar behaviour to switch statements in Rust, which also have to\nhandle all possible cases.\n\nTake a look at the `dump_hex_fallible()` function below as an example. This function\nalso comes from the Zig Standard Library, but this time, it comes from the [`debug.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/debug.zig)[^debug-mod].\nThere are multiple lines in this function, but I omitted them to focus solely on the\nswitch statement found in this function. Notice that this switch statement have four\npossible cases, or four explicit branches. Also, notice that we used an `else` branch\nin this case. Whenever you have multiple possible cases in your switch statement\nwhich you want to apply the same exact action, you can use an `else` branch to do that.\n\n[^debug-mod]: \n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n```\n:::\n\n\nMany users would also use an `else` branch to handle a \"not supported\" case.\nThat is, a case that cannot be properly handled by your code, or, just a case that\nshould not be \"fixed\". So many programmers use an `else` branch to panic (or raise an error) to stop\nthe current execution.\n\nTake the code example below as an example. We can see that, we are handling the cases\nfor the `level` object being either 1, 2, or 3. All other possible cases are not supported by default,\nand, as consequence, we raise an runtime error in these cases, through the `@panic()` built-in function.\n\nAlso notice that, we are assigning the result of the switch statement to a new object called `category`.\nThis is another thing that you can do with switch statements in Zig. If the branchs in this switch\nstatement output some value as result, you can store the result value of the switch statement into\na new variable.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n:::\n\n\n```\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\n```\n\nFurthermore, you can also use ranges of values in switch statements.\nThat is, you can create a branch in your switch statement that is used\nwhenever the input value is contained in a range. These range\nexpressions are created with the operator `...`. Is important\nto emphasize that the ranges created by this operator are\ninclusive on both ends.\n\nFor example, I could easily change the code example above to support all\nlevels between 0 and 100. Like this:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nbeginner\n```\n\n\n:::\n:::\n\n\nThis is neat, and it works with character ranges too. That is, I could\nsimply write `'a'...'z'`, to match any character value that is a\nlowercase letter, and it would work fine.\n\n\n\n### For loops\n\nA loop allows you to execute the same lines of code multiple times,\nthus, creating a \"repetition space\" in the execution flow of your program.\nLoops are particularly useful when we want to replicate the same function\n(or the same set of commands) over several different inputs.\n\nThere are different types of loops available in Zig. But the most\nessential of them all is probably the *for loop*. A for loop is\nused to apply the same piece of code over the elements of a slice or an array.\n\nFor loops in Zig have a slightly different syntax that you are\nprobably used to see in other languages. You start with the `for` keyword, then, you\nlist the items that you want to iterate\nover inside a pair of parentheses. Then, inside of a pair of pipes (`|`)\nyou should declare an identifier that will serve as your iterator, or,\nthe \"repetition index of the loop\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (items) |value| {\n // code to execute\n}\n```\n:::\n\n\nInstead of using a `(value in items)` syntax,\nin Zig, for loops use the syntax `(items) |value|`. In the example\nbelow, you can see that we are looping through the items\nof the array stored at the object `name`, and printing to the\nconsole the decimal representation of each character in this array.\n\nIf we wanted, we could also iterate through a slice (or a portion) of\nthe array, instead of iterating through the entire array stored in the `name` object.\nJust use a range selector to select the section you want. For example,\nI could provide the expression `name[0..3]` to the for loop, to iterate\njust through the first 3 elements in the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n80 | 101 | 100 | 114 | 111 | \n```\n\n\n:::\n:::\n\n\nIn the above example we are using the value itself of each\nelement in the array as our iterator. But there are many situations where\nwe need to use an index instead of the actual values of the items.\n\nYou can do that by providing a second set of items to iterate over.\nMore precisely, you provide the range selector `0..` to the for loop. So,\nyes, you can use two different iterators at the same time in a for\nloop in Zig.\n\nBut remember from @sec-assignments that, every object\nyou create in Zig must be used in some way. So if you declare two iterators\nin your for loop, you must use both iterators inside the for loop body.\nBut if you want to use just the index iterator, and not use the \"value iterator\",\nthen, you can discard the value iterator by maching the\nvalue items to the underscore character, like in the example below:\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n:::\n\n\n```\n0 | 1 | 2 | 3 | 4 |\n```\n\n\n### While loops\n\nA while loop is created from the `while` keyword. While a `for` loop\niterates through the items of an array, a `while` loop\nwill loop continuously, and infinitely, until a logical test\n(specified by you) becomes false.\n\nYou start with the `while` keyword, then, you define a logical\nexpression inside a pair of parentheses, and the body of the\nloop is provided inside a pair of curly braces, like in the example below:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\n### Using `break` and `continue`\n\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, using\nthe keywords `break` and `continue`, respectively. The `while` loop present in the example below, is\nat first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to `true`.\nWhat makes this `while` loop stop when the `i` object reaches the count\n10? Is the `break` keyword!\n\nInside the while loop, we have an if statement that is constantly checking if the `i` variable\nis equal to 10. Since we are increasing the value of this `i` variable at each iteration of the\nwhile loop. At some point, this `i` variable will be equal to 10, and when it does, the if statement\nwill execute the `break` expression, and, as a result, the execution of the while loop is stopped.\n\nNotice the `expect()` function from the Zig standard library after the while loop.\nThis `expect()` function is an \"assert\" type of function.\nThis function checks if the logical test provided is equal to true. If this logical test is false,\nthe function raises an assertion error. But it is equal to true, then, the function will do nothing.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nEverything worked!\n```\n\n\n:::\n:::\n\n\nSince this code example was executed succesfully by the `zig` compiler,\nwithout raising any errors, then, we known that, after the execution of while loop,\nthe `i` variable is equal to 10. Because if it wasn't equal to 10, then, an error would\nbe raised by `expect()`.\n\nNow, in the next example, we have an use case for\nthe `continue` keyword. The if statement is constantly\nchecking if the current index is a multiple of 2. If\nit is, then we jump to the next iteration of the loop\ndirectly. But it the current index is not a multiple of 2,\nthen, the loop will simply print this index to the console.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 3 | 5 | \n```\n\n\n:::\n:::\n\n\n\n## Structs and OOP {#sec-structs-and-oop}\n\nZig is a language more closely related to C (which is a procedural language),\nthan it is to C++ or Java (which are object-oriented languages). Because of that, you do not\nhave advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or\nclass inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\n\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C.\nYou give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can\nalso register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object\nthat you create with this new type, will always have these methods available and associated with them.\n\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) to construct or to instantiate every object\nof this particular class, and you also have a destructor method (or a destructor function) that\nis the function responsible for destroying every object of this class.\n\nIn Zig, we normally declare the constructor and the destructor methods\nof our structs, by declaring an `init()` and a `deinit()` methods inside the struct.\nThis is just a naming convention that you will find across the entire Zig standard library.\nSo, in Zig, the `init()` method of a struct is normally the constructor method of the class represented by this struct.\nWhile the `deinit()` method is the method used for destroying an existing instance of that struct.\n\nBoth the `init()` and `deinit()` methods are used extensively in Zig code, and you will see both of them at @sec-arena-allocator. In this section,\nI present the `ArenaAllocator()`, which is a special type of allocator object that receives a second (child)\nallocator object at instantiation. We use the `init()` method to create a new `ArenaAllocator()` object,\nthen, on the next line, we also used the `deinit()` method in conjunction with the `defer` keyword, to destroy this arena allocator object at the end\nof the current scope.\n\nBut, as another example, let's build a simple `User` struct to represent an user of some sort of system.\nIf you look at the `User` struct below, you can see the `struct` keyword, and inside of a\npair of curly braces, we write the struct's body.\n\nNotice the data members of this struct, `id`, `name` and `email`. Every data member have it's\ntype explicitly annotated, with the colon character (`:`) syntax that we described earlier at @sec-root-file.\nBut also notice that every line in the struct body that describes a data member, ends with a comma character (`,`).\nSo every time you declare a data member in your Zig code, always end the line with a comma character, instead\nof ending it with the traditional semicolon character (`;`).\n\nNext, also notice in this example, that we registrated an `init()` function as a method\nof this `User` struct. This `init()` method is the constructor method that you use to instantiate\nevery new `User` object. That is why this `init()` function return an `User` object as result.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\npedro\n```\n\n\n:::\n:::\n\n\nThe `pub` keyword plays an important role in struct declarations, and OOP in Zig.\nEvery method that you declare in your struct that is marked with the keyword `pub`,\nbecomes a public method of this particular struct.\n\nSo every method that you create in your struct, is, at first, a private method\nof that struct. Meaning that, this method can only be called from within this\nstruct. But, if you mark this method as public, with the keyword `pub`, then,\nyou can call the method directly from the `User` object you have\nin your code.\n\nIn other words, the functions marked by the keyword `pub`\nare members of the public API of that struct.\nFor example, if I did not marked the `print_name()` method as public,\nthen, I could not execute the line `u.print_name()`. Because I would\nnot be authorized to call this method directly in my code.\n\n\n\n\n## Anonymous struct literals {#sec-anonymous-struct-literals}\n\nYou can declare a struct object as a literal value. When we do that, we normally specify the\ndata type of this struct literal by writing it's data type just before the opening curly braces.\nFor example, I could write a struct literal of type `User` that we defined in the previous section like\nthis:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst eu = User {\n .id = 1,\n .name = \"Pedro\",\n .email = \"someemail@gmail.com\"\n};\n_ = eu;\n```\n:::\n\n\nHowever, in Zig, we can also write an anonymous struct literal. That is, you can write a\nstruct literal, but not especify explicitly the type of this particular struct.\nAn anonymous struct is written by using the syntax `.{}`. So, we essentially\nreplaced the explicit type of the struct literal with a dot character (`.`).\n\nAs we described at @sec-type-inference, when you put a dot before a struct literal,\nthe type of this struct literal is automatically inferred by the `zig` compiler.\nIn essence, the `zig` compiler will look for some hint of what is the type of that struct.\nIt can be the type annotation of an function argument,\nor the return type annotation of the function that you are using, or the type annotation\nof a variable.\nIf the compiler do find such type annotation, then, it will use this\ntype in your literal struct. \n\nAnonymous structs are very commom to use in function arguments in Zig.\nOne example that you have seen already constantly, is the `print()`\nfunction from the `stdout` object.\nThis function takes two arguments.\nThe first argument, is a template string, which should\ncontain string format specifiers in it, which tells how the values provided\nin the second argument should be printed into the message.\n\nWhile the second argument is a struct literal that lists the values\nto be printed into the template message specified in the first argument.\nYou normally want to use an anonymous struct literal here, so that, the\n`zig` compiler do the job of specifying the type of this particular\nanonymous struct for you.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello, world!\n```\n\n\n:::\n:::\n\n\n\n\n\n## How strings work in Zig? {#sec-zig-strings}\n\nThe first project that we are going to build and discuss in this book is a base64 encoder/decoder (@sec-base64).\nBut in order for us to build such a thing, we need to get a better understanding on how strings work in Zig.\nSo let's discuss this specific aspect of Zig.\n\nIn Zig, a string literal (or a string object if you prefer) is a pointer to a null-terminated array\nof bytes. Each byte in this array is represented by an `u8` value, which is an unsigned 8 bit integer,\nso, it is equivalent to the C data type `unsigned char`.\n\nZig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every\nsequence of bytes you have it, but is not really Zig's job to fix the encoding of your strings\n(you can use [`iconv`](https://www.gnu.org/software/libiconv/)[^libiconv] for that).\nToday, most of the text in our modern world, specially on the web, should be UTF-8 encoded.\nSo if your string literal is not UTF-8 encoded, then, you will likely\nhave problems in Zig.\n\n[^libiconv]: \n\nLet’s take for example the word \"Hello\". In UTF-8, this sequence of characters (H, e, l, l, o)\nis represented by the sequence of decimal numbers 72, 101, 108, 108, 111. In xecadecimal, this\nsequence is `0x48`, `0x65`, `0x6C`, `0x6C`, `0x6F`. So if I take this sequence of hexadecimal values,\nand ask Zig to print this sequence of bytes as a sequence of characters (i.e. a string), then,\nthe text \"Hello\" will be printed into the terminal:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\n\npub fn main() !void {\n const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};\n try stdout.print(\"{s}\\n\", .{bytes});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello\n```\n\n\n:::\n:::\n\n\n\nIf you want to see the actual bytes that represents a string in Zig, you can use\na `for` loop to iterate trough each byte in the string, and ask Zig to print each byte as an hexadecimal\nvalue to the terminal. You do that by using a `print()` statement with the `X` formatting specifier,\nlike you would normally do with the [`printf()` function](https://cplusplus.com/reference/cstdio/printf/)[^printfs] in C.\n\n[^printfs]: \n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_literal) |byte| {\n try stdout.print(\"{X} \", .{byte});\n }\n try stdout.print(\"\\n\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: 54 68 69 \n 73 20 69 73 20 61 6E 20 65 78 61 6D 70 6C 65 20 6F\n F 66 20 73 74 72 69 6E 67 20 6C 69 74 65 72 61 6C 2\n 20 69 6E 20 5A 69 67 \n```\n\n\n:::\n:::\n\n\n### Strings in C\n\nAt first glance, this looks very similar to how C treats strings as well. That is, string values\nin C are also treated internally as an array of bytes, and this array is also null-terminated.\n\nBut one key difference between a Zig string and a C string, is that Zig also stores the length of\nthe array inside the string object. This small detail makes your code safer, because is much\neasier for the Zig compiler to check if you are trying to access an element that is \"out of bounds\", i.e. if\nyour trying to access memory that does not belong to you.\n\nTo achieve this same kind of safety in C, you have to do a lot of work that kind of seems pointless.\nSo getting this kind of safety is not automatic and much harder to do in C. For example, if you want\nto track the length of your string troughout your program in C, then, you first need to loop through\nthe array of bytes that represents this string, and find the null element (`'\\0'`) position to discover\nwhere exactly the array ends, or, in other words, to find how much elements the array of bytes contain.\n\nTo do that, you would need something like this in C. In this example, the C string stored in\nthe object `array` is 25 bytes long:\n\n```c\n#include \nint main() {\n char* array = \"An example of string in C\";\n int index = 0;\n while (1) {\n if (array[index] == '\\0') {\n break;\n }\n index++;\n }\n printf(\"Number of elements in the array: %d\\n\", index);\n}\n```\n\n```\nNumber of elements in the array: 25\n```\n\nBut in Zig, you do not have to do this, because the object already contains a `len`\nfield which stores the length information of the array. As an example, the `string_literal` object below is 43 bytes long:\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n try stdout.print(\"{d}\\n\", .{string_literal.len});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n43\n```\n\n\n:::\n:::\n\n\n\n### A better look at the object type\n\nNow, we can inspect better the type of objects that Zig create. To check the type of any object in Zig, you can use the\n`@TypeOf()` function. If we look at the type of the `simple_array` object below, you will find that this object\nis a array of 4 elements. Each element is a signed integer of 32 bits which corresponds to the data type `i32` in Zig.\nThat is what an object of type `[4]i32` is.\n\nBut if we look closely at the type of the `string_literal` object below, you will find that this object is a\nconstant pointer (hence the `*const` annotation) to an array of 43 elements (or 43 bytes). Each element is a\nsingle byte (more precisely, an unsigned 8 bit integer - `u8`), that is why we have the `[43:0]u8` portion of the type below.\nIn other words, the string stored inside the `string_literal` object is 43 bytes long.\nThat is why you have the type `*const [43:0]u8` below.\n\nIn the case of `string_literal`, it is a constant pointer (`*const`) because the object `string_literal` is declared\nas constant in the source code (in the line `const string_literal = ...`). So, if we changed that for some reason, if\nwe declare `string_literal` as a variable object (i.e. `var string_literal = ...`), then, `string_literal` would be\njust a normal pointer to an array of unsigned 8-bit integers (i.e. `* [43:0]u8`).\n\nNow, if we create an pointer to the `simple_array` object, then, we get a constant pointer to an array of 4 elements (`*const [4]i32`),\nwhich is very similar to the type of the `string_literal` object. This demonstrates that a string object (or a string literal)\nin Zig is already a pointer to an array.\n\nJust remember that a \"pointer to an array\" is different than an \"array\". So a string object in Zig is a pointer to an array\nof bytes, and not simply an array of bytes.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n const simple_array = [_]i32{1, 2, 3, 4};\n try stdout.print(\"Type of array object: {}\", .{@TypeOf(simple_array)});\n try stdout.print(\n \"Type of string object: {}\",\n .{@TypeOf(string_literal)}\n );\n try stdout.print(\n \"Type of a pointer that points to the array object: {}\",\n .{@TypeOf(&simple_array)}\n );\n}\n```\n:::\n\n\n```\nType of array object: [4]i32\nType of string object: *const [43:0]u8\nType of a pointer that points to\n the array object: *const [4]i32\n```\n\n\n### Byte vs unicode points\n\nIs important to point out that each byte in the array is not necessarily a single character.\nThis fact arises from the difference between a single byte and a single unicode point.\n\nThe encoding UTF-8 works by assigning a number (which is called a unicode point) to each character in\nthe string. For example, the character \"H\" is stored in UTF-8 as the decimal number 72. This means that\nthe number 72 is the unicode point for the character \"H\". Each possible character that can appear in a\nUTF-8 encoded string have its own unicode point.\n\nFor example, the Latin Capital Letter A With Stroke (Ⱥ) is represented by the number (or the unicode point)\n570. However, this decimal number (570) is higher than the maximum number stored inside a single byte, which\nis 255. In other words, the maximum decimal number that can be represented with a single byte is 255. That is why,\nthe unicode point 570 is actually stored inside the computer’s memory as the bytes `C8 BA`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"Ⱥ\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_literal) |char| {\n try stdout.print(\"{X} \", .{char});\n }\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: C8 BA \n```\n\n\n:::\n:::\n\n\n\nThis means that to store the character Ⱥ in an UTF-8 encoded string, we need to use two bytes together\nto represent the number 570. That is why the relationship between bytes and unicode points is not always\n1 to 1. Each unicode point is a single character in the string, but not always a single byte corresponds\nto a single unicode point.\n\nAll of this means that if you loop trough the elements of a string in Zig, you will be looping through the\nbytes that represents that string, and not through the characters of that string. In the Ⱥ example above,\nthe for loop needed two iterations (instead of a single iteration) to print the two bytes that represents this Ⱥ letter.\n\nNow, all english letters (or ASCII letters if you prefer) can be represented by a single byte in UTF-8. As a\nconsequence, if your UTF-8 string contains only english letters (or ASCII letters), then, you are lucky. Because\nthe number of bytes will be equal to the number of characters in that string. In other words, in this specific\nsituation, the relationship between bytes and unicode points is 1 to 1.\n\nBut on the other side, if your string contains other types of letters… for example, you might be working with\ntext data that contains, chinese, japanese or latin letters, then, the number of bytes necessary to represent\nyour UTF-8 string will likely be much higher than the number of characters in that string.\n\nIf you need to iterate through the characters of a string, instead of its bytes, then, you can use the\n`std.unicode.Utf8View` struct to create an iterator that iterates through the unicode points of your string.\n\nIn the example below, we loop through the japanese characters “アメリカ”. Each of the four characters in\nthis string is represented by three bytes. But the for loop iterates four times, one iteration for each\ncharacter/unicode point in this string:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n var utf8 = (\n (try std.unicode.Utf8View.init(\"アメリカ\"))\n .iterator()\n );\n while (utf8.nextCodepointSlice()) |codepoint| {\n try stdout.print(\n \"got codepoint {}\\n\",\n .{std.fmt.fmtSliceHexUpper(codepoint)}\n );\n }\n}\n```\n:::\n\n\n```\ngot codepoint E382A2\ngot codepoint E383A1\ngot codepoint E383AA\ngot codepoint E382AB\n```\n\n\n\n## Other parts of Zig\n\nWe already learned a lot about Zig's syntax, and also, some pretty technical\ndetails about it. Just as a quick recap:\n\n- We talked about how functions are written in Zig at @sec-root-file and @sec-main-file.\n- How to create new objects/identifiers at @sec-root-file and specially at @sec-assignments.\n- Basic control flow syntax at @sec-zig-control-flow.\n- How strings work in Zig at @sec-zig-strings.\n- How to use arrays and slices at @sec-arrays.\n- How to import functionality from other Zig modules at @sec-root-file.\n- How Object-Oriented programming can be done in Zig through *Struct declarations* at @sec-structs-and-oop.\n\n\nBut, for now, this amount of knowledge is enough for us to continue with this book.\nLater, over the next chapters we will still talk more about other parts of\nZig's syntax that are also equally important as the other parts. Such as:\n\n- Enums at @sec-enum;\n- Pointers and Optionals at @sec-pointer;\n- Error handling with `try` and `catch`;\n- Unit tests at @sec-unittests;\n- Vectors;\n- Build System at @sec-build-system;\n\n\n\n\n", + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n# Introducing Zig\n\nIn this chapter, I want to introduce you to the world of Zig.\nDespite it's rapidly growing over the last years, Zig is, still, a very young language^[New programming languages in general, take years and years to be developed.].\nAs a consequence, it's world is still very wild and to be explored.\nThis book is my attempt to help you on your personal journey for\nunderstanding and exploring the exciting world of Zig.\n\nI assume you have previous experience with some programming\nlanguage in this book, not necessarily with a low-level one.\nSo, if you have experience with Python, or Javascript, for example, is fine.\nBut, if you do have experience with low-level languages, such as C, C++, or\nRust, you will probably learn faster throughout this book.\n\n\n\n## What is Zig?\n\nZig is a modern, low-level, and general-purpose programming language. Some programmers interpret\nZig as the \"modern C language\". It is a simple language like C, but with some\nmodern features.\n\nIn the author's personal interpretation, Zig is tightly connected with \"less is more\".\nInstead of trying to become a modern language by adding more and more features,\nmany of the core improvements that Zig brings to the\ntable are actually about removing annoying and evil behaviours/features from C and C++.\nIn other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.\nAs a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.\n\nThis philosophy becomes clear with the following phrase from the official website of Zig:\n\n> \"Focus on debugging your application rather than debugging your programming language knowledge\".\n\nThis phrase is specially true for C++ programmers. Because C++ is a gigantic language,\nwith tons of features, and also, there are lots of different \"flavors of C++\". These elements\nare what makes C++ so much complex and hard to learn. Zig tries to go in the opposite direction.\nZig is a very simple language, more closely related to other simple languages such as C and Go.\n\nThe phrase above is still important for C programmers too. Because, even C being a simple\nlanguage, it is still hard sometimes to read and understand C code. For example, pre-processor macros in\nC are an evil source of confusion. They really makes it hard sometimes to debug\nC programs. Because macros are essentially a second language embedded in C that obscures\nyour C code. With macros, you are no longer 100% sure about which pieces\nof code are being sent to the compiler. It obscures the actual source code that you wrote.\n\nYou don't have macros in Zig. In Zig, the code you write, is the actual code that get's compiled by the compiler.\nYou don't have evil features that obscures you code.\nYou also don't have hidden control flow happening behind the scenes. And, you also\ndon't have functions or operators from the standard library that make\nhidden memory allocations behind your back.\n\nBy being a simpler language, Zig becomes much more clear and easier to read/write,\nbut at the same time, it also achieves a much more robust state, with more consistent\nbehaviour in edge situations. Once again, less is more.\n\n\n## Hello world in Zig\n\nWe begin our journey in Zig by creating a small \"Hello World\" program.\nTo start a new Zig project in your computer, you simply call the `init` command\nfrom the `zig` compiler.\nJust create a new directory in your computer, then, init a new Zig project\ninside this directory, like this:\n\n```bash\nmkdir hello_world\ncd hello_world\nzig init\n```\n\n```\ninfo: created build.zig\ninfo: created build.zig.zon\ninfo: created src/main.zig\ninfo: created src/root.zig\ninfo: see `zig build --help` for a menu of options\n```\n\n### Understanding the project files {#sec-project-files}\n\nAfter you run the `init` command from the `zig` compiler, some new files\nare created inside of your current directory. First, a \"source\" (`src`) directory\nis created, containing two files, `main.zig` and `root.zig`. Each `.zig` file\nis a separate Zig module, which is simply a text file that contains some Zig code.\n\n\nThe `main.zig` file for example, contains a `main()` function, which represents\nthe entrypoint of your program. It is where the execution of your program begins.\nAs you would expect from a C, C++, Rust or Go,\nto build an executabe program in Zig, you also need to declare a `main()` function in your module.\nSo, the `main.zig` module represents an executable program written in Zig.\n\nOn the other side, the `root.zig` module does not contain a `main()` function. Because\nit represents a library written in Zig. Libraries are different than executables.\nThey don't need to have an entrypoint to work.\nSo, you can choose which file (`main.zig` or `root.zig`) you want to follow depending on which type\nof project (executable or library) you want to develop.\n\n```bash\ntree .\n```\n\n```\n.\n├── build.zig\n├── build.zig.zon\n└── src\n ├── main.zig\n └── root.zig\n\n1 directory, 4 files\n```\n\n\nNow, in addition to the source directory, two other files were created in our working directory:\n`build.zig` and `build.zig.zon`. The first file (`build.zig`) represents a build script written in Zig.\nThis script is executed when you call the `build` command from the `zig` compiler.\nIn other words, this file contain Zig code that executes the necessary steps to build the entire project.\n\nIn general, low-level languages normally use a compiler to build your\nsource code into binary executables or binary libraries.\nNevertheless, this process of compiling your source code and building\nbinary executables or binary libraries from it, became a real challenge\nin the programming world, once the projects became bigger and bigger.\nAs a result, programmers created \"build systems\", which are a second set of tools designed to make this process\nof compiling and building complex projects, easier.\n\nExamples of build systems are CMake, GNU Make, GNU Autoconf and Ninja,\nwhich are used to build complex C and C++ projects.\nWith these systems, you can write scripts, which are called \"build scripts\".\nThey simply are scripts that describes the necessary steps to compile/build\nyour project.\n\nHowever, these are separate tools, that do not\nbelong to C/C++ compilers, like `gcc` or `clang`.\nAs a result, in C/C++ projects, you have not only to install and\nmanage your C/C++ compilers, but you also have to install and manage\nthese build systems separately.\n\nBut instead of using a separate build system, in Zig, we use the\nZig language itself to write build scripts.\nIn other words, Zig contains a native build system in it. And\nwe can use this build system to write small scripts in Zig,\nwhich describes the necessary steps to build/compile our Zig project[^zig-build-system].\nSo, everything you need to build a complex Zig project is the\n`zig` compiler, and nothing more.\n\n[^zig-build-system]: .\n\n\nNow that we described this topic in more depth, let's focus\non the second generated file (`build.zig.zon`), which is the Zig package manager configuration file,\nwhere you can list and manage the dependencies of your project. Yes, Zig have\na package manager (like `pip` in Python, `cargo` in Rust, or `npm` in Javascript) called Zon,\nand this `build.zig.zon` file is similar to the `package.json` file\nin Javascript projects, or, the `Pipfile` in Python projects.\n\n\n### Looking at the `root.zig` file {#sec-root-file}\n\nLet's take a look at the `root.zig` file, and start to analyze some of the\nsyntax of Zig.\nThe first thing that you might notice, is that every line of code\nthat have an expression in it, ends with a semicolon character (`;`). This is\nsimilar syntax to other languages such as C, C++ and Rust,\nwhich have the same rule.\n\nAlso, notice the `@import()` call at the first line. We use this built-in function\nto import functionality from other Zig modules into our current module.\nIn other words, the `@import()` function works similarly to the `#include` pre-processor\nin C or C++, or, to the `import` statement in Python or Javascript code.\nIn this example, we are importing the `std` module,\nwhich gives you access to the Zig standard library.\n\nIn this `root.zig` file, we can also see how assignments (i.e. creating new objects)\nare made in Zig. You can create a new object in Zig by using the following syntax\n`(const|var) name = value;`. In the example below, we are creating two constant\nobjects (`std` and `testing`). At @sec-assignments we talk more about objects in general.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst testing = std.testing;\n\nexport fn add(a: i32, b: i32) i32 {\n return a + b;\n}\n```\n:::\n\n\nFunctions in Zig are declared similarly to functions in Rust, using the `fn` keyword. In the example above,\nwe are declaring a function called `add()`, which have two arguments named `a` and `b`, and returns\na integer number (`i32`) as result.\n\nMaybe Zig is not exactly a strongly-typed language, because you do not need\nnecessarily to specify the type of every single object you create across your source code.\nBut you do have to explicitly specify the type of every function argument, and also,\nthe return type of every function you create in Zig. So, at least in function declarations,\nZig is a strongly-typed language.\n\nWe specify the type of an object or a function argument in Zig, by\nusing a colon character (`:`) followed by the type after the name of this object/function argument.\nWith the expressions `a: i32` and `b: i32`, we know that, both `a` and `b` arguments have type `i32`,\nwhich is a signed 32 bit integer. In this part,\nthe syntax in Zig is identical to the syntax in Rust, which also specifies types by\nusing the colon character.\n\nLastly, we have the return type of the function at the end of the line, before we open\nthe curly braces to start writing the function's body, which, in the example above is\nagain a signed 32 bit integer (`i32`) value. This specific part is different than it is in Rust.\nBecause in Rust, the return type of a function is specified after an arrow (`->`).\nWhile in Zig, we simply declare the return type directly after the parentheses with the function arguments.\n\nWe also have an `export` keyword before the function declaration. This keyword\nis similar to the `extern` keyword in C. It exposes the function\nto make it available in the library API.\n\nIn other words, if you have a project where you are currently building\na library for other people to use, you need to expose your functions\nso that they are available in the library's API, so that users can use it.\nIf we removed the `export` keyword from the `add()` function declaration,\nthen, this function would be no longer exposed in the library object built\nby the `zig` compiler.\n\n\nHaving that in mind, the keyword `export` is a keyword used in libraries written in Zig.\nSo, if you are not currently writing a library in your project, then, you do not need to\ncare about this keyword.\n\n\n### Looking at the `main.zig` file {#sec-main-file}\n\nNow that we have learned a lot about Zig's syntax from the `root.zig` file,\nlet's take a look at the `main.zig` file.\nA lot of the elements we saw in `root.zig` are also present in `main.zig`.\nBut we have some other elements that we did not have seen yet, so let's dive in.\n\nFirst, look at the return type of the `main()` function in this file.\nWe can see a small change. Now, the return\ntype of the function (`void`) is accompanied by an exclamation mark (`!`).\nWhat this exclamation mark is telling us, is that this `main()` function\nmight also return an error.\n\nSo, in this example, the `main()` function can either return `void`, or, return an error.\nThis is an interesting feature of Zig. If you write a function, and, something inside of\nthe body of this function might return an error, then, you are forced to:\n\n- either add the exclamation mark to the return type of the function, to make it clear that\nthis function might return an error.\n- or explicitly handle this error that might occur inside the function, to make sure that,\nif this error does happen, you are prepared, and your function will no longer return an error\nbecause you handled the error inside your function.\n\nIn most programming languages, we normally handle (or deals with) an error through\na *try catch* pattern, and Zig, this is no different. But, if we look at the `main()` function\nbelow, you can see that we do have a `try` keyword in the 5th line. But we do not have a `catch` keyword\nin this code.\n\nThis means that, we are using the keyword `try` to execute a code that might return an error,\nwhich is the `stdout.print()` expression. But because we do not have a `catch` keyword in this line,\nwe are not treating (or dealing with) this error.\nSo, if this expression do return an error, we are not catching and solving this error in any way.\nThat is why the exclamation mark was added to the return type of the function.\n\nSo, in essence, the `try` keyword executes the expression `stdout.print()`. If this expression\nreturns a valid value, then, the `try` keyword do nothing essentially. It simply passes this value forward. But, if the expression do\nreturn an error, then, the `try` keyword will unwrap and return this error from the function, and also print it's\nstack trace to `stderr`.\n\nThis might sound weird to you, if you come from a high-level language. Because in\nhigh-level languages, such as Python, if an error occurs somewhere, this error is automatically\nreturned and the execution of your program will automatically stops, even if you don't want\nto stop the execution. You are obligated to face the error.\n\nBut if you come from a low-level language, then, maybe, this idea do not sound so weird or distant to you.\nBecause in C for example, normally functions doesn't raise errors, or, they normally don't stop the execution.\nIn C, error handling\nis done by constantly checking the return value of the function. So, you run the function,\nand then, you use an if statement to check if the function returned a value that is valid,\nor, if it returned an error. If an error was returned from the function, then, the if statement\nwill execute some code that fixes this error.\n\nSo, at least for C programmers, they do need to write a lot of if statements to\nconstantly check for errors around their code. And because of that, this simple feature from Zig, might be\nextraordinary for them. Because this `try` keyword can automatically unwrap the error,\nand warn you about this error, and let you deal with it, without any extra work from the programmer.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\n\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n:::\n\n\nNow, another thing that you might have noticed in this code example, is that\nthe `main()` function is marked with the `pub` keyword. This keyword means\n\"public\". It marks the `main()` function as a *public function* from this module.\n\nIn other words, every function that you declare in your Zig module is, by default, a private (or \"static\")\nfunction that belongs to this Zig module, and can only be used (or called) from within this same module.\nUnless, you explicitly mark this function as a public function with the `pub` keyword.\nThis means that the `pub` keyword in Zig do essentially the opposite of what the `static` keyword\ndo in C/C++.\n\nBy making a function \"public\", you allow other Zig modules to access and call this function,\nand use it for they own purposes.\nall these other Zig modules need to do is, to import your module with the `@import()`\nbuilt-in function. Then, they get access to all public functions that are present in\nyour Zig module.\n\n\n### Compiling your source code {#sec-compile-code}\n\nYou can compile your Zig modules into a binary executable by running the `build-exe` command\nfrom the `zig` compiler. You simply list all the Zig modules that you want to build after\nthe `build-exe` command, separated by spaces. In the example below, we are compiling the module `main.zig`.\n\n```bash\nzig build-exe src/main.zig\n```\n\nSince we are building an executable, the `zig` compiler will look for a `main()` function\ndeclared in any of the files that you list after the `build-exe` command. If\nthe compiler does not find a `main()` function declared somewhere, a\ncompilation error will be raised, warning about this mistake.\n\nThe `zig` compiler also offers a `build-lib` and `build-obj` commands, which work\nthe exact same way as the `build-exe` command. The only difference is that, they compile your\nZig modules into a portale C ABI library, or, into object files, respectively.\n\nIn the case of the `build-exe` command, a binary executable file is created by the `zig`\ncompiler in the root directory of your project.\nIf we take a look now at the contents of our current directory, with a simple `ls` command, we can\nsee the binary file called `main` that was created by the compiler.\n\n```bash\nls\n```\n\n```\nbuild.zig build.zig.zon main src\n```\n\nIf I execute this binary executable, I get the \"Hello World\" message in the terminal\n, as we expected.\n\n```bash\n./main\n```\n\n```\nHello, world!\n```\n\n\n### Compile and execute at the same time {#sec-compile-run-code}\n\nOn the previous section, I presented the `zig build-exe` command, which\ncompiles Zig modules into an executable file. However, this means that,\nin order to execute the executable file, we have to run two different commands.\nFirst, the `zig build-exe` command, and then, we call the executable file\ncreated by the compiler.\n\nBut what if we wanted to perform these two steps,\nall at once, in a single command? We can do that by using the `zig run`\ncommand.\n\n```bash\nzig run src/main.zig\n```\n\n```\nHello, world!\n```\n\n### Compiling the entire project {#sec-compile-project}\n\nJust as I described at @sec-project-files, as our project grows in size and\ncomplexity, we usually prefer to organize the compilation and build process\nof the project into a build script, using some sort of \"build system\".\n\nIn other words, as our project grows in size and complexity,\nthe `build-exe`, `build-lib` and `build-obj` commands become\nharder to use directly. Because then, we start to list\nmultiple and multiple modules at the same time. We also\nstart to add built-in compilation flags to customize the\nbuild process for our needs, etc. It becomes a lot of work\nto write the necessary commands by hand.\n\nIn C/C++ projects, programmers normally opt to use CMake, Ninja, `Makefile` or `configure` scripts\nto organize this process. However, in Zig, we have a native build system in the language itself.\nSo, we can write build scripts in Zig to compile and build Zig projects. Then, all we\nneed to do, is to call the `zig build` command to build our project.\n\nSo, when you execute the `zig build` command, the `zig` compiler will search\nfor a Zig module named `build.zig` inside your current directory, which\nshould be your build script, containing the necessary code to compile and\nbuild your project. If the compiler do find this `build.zig` file in your directory,\nthen, the compiler will essentially execute a `zig run` command\nover this `build.zig` file, to compile and execute this build\nscript, which in turn, will compile and build your entire project.\n\n\n```bash\nzig build\n```\n\n\nAfter you execute this \"build project\" command, a `zig-out` directory\nis created in the root of your project directory, where you can find\nthe binary executables and libraries created from your Zig modules\naccordingly to the build commands that you specified at `build.zig`.\nWe will talk more about the build system in Zig latter in this book.\n\nIn the example below, I'm executing the binary executable\nnamed `hello_world` that was generated by the compiler after the\n`zig build` command.\n\n```bash\n./zig-out/bin/hello_world\n```\n\n```\nHello, world!\n```\n\n\n\n## How to learn Zig?\n\nWhat are the best strategies to learn Zig? \nFirst of all, of course this book will help you a lot on your journey through Zig.\nBut you will also need some extra resources if you want to be really good at Zig.\n\nAs a first tip, you can join a community with Zig programmers to get some help\n, when you need it:\n\n- Reddit forum: ;\n- Ziggit community: ;\n- Discord, Slack, Telegram, and others: ;\n\nNow, one of the best ways to learn Zig is to simply read Zig code. Try\nto read Zig code often, and things will become more clear.\nA C/C++ programmer would also probably give you this same tip.\nBecause this strategy really works!\n\nNow, where you can find Zig code to read?\nI personally think that, the best way of reading Zig code is to read the source code of the\nZig Standard Library. The Zig Standard Library is available at the [`lib/std` folder](https://github.com/ziglang/zig/tree/master/lib/std)[^zig-lib-std] on\nthe official GitHub repository of Zig. Access this folder, and start exploring the Zig modules.\n\nAlso, a great alternative is to read code from other large Zig\ncodebases, such as:\n\n1. the [Javascript runtime Bun](https://github.com/oven-sh/bun)[^bunjs].\n1. the [game engine Mach](https://github.com/hexops/mach)[^mach].\n1. a [LLama 2 LLM model implementation in Zig](https://github.com/cgbur/llama2.zig/tree/main)[^ll2].\n1. the [financial transactions database `tigerbeetle`](https://github.com/tigerbeetle/tigerbeetle)[^tiger].\n1. the [command-line arguments parser `zig-clap`](https://github.com/Hejsil/zig-clap)[^clap].\n1. the [UI framework `capy`](https://github.com/capy-ui/capy)[^capy].\n1. the [Language Protocol implementation for Zig, `zls`](https://github.com/zigtools/zls)[^zls].\n1. the [event-loop library `libxev`](https://github.com/mitchellh/libxev)[^xev].\n\n[^xev]: \n[^zls]: \n[^capy]: \n[^clap]: \n[^tiger]: \n[^ll2]: \n[^mach]: \n[^bunjs]: .\n\nAll these assets are available on GitHub,\nand this is great, because we can use the GitHub search bar in our advantage,\nto find Zig code that fits our description.\nFor example, you can always include `lang:Zig` in the GitHub search bar when you\nare searching for a particular pattern. This will limit the search to only Zig modules.\n\n[^zig-lib-std]: \n\nAlso, a great alternative is to consult online resources and documentations.\nHere is a quick list of resources that I personally use from time to time to learn\nmore about the language each day:\n\n- Zig Language Reference: ;\n- Zig Standard Library Reference: ;\n- Zig Guide: ;\n- Karl Seguin Blog: ;\n- Zig News: ;\n- Read the code written by one of the Zig core team members: ;\n- Some livecoding sessions are transmitted in the Zig Showtime Youtube Channel: ;\n\n\nAnother great strategy to learn Zig, or honestly, to learn any language you want,\nis to practice it by solving exercises. For example, there is a famous repository\nin the Zig community called [Ziglings](https://codeberg.org/ziglings/exercises/)[^ziglings]\n, which contains more than 100 small exercises that you can solve. It is a repository of\ntiny programs written in Zig that are currently broken, and your responsibility is to\nfix these programs, and make them work again.\n\n[^ziglings]: .\n\nA famous tech YouTuber known as *The Primeagen* also posted some videos (at YouTube)\nwhere he solves these exercises from Ziglings. The first video is named\n[\"Trying Zig Part 1\"](https://www.youtube.com/watch?v=OPuztQfM3Fg&t=2524s&ab_channel=TheVimeagen)[^prime1].\n\n[^prime1]: .\n\nAnother great alternative, is to solve the [Advent of Code exercises](https://adventofcode.com/)[^advent-code].\nThere are people that already took the time to learn and solve the exercises, and they posted\ntheir solutions on GitHub as well, so, in case you need some resource to compare while solving\nthe exercises, you can look at these two repositories:\n\n- ;\n- ;\n\n[^advent-code]: \n\n\n\n\n\n\n## Creating new objects in Zig (i.e. identifiers) {#sec-assignments}\n\nLet's talk more about objects in Zig. Readers that have past experience\nwith other programming languages might know this concept through\na different name, such as: \"variable\" or \"identifier\". In this book, I choose\nto use the term \"object\" to refer to this concept.\n\nTo create a new object (or a new \"identifier\") in Zig, we use\nthe keywords `const` or `var`. These keywords specificy if the object\nthat you are creating is mutable or not.\nIf you use `const`, then the object you are\ncreating is a constant (or immutable) object, which means that once you declare this object, you\ncan no longer change the value stored inside this object.\n\nOn the other side, if you use `var`, then, you are creating a variable (or mutable) object.\nYou can change the value of this object as many times you want. Using the\nkeyword `var` in Zig is similar to using the keywords `let mut` in Rust.\n\n### Constant objects vs variable objects\n\nIn the code example below, we are creating a new constant object called `age`.\nThis object stores a number representing the age of someone. However, this code example\ndoes not compiles succesfully. Because on the next line of code, we are trying to change the value\nof the object `age` to 25.\n\nThe `zig` compiler detects that we are trying to change\nthe value of an object/identifier that is constant, and because of that,\nthe compiler will raise a compilation error, warning us about the mistake.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 24;\n// The line below is not valid!\nage = 25;\n```\n:::\n\n\n```\nt.zig:10:5: error: cannot assign to constant\n age = 25;\n ~~^~~\n```\n\nIn contrast, if you use `var`, then, the object created is a variable object.\nWith `var` you can declare this object in your source code, and then,\nchange the value of this object how many times you want over future points\nin your source code.\n\nSo, using the same code example exposed above, if I change the declaration of the\n`age` object to use the `var` keyword, then, the program gets compiled succesfully.\nBecause now, the `zig` compiler detects that we are changing the value of an\nobject that allows this behaviour, because it is an \"variable object\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = 24;\nage = 25;\n```\n:::\n\n\n\n### Declaring without an initial value\n\nBy default, when you declare a new object in Zig, you must give it\nan initial value. In other words, this means\nthat we have to declare, and, at the same time, initialize every object we\ncreate in our source code.\n\nOn the other hand, you can, in fact, declare a new object in your source code,\nand not give it an explicit value. But we need to use a special keyword for that,\nwhich is the `undefined` keyword.\n\nIs important to emphasize that, you should avoid using `undefined` as much as possible.\nBecause when you use this keyword, you leave your object uninitialized, and, as a consequence,\nif for some reason, your code use this object while it is uninitialized, then, you will definitely\nhave undefined behaviour and major bugs in your program.\n\nIn the example below, I'm declaring the `age` object again. But this time,\nI do not give it an initial value. The variable is only initialized at\nthe second line of code, where I store the number 25 in this object.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar age: u8 = undefined;\nage = 25;\n```\n:::\n\n\nHaving these points in mind, just remember that you should avoid as much as possible to use `undefined` in your code.\nAlways declare and initialize your objects. Because this gives you much more safety in your program.\nBut in case you really need to declare an object without initializing it... the\n`undefined` keyword is the way to do it in Zig.\n\n\n### There is no such thing as unused objects\n\nEvery object (being constant or variable) that you declare in Zig **must be used in some way**. You can give this object\nto a function call, as a function argument, or, you can use it in another expression\nto calculate the value of another object, or, you can call a method that belongs to this\nparticular object. \n\nIt doesn't matter in which way you use it. As long as you use it.\nIf you try to break this rule, i.e. if your try to declare a object, but not use it,\nthe `zig` compiler will not compile your Zig source code, and it will issue a error\nmessage warning that you have unused objects in your code.\n\nLet's demonstrate this with an example. In the source code below, we declare a constant object\ncalled `age`. If you try to compile a simple Zig program with this line of code below,\nthe compiler will return an error as demonstrated below:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst age = 15;\n```\n:::\n\n\n```\nt.zig:4:11: error: unused local constant\n const age = 15;\n ^~~\n```\n\nEverytime you declare a new object in Zig, you have two choices:\n\n1. you either use the value of this object;\n2. or you explicitly discard the value of the object;\n\nTo explicitly discard the value of any object (constant or variable), all you need to do is to assign\nthis object to an special character in Zig, which is the underscore (`_`).\nWhen you assign an object to a underscore, like in the example below, the `zig` compiler will automatically\ndiscard the value of this particular object.\n\nYou can see in the example below that, this time, the compiler did not\ncomplain about any \"unused constant\", and succesfully compiled our source code.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It compiles!\nconst age = 15;\n_ = age;\n```\n:::\n\n\nNow, remember, everytime you assign a particular object to the underscore, this object\nis essentially destroyed. It is discarded by the compiler. This means that you can no longer\nuse this object further in your code. It doesn't exist anymore.\n\nSo if you try to use the constant `age` in the example below, after we discarded it, you\nwill get a loud error message from the compiler (talking about a \"pointless discard\")\nwarning you about this mistake.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\n// It does not compile.\nconst age = 15;\n_ = age;\n// Using a discarded value!\nstd.debug.print(\"{d}\\n\", .{age + 2});\n```\n:::\n\n\n```\nt.zig:7:5: error: pointless discard\n of local constant\n```\n\n\nThis same rule applies to variable objects. Every variable object must also be used in\nsome way. And if you assign a variable object to the underscore,\nthis object also get's discarded, and you can no longer use this object.\n\n\n\n### You must mutate every variable objects\n\nEvery variable object that you create in your source code must be mutated at some point.\nIn other words, if you declare an object as a variable\nobject, with the keyword `var`, and you do not change the value of this object\nat some point in the future, the `zig` compiler will detect this,\nand it will raise an error warning you about this mistake.\n\nThe concept behind this is that every object you create in Zig should be preferably a\nconstant object, unless you really need an object whose value will\nchange during the execution of your program.\n\nSo, if I try to declare a variable object such as `where_i_live` below,\nand I do not change the value of this object in some way,\nthe `zig` compiler raises an error message with the phrase \"variable is never mutated\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar where_i_live = \"Belo Horizonte\";\n_ = where_i_live;\n```\n:::\n\n\n```\nt.zig:7:5: error: local variable is never mutated\nt.zig:7:5: note: consider using 'const'\n```\n\n## Primitive Data Types\n\nZig have many different primitive data types available for you to use.\nYou can see the full list of available data types at the official\n[Language Reference page](https://ziglang.org/documentation/master/#Primitive-Types)[^lang-data-types].\n\n[^lang-data-types]: .\n\nBut here is a quick list:\n\n- Unsigned integers: `u8`, 8-bit integer; `u16`, 16-bit integer; `u32`, 32-bit integer; `u64`, 64-bit integer; `u128`, 128-bit integer.\n- Signed integers: `i8`, 8-bit integer; `i16`, 16-bit integer; `i32`, 32-bit integer; `i64`, 64-bit integer; `i128`, 128-bit integer.\n- Float number: `f16`, 16-bit floating point; `f32`, 32-bit floating point; `f64`, 64-bit floating point; `f128`, 128-bit floating point;\n- Boolean: `bool`, represents true or false values.\n- C ABI compatible types: `c_long`, `c_char`, `c_short`, `c_ushort`, `c_int`, `c_uint`, and many others.\n- Pointer sized integers: `isize` and `usize`.\n\n\n\n\n\n\n\n## Arrays {#sec-arrays}\n\nYou create arrays in Zig by using a syntax that resembles the C syntax.\nFirst, you specify the size of the array (i.e. the number of elements that will be stored in the array)\nyou want to create inside a pair of brackets.\n\nThen, you specify the data type of the elements that will be stored inside this array.\nAll elements present in an array in Zig must have the same data type. For example, you cannot mix elements\nof type `f32` with elements of type `i32` in the same array.\n\nAfter that, you simply list the values that you want to store in this array inside\na pair of curly braces.\nIn the example below, I am creating two constant objets that contain different arrays.\nThe first object contains an array of 4 integer values, while the second object,\nan array of 3 floating point values.\n\nNow, you should notice that in the object `ls`, I am\nnot explicitly specifying the size of the array inside of the brackets. Instead\nof using a literal value (like the value 4 that I used in the `ns` object), I am\nusing the special character underscore (`_`). This syntax tells the `zig` compiler\nto fill this field with the number of elements listed inside of the curly braces.\nSo, this syntax `[_]` is for lazy (or smart) programmers who leave the job of\ncounting how many elements there are in the curly braces for the compiler.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst ls = [_]f64{432.1, 87.2, 900.05};\n_ = ns; _ = ls;\n```\n:::\n\n\n### Selecting elements of the array\n\nOne very commom activity is to select specific portions of an array\nyou have in your source code.\nIn Zig, you can select a specific element from your\narray, by simply providing the index of this particular\nelement inside brackets after the object name.\nIn the example below, I am selecting the third element from the\n`ns` array. Notice that Zig is a \"zero-index\" based language,\nlike C, C++, Rust, Python, and many other languages.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\ntry stdout.print(\"{d}\\n\", .{ ns[2] });\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n12\n```\n\n\n:::\n:::\n\n\nIn contrast, you can also select specific slices (or sections) of your array, by using a\nrange selector. Some programmers also call these selectors of \"slice selectors\",\nand they also exist in Rust, and have the exact same syntax as in Zig.\nAnyway, a range selector is a special expression in Zig that defines\na range of indexes, and it have the syntax `start..end`.\n\nIn the example below, at the second line of code,\nthe `sl` object stores a slice (or a portion) of the\n`ns` array. More precisely, the elements at index 1 and 2\nin the `ns` array. \n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\n_ = sl;\n```\n:::\n\n\nWhen you use the `start..end` syntax,\nthe \"end tail\" of the range selector is non-inclusive,\nmeaning that, the index at the end is not included in the range that is\nselected from the array.\nTherefore, the syntax `start..end` actually means `start..end - 1` in practice.\n\nYou can for example, create a slice that goes from the first to the\nlast elements of the array, by using `ar[0..ar.len]` syntax\nIn other words, it is a slice that\naccess all elements in the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ar = [4]u8{48, 24, 12, 6};\nconst sl = ar[0..ar.len];\n_ = sl;\n```\n:::\n\n\nYou can also use the syntax `start..` in your range selector.\nWhich tells the `zig` compiler to select the portion of the array\nthat begins at the `start` index until the last element of the array.\nIn the example below, we are selecting the range from index 1\nuntil the end of the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..];\n_ = sl;\n```\n:::\n\n\n\n### More on slices\n\nAs we discussed before, in Zig, you can select specific portions of an existing\narray. This is called *slicing* in Zig [@zigguide], because when you select a portion\nof an array, you are creating a slice object from that array.\n\nA slice object is essentially a pointer object accompained by a length number.\nThe pointer object points to the first element in the slice, and the\nlength number tells the `zig` compiler how many elements there are in this slice.\n\n> Slices can be thought of as a pair of `[*]T` (the pointer to the data) and a `usize` (the element count) [@zigguide].\n\nThrough the pointer contained inside the slice you can access the elements (or values)\nthat are inside this range (or portion) that you selected from the original array.\nBut the length number (which you can access through the `len` property of your slice object)\nis the really big improvement (over C arrays for example) that Zig brings to the table here.\n\nBecause with this length number\nthe `zig` compiler can easily check if you are trying to access an index that is out of the bounds of this particular slice,\nor, if you are causing any buffer overflow problems. In the example below,\nwe access the `len` property of the slice `sl`, which tells us that this slice\nhave 2 elements in it.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [4]u8{48, 24, 12, 6};\nconst sl = ns[1..3];\ntry stdout.print(\"{d}\\n\", .{sl.len});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n2\n```\n\n\n:::\n:::\n\n\n\n### Array operators\n\nThere are two array operators available in Zig that are very useful.\nThe array concatenation operator (`++`), and the array multiplication operator (`**`). As the name suggests,\nthese are array operators.\n\nOne important detail about these two operators is that they work\nonly when both operands have a size (or \"length\") that is compile-time known.\nWe are going to talk more about\nthe differences between \"compile-time known\" and \"runtime known\" at @sec-compile-time.\nBut for now, keep this information in mind, that you cannot use these operators in every situation.\n\nIn summary, the `++` operator creates a new array that is the concatenation,\nof both arrays provided as operands. So, the expression `a ++ b` produces\na new array which contains all the elements from arrays `a` and `b`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst b = [_]u8{4,5};\nconst c = a ++ b;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 4, 5 }\n```\n\n\n:::\n:::\n\n\nThis `++` operator is particularly useful to concatenate strings together.\nStrings in Zig are described in depth at @sec-zig-strings. In summary, a string object in Zig\nis essentially an arrays of bytes. So, you can use this array concatenation operator\nto effectively concatenate strings together.\n\nIn contrast, the `**` operator is used to replicate an array multiple\ntimes. In other words, the expression `a ** 3` creates a new array\nwhich contains the elements of the array `a` repeated 3 times.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst a = [_]u8{1,2,3};\nconst c = a ** 2;\ntry stdout.print(\"{any}\\n\", .{c});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n{ 1, 2, 3, 1, 2, 3 }\n```\n\n\n:::\n:::\n\n\n\n## Blocks and scopes\n\nBlocks are created in Zig by a pair of curly braces. A block is just a group of\nexpressions (or statements) contained inside of a pair of curly braces. All of these expressions that\nare contained inside of this pair of curly braces belongs to the same scope.\n\nIn other words, a block just delimits a scope in your code.\nThe objects that you define inside the same block belongs to the same\nscope, and, therefore, are accessible from within this scope.\nAt the same time, these objects are not accessible outside of this scope.\nSo, you could also say that blocks are used to limit the scope of the objects that you create in\nyour source code. In less technical terms, blocks are used to specify where in your source code\nyou can access whatever object you have in your source code.\n\nSo, a block is just a group of expressions contained inside a pair of curly braces.\nAnd every block have it's own scope separated from the others.\nThe body of a function is a classic example of a block. If statements, for and while loops\n(and any other structure in the language that uses the pair of curly braces)\nare also examples of blocks.\n\nThis means that, every if statement, or for loop,\netc., that you create in your source code have it's own separate scope.\nThat is why you can't access the objects that you defined inside\nof your for loop (or if statement) in an outer scope, i.e. a scope outside of the for loop.\nBecause you are trying to access an object that belongs to a scope that is different\nthan your current scope.\n\n\nYou can create blocks within blocks, with multiple levels of nesting.\nYou can also (if you want to) give a label to a particular block, with the colon character (`:`).\nJust write `label:` before you open the pair of curly braces that delimits your block. When you label a block\nin Zig, you can use the `break` keyword to return a value from this block, like as if it\nwas a function's body. You just write the `break` keyword, followed by the block label in the format `:label`,\nand the expression that defines the value that you want to return.\n\nLike in the example below, where we are returning the value from the `y` object\nfrom the block `add_one`, and saving the result inside the `x` object.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar y: i32 = 123;\nconst x = add_one: {\n y += 1;\n break :add_one y;\n};\nif (x == 124 and y == 124) {\n try stdout.print(\"Hey!\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHey!\n```\n\n\n:::\n:::\n\n\n\n## Type inference {#sec-type-inference}\n\nZig is kind of a strongly typed language. I say \"kind of\" because there are situations\nwhere you don't have to explicitly write the type of every single object in your source code,\nas you would expect from a traditional strongly typed language, such as C and C++.\n\nIn some situations, the `zig` compiler can use type inference to solves the data types for you, easing some of\nthe burden that you carry as a developer.\nThe most commom way this happens is through function arguments that receives struct objects\nas input.\n\nIn general, type inference in Zig is done by using the dot character (`.`).\nEverytime you see a dot character written before a struct literal, or before a enum value, or something like that,\nyou know that this dot character is playing a special party in this place. More specifically, it is\ntelling the `zig` compiler something on the lines of: \"Hey! Can you infer the type of this\nvalue for me? Please!\". In other words, this dot character is playing a role similar to the `auto` keyword in C++.\n\nI give you some examples of this at @sec-anonymous-struct-literals, where we present anonymous struct literals.\nBecause anonymous struct literals are, essentially, struct literals that use type inference to\ninfer the exact type of this particular struct literal.\nThis type inference is done by looking for some minimal hint of the correct data type to be used.\nYou could say that the `zig` compiler looks for any neighbouring type annotation that might tell him what would be the correct type.\n\nAnother commom place where we use type inference in Zig is at switch statements (which we talk about at @sec-switch).\nTake a look at this `fence()` function, which comes from the [`atomic.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/atomic.zig)[^fence-fn]\nfrom the Zig Standard Library.\n\n[^fence-fn]: .\n\nThere are a lot of things in this function that we haven't talked about yet, such as:\nwhat `comptime` means? `inline`? `extern`? What is this star symbol before `Self`?\nLet's just ignore all of these things, and focus solely on the switch statement\nthat is inside this function.\n\nWe can see that this switch statement uses the `order` object as input. This `order`\nobject is one of the inputs of this `fence()` function, and we can see in the type annotation,\nthat this object is of type `AtomicOrder`. We can also see a bunch of values inside the\nswitch statements that begins with a dot character, such as `.release` and `.acquire`.\n\nBecause these weird values contain a dot character before them, we are asking the `zig`\ncompiler to infer the types of these values inside the switch statement. Then, the `zig`\ncompiler is looking into the current context where these values are being used, and it is\ntrying to infer the types of these values.\n\nSince they are being used inside a switch statement, the `zig` compiler looks into the type\nof the input object given to the switch statement, which is the `order` object in this case.\nBecause this object have type `AtomicOrder`, the `zig` compiler infers that these values\nare data members from this type `AtomicOrder`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub inline fn fence(self: *Self, comptime order: AtomicOrder) void {\n // LLVM's ThreadSanitizer doesn't support the normal fences so we specialize for it.\n if (builtin.sanitize_thread) {\n const tsan = struct {\n extern \"c\" fn __tsan_acquire(addr: *anyopaque) void;\n extern \"c\" fn __tsan_release(addr: *anyopaque) void;\n };\n\n const addr: *anyopaque = self;\n return switch (order) {\n .unordered, .monotonic => @compileError(@tagName(order) ++ \" only applies to atomic loads and stores\"),\n .acquire => tsan.__tsan_acquire(addr),\n .release => tsan.__tsan_release(addr),\n .acq_rel, .seq_cst => {\n tsan.__tsan_acquire(addr);\n tsan.__tsan_release(addr);\n },\n };\n }\n\n return @fence(order);\n}\n```\n:::\n\n\nThis is how basic type inference is done in Zig. If we didn't use the dot character before\nthe values inside this switch statement, then, we would be forced to write explicitly\nthe types of these values. For example, instead of writing `.release` we would have to\nwrite `AtomicOrder.release`. We would have to do this for every single value\nin this switch statement, and this is kind of painful. That is why type inference\nis commonly used on switch statements in Zig.\n\n\n\n\n## Control flow {#sec-zig-control-flow}\n\nSometimes, you need to make decisions in your program. Maybe you need to decide\nwether to execute or not a specific piece of code. Or maybe,\nyou need to apply the same operation over a sequence of values. These kinds of tasks,\ninvolve using structures that are capable of changing the \"control flow\" of our program.\n\nIn computer science, the term \"control flow\" usually refers to the order in which expressions (or commands)\nare evaluated in a given language or program. But this term is also used to refer\nto structures that are capable of changing this \"evaluation order\" of the commands\nexecuted by a given language/program.\n\nThese structures are better known\nby a set of terms, such as: loops, if/else statements, switch statements, among others. So,\nloops and if/else statements are examples of structures that can change the \"control\nflow\" of our program. The keywords `continue` and `break` are also examples of symbols\nthat can change the order of evaluation, since they can move our program to the next iteration\nof a loop, or make the loop stop completely.\n\n\n### If/else statements\n\nAn if/else statement performs an \"conditional flow operation\".\nA conditional flow control (or choice control) allows you to execute\nor ignore a certain block of commands based on a logical condition.\nMany programmers and computer science professionals also use\nthe term \"branching\" in this case.\nIn essence, we use if/else statements to use the result of a logical test\nto decide whether or not to execute a given block of commands.\n\nIn Zig, we write if/else statements by using the keywords `if` and `else`.\nWe start with the `if` keyword followed by a logical test inside a pair\nof parentheses, and then, a pair of curly braces with contains the lines\nof code to be executed in case the logical test returns the value `true`.\n\nAfter that, you can optionally add an `else` statement. Just add the `else`\nkeyword followed by a pair of curly braces, with the lines of code\nto executed in case the logical test defined in the `if`\nreturns `false`.\n\nIn the example below, we are testing if the object `x` contains a number\nthat is greater than 10. Judging by the output printed to the console,\nwe know that this logical test returned `false`. Because the output\nin the console is compatible with the line of code present in the\n`else` branch of the if/else statement.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nx <= 10!\n```\n\n\n:::\n:::\n\n\n\n\n### Swith statements {#sec-switch}\n\nSwitch statements are also available in Zig.\nA switch statement in Zig have a similar syntax to a switch statement in Rust.\nAs you would expect, to write a switch statement in Zig we use the `switch` keyword.\nWe provide the value that we want to \"switch over\" inside a\npair of parentheses. Then, we list the possible combinations (or \"branchs\")\ninside a pair of curly braces.\n\nLet's take a look at the code example below. You can see in this example that,\nI'm creating an enum type called `Role`. We talk more about enums at @sec-enum.\nBut in essence, this `Role` type is listing different types of roles in a fictituous\ncompany, like `SE` for Software Engineer, `DE` for Data Engineer, `PM` for Product Manager,\netc.\n\nNotice that we are using the value from the `role` object in the\nswitch statement, to discover which exact area we need to store in the `area` variable object.\nAlso notice that we are using type inference inside the switch statement, with the dot character,\nas we described at @sec-type-inference.\nThis makes the `zig` compiler infer the correct data type of the values (`PM`, `SE`, etc.) for us.\n\nAlso notice that, we are grouping multiple values in the same branch of switch statement.\nWe just separate each possible value with a comma. So, for example, if `role` contains either `DE` or `DA`,\nthe `area` variable would contain the value `\"Data & Analytics\"`, instead of `\"Platform\"`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nPlatform\n```\n\n\n:::\n:::\n\n\nNow, one very important aspect about this switch statement presented\nin the code example above, is that it exhaust all existing possibilities.\nIn other words, all possible values that could be found inside the `order`\nobject are explicitly handled in this switch statement.\n\nSince the `role` object have type `Role`, the only possible values to\nbe found inside this object are `PM`, `SE`, `DPE`, `PO`, `DE`, `DA` and `KS`.\nThere is no other possible value to be stored in this `role` object.\nThis what \"exhaust all existing possibilities\" means. The switch statement covers\nevery possible case.\n\nIn Zig, switch statements must exhaust all existing possibilities. You cannot write\na switch statement, and leave an edge case with no expliciting action to be taken.\nThis is a similar behaviour to switch statements in Rust, which also have to\nhandle all possible cases.\n\nTake a look at the `dump_hex_fallible()` function below as an example. This function\nalso comes from the Zig Standard Library, but this time, it comes from the [`debug.zig` module](https://github.com/ziglang/zig/blob/master/lib/std/debug.zig)[^debug-mod].\nThere are multiple lines in this function, but I omitted them to focus solely on the\nswitch statement found in this function. Notice that this switch statement have four\npossible cases, or four explicit branches. Also, notice that we used an `else` branch\nin this case. Whenever you have multiple possible cases in your switch statement\nwhich you want to apply the same exact action, you can use an `else` branch to do that.\n\n[^debug-mod]: \n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n```\n:::\n\n\nMany users would also use an `else` branch to handle a \"not supported\" case.\nThat is, a case that cannot be properly handled by your code, or, just a case that\nshould not be \"fixed\". So many programmers use an `else` branch to panic (or raise an error) to stop\nthe current execution.\n\nTake the code example below as an example. We can see that, we are handling the cases\nfor the `level` object being either 1, 2, or 3. All other possible cases are not supported by default,\nand, as consequence, we raise an runtime error in these cases, through the `@panic()` built-in function.\n\nAlso notice that, we are assigning the result of the switch statement to a new object called `category`.\nThis is another thing that you can do with switch statements in Zig. If the branchs in this switch\nstatement output some value as result, you can store the result value of the switch statement into\na new variable.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n:::\n\n\n```\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\n```\n\nFurthermore, you can also use ranges of values in switch statements.\nThat is, you can create a branch in your switch statement that is used\nwhenever the input value is contained in a range. These range\nexpressions are created with the operator `...`. Is important\nto emphasize that the ranges created by this operator are\ninclusive on both ends.\n\nFor example, I could easily change the code example above to support all\nlevels between 0 and 100. Like this:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nbeginner\n```\n\n\n:::\n:::\n\n\nThis is neat, and it works with character ranges too. That is, I could\nsimply write `'a'...'z'`, to match any character value that is a\nlowercase letter, and it would work fine.\n\n\n\n### The `defer` keyword {#sec-defer}\n\nWith the `defer` keyword you can execute expressions at the end of the current scope.\nTake the `foo()` function below as an example. When we execute this function, the expression\nthat prints the message \"Exiting function ...\" get's executed only at\nthe end of the function scope.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nfn foo() !void {\n defer std.debug.print(\n \"Exiting function ...\\n\", .{}\n );\n try stdout.print(\"Adding some numbers ...\\n\", .{});\n const x = 2 + 2; _ = x;\n try stdout.print(\"Multiplying ...\\n\", .{});\n const y = 2 * 8; _ = y;\n}\n\npub fn main() !void {\n try foo();\n}\n```\n:::\n\n\n```\nAdding some numbers ...\nMultiplying ...\nExiting function ...\n```\n\nIt doesn't matter how the function exits (i.e. because\nof an error, or, because of an return statement, or whatever),\njust remember, this expression get's executed when the function exits.\n\n\n\n\n### For loops\n\nA loop allows you to execute the same lines of code multiple times,\nthus, creating a \"repetition space\" in the execution flow of your program.\nLoops are particularly useful when we want to replicate the same function\n(or the same set of commands) over several different inputs.\n\nThere are different types of loops available in Zig. But the most\nessential of them all is probably the *for loop*. A for loop is\nused to apply the same piece of code over the elements of a slice or an array.\n\nFor loops in Zig have a slightly different syntax that you are\nprobably used to see in other languages. You start with the `for` keyword, then, you\nlist the items that you want to iterate\nover inside a pair of parentheses. Then, inside of a pair of pipes (`|`)\nyou should declare an identifier that will serve as your iterator, or,\nthe \"repetition index of the loop\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (items) |value| {\n // code to execute\n}\n```\n:::\n\n\nInstead of using a `(value in items)` syntax,\nin Zig, for loops use the syntax `(items) |value|`. In the example\nbelow, you can see that we are looping through the items\nof the array stored at the object `name`, and printing to the\nconsole the decimal representation of each character in this array.\n\nIf we wanted, we could also iterate through a slice (or a portion) of\nthe array, instead of iterating through the entire array stored in the `name` object.\nJust use a range selector to select the section you want. For example,\nI could provide the expression `name[0..3]` to the for loop, to iterate\njust through the first 3 elements in the array.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n80 | 101 | 100 | 114 | 111 | \n```\n\n\n:::\n:::\n\n\nIn the above example we are using the value itself of each\nelement in the array as our iterator. But there are many situations where\nwe need to use an index instead of the actual values of the items.\n\nYou can do that by providing a second set of items to iterate over.\nMore precisely, you provide the range selector `0..` to the for loop. So,\nyes, you can use two different iterators at the same time in a for\nloop in Zig.\n\nBut remember from @sec-assignments that, every object\nyou create in Zig must be used in some way. So if you declare two iterators\nin your for loop, you must use both iterators inside the for loop body.\nBut if you want to use just the index iterator, and not use the \"value iterator\",\nthen, you can discard the value iterator by maching the\nvalue items to the underscore character, like in the example below:\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n```\n:::\n\n\n```\n0 | 1 | 2 | 3 | 4 |\n```\n\n\n### While loops\n\nA while loop is created from the `while` keyword. While a `for` loop\niterates through the items of an array, a `while` loop\nwill loop continuously, and infinitely, until a logical test\n(specified by you) becomes false.\n\nYou start with the `while` keyword, then, you define a logical\nexpression inside a pair of parentheses, and the body of the\nloop is provided inside a pair of curly braces, like in the example below:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 2 | 3 | 4 | \n```\n\n\n:::\n:::\n\n\n\n\n### Using `break` and `continue`\n\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, using\nthe keywords `break` and `continue`, respectively. The `while` loop present in the example below, is\nat first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to `true`.\nWhat makes this `while` loop stop when the `i` object reaches the count\n10? Is the `break` keyword!\n\nInside the while loop, we have an if statement that is constantly checking if the `i` variable\nis equal to 10. Since we are increasing the value of this `i` variable at each iteration of the\nwhile loop. At some point, this `i` variable will be equal to 10, and when it does, the if statement\nwill execute the `break` expression, and, as a result, the execution of the while loop is stopped.\n\nNotice the `expect()` function from the Zig standard library after the while loop.\nThis `expect()` function is an \"assert\" type of function.\nThis function checks if the logical test provided is equal to true. If this logical test is false,\nthe function raises an assertion error. But it is equal to true, then, the function will do nothing.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nEverything worked!\n```\n\n\n:::\n:::\n\n\nSince this code example was executed succesfully by the `zig` compiler,\nwithout raising any errors, then, we known that, after the execution of while loop,\nthe `i` variable is equal to 10. Because if it wasn't equal to 10, then, an error would\nbe raised by `expect()`.\n\nNow, in the next example, we have an use case for\nthe `continue` keyword. The if statement is constantly\nchecking if the current index is a multiple of 2. If\nit is, then we jump to the next iteration of the loop\ndirectly. But it the current index is not a multiple of 2,\nthen, the loop will simply print this index to the console.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n1 | 3 | 5 | \n```\n\n\n:::\n:::\n\n\n\n## Structs and OOP {#sec-structs-and-oop}\n\nZig is a language more closely related to C (which is a procedural language),\nthan it is to C++ or Java (which are object-oriented languages). Because of that, you do not\nhave advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or\nclass inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\n\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C.\nYou give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can\nalso register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object\nthat you create with this new type, will always have these methods available and associated with them.\n\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) to construct or to instantiate every object\nof this particular class, and you also have a destructor method (or a destructor function) that\nis the function responsible for destroying every object of this class.\n\nIn Zig, we normally declare the constructor and the destructor methods\nof our structs, by declaring an `init()` and a `deinit()` methods inside the struct.\nThis is just a naming convention that you will find across the entire Zig standard library.\nSo, in Zig, the `init()` method of a struct is normally the constructor method of the class represented by this struct.\nWhile the `deinit()` method is the method used for destroying an existing instance of that struct.\n\nThe `init()` and `deinit()` methods are both used extensively in Zig code, and you will see both of\nthem being used when we talk about allocators at @sec-allocators.\nBut, as another example, let's build a simple `User` struct to represent an user of some sort of system.\nIf you look at the `User` struct below, you can see the `struct` keyword, and inside of a\npair of curly braces, we write the struct's body.\n\nNotice the data members of this struct, `id`, `name` and `email`. Every data member have it's\ntype explicitly annotated, with the colon character (`:`) syntax that we described earlier at @sec-root-file.\nBut also notice that every line in the struct body that describes a data member, ends with a comma character (`,`).\nSo every time you declare a data member in your Zig code, always end the line with a comma character, instead\nof ending it with the traditional semicolon character (`;`).\n\nNext, also notice in this example, that we registrated an `init()` function as a method\nof this `User` struct. This `init()` method is the constructor method that you use to instantiate\nevery new `User` object. That is why this `init()` function return an `User` object as result.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\npedro\n```\n\n\n:::\n:::\n\n\nThe `pub` keyword plays an important role in struct declarations, and OOP in Zig.\nEvery method that you declare in your struct that is marked with the keyword `pub`,\nbecomes a public method of this particular struct.\n\nSo every method that you create in your struct, is, at first, a private method\nof that struct. Meaning that, this method can only be called from within this\nstruct. But, if you mark this method as public, with the keyword `pub`, then,\nyou can call the method directly from the `User` object you have\nin your code.\n\nIn other words, the functions marked by the keyword `pub`\nare members of the public API of that struct.\nFor example, if I did not marked the `print_name()` method as public,\nthen, I could not execute the line `u.print_name()`. Because I would\nnot be authorized to call this method directly in my code.\n\n\n\n\n## Anonymous struct literals {#sec-anonymous-struct-literals}\n\nYou can declare a struct object as a literal value. When we do that, we normally specify the\ndata type of this struct literal by writing it's data type just before the opening curly braces.\nFor example, I could write a struct literal of type `User` that we defined in the previous section like\nthis:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst eu = User {\n .id = 1,\n .name = \"Pedro\",\n .email = \"someemail@gmail.com\"\n};\n_ = eu;\n```\n:::\n\n\nHowever, in Zig, we can also write an anonymous struct literal. That is, you can write a\nstruct literal, but not especify explicitly the type of this particular struct.\nAn anonymous struct is written by using the syntax `.{}`. So, we essentially\nreplaced the explicit type of the struct literal with a dot character (`.`).\n\nAs we described at @sec-type-inference, when you put a dot before a struct literal,\nthe type of this struct literal is automatically inferred by the `zig` compiler.\nIn essence, the `zig` compiler will look for some hint of what is the type of that struct.\nIt can be the type annotation of an function argument,\nor the return type annotation of the function that you are using, or the type annotation\nof a variable.\nIf the compiler do find such type annotation, then, it will use this\ntype in your literal struct. \n\nAnonymous structs are very commom to use in function arguments in Zig.\nOne example that you have seen already constantly, is the `print()`\nfunction from the `stdout` object.\nThis function takes two arguments.\nThe first argument, is a template string, which should\ncontain string format specifiers in it, which tells how the values provided\nin the second argument should be printed into the message.\n\nWhile the second argument is a struct literal that lists the values\nto be printed into the template message specified in the first argument.\nYou normally want to use an anonymous struct literal here, so that, the\n`zig` compiler do the job of specifying the type of this particular\nanonymous struct for you.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\npub fn main() !void {\n const stdout = std.io.getStdOut().writer();\n try stdout.print(\"Hello, {s}!\\n\", .{\"world\"});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello, world!\n```\n\n\n:::\n:::\n\n\n\n\n\n## How strings work in Zig? {#sec-zig-strings}\n\nThe first project that we are going to build and discuss in this book is a base64 encoder/decoder (@sec-base64).\nBut in order for us to build such a thing, we need to get a better understanding on how strings work in Zig.\nSo let's discuss this specific aspect of Zig.\n\nIn Zig, a string literal (or a string object if you prefer) is a pointer to a null-terminated array\nof bytes. Each byte in this array is represented by an `u8` value, which is an unsigned 8 bit integer,\nso, it is equivalent to the C data type `unsigned char`.\n\nZig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every\nsequence of bytes you have it, but is not really Zig's job to fix the encoding of your strings\n(you can use [`iconv`](https://www.gnu.org/software/libiconv/)[^libiconv] for that).\nToday, most of the text in our modern world, specially on the web, should be UTF-8 encoded.\nSo if your string literal is not UTF-8 encoded, then, you will likely\nhave problems in Zig.\n\n[^libiconv]: \n\nLet’s take for example the word \"Hello\". In UTF-8, this sequence of characters (H, e, l, l, o)\nis represented by the sequence of decimal numbers 72, 101, 108, 108, 111. In xecadecimal, this\nsequence is `0x48`, `0x65`, `0x6C`, `0x6C`, `0x6F`. So if I take this sequence of hexadecimal values,\nand ask Zig to print this sequence of bytes as a sequence of characters (i.e. a string), then,\nthe text \"Hello\" will be printed into the terminal:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\n\npub fn main() !void {\n const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};\n try stdout.print(\"{s}\\n\", .{bytes});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nHello\n```\n\n\n:::\n:::\n\n\n\nIf you want to see the actual bytes that represents a string in Zig, you can use\na `for` loop to iterate trough each byte in the string, and ask Zig to print each byte as an hexadecimal\nvalue to the terminal. You do that by using a `print()` statement with the `X` formatting specifier,\nlike you would normally do with the [`printf()` function](https://cplusplus.com/reference/cstdio/printf/)[^printfs] in C.\n\n[^printfs]: \n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_literal) |byte| {\n try stdout.print(\"{X} \", .{byte});\n }\n try stdout.print(\"\\n\", .{});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: 54 68 69 \n 73 20 69 73 20 61 6E 20 65 78 61 6D 70 6C 65 20 6F\n F 66 20 73 74 72 69 6E 67 20 6C 69 74 65 72 61 6C 2\n 20 69 6E 20 5A 69 67 \n```\n\n\n:::\n:::\n\n\n### Strings in C\n\nAt first glance, this looks very similar to how C treats strings as well. That is, string values\nin C are also treated internally as an array of bytes, and this array is also null-terminated.\n\nBut one key difference between a Zig string and a C string, is that Zig also stores the length of\nthe array inside the string object. This small detail makes your code safer, because is much\neasier for the Zig compiler to check if you are trying to access an element that is \"out of bounds\", i.e. if\nyour trying to access memory that does not belong to you.\n\nTo achieve this same kind of safety in C, you have to do a lot of work that kind of seems pointless.\nSo getting this kind of safety is not automatic and much harder to do in C. For example, if you want\nto track the length of your string troughout your program in C, then, you first need to loop through\nthe array of bytes that represents this string, and find the null element (`'\\0'`) position to discover\nwhere exactly the array ends, or, in other words, to find how much elements the array of bytes contain.\n\nTo do that, you would need something like this in C. In this example, the C string stored in\nthe object `array` is 25 bytes long:\n\n```c\n#include \nint main() {\n char* array = \"An example of string in C\";\n int index = 0;\n while (1) {\n if (array[index] == '\\0') {\n break;\n }\n index++;\n }\n printf(\"Number of elements in the array: %d\\n\", index);\n}\n```\n\n```\nNumber of elements in the array: 25\n```\n\nBut in Zig, you do not have to do this, because the object already contains a `len`\nfield which stores the length information of the array. As an example, the `string_literal` object below is 43 bytes long:\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n try stdout.print(\"{d}\\n\", .{string_literal.len});\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\n43\n```\n\n\n:::\n:::\n\n\n\n### A better look at the object type\n\nNow, we can inspect better the type of objects that Zig create. To check the type of any object in Zig, you can use the\n`@TypeOf()` function. If we look at the type of the `simple_array` object below, you will find that this object\nis a array of 4 elements. Each element is a signed integer of 32 bits which corresponds to the data type `i32` in Zig.\nThat is what an object of type `[4]i32` is.\n\nBut if we look closely at the type of the `string_literal` object below, you will find that this object is a\nconstant pointer (hence the `*const` annotation) to an array of 43 elements (or 43 bytes). Each element is a\nsingle byte (more precisely, an unsigned 8 bit integer - `u8`), that is why we have the `[43:0]u8` portion of the type below.\nIn other words, the string stored inside the `string_literal` object is 43 bytes long.\nThat is why you have the type `*const [43:0]u8` below.\n\nIn the case of `string_literal`, it is a constant pointer (`*const`) because the object `string_literal` is declared\nas constant in the source code (in the line `const string_literal = ...`). So, if we changed that for some reason, if\nwe declare `string_literal` as a variable object (i.e. `var string_literal = ...`), then, `string_literal` would be\njust a normal pointer to an array of unsigned 8-bit integers (i.e. `* [43:0]u8`).\n\nNow, if we create an pointer to the `simple_array` object, then, we get a constant pointer to an array of 4 elements (`*const [4]i32`),\nwhich is very similar to the type of the `string_literal` object. This demonstrates that a string object (or a string literal)\nin Zig is already a pointer to an array.\n\nJust remember that a \"pointer to an array\" is different than an \"array\". So a string object in Zig is a pointer to an array\nof bytes, and not simply an array of bytes.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"This is an example of string literal in Zig\";\n const simple_array = [_]i32{1, 2, 3, 4};\n try stdout.print(\"Type of array object: {}\", .{@TypeOf(simple_array)});\n try stdout.print(\n \"Type of string object: {}\",\n .{@TypeOf(string_literal)}\n );\n try stdout.print(\n \"Type of a pointer that points to the array object: {}\",\n .{@TypeOf(&simple_array)}\n );\n}\n```\n:::\n\n\n```\nType of array object: [4]i32\nType of string object: *const [43:0]u8\nType of a pointer that points to\n the array object: *const [4]i32\n```\n\n\n### Byte vs unicode points\n\nIs important to point out that each byte in the array is not necessarily a single character.\nThis fact arises from the difference between a single byte and a single unicode point.\n\nThe encoding UTF-8 works by assigning a number (which is called a unicode point) to each character in\nthe string. For example, the character \"H\" is stored in UTF-8 as the decimal number 72. This means that\nthe number 72 is the unicode point for the character \"H\". Each possible character that can appear in a\nUTF-8 encoded string have its own unicode point.\n\nFor example, the Latin Capital Letter A With Stroke (Ⱥ) is represented by the number (or the unicode point)\n570. However, this decimal number (570) is higher than the maximum number stored inside a single byte, which\nis 255. In other words, the maximum decimal number that can be represented with a single byte is 255. That is why,\nthe unicode point 570 is actually stored inside the computer’s memory as the bytes `C8 BA`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n const string_literal = \"Ⱥ\";\n try stdout.print(\"Bytes that represents the string object: \", .{});\n for (string_literal) |char| {\n try stdout.print(\"{X} \", .{char});\n }\n}\n```\n\n\n::: {.cell-output .cell-output-stdout}\n\n```\nBytes that represents the string object: C8 BA \n```\n\n\n:::\n:::\n\n\n\nThis means that to store the character Ⱥ in an UTF-8 encoded string, we need to use two bytes together\nto represent the number 570. That is why the relationship between bytes and unicode points is not always\n1 to 1. Each unicode point is a single character in the string, but not always a single byte corresponds\nto a single unicode point.\n\nAll of this means that if you loop trough the elements of a string in Zig, you will be looping through the\nbytes that represents that string, and not through the characters of that string. In the Ⱥ example above,\nthe for loop needed two iterations (instead of a single iteration) to print the two bytes that represents this Ⱥ letter.\n\nNow, all english letters (or ASCII letters if you prefer) can be represented by a single byte in UTF-8. As a\nconsequence, if your UTF-8 string contains only english letters (or ASCII letters), then, you are lucky. Because\nthe number of bytes will be equal to the number of characters in that string. In other words, in this specific\nsituation, the relationship between bytes and unicode points is 1 to 1.\n\nBut on the other side, if your string contains other types of letters… for example, you might be working with\ntext data that contains, chinese, japanese or latin letters, then, the number of bytes necessary to represent\nyour UTF-8 string will likely be much higher than the number of characters in that string.\n\nIf you need to iterate through the characters of a string, instead of its bytes, then, you can use the\n`std.unicode.Utf8View` struct to create an iterator that iterates through the unicode points of your string.\n\nIn the example below, we loop through the japanese characters “アメリカ”. Each of the four characters in\nthis string is represented by three bytes. But the for loop iterates four times, one iteration for each\ncharacter/unicode point in this string:\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\npub fn main() !void {\n var utf8 = (\n (try std.unicode.Utf8View.init(\"アメリカ\"))\n .iterator()\n );\n while (utf8.nextCodepointSlice()) |codepoint| {\n try stdout.print(\n \"got codepoint {}\\n\",\n .{std.fmt.fmtSliceHexUpper(codepoint)}\n );\n }\n}\n```\n:::\n\n\n```\ngot codepoint E382A2\ngot codepoint E383A1\ngot codepoint E383AA\ngot codepoint E382AB\n```\n\n\n\n## Other parts of Zig\n\nWe already learned a lot about Zig's syntax, and also, some pretty technical\ndetails about it. Just as a quick recap:\n\n- We talked about how functions are written in Zig at @sec-root-file and @sec-main-file.\n- How to create new objects/identifiers at @sec-root-file and specially at @sec-assignments.\n- Basic control flow syntax at @sec-zig-control-flow.\n- How strings work in Zig at @sec-zig-strings.\n- How to use arrays and slices at @sec-arrays.\n- How to import functionality from other Zig modules at @sec-root-file.\n- How Object-Oriented programming can be done in Zig through *Struct declarations* at @sec-structs-and-oop.\n\n\nBut, for now, this amount of knowledge is enough for us to continue with this book.\nLater, over the next chapters we will still talk more about other parts of\nZig's syntax that are also equally important as the other parts. Such as:\n\n- Enums at @sec-enum;\n- Pointers and Optionals at @sec-pointer;\n- Error handling with `try` and `catch`;\n- Unit tests at @sec-unittests;\n- Vectors;\n- Build System at @sec-build-system;\n\n\n\n\n", "supporting": [], "filters": [ "rmarkdown/pagebreak.lua" diff --git a/_freeze/Chapters/09-error-handling/execute-results/html.json b/_freeze/Chapters/09-error-handling/execute-results/html.json index 01c930f..1727797 100644 --- a/_freeze/Chapters/09-error-handling/execute-results/html.json +++ b/_freeze/Chapters/09-error-handling/execute-results/html.json @@ -1,9 +1,11 @@ { - "hash": "7107d1bd349fb4ef208c4ee327a3343b", + "hash": "5c40ebc06038a60d1b098b28ffa2a6c4", "result": { "engine": "knitr", - "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n# Error handling and unions in Zig {#sec-error-handling}\n\nIn this chapter, I want to discuss how error handling is done in Zig.\nWe already briefly learned about one of the available strategies to handle errors in Zig,\nwhich is the `try` keyword presented at @sec-main-file. But we still haven't learned about\nthe other methods, such as the `catch` keyword.\nI also want to discuss in this chapter how enum types are created in Zig.\n\n## Learning more about errors in Zig\n\nBefore we get into how error handling is done, we need to learn more about what errors are in Zig.\nAn error is actually a value in Zig [@zigoverview]. In other words, when an error occurs inside your Zig program,\nit means that somewhere in your Zig codebase, an error value is being generated.\nAn error value is similar to any integer value that you create in your Zig code.\nYou can take an error value and pass it as input to a function,\nand you can also cast (or coerce) it into a different type of error value.\n\nThis have some similarities with exceptions in C++ and Python.\nBecause in C++ and Python, when an exception happens inside a `try` block,\nyou can use a `catch` block (in C++) or an `except` block (in Python)\nto capture the exception produced in the `try` block,\nand pass it to functions as an input.\n\n\nAlthough they are normal values as any other, you cannot ignore error values in your Zig code. Meaning that, if an error\nvalue appears somewhere in your source code, this error value must be explicitly handled in some way.\nThis also means that you cannot discard error values by assigning them to a underscore,\nas you could do with normal values and objects.\n\nTake the source code below as an example. Here we are trying to open a file that does not exist\nin my computer, and as a result, an obvious error value of `FileNotFound` is returned from the `openFile()`\nfunction. But because I'm assigning the result of this function to an underscore, I end up\ntrying to discard an error value.\n\nThe `zig` compiler detects this mistake, and raises a compile\nerror telling me that I'm trying to discard an error value.\nIt also adds a note message that suggests the use of `try`,\n`catch` or an if statement to explicitly handle this error value\nThis note is reinforcing that every possible error value must be explicitly handled in Zig.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst dir = std.fs.cwd();\n_ = dir.openFile(\"doesnt_exist.txt\", .{});\n```\n:::\n\n\n```\nt.zig:8:17: error: error set is discarded\nt.zig:8:17: note: consider using 'try', 'catch', or 'if'\n```\n\n### Returning errors from functions\n\nAs we described at @sec-main-file, when we have a function that might return an error\nvalue, this function normally includes an exclamation mark (`!`) in it's return type\nannotation. The presence of this exclamation mark indicates that this function might\nreturn an error value as result, and, the `zig` compiler forces you to always handle explicitly\nthe case of this function returning an error value.\n\nTake a look at the `print_name()` function below. This function might return an error in the `stdout.print()` function call,\nand, as a consequence, it's return type (`!void`) includes an exclamation mark in it.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn print_name() !void {\n const stdout = std.getStdOut().writer();\n try stdout.print(\"My name is Pedro!\", .{});\n}\n```\n:::\n\n\nIn the example above, we are using the exclamation mark to tell the `zig` compiler\nthat this function might return some error. But which error exactly is returned from\nthis function? For now, we are not specifying a specific error value. We only\nknown for now that some error value (whatever it is) might be returned.\n\nBut in fact, you can (if you want to) specify clearly which exact error values\nmight be returned from this function. There are lot of examples of\nthis in the Zig Standard Library. Take this `fill()` function from\nthe `http.Client` module as an example. This function returns\neither a error value of type `ReadError`, or `void`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn fill(conn: *Connection) ReadError!void {\n // The body of this function ...\n}\n```\n:::\n\n\nThis idea of specifying the exact error values that you expect to be returned\nfrom the function is interesting. Because they automatically become some sort of documentation\nof your function, and also, it allows the `zig` compiler to perform some extra checks over\nyour code. Because it can check if there is any other type of error value\nthat is being generated inside your function, and, that it is not being accounted\nfor in this return type annotation.\n\nAnyway, you can list the types of errors that can be returned from the function\nby listing them on the left side of the exclamation mark. While the valid values\nstay on the right side of the exclamation mark. So the syntax format become:\n\n```\n!\n```\n\n### Error sets\n\nBut what about when we have a single function that might return different types of errors?\nWhen you have such a function, you can list\nall of these different types of errors that can be returned from this function,\nthrough a structure in Zig that we call of *error set*.\n\nAn error set is a special case of an union type.\nIt essentially is an union that contains error values in it.\nNot all programming languages have a notion of an \"union object\".\nBut in summary, an union is just a list of the options that\nan object can be. For example, a union of `x`, `y` and `z`, means that\nan object can be either of type `x`, or type `y` or type `z`.\n\nWe are going to talk in more depth about unions at @sec-unions.\nBut you can write an error set by writing the keyword `error` before\na pair of curly braces, then you list the error values that can be\nreturned from the function inside this pair of curly braces.\n\nTake the `resolvePath()` function below as an example, which comes from the\n`introspect.zig` module of the Zig Standard Library. We can see in it's return type annotation, that this\nfunction return either: 1) a valid slice of `u8` values (`[]u8`); or, 2) one of the three different\ntypes of error values listed inside the error set (`OutOfMemory`, `Unexpected`, etc.).\nThis is an example of use of an error set.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn resolvePath(\n ally: mem.Allocator,\n p: []const u8,\n) error{\n OutOfMemory,\n CurrentWorkingDirectoryUnlinked,\n Unexpected,\n}![]u8 {\n // The body of the function ...\n}\n```\n:::\n\n\n\nThis is a valid way of annotating the return value of a Zig function. But, if you navigate through\nthe modules that composes the Zig Standard Library, you will notice that, for the majority of cases,\nthe programmers prefer to give a descriptive name to this error set, and then, use this name (or this \"label\")\nof the error set in the return type annotation, instead of using the error set directly.\n\nWe can see that in the `ReadError` error set that we showed earlier in the `fill()` function,\nwhich is defined in the `http.Client` module.\nSo yes, I presented the `ReadError` as if it was just a standard and single error value, but in fact,\nit is an error set defined in the `http.Client` module, and therefore, it actually represents\na set of different error values that might happen in the `fill()` and other functions.\n\n\nTake a look at the `ReadError` definition reproduced below. Notice that we are grouping all of these\ndifferent error values into a single object, and then, we use this object into the return type annotation of the functions.\nLike the `fill()` function that we showed earlier, or, the `readvDirect()` function from the same module,\nwhich is reproduced below.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub const ReadError = error{\n TlsFailure,\n TlsAlert,\n ConnectionTimedOut,\n ConnectionResetByPeer,\n UnexpectedReadFailure,\n EndOfStream,\n};\n// Some lines of code\npub fn readvDirect(\n conn: *Connection,\n buffers: []std.posix.iovec\n ) ReadError!usize {\n // The body of the function ...\n}\n```\n:::\n\n\nSo, an error set is just a convenient way of grouping a set of\npossible error values into a single object, or a single type of an error value.\n\n\n### Casting error values\n\nLet's suppose you have two different error sets, named `A` and `B`.\nIf error set `A` is a superset of error set `B`, then, you can cast (or coerce)\nerror values from `B` into error values of `A`.\n\nError sets are just a set of error values. So, if the error set `A`\ncontains all error values from the error set `B`, then `A`\nbecomes a superset of `B`. You could also say\nthat the error set `B` is a subset of error set `A`.\n\nThe example below demonstrates this idea. Because `A` contains all\nvalues from `B`, `A` is a superset of `B`.\nIn math notation, we would say that $A \\supset B$.\nAs a consequence, we can give an error value from `B` as input to the `cast()`\nfunction, and, implicitly cast this input into the same error value, but from the `A` set.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst A = error{\n ConnectionTimeoutError,\n DatabaseNotFound,\n OutOfMemory,\n InvalidToken,\n};\nconst B = error {\n OutOfMemory,\n};\n\nfn cast(err: B) A {\n return err;\n}\n\ntest \"coerce error value\" {\n const error_value = cast(B.OutOfMemory);\n try std.testing.expect(\n error_value == A.OutOfMemory\n );\n}\n```\n:::\n\n\n\n## How to handle errors\n\nNow that we learned more about what errors are in Zig,\nlet's discuss the available strategies to handle these errors,\nwhich are:\n\n- `try` keyword;\n- `catch` keyword;\n- an if statement;\n- `errdefer` keyword;\n\n\n\n### What `try` means?\n\nAs I described over the previous sections, when we say that an expression might\nreturn an error, we are basically referring to an expression that have\na return type in the format `!T`.\nThe `!` indicates that this expression returns either an error value, or a value of type `T`.\n\nAt @sec-main-file, I presented the `try` keyword and where to use it.\nBut I did not talked about what exactly this keyword does to your code,\nor, in other words, I have not explained yet what `try` means in your code.\n\nIn essence, when you use the `try` keyword in an expression, you are telling\nthe `zig` compiler the following: \"Hey! Execute this expression for me,\nand, if this expression return an error, please, return this error for me\nand stop the execution of my program. But if this expression return a valid\nvalue, then, return this value, and move on\".\n\nIn other words, the `try` keyword is essentially, a strategy to enter in panic mode, and stop\nthe execution of your program in case an error occurs.\nWith the `try` keyword, you are telling the `zig` compiler, that stopping the execution\nof your program is the most reasonable strategy to take if an error occurs\nin that particular expression.\n\n### The `catch` keyword\n\nOk, now that we understand properly what `try` means, let's discuss `catch` now.\nOne important detail here, is that you can use `try` or `catch` to handle your errors,\nbut you **cannot use `try` and `catch` together**. In other words, `try` and `catch`\nare different and completely separate strategies in the Zig language.\n\nThis is uncommon, and different than what happens in other languages. Most\nprogramming languages that adopts the *try catch* pattern (such as C++, R, Python, Javascript, etc.), normally use\nthese two keywords in conjunction to form the complete logic to\nproperly handle the errors.\nAnyway, Zig tries a different approach in the *try catch* pattern.\n\nSo, we learned already about what `try` means, and we also known that both\n`try` and `catch` should be used alone, separate from each other. But\nwhat exactly `catch` do in Zig? With `catch`, we can construct a block of\nlogic to handle the error value, in case it happens in the current expression.\n\nLook at the code example below. Once again, we go back to the previous\nexample where we were trying to open a file that doesn't exist in my computer,\nbut this time, I use `catch` to actually implement a logic to handle the error, instead of\njust stopping the execution right away.\n\nMore specifically, in this example, I'm using a logger object to record some logs into\nthe system, before I return the error, and stops the execution of the program. For example,\nthis could be some part of the codebase of a complex system that I do not have full control over,\nand I want to record these logs before the program crashes, so that I can debug it later\n(e.g. maybe I cannot compile the full program, and properly debug it with a debugger. So, these logs might\nbe a valid strategy to surpass this barrier).\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst dir = std.fs.cwd();\nconst file = dir.openFile(\n \"doesnt_exist.txt\", .{}\n) catch |err| {\n logger.record_context();\n logger.log_error(err);\n return err;\n};\n```\n:::\n\n\n\nTherefore, we use `catch` to create a block of expressions that will handle the error.\nI can return the error value from this block of expressions, like I did in the above example,\nwhich, will make the program enter in panic mode, and, stop the execution.\nBut I could also, return a valid value from this block of code, which would\nbe stored in the `file` object.\n\nNotice that, instead of writing the keyword before the expression that might return the error,\nlike we do with `try`,\nwe write `catch` after the expression. We can open the pair of pipes (`|`),\nwhich captures the error value returned by the expression, and makes\nthis error value available in the scope of the `catch` block as the object named `err`.\nIn other words, because I wrote `|err|` in the code, I can access the error value\nreturned by the expression, by using the `err` object.\n\nAlthough this being the most common use of `catch`, you can also use this keyword\nto handle the error in a \"default value\" style. That is, if the expression returns\nan error, we use the default value instead. Otherwise, we use the valid value returned\nby the expression.\n\nThe Zig official language reference, provides a great example of this \"default value\"\nstrategy with `catch`. This example is reproduced below. Notice that we are trying to parse\nsome unsigned integer from a string object named `str`. In other words, this function\nis trying to transform an object of type `[]const u8` (i.e. an array of characters, a string, etc.)\ninto an object of type `u64`.\n\nBut this parsing process done by the function `parseU64()` may fail, resulting in a runtime error.\nThe `catch` keyword used in this example provides an alternative value (13) to be used in case\nthis `parseU64()` function raises an error. So, the expression below essentially means:\n\"Hey! Please, parse this string into a `u64` for me, and store the results into the\nobject `number`. But, if an error occurs, then, return the value `13` instead\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst number = parseU64(str, 10) catch 13;\n```\n:::\n\n\nSo, at the end of this process, the object `number` will contain either a `u64` integer\nthat was parsed succesfully from the input string `str`, or, if an error in the\nparsing process occurs, it will contain the `u64` value `13` that was provided by the `catch`\nkeyword as the \"default\", or, the \"alternative\" value.\n\n\n\n### Using if statements\n\nNow, you can also use if statements to handle errors in your Zig code.\nIn the example below, I'm reproducing the previous example, where\nwe try to parse an integer value from an input string with a function\nnamed `parseU64()`.\n\nWe execute the expression inside the \"if\". If this expression returns an\nerror value, the \"if branch\" (or, the \"true branch\") of the if statement is not executed.\nBut if this expression returns a valid value instead, then, this value is unwrapped\ninto the `number` object.\n\nThis means that, if the `parseU64()` expression returns a valid value, this value becomes available\ninside the scope of this \"if branch\" (i.e. the \"true branch\") through the object that we listed inside the pair\nof pipe charactes (`|`), which is the object `number`.\n\nIf an error occurs, we can use an \"else branch\" (or the \"false branch\") of the if statement\nto handle the error. In the example below, we are using the `else` in the if statement\nto unwrap the error value (that was returned by `parseU64()`) into the `err` object,\nand handle the error.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nif (parseU64(str, 10)) |number| {\n // do something with `number` here\n} else |err| {\n // handle the error value.\n}\n```\n:::\n\n\nNow, if the expression that you are executing returns different types of error values,\nand you want to take a different action in each of these types of error values, the\n`catch` keyword becomes limited.\n\nFor this type of situation, the official documentation\nof the language suggests the use of a switch statement with an if statement [@zigdocs].\nThe basic idea is, to use the if statement to execute the expression, and\nuse the \"else branch\" to pass the error value to a switch statement, where\nyou define a different action for each type of error value that might be\nreturned by the expression executed in the if statement.\n\nThe example below demonstrates this idea. We first try to add (or register) a set of\ntasks to a queue. If this \"registration process\" occurs well, we then try\nto distribute these tasks across the workers of our system. But\nif this \"registration process\" returns an error value, we then use a switch\nstatement in the \"else branch\" to handle each possible error value.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nif (add_tasks_to_queue(&queue, tasks)) |_| {\n distribute_tasks(&queue);\n} else |err| switch (err) {\n error.InvalidTaskName => {\n // do something\n },\n error.TimeoutTooBig => {\n // do something\n },\n error.QueueNotFound => {\n // do somethimg\n },\n // and all the other error options ...\n}\n```\n:::\n\n\n\n### The `errdefer` keyword\n\nA commom pattern in C programs in general, is to clean resources when an error occurs during\nthe execution of the program. In other words, one commom way to handle errors, is to perform\n\"cleanup actions\" before we exit our program. This garantees that a runtime error does not make\nour program to leak resources of the system.\n\n\nThe `errdefer` keyword is a tool to perform such \"cleanup actions\" in hostile situations.\nThis keyword is commonly used to clean (or to free) allocated resources, before the execution of our program\nget's stopped because of an error value being generated.\n\nThe basic idea is to provide an expression to the `errdefer` keyword. Then,\n`errdefer` executes this expression if, and only if, an error occurs\nduring the execution of the current scope.\nIn the example below, we are using an allocator object (that we presented at @sec-allocators)\nto create a new `User` object. If we are succesfull in creating and registering this new user,\nthis `create_user()` function will return this new `User` object as it's return value.\n\nHowever, if for some reason, an error value is generated by some expression\nthat is after the `errdefer` line, for example, in the `db.add(user)` expression,\nthe expression registered by `errdefer` get's executed before the error value is returned\nfrom the function, and before the program enters in panic mode and stops the\ncurrent execution.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn create_user(db: Database, allocator: Allocator) !User {\n const user = try allocator.create(User);\n errdefer allocator.destroy(user);\n\n // Register new user in the Database.\n _ = try db.register_user(user);\n return user;\n}\n```\n:::\n\n\nBy using `errdefer` to destroy the `user` object that we have just created,\nwe garantee that the memory allocated for this `user` object\nget's freed, before the execution of the program stops.\n\nBecause if the expression `try db.add(user)` returns an error value,\nthe execution of our program stops, and we loose all references and control over the memory\nthat we have allocated for the `user` object.\nAs a result, if we do not free the memory associated with the `user` object before the program stops,\nwe cannot free this memory anymore. We simply loose our chance to do the right thing.\nThat is why `errdefer` is essential in this situation.\n\nHaving all this in mind, the `errdefer` keyword is different but also similar\nto the `defer` keyword. The only difference between the two is when the provided expression\nget's executed. The `defer` keyword always execute the provided expression at the end of the\ncurrent scope, while `errdefer` executes the provided expression when an error occurs in the\ncurrent scope.\n\n\n\n## Union type in Zig {#sec-unions}\n\nAn union type defines a set of types that an object can be. It is like a list of\noptions. Each option is a type that an object can assume. Therefore, unions in Zig\nhave the same meaning, or, the same role as unions in C. They are used for the same purpose.\nYou could also say that unions in Zig produces a similar effect to\n[`typing.Union` in Python](https://docs.python.org/3/library/typing.html#typing.Union)[^pyunion].\n\n[^pyunion]: \n\nFor example, you might be creating an API that sends data to a data lake, hosted\nin some private cloud infrastructure. Suppose you created different structs in your codebase,\nto store the necessary information that you need, in order to connect to the services of\neach mainstream data lake service (Amazon S3, Azure Blob, etc.).\n\nNow, suppose you also have a function named `send_event()` that receives an event as input,\nand, a target data lake, and it sends the input event to the data lake specified in the\ntarget data lake argument. But this target data lake could be any of the three mainstream data lakes\nservices (Amazon S3, Azure Blob, etc.). Here is where an union can help you.\n\nThe union `LakeTarget` defined below allows the `lake_target` argument of `send_event()`\nto be either an object of type `AzureBlob`, or type `AmazonS3`, or type `GoogleGCP`.\nThis union allows the `send_event()` function to receive an object of any of these three types\nas input in the `lake_target` argument.\n\nRemember that each of these three types\n(`AmazonS3`, `GoogleGCP` and `AzureBlob`) are separate structs that we defined in\nour source code. So, at first glance, they are separate data types in our source code.\nBut is the `union` keyword that unifies them into a single data type called `LakeTarget`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst LakeTarget = union {\n azure: AzureBlob,\n amazon: AmazonS3,\n google: GoogleGCP,\n};\n\nfn send_event(\n event: Event,\n lake_target: LakeTarget\n) -> bool {\n // body of the function ...\n}\n```\n:::\n\n\nAn union definition is composed by a list of data members. Each data member is of a specific data type.\nIn the example above, the `LakeTarget` union have three data members (`azure`, `amazon`, `google`).\nWhen you instantiate an object that uses an union type, you can only use one of it's data members\nin this instantiation.\n\nYou could also interpret this as: only one data member of an union type can be activated at a time, the other data\nmembers remain deactivated and unaccessible. For example, if you create a `LakeTarget` object that uses\nthe `azure` data member, you can no longer use or access the data members `google` or `amazon`.\nIt is like if these other data members didn't exist at all in the `LakeTarget` type.\n\nYou can see this logic in the example below. Notice that, we first instantiate the union\nobject using the `azure` data member. As a result, this `target` object contains only\nthe `azure` data member inside of it. Only this data member is active in this object.\nThat is why the last line in this code example is invalid. Because we are trying to instantiate the data member\n`google`, which is currently inactive for this `target` object, and as a result, the program\nenters in panic mode warning us about this mistake through a loud error message.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar target = LakeTarget {\n .azure = AzureBlob.init()\n};\n// Only the `azure` data member exist inside\n// the `target` object, and, as a result, this\n// line below is invalid:\ntarget.google = GoogleGCP.init();\n```\n:::\n\n\n```\nthread 2177312 panic: access of union field 'google' while\n field 'azure' is active:\n target.google = GoogleGCP.init();\n ^\n```\n\nSo, when you instantiate an union object, you must choose one of the data types (or, one of the data members)\nlisted in the union type. In the example above, I choose to use the `azure` data member, and, as a result,\nall other data members were automatically deactivated,\nand you can no longer use them after you instantiate the object.\n\nYou can activate another data member by completely redefining the entire enum object.\nIn the example below, I initially use the `azure` data member. But then, I redefine the\n`target` object to use a new `LakeTarget` object, which uses this time the `google` data member.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar target = LakeTarget {\n .azure = AzureBlob.init()\n};\ntarget = LakeTarget {\n .google = GoogleGCP.init()\n};\n```\n:::\n\n\nAn curious fact about union types, is that, at first, you cannot use them in switch statements (that we preseted at @sec-switch).\nIn other words, if you have an object of type `LakeTarget` for example, you cannot give this object\nto a switch statement as input.\n\nBut what if you really need to do so? What if you actually need to\nprovide an \"union object\" to a switch statement? The answer to this question relies on another special type in Zig,\nwhich are the *tagged unions*. To create a tagged union, all you have to do is to add\nan enum type into your union declaration.\n\nAs an example of a tagged union in Zig, take the `Registry` type exposed\nbelow. This type comes from the\n[`grammar.zig` module](https://github.com/ziglang/zig/blob/30b4a87db711c368853b3eff8e214ab681810ef9/tools/spirv/grammar.zig)[^grammar]\nfrom the Zig repository. This union type lists different types of registries.\nBut notice this time, the use of `(enum)` after the `union` keyword. This is what makes\nthis union type a tagged union. Also, by being a tagged union, an object of this `Registry` type\ncan be used as input in a switch statement. This is all you have to do. Just add `(enum)`\nto your `union` declaration, and you can use it in switch statements.\n\n[^grammar]: .\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub const Registry = union(enum) {\n core: CoreRegistry,\n extension: ExtensionRegistry,\n};\n```\n:::\n", - "supporting": [], + "markdown": "---\nengine: knitr\nknitr: true\nsyntax-definition: \"../Assets/zig.xml\"\n---\n\n\n\n\n\n\n# Error handling and unions in Zig {#sec-error-handling}\n\nIn this chapter, I want to discuss how error handling is done in Zig.\nWe already briefly learned about one of the available strategies to handle errors in Zig,\nwhich is the `try` keyword presented at @sec-main-file. But we still haven't learned about\nthe other methods, such as the `catch` keyword.\nI also want to discuss in this chapter how enum types are created in Zig.\n\n## Learning more about errors in Zig\n\nBefore we get into how error handling is done, we need to learn more about what errors are in Zig.\nAn error is actually a value in Zig [@zigoverview]. In other words, when an error occurs inside your Zig program,\nit means that somewhere in your Zig codebase, an error value is being generated.\nAn error value is similar to any integer value that you create in your Zig code.\nYou can take an error value and pass it as input to a function,\nand you can also cast (or coerce) it into a different type of error value.\n\nThis have some similarities with exceptions in C++ and Python.\nBecause in C++ and Python, when an exception happens inside a `try` block,\nyou can use a `catch` block (in C++) or an `except` block (in Python)\nto capture the exception produced in the `try` block,\nand pass it to functions as an input.\n\n\nAlthough they are normal values as any other, you cannot ignore error values in your Zig code. Meaning that, if an error\nvalue appears somewhere in your source code, this error value must be explicitly handled in some way.\nThis also means that you cannot discard error values by assigning them to a underscore,\nas you could do with normal values and objects.\n\nTake the source code below as an example. Here we are trying to open a file that does not exist\nin my computer, and as a result, an obvious error value of `FileNotFound` is returned from the `openFile()`\nfunction. But because I'm assigning the result of this function to an underscore, I end up\ntrying to discard an error value.\n\nThe `zig` compiler detects this mistake, and raises a compile\nerror telling me that I'm trying to discard an error value.\nIt also adds a note message that suggests the use of `try`,\n`catch` or an if statement to explicitly handle this error value\nThis note is reinforcing that every possible error value must be explicitly handled in Zig.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst dir = std.fs.cwd();\n_ = dir.openFile(\"doesnt_exist.txt\", .{});\n```\n:::\n\n\n```\nt.zig:8:17: error: error set is discarded\nt.zig:8:17: note: consider using 'try', 'catch', or 'if'\n```\n\n### Returning errors from functions\n\nAs we described at @sec-main-file, when we have a function that might return an error\nvalue, this function normally includes an exclamation mark (`!`) in it's return type\nannotation. The presence of this exclamation mark indicates that this function might\nreturn an error value as result, and, the `zig` compiler forces you to always handle explicitly\nthe case of this function returning an error value.\n\nTake a look at the `print_name()` function below. This function might return an error in the `stdout.print()` function call,\nand, as a consequence, it's return type (`!void`) includes an exclamation mark in it.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn print_name() !void {\n const stdout = std.getStdOut().writer();\n try stdout.print(\"My name is Pedro!\", .{});\n}\n```\n:::\n\n\nIn the example above, we are using the exclamation mark to tell the `zig` compiler\nthat this function might return some error. But which error exactly is returned from\nthis function? For now, we are not specifying a specific error value. We only\nknown for now that some error value (whatever it is) might be returned.\n\nBut in fact, you can (if you want to) specify clearly which exact error values\nmight be returned from this function. There are lot of examples of\nthis in the Zig Standard Library. Take this `fill()` function from\nthe `http.Client` module as an example. This function returns\neither a error value of type `ReadError`, or `void`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn fill(conn: *Connection) ReadError!void {\n // The body of this function ...\n}\n```\n:::\n\n\nThis idea of specifying the exact error values that you expect to be returned\nfrom the function is interesting. Because they automatically become some sort of documentation\nof your function, and also, it allows the `zig` compiler to perform some extra checks over\nyour code. Because it can check if there is any other type of error value\nthat is being generated inside your function, and, that it is not being accounted\nfor in this return type annotation.\n\nAnyway, you can list the types of errors that can be returned from the function\nby listing them on the left side of the exclamation mark. While the valid values\nstay on the right side of the exclamation mark. So the syntax format become:\n\n```\n!\n```\n\n### Error sets\n\nBut what about when we have a single function that might return different types of errors?\nWhen you have such a function, you can list\nall of these different types of errors that can be returned from this function,\nthrough a structure in Zig that we call of *error set*.\n\nAn error set is a special case of an union type.\nIt essentially is an union that contains error values in it.\nNot all programming languages have a notion of an \"union object\".\nBut in summary, an union is just a list of the options that\nan object can be. For example, a union of `x`, `y` and `z`, means that\nan object can be either of type `x`, or type `y` or type `z`.\n\nWe are going to talk in more depth about unions at @sec-unions.\nBut you can write an error set by writing the keyword `error` before\na pair of curly braces, then you list the error values that can be\nreturned from the function inside this pair of curly braces.\n\nTake the `resolvePath()` function below as an example, which comes from the\n`introspect.zig` module of the Zig Standard Library. We can see in it's return type annotation, that this\nfunction return either: 1) a valid slice of `u8` values (`[]u8`); or, 2) one of the three different\ntypes of error values listed inside the error set (`OutOfMemory`, `Unexpected`, etc.).\nThis is an example of use of an error set.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub fn resolvePath(\n ally: mem.Allocator,\n p: []const u8,\n) error{\n OutOfMemory,\n CurrentWorkingDirectoryUnlinked,\n Unexpected,\n}![]u8 {\n // The body of the function ...\n}\n```\n:::\n\n\n\nThis is a valid way of annotating the return value of a Zig function. But, if you navigate through\nthe modules that composes the Zig Standard Library, you will notice that, for the majority of cases,\nthe programmers prefer to give a descriptive name to this error set, and then, use this name (or this \"label\")\nof the error set in the return type annotation, instead of using the error set directly.\n\nWe can see that in the `ReadError` error set that we showed earlier in the `fill()` function,\nwhich is defined in the `http.Client` module.\nSo yes, I presented the `ReadError` as if it was just a standard and single error value, but in fact,\nit is an error set defined in the `http.Client` module, and therefore, it actually represents\na set of different error values that might happen in the `fill()` and other functions.\n\n\nTake a look at the `ReadError` definition reproduced below. Notice that we are grouping all of these\ndifferent error values into a single object, and then, we use this object into the return type annotation of the functions.\nLike the `fill()` function that we showed earlier, or, the `readvDirect()` function from the same module,\nwhich is reproduced below.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub const ReadError = error{\n TlsFailure,\n TlsAlert,\n ConnectionTimedOut,\n ConnectionResetByPeer,\n UnexpectedReadFailure,\n EndOfStream,\n};\n// Some lines of code\npub fn readvDirect(\n conn: *Connection,\n buffers: []std.posix.iovec\n ) ReadError!usize {\n // The body of the function ...\n}\n```\n:::\n\n\nSo, an error set is just a convenient way of grouping a set of\npossible error values into a single object, or a single type of an error value.\n\n\n### Casting error values\n\nLet's suppose you have two different error sets, named `A` and `B`.\nIf error set `A` is a superset of error set `B`, then, you can cast (or coerce)\nerror values from `B` into error values of `A`.\n\nError sets are just a set of error values. So, if the error set `A`\ncontains all error values from the error set `B`, then `A`\nbecomes a superset of `B`. You could also say\nthat the error set `B` is a subset of error set `A`.\n\nThe example below demonstrates this idea. Because `A` contains all\nvalues from `B`, `A` is a superset of `B`.\nIn math notation, we would say that $A \\supset B$.\nAs a consequence, we can give an error value from `B` as input to the `cast()`\nfunction, and, implicitly cast this input into the same error value, but from the `A` set.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst std = @import(\"std\");\nconst A = error{\n ConnectionTimeoutError,\n DatabaseNotFound,\n OutOfMemory,\n InvalidToken,\n};\nconst B = error {\n OutOfMemory,\n};\n\nfn cast(err: B) A {\n return err;\n}\n\ntest \"coerce error value\" {\n const error_value = cast(B.OutOfMemory);\n try std.testing.expect(\n error_value == A.OutOfMemory\n );\n}\n```\n:::\n\n\n\n## How to handle errors\n\nNow that we learned more about what errors are in Zig,\nlet's discuss the available strategies to handle these errors,\nwhich are:\n\n- `try` keyword;\n- `catch` keyword;\n- an if statement;\n- `errdefer` keyword;\n\n\n\n### What `try` means?\n\nAs I described over the previous sections, when we say that an expression might\nreturn an error, we are basically referring to an expression that have\na return type in the format `!T`.\nThe `!` indicates that this expression returns either an error value, or a value of type `T`.\n\nAt @sec-main-file, I presented the `try` keyword and where to use it.\nBut I did not talked about what exactly this keyword does to your code,\nor, in other words, I have not explained yet what `try` means in your code.\n\nIn essence, when you use the `try` keyword in an expression, you are telling\nthe `zig` compiler the following: \"Hey! Execute this expression for me,\nand, if this expression return an error, please, return this error for me\nand stop the execution of my program. But if this expression return a valid\nvalue, then, return this value, and move on\".\n\nIn other words, the `try` keyword is essentially, a strategy to enter in panic mode, and stop\nthe execution of your program in case an error occurs.\nWith the `try` keyword, you are telling the `zig` compiler, that stopping the execution\nof your program is the most reasonable strategy to take if an error occurs\nin that particular expression.\n\n### The `catch` keyword\n\nOk, now that we understand properly what `try` means, let's discuss `catch` now.\nOne important detail here, is that you can use `try` or `catch` to handle your errors,\nbut you **cannot use `try` and `catch` together**. In other words, `try` and `catch`\nare different and completely separate strategies in the Zig language.\n\nThis is uncommon, and different than what happens in other languages. Most\nprogramming languages that adopts the *try catch* pattern (such as C++, R, Python, Javascript, etc.), normally use\nthese two keywords in conjunction to form the complete logic to\nproperly handle the errors.\nAnyway, Zig tries a different approach in the *try catch* pattern.\n\nSo, we learned already about what `try` means, and we also known that both\n`try` and `catch` should be used alone, separate from each other. But\nwhat exactly `catch` do in Zig? With `catch`, we can construct a block of\nlogic to handle the error value, in case it happens in the current expression.\n\nLook at the code example below. Once again, we go back to the previous\nexample where we were trying to open a file that doesn't exist in my computer,\nbut this time, I use `catch` to actually implement a logic to handle the error, instead of\njust stopping the execution right away.\n\nMore specifically, in this example, I'm using a logger object to record some logs into\nthe system, before I return the error, and stops the execution of the program. For example,\nthis could be some part of the codebase of a complex system that I do not have full control over,\nand I want to record these logs before the program crashes, so that I can debug it later\n(e.g. maybe I cannot compile the full program, and properly debug it with a debugger. So, these logs might\nbe a valid strategy to surpass this barrier).\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst dir = std.fs.cwd();\nconst file = dir.openFile(\n \"doesnt_exist.txt\", .{}\n) catch |err| {\n logger.record_context();\n logger.log_error(err);\n return err;\n};\n```\n:::\n\n\n\nTherefore, we use `catch` to create a block of expressions that will handle the error.\nI can return the error value from this block of expressions, like I did in the above example,\nwhich, will make the program enter in panic mode, and, stop the execution.\nBut I could also, return a valid value from this block of code, which would\nbe stored in the `file` object.\n\nNotice that, instead of writing the keyword before the expression that might return the error,\nlike we do with `try`,\nwe write `catch` after the expression. We can open the pair of pipes (`|`),\nwhich captures the error value returned by the expression, and makes\nthis error value available in the scope of the `catch` block as the object named `err`.\nIn other words, because I wrote `|err|` in the code, I can access the error value\nreturned by the expression, by using the `err` object.\n\nAlthough this being the most common use of `catch`, you can also use this keyword\nto handle the error in a \"default value\" style. That is, if the expression returns\nan error, we use the default value instead. Otherwise, we use the valid value returned\nby the expression.\n\nThe Zig official language reference, provides a great example of this \"default value\"\nstrategy with `catch`. This example is reproduced below. Notice that we are trying to parse\nsome unsigned integer from a string object named `str`. In other words, this function\nis trying to transform an object of type `[]const u8` (i.e. an array of characters, a string, etc.)\ninto an object of type `u64`.\n\nBut this parsing process done by the function `parseU64()` may fail, resulting in a runtime error.\nThe `catch` keyword used in this example provides an alternative value (13) to be used in case\nthis `parseU64()` function raises an error. So, the expression below essentially means:\n\"Hey! Please, parse this string into a `u64` for me, and store the results into the\nobject `number`. But, if an error occurs, then, return the value `13` instead\".\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst number = parseU64(str, 10) catch 13;\n```\n:::\n\n\nSo, at the end of this process, the object `number` will contain either a `u64` integer\nthat was parsed succesfully from the input string `str`, or, if an error in the\nparsing process occurs, it will contain the `u64` value `13` that was provided by the `catch`\nkeyword as the \"default\", or, the \"alternative\" value.\n\n\n\n### Using if statements\n\nNow, you can also use if statements to handle errors in your Zig code.\nIn the example below, I'm reproducing the previous example, where\nwe try to parse an integer value from an input string with a function\nnamed `parseU64()`.\n\nWe execute the expression inside the \"if\". If this expression returns an\nerror value, the \"if branch\" (or, the \"true branch\") of the if statement is not executed.\nBut if this expression returns a valid value instead, then, this value is unwrapped\ninto the `number` object.\n\nThis means that, if the `parseU64()` expression returns a valid value, this value becomes available\ninside the scope of this \"if branch\" (i.e. the \"true branch\") through the object that we listed inside the pair\nof pipe charactes (`|`), which is the object `number`.\n\nIf an error occurs, we can use an \"else branch\" (or the \"false branch\") of the if statement\nto handle the error. In the example below, we are using the `else` in the if statement\nto unwrap the error value (that was returned by `parseU64()`) into the `err` object,\nand handle the error.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nif (parseU64(str, 10)) |number| {\n // do something with `number` here\n} else |err| {\n // handle the error value.\n}\n```\n:::\n\n\nNow, if the expression that you are executing returns different types of error values,\nand you want to take a different action in each of these types of error values, the\n`catch` keyword becomes limited.\n\nFor this type of situation, the official documentation\nof the language suggests the use of a switch statement with an if statement [@zigdocs].\nThe basic idea is, to use the if statement to execute the expression, and\nuse the \"else branch\" to pass the error value to a switch statement, where\nyou define a different action for each type of error value that might be\nreturned by the expression executed in the if statement.\n\nThe example below demonstrates this idea. We first try to add (or register) a set of\ntasks to a queue. If this \"registration process\" occurs well, we then try\nto distribute these tasks across the workers of our system. But\nif this \"registration process\" returns an error value, we then use a switch\nstatement in the \"else branch\" to handle each possible error value.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nif (add_tasks_to_queue(&queue, tasks)) |_| {\n distribute_tasks(&queue);\n} else |err| switch (err) {\n error.InvalidTaskName => {\n // do something\n },\n error.TimeoutTooBig => {\n // do something\n },\n error.QueueNotFound => {\n // do somethimg\n },\n // and all the other error options ...\n}\n```\n:::\n\n\n\n### The `errdefer` keyword\n\nA commom pattern in C programs in general, is to clean resources when an error occurs during\nthe execution of the program. In other words, one commom way to handle errors, is to perform\n\"cleanup actions\" before we exit our program. This garantees that a runtime error does not make\nour program to leak resources of the system.\n\n\nThe `errdefer` keyword is a tool to perform such \"cleanup actions\" in hostile situations.\nThis keyword is commonly used to clean (or to free) allocated resources, before the execution of our program\nget's stopped because of an error value being generated.\n\nThe basic idea is to provide an expression to the `errdefer` keyword. Then,\n`errdefer` executes this expression if, and only if, an error occurs\nduring the execution of the current scope.\nIn the example below, we are using an allocator object (that we presented at @sec-allocators)\nto create a new `User` object. If we are succesfull in creating and registering this new user,\nthis `create_user()` function will return this new `User` object as it's return value.\n\nHowever, if for some reason, an error value is generated by some expression\nthat is after the `errdefer` line, for example, in the `db.add(user)` expression,\nthe expression registered by `errdefer` get's executed before the error value is returned\nfrom the function, and before the program enters in panic mode and stops the\ncurrent execution.\n\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nfn create_user(db: Database, allocator: Allocator) !User {\n const user = try allocator.create(User);\n errdefer allocator.destroy(user);\n\n // Register new user in the Database.\n _ = try db.register_user(user);\n return user;\n}\n```\n:::\n\n\nBy using `errdefer` to destroy the `user` object that we have just created,\nwe garantee that the memory allocated for this `user` object\nget's freed, before the execution of the program stops.\nBecause if the expression `try db.add(user)` returns an error value,\nthe execution of our program stops, and we loose all references and control over the memory\nthat we have allocated for the `user` object.\nAs a result, if we do not free the memory associated with the `user` object before the program stops,\nwe cannot free this memory anymore. We simply loose our chance to do the right thing.\nThat is why `errdefer` is essential in this situation.\n\nJust to make very clear the differences between `defer` (which I described at @sec-defer)\nand `errdefer`, it might be worth to discuss the subject a bit further.\nYou might still have the question \"why use `errdefer` if we can use `defer` instead?\"\nin your mind.\n\nAlthough being similar, the key difference between `errdefer` and `defer` keyword\nis when the provided expression get's executed.\nThe `defer` keyword always execute the provided expression at the end of the\ncurrent scope, no matter how your code exits this scope.\nIn contrast, `errdefer` executes the provided expression only when an error occurs in the\ncurrent scope.\n\nThis becomes important if a resource that you allocate in the\ncurrent scope get's freed later in your code, in a different scope.\nThe `create_user()` functions is an example of this. If you think\nclosely about this function, you will notice that this function returns\nthe `user` object as the result.\n\nIn other words, the allocated memory for the `user` object does not get\nfreed inside the `create_user()`, if the function returns succesfully.\nSo, if an error does not occur inside this function, the `user` object\nis returned from the function, and probably, the code that runs after\nthis `create_user()` function will be responsible for freeying\nthe memory of the `user` object.\n\nBut what if an error do occur inside the `create_user()`? What happens then?\nThis would mean that the execution of your code would stop in this `create_user()`\nfunction, and, as a consequence, the code that runs after this `create_user()`\nfunction would simply not run, and, as a result, the memory of the `user` object\nwould not be freed before your program stops.\n\nThis is the perfect scenario for `errdefer`. We use this keyword to garantee\nthat our program will free the allocated memory for the `user` object,\neven if an error occurs inside the `create_user()` function.\n\nIf you allocate and free some memory for an object in the same scope, then,\njust use `defer` and be happy, `errdefer` have no use for you in such situation.\nBut if you allocate some memory in a scope A, but you only free this memory\nlater, in a scope B for example, then, `errdefer` becomes useful to avoid leaking memory\nin sketchy situations.\n\n\n\n## Union type in Zig {#sec-unions}\n\nAn union type defines a set of types that an object can be. It is like a list of\noptions. Each option is a type that an object can assume. Therefore, unions in Zig\nhave the same meaning, or, the same role as unions in C. They are used for the same purpose.\nYou could also say that unions in Zig produces a similar effect to\n[`typing.Union` in Python](https://docs.python.org/3/library/typing.html#typing.Union)[^pyunion].\n\n[^pyunion]: \n\nFor example, you might be creating an API that sends data to a data lake, hosted\nin some private cloud infrastructure. Suppose you created different structs in your codebase,\nto store the necessary information that you need, in order to connect to the services of\neach mainstream data lake service (Amazon S3, Azure Blob, etc.).\n\nNow, suppose you also have a function named `send_event()` that receives an event as input,\nand, a target data lake, and it sends the input event to the data lake specified in the\ntarget data lake argument. But this target data lake could be any of the three mainstream data lakes\nservices (Amazon S3, Azure Blob, etc.). Here is where an union can help you.\n\nThe union `LakeTarget` defined below allows the `lake_target` argument of `send_event()`\nto be either an object of type `AzureBlob`, or type `AmazonS3`, or type `GoogleGCP`.\nThis union allows the `send_event()` function to receive an object of any of these three types\nas input in the `lake_target` argument.\n\nRemember that each of these three types\n(`AmazonS3`, `GoogleGCP` and `AzureBlob`) are separate structs that we defined in\nour source code. So, at first glance, they are separate data types in our source code.\nBut is the `union` keyword that unifies them into a single data type called `LakeTarget`.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nconst LakeTarget = union {\n azure: AzureBlob,\n amazon: AmazonS3,\n google: GoogleGCP,\n};\n\nfn send_event(\n event: Event,\n lake_target: LakeTarget\n) -> bool {\n // body of the function ...\n}\n```\n:::\n\n\nAn union definition is composed by a list of data members. Each data member is of a specific data type.\nIn the example above, the `LakeTarget` union have three data members (`azure`, `amazon`, `google`).\nWhen you instantiate an object that uses an union type, you can only use one of it's data members\nin this instantiation.\n\nYou could also interpret this as: only one data member of an union type can be activated at a time, the other data\nmembers remain deactivated and unaccessible. For example, if you create a `LakeTarget` object that uses\nthe `azure` data member, you can no longer use or access the data members `google` or `amazon`.\nIt is like if these other data members didn't exist at all in the `LakeTarget` type.\n\nYou can see this logic in the example below. Notice that, we first instantiate the union\nobject using the `azure` data member. As a result, this `target` object contains only\nthe `azure` data member inside of it. Only this data member is active in this object.\nThat is why the last line in this code example is invalid. Because we are trying to instantiate the data member\n`google`, which is currently inactive for this `target` object, and as a result, the program\nenters in panic mode warning us about this mistake through a loud error message.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar target = LakeTarget {\n .azure = AzureBlob.init()\n};\n// Only the `azure` data member exist inside\n// the `target` object, and, as a result, this\n// line below is invalid:\ntarget.google = GoogleGCP.init();\n```\n:::\n\n\n```\nthread 2177312 panic: access of union field 'google' while\n field 'azure' is active:\n target.google = GoogleGCP.init();\n ^\n```\n\nSo, when you instantiate an union object, you must choose one of the data types (or, one of the data members)\nlisted in the union type. In the example above, I choose to use the `azure` data member, and, as a result,\nall other data members were automatically deactivated,\nand you can no longer use them after you instantiate the object.\n\nYou can activate another data member by completely redefining the entire enum object.\nIn the example below, I initially use the `azure` data member. But then, I redefine the\n`target` object to use a new `LakeTarget` object, which uses this time the `google` data member.\n\n\n::: {.cell}\n\n```{.zig .cell-code}\nvar target = LakeTarget {\n .azure = AzureBlob.init()\n};\ntarget = LakeTarget {\n .google = GoogleGCP.init()\n};\n```\n:::\n\n\nAn curious fact about union types, is that, at first, you cannot use them in switch statements (that we preseted at @sec-switch).\nIn other words, if you have an object of type `LakeTarget` for example, you cannot give this object\nto a switch statement as input.\n\nBut what if you really need to do so? What if you actually need to\nprovide an \"union object\" to a switch statement? The answer to this question relies on another special type in Zig,\nwhich are the *tagged unions*. To create a tagged union, all you have to do is to add\nan enum type into your union declaration.\n\nAs an example of a tagged union in Zig, take the `Registry` type exposed\nbelow. This type comes from the\n[`grammar.zig` module](https://github.com/ziglang/zig/blob/30b4a87db711c368853b3eff8e214ab681810ef9/tools/spirv/grammar.zig)[^grammar]\nfrom the Zig repository. This union type lists different types of registries.\nBut notice this time, the use of `(enum)` after the `union` keyword. This is what makes\nthis union type a tagged union. Also, by being a tagged union, an object of this `Registry` type\ncan be used as input in a switch statement. This is all you have to do. Just add `(enum)`\nto your `union` declaration, and you can use it in switch statements.\n\n[^grammar]: .\n\n\n::: {.cell}\n\n```{.zig .cell-code}\npub const Registry = union(enum) {\n core: CoreRegistry,\n extension: ExtensionRegistry,\n};\n```\n:::\n", + "supporting": [ + "09-error-handling_files" + ], "filters": [ "rmarkdown/pagebreak.lua" ], diff --git a/docs/Chapters/01-memory.html b/docs/Chapters/01-memory.html index e9636f3..53160a9 100644 --- a/docs/Chapters/01-memory.html +++ b/docs/Chapters/01-memory.html @@ -586,7 +586,7 @@

std.debug.print("{s}\n", .{input}); } -

Also, notice that in this example, we use the keyword defer to run a small piece of code at the end of the current scope, which is the expression allocator.free(input). When you execute this expression, the allocator will free the memory that it allocated for the input object.

+

Also, notice that in this example, we use the defer keyword (which I described at Section 1.9.3) to run a small piece of code at the end of the current scope, which is the expression allocator.free(input). When you execute this expression, the allocator will free the memory that it allocated for the input object.

We have talked about this at Section 2.1.5. You should always explicitly free any memory that you allocate using an allocator! You do that by using the free() method of the same allocator object you used to allocate this memory. The defer keyword is used in this example only to help us execute this free operation at the end of the current scope.

diff --git a/docs/Chapters/01-zig-weird.html b/docs/Chapters/01-zig-weird.html index 18b329a..bc52959 100644 --- a/docs/Chapters/01-zig-weird.html +++ b/docs/Chapters/01-zig-weird.html @@ -272,9 +272,10 @@

Table of contents

  • 1.10 Structs and OOP
  • 1.11 Anonymous struct literals
  • @@ -834,23 +835,48 @@

    This is neat, and it works with character ranges too. That is, I could simply write 'a'...'z', to match any character value that is a lowercase letter, and it would work fine.

    -
    -

    1.9.3 For loops

    +
    +

    1.9.3 The defer keyword

    +

    With the defer keyword you can execute expressions at the end of the current scope. Take the foo() function below as an example. When we execute this function, the expression that prints the message “Exiting function …” get’s executed only at the end of the function scope.

    +
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +fn foo() !void {
    +    defer std.debug.print(
    +        "Exiting function ...\n", .{}
    +    );
    +    try stdout.print("Adding some numbers ...\n", .{});
    +    const x = 2 + 2; _ = x;
    +    try stdout.print("Multiplying ...\n", .{});
    +    const y = 2 * 8; _ = y;
    +}
    +
    +pub fn main() !void {
    +    try foo();
    +}
    +
    +
    Adding some numbers ...
    +Multiplying ...
    +Exiting function ...
    +

    It doesn’t matter how the function exits (i.e. because of an error, or, because of an return statement, or whatever), just remember, this expression get’s executed when the function exits.

    +
    +
    +

    1.9.4 For loops

    A loop allows you to execute the same lines of code multiple times, thus, creating a “repetition space” in the execution flow of your program. Loops are particularly useful when we want to replicate the same function (or the same set of commands) over several different inputs.

    There are different types of loops available in Zig. But the most essential of them all is probably the for loop. A for loop is used to apply the same piece of code over the elements of a slice or an array.

    For loops in Zig have a slightly different syntax that you are probably used to see in other languages. You start with the for keyword, then, you list the items that you want to iterate over inside a pair of parentheses. Then, inside of a pair of pipes (|) you should declare an identifier that will serve as your iterator, or, the “repetition index of the loop”.

    -
    for (items) |value| {
    -    // code to execute
    -}
    +
    for (items) |value| {
    +    // code to execute
    +}

    Instead of using a (value in items) syntax, in Zig, for loops use the syntax (items) |value|. In the example below, you can see that we are looping through the items of the array stored at the object name, and printing to the console the decimal representation of each character in this array.

    If we wanted, we could also iterate through a slice (or a portion) of the array, instead of iterating through the entire array stored in the name object. Just use a range selector to select the section you want. For example, I could provide the expression name[0..3] to the for loop, to iterate just through the first 3 elements in the array.

    -
    const name = [_]u8{'P','e','d','r','o'};
    -for (name) |char| {
    -    try stdout.print("{d} | ", .{char});
    -}
    +
    const name = [_]u8{'P','e','d','r','o'};
    +for (name) |char| {
    +    try stdout.print("{d} | ", .{char});
    +}
    80 | 101 | 100 | 114 | 111 | 
    @@ -859,42 +885,42 @@

    You can do that by providing a second set of items to iterate over. More precisely, you provide the range selector 0.. to the for loop. So, yes, you can use two different iterators at the same time in a for loop in Zig.

    But remember from Section 1.4 that, every object you create in Zig must be used in some way. So if you declare two iterators in your for loop, you must use both iterators inside the for loop body. But if you want to use just the index iterator, and not use the “value iterator”, then, you can discard the value iterator by maching the value items to the underscore character, like in the example below:

    -
    for (name, 0..) |_, i| {
    -    try stdout.print("{d} | ", .{i});
    -}
    +
    for (name, 0..) |_, i| {
    +    try stdout.print("{d} | ", .{i});
    +}
    0 | 1 | 2 | 3 | 4 |

    -
    -

    1.9.4 While loops

    +
    +

    1.9.5 While loops

    A while loop is created from the while keyword. While a for loop iterates through the items of an array, a while loop will loop continuously, and infinitely, until a logical test (specified by you) becomes false.

    You start with the while keyword, then, you define a logical expression inside a pair of parentheses, and the body of the loop is provided inside a pair of curly braces, like in the example below:

    -
    var i: u8 = 1;
    -while (i < 5) {
    -    try stdout.print("{d} | ", .{i});
    -    i += 1;
    -}
    +
    var i: u8 = 1;
    +while (i < 5) {
    +    try stdout.print("{d} | ", .{i});
    +    i += 1;
    +}
    1 | 2 | 3 | 4 | 
    -
    -

    1.9.5 Using break and continue

    +
    +

    1.9.6 Using break and continue

    In Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, using the keywords break and continue, respectively. The while loop present in the example below, is at first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to true. What makes this while loop stop when the i object reaches the count 10? Is the break keyword!

    Inside the while loop, we have an if statement that is constantly checking if the i variable is equal to 10. Since we are increasing the value of this i variable at each iteration of the while loop. At some point, this i variable will be equal to 10, and when it does, the if statement will execute the break expression, and, as a result, the execution of the while loop is stopped.

    Notice the expect() function from the Zig standard library after the while loop. This expect() function is an “assert” type of function. This function checks if the logical test provided is equal to true. If this logical test is false, the function raises an assertion error. But it is equal to true, then, the function will do nothing.

    -
    var i: usize = 0;
    -while (true) {
    -    if (i == 10) {
    -        break;
    -    }
    -    i += 1;
    -}
    -try std.testing.expect(i == 10);
    -try stdout.print("Everything worked!", .{});
    +
    var i: usize = 0;
    +while (true) {
    +    if (i == 10) {
    +        break;
    +    }
    +    i += 1;
    +}
    +try std.testing.expect(i == 10);
    +try stdout.print("Everything worked!", .{});
    Everything worked!
    @@ -902,13 +928,13 @@

    -
    const ns = [_]u8{1,2,3,4,5,6};
    -for (ns) |i| {
    -    if ((i % 2) == 0) {
    -        continue;
    -    }
    -    try stdout.print("{d} | ", .{i});
    -}
    +
    const ns = [_]u8{1,2,3,4,5,6};
    +for (ns) |i| {
    +    if ((i % 2) == 0) {
    +        continue;
    +    }
    +    try stdout.print("{d} | ", .{i});
    +}
    1 | 3 | 5 | 
    @@ -921,38 +947,37 @@

    With struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C. You give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can also register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object that you create with this new type, will always have these methods available and associated with them.

    In C++, when we create a new class, we normally have a constructor method (or, a constructor function) to construct or to instantiate every object of this particular class, and you also have a destructor method (or a destructor function) that is the function responsible for destroying every object of this class.

    In Zig, we normally declare the constructor and the destructor methods of our structs, by declaring an init() and a deinit() methods inside the struct. This is just a naming convention that you will find across the entire Zig standard library. So, in Zig, the init() method of a struct is normally the constructor method of the class represented by this struct. While the deinit() method is the method used for destroying an existing instance of that struct.

    -

    Both the init() and deinit() methods are used extensively in Zig code, and you will see both of them at Section 2.2.7. In this section, I present the ArenaAllocator(), which is a special type of allocator object that receives a second (child) allocator object at instantiation. We use the init() method to create a new ArenaAllocator() object, then, on the next line, we also used the deinit() method in conjunction with the defer keyword, to destroy this arena allocator object at the end of the current scope.

    -

    But, as another example, let’s build a simple User struct to represent an user of some sort of system. If you look at the User struct below, you can see the struct keyword, and inside of a pair of curly braces, we write the struct’s body.

    +

    The init() and deinit() methods are both used extensively in Zig code, and you will see both of them being used when we talk about allocators at Section 2.2. But, as another example, let’s build a simple User struct to represent an user of some sort of system. If you look at the User struct below, you can see the struct keyword, and inside of a pair of curly braces, we write the struct’s body.

    Notice the data members of this struct, id, name and email. Every data member have it’s type explicitly annotated, with the colon character (:) syntax that we described earlier at Section 1.2.2. But also notice that every line in the struct body that describes a data member, ends with a comma character (,). So every time you declare a data member in your Zig code, always end the line with a comma character, instead of ending it with the traditional semicolon character (;).

    Next, also notice in this example, that we registrated an init() function as a method of this User struct. This init() method is the constructor method that you use to instantiate every new User object. That is why this init() function return an User object as result.

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -const User = struct {
    -    id: u64,
    -    name: []const u8,
    -    email: []const u8,
    -
    -    pub fn init(id: u64,
    -                name: []const u8,
    -                email: []const u8) User {
    -
    -        return User {
    -            .id = id,
    -            .name = name,
    -            .email = email
    -        };
    -    }
    -
    -    pub fn print_name(self: User) !void {
    -        try stdout.print("{s}\n", .{self.name});
    -    }
    -};
    -
    -pub fn main() !void {
    -    const u = User.init(1, "pedro", "email@gmail.com");
    -    try u.print_name();
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +const User = struct {
    +    id: u64,
    +    name: []const u8,
    +    email: []const u8,
    +
    +    pub fn init(id: u64,
    +                name: []const u8,
    +                email: []const u8) User {
    +
    +        return User {
    +            .id = id,
    +            .name = name,
    +            .email = email
    +        };
    +    }
    +
    +    pub fn print_name(self: User) !void {
    +        try stdout.print("{s}\n", .{self.name});
    +    }
    +};
    +
    +pub fn main() !void {
    +    const u = User.init(1, "pedro", "email@gmail.com");
    +    try u.print_name();
    +}
    pedro
    @@ -965,23 +990,23 @@

    1.11 Anonymous struct literals

    You can declare a struct object as a literal value. When we do that, we normally specify the data type of this struct literal by writing it’s data type just before the opening curly braces. For example, I could write a struct literal of type User that we defined in the previous section like this:

    -
    const eu = User {
    -    .id = 1,
    -    .name = "Pedro",
    -    .email = "someemail@gmail.com"
    -};
    -_ = eu;
    +
    const eu = User {
    +    .id = 1,
    +    .name = "Pedro",
    +    .email = "someemail@gmail.com"
    +};
    +_ = eu;

    However, in Zig, we can also write an anonymous struct literal. That is, you can write a struct literal, but not especify explicitly the type of this particular struct. An anonymous struct is written by using the syntax .{}. So, we essentially replaced the explicit type of the struct literal with a dot character (.).

    As we described at Section 1.8, when you put a dot before a struct literal, the type of this struct literal is automatically inferred by the zig compiler. In essence, the zig compiler will look for some hint of what is the type of that struct. It can be the type annotation of an function argument, or the return type annotation of the function that you are using, or the type annotation of a variable. If the compiler do find such type annotation, then, it will use this type in your literal struct.

    Anonymous structs are very commom to use in function arguments in Zig. One example that you have seen already constantly, is the print() function from the stdout object. This function takes two arguments. The first argument, is a template string, which should contain string format specifiers in it, which tells how the values provided in the second argument should be printed into the message.

    While the second argument is a struct literal that lists the values to be printed into the template message specified in the first argument. You normally want to use an anonymous struct literal here, so that, the zig compiler do the job of specifying the type of this particular anonymous struct for you.

    -
    const std = @import("std");
    -pub fn main() !void {
    -    const stdout = std.io.getStdOut().writer();
    -    try stdout.print("Hello, {s}!\n", .{"world"});
    -}
    +
    const std = @import("std");
    +pub fn main() !void {
    +    const stdout = std.io.getStdOut().writer();
    +    try stdout.print("Hello, {s}!\n", .{"world"});
    +}
    Hello, world!
    @@ -994,29 +1019,29 @@

    Zig always assumes that this sequence of bytes is UTF-8 encoded. This might not be true for every sequence of bytes you have it, but is not really Zig’s job to fix the encoding of your strings (you can use iconv18 for that). Today, most of the text in our modern world, specially on the web, should be UTF-8 encoded. So if your string literal is not UTF-8 encoded, then, you will likely have problems in Zig.

    Let’s take for example the word “Hello”. In UTF-8, this sequence of characters (H, e, l, l, o) is represented by the sequence of decimal numbers 72, 101, 108, 108, 111. In xecadecimal, this sequence is 0x48, 0x65, 0x6C, 0x6C, 0x6F. So if I take this sequence of hexadecimal values, and ask Zig to print this sequence of bytes as a sequence of characters (i.e. a string), then, the text “Hello” will be printed into the terminal:

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -
    -pub fn main() !void {
    -    const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};
    -    try stdout.print("{s}\n", .{bytes});
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +
    +pub fn main() !void {
    +    const bytes = [_]u8{0x48, 0x65, 0x6C, 0x6C, 0x6F};
    +    try stdout.print("{s}\n", .{bytes});
    +}
    Hello

    If you want to see the actual bytes that represents a string in Zig, you can use a for loop to iterate trough each byte in the string, and ask Zig to print each byte as an hexadecimal value to the terminal. You do that by using a print() statement with the X formatting specifier, like you would normally do with the printf() function19 in C.

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -pub fn main() !void {
    -    const string_literal = "This is an example of string literal in Zig";
    -    try stdout.print("Bytes that represents the string object: ", .{});
    -    for (string_literal) |byte| {
    -        try stdout.print("{X} ", .{byte});
    -    }
    -    try stdout.print("\n", .{});
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +pub fn main() !void {
    +    const string_literal = "This is an example of string literal in Zig";
    +    try stdout.print("Bytes that represents the string object: ", .{});
    +    for (string_literal) |byte| {
    +        try stdout.print("{X} ", .{byte});
    +    }
    +    try stdout.print("\n", .{});
    +}
    Bytes that represents the string object: 54 68 69 
        73 20 69 73 20 61 6E 20 65 78 61 6D 70 6C 65 20 6F
    @@ -1030,27 +1055,27 @@ 

    But one key difference between a Zig string and a C string, is that Zig also stores the length of the array inside the string object. This small detail makes your code safer, because is much easier for the Zig compiler to check if you are trying to access an element that is “out of bounds”, i.e. if your trying to access memory that does not belong to you.

    To achieve this same kind of safety in C, you have to do a lot of work that kind of seems pointless. So getting this kind of safety is not automatic and much harder to do in C. For example, if you want to track the length of your string troughout your program in C, then, you first need to loop through the array of bytes that represents this string, and find the null element ('\0') position to discover where exactly the array ends, or, in other words, to find how much elements the array of bytes contain.

    To do that, you would need something like this in C. In this example, the C string stored in the object array is 25 bytes long:

    -
    #include <stdio.h>
    -int main() {
    -    char* array = "An example of string in C";
    -    int index = 0;
    -    while (1) {
    -        if (array[index] == '\0') {
    -            break;
    -        }
    -        index++;
    -    }
    -    printf("Number of elements in the array: %d\n", index);
    -}
    +
    #include <stdio.h>
    +int main() {
    +    char* array = "An example of string in C";
    +    int index = 0;
    +    while (1) {
    +        if (array[index] == '\0') {
    +            break;
    +        }
    +        index++;
    +    }
    +    printf("Number of elements in the array: %d\n", index);
    +}
    Number of elements in the array: 25

    But in Zig, you do not have to do this, because the object already contains a len field which stores the length information of the array. As an example, the string_literal object below is 43 bytes long:

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -pub fn main() !void {
    -    const string_literal = "This is an example of string literal in Zig";
    -    try stdout.print("{d}\n", .{string_literal.len});
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +pub fn main() !void {
    +    const string_literal = "This is an example of string literal in Zig";
    +    try stdout.print("{d}\n", .{string_literal.len});
    +}
    43
    @@ -1064,21 +1089,21 @@

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -pub fn main() !void {
    -    const string_literal = "This is an example of string literal in Zig";
    -    const simple_array = [_]i32{1, 2, 3, 4};
    -    try stdout.print("Type of array object: {}", .{@TypeOf(simple_array)});
    -    try stdout.print(
    -        "Type of string object: {}",
    -        .{@TypeOf(string_literal)}
    -    );
    -    try stdout.print(
    -        "Type of a pointer that points to the array object: {}",
    -        .{@TypeOf(&simple_array)}
    -    );
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +pub fn main() !void {
    +    const string_literal = "This is an example of string literal in Zig";
    +    const simple_array = [_]i32{1, 2, 3, 4};
    +    try stdout.print("Type of array object: {}", .{@TypeOf(simple_array)});
    +    try stdout.print(
    +        "Type of string object: {}",
    +        .{@TypeOf(string_literal)}
    +    );
    +    try stdout.print(
    +        "Type of a pointer that points to the array object: {}",
    +        .{@TypeOf(&simple_array)}
    +    );
    +}

    Type of array object: [4]i32
     Type of string object: *const [43:0]u8
    @@ -1091,15 +1116,15 @@ 

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -pub fn main() !void {
    -    const string_literal = "Ⱥ";
    -    try stdout.print("Bytes that represents the string object: ", .{});
    -    for (string_literal) |char| {
    -        try stdout.print("{X} ", .{char});
    -    }
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +pub fn main() !void {
    +    const string_literal = "Ⱥ";
    +    try stdout.print("Bytes that represents the string object: ", .{});
    +    for (string_literal) |char| {
    +        try stdout.print("{X} ", .{char});
    +    }
    +}
    Bytes that represents the string object: C8 BA 
    @@ -1111,20 +1136,20 @@

    -
    const std = @import("std");
    -const stdout = std.io.getStdOut().writer();
    -pub fn main() !void {
    -    var utf8 = (
    -        (try std.unicode.Utf8View.init("アメリカ"))
    -            .iterator()
    -    );
    -    while (utf8.nextCodepointSlice()) |codepoint| {
    -        try stdout.print(
    -            "got codepoint {}\n",
    -            .{std.fmt.fmtSliceHexUpper(codepoint)}
    -        );
    -    }
    -}
    +
    const std = @import("std");
    +const stdout = std.io.getStdOut().writer();
    +pub fn main() !void {
    +    var utf8 = (
    +        (try std.unicode.Utf8View.init("アメリカ"))
    +            .iterator()
    +    );
    +    while (utf8.nextCodepointSlice()) |codepoint| {
    +        try stdout.print(
    +            "got codepoint {}\n",
    +            .{std.fmt.fmtSliceHexUpper(codepoint)}
    +        );
    +    }
    +}

    got codepoint E382A2
     got codepoint E383A1
    diff --git a/docs/Chapters/09-error-handling.html b/docs/Chapters/09-error-handling.html
    index e7f3911..d6615d8 100644
    --- a/docs/Chapters/09-error-handling.html
    +++ b/docs/Chapters/09-error-handling.html
    @@ -506,9 +506,14 @@ 

    < return user; }

    -

    By using errdefer to destroy the user object that we have just created, we garantee that the memory allocated for this user object get’s freed, before the execution of the program stops.

    -

    Because if the expression try db.add(user) returns an error value, the execution of our program stops, and we loose all references and control over the memory that we have allocated for the user object. As a result, if we do not free the memory associated with the user object before the program stops, we cannot free this memory anymore. We simply loose our chance to do the right thing. That is why errdefer is essential in this situation.

    -

    Having all this in mind, the errdefer keyword is different but also similar to the defer keyword. The only difference between the two is when the provided expression get’s executed. The defer keyword always execute the provided expression at the end of the current scope, while errdefer executes the provided expression when an error occurs in the current scope.

    +

    By using errdefer to destroy the user object that we have just created, we garantee that the memory allocated for this user object get’s freed, before the execution of the program stops. Because if the expression try db.add(user) returns an error value, the execution of our program stops, and we loose all references and control over the memory that we have allocated for the user object. As a result, if we do not free the memory associated with the user object before the program stops, we cannot free this memory anymore. We simply loose our chance to do the right thing. That is why errdefer is essential in this situation.

    +

    Just to make very clear the differences between defer (which I described at Section 1.9.3) and errdefer, it might be worth to discuss the subject a bit further. You might still have the question “why use errdefer if we can use defer instead?” in your mind.

    +

    Although being similar, the key difference between errdefer and defer keyword is when the provided expression get’s executed. The defer keyword always execute the provided expression at the end of the current scope, no matter how your code exits this scope. In contrast, errdefer executes the provided expression only when an error occurs in the current scope.

    +

    This becomes important if a resource that you allocate in the current scope get’s freed later in your code, in a different scope. The create_user() functions is an example of this. If you think closely about this function, you will notice that this function returns the user object as the result.

    +

    In other words, the allocated memory for the user object does not get freed inside the create_user(), if the function returns succesfully. So, if an error does not occur inside this function, the user object is returned from the function, and probably, the code that runs after this create_user() function will be responsible for freeying the memory of the user object.

    +

    But what if an error do occur inside the create_user()? What happens then? This would mean that the execution of your code would stop in this create_user() function, and, as a consequence, the code that runs after this create_user() function would simply not run, and, as a result, the memory of the user object would not be freed before your program stops.

    +

    This is the perfect scenario for errdefer. We use this keyword to garantee that our program will free the allocated memory for the user object, even if an error occurs inside the create_user() function.

    +

    If you allocate and free some memory for an object in the same scope, then, just use defer and be happy, errdefer have no use for you in such situation. But if you allocate some memory in a scope A, but you only free this memory later, in a scope B for example, then, errdefer becomes useful to avoid leaking memory in sketchy situations.

    diff --git a/docs/search.json b/docs/search.json index ae5399c..38e8a3a 100644 --- a/docs/search.json +++ b/docs/search.json @@ -174,7 +174,7 @@ "href": "Chapters/01-zig-weird.html#sec-zig-control-flow", "title": "1  Introducing Zig", "section": "1.9 Control flow", - "text": "1.9 Control flow\nSometimes, you need to make decisions in your program. Maybe you need to decide wether to execute or not a specific piece of code. Or maybe, you need to apply the same operation over a sequence of values. These kinds of tasks, involve using structures that are capable of changing the “control flow” of our program.\nIn computer science, the term “control flow” usually refers to the order in which expressions (or commands) are evaluated in a given language or program. But this term is also used to refer to structures that are capable of changing this “evaluation order” of the commands executed by a given language/program.\nThese structures are better known by a set of terms, such as: loops, if/else statements, switch statements, among others. So, loops and if/else statements are examples of structures that can change the “control flow” of our program. The keywords continue and break are also examples of symbols that can change the order of evaluation, since they can move our program to the next iteration of a loop, or make the loop stop completely.\n\n1.9.1 If/else statements\nAn if/else statement performs an “conditional flow operation”. A conditional flow control (or choice control) allows you to execute or ignore a certain block of commands based on a logical condition. Many programmers and computer science professionals also use the term “branching” in this case. In essence, we use if/else statements to use the result of a logical test to decide whether or not to execute a given block of commands.\nIn Zig, we write if/else statements by using the keywords if and else. We start with the if keyword followed by a logical test inside a pair of parentheses, and then, a pair of curly braces with contains the lines of code to be executed in case the logical test returns the value true.\nAfter that, you can optionally add an else statement. Just add the else keyword followed by a pair of curly braces, with the lines of code to executed in case the logical test defined in the if returns false.\nIn the example below, we are testing if the object x contains a number that is greater than 10. Judging by the output printed to the console, we know that this logical test returned false. Because the output in the console is compatible with the line of code present in the else branch of the if/else statement.\n\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n\nx <= 10!\n\n\n\n\n1.9.2 Swith statements\nSwitch statements are also available in Zig. A switch statement in Zig have a similar syntax to a switch statement in Rust. As you would expect, to write a switch statement in Zig we use the switch keyword. We provide the value that we want to “switch over” inside a pair of parentheses. Then, we list the possible combinations (or “branchs”) inside a pair of curly braces.\nLet’s take a look at the code example below. You can see in this example that, I’m creating an enum type called Role. We talk more about enums at Section 6.6. But in essence, this Role type is listing different types of roles in a fictituous company, like SE for Software Engineer, DE for Data Engineer, PM for Product Manager, etc.\nNotice that we are using the value from the role object in the switch statement, to discover which exact area we need to store in the area variable object. Also notice that we are using type inference inside the switch statement, with the dot character, as we described at Section 1.8. This makes the zig compiler infer the correct data type of the values (PM, SE, etc.) for us.\nAlso notice that, we are grouping multiple values in the same branch of switch statement. We just separate each possible value with a comma. So, for example, if role contains either DE or DA, the area variable would contain the value \"Data & Analytics\", instead of \"Platform\".\n\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n\nPlatform\n\n\nNow, one very important aspect about this switch statement presented in the code example above, is that it exhaust all existing possibilities. In other words, all possible values that could be found inside the order object are explicitly handled in this switch statement.\nSince the role object have type Role, the only possible values to be found inside this object are PM, SE, DPE, PO, DE, DA and KS. There is no other possible value to be stored in this role object. This what “exhaust all existing possibilities” means. The switch statement covers every possible case.\nIn Zig, switch statements must exhaust all existing possibilities. You cannot write a switch statement, and leave an edge case with no expliciting action to be taken. This is a similar behaviour to switch statements in Rust, which also have to handle all possible cases.\nTake a look at the dump_hex_fallible() function below as an example. This function also comes from the Zig Standard Library, but this time, it comes from the debug.zig module17. There are multiple lines in this function, but I omitted them to focus solely on the switch statement found in this function. Notice that this switch statement have four possible cases, or four explicit branches. Also, notice that we used an else branch in this case. Whenever you have multiple possible cases in your switch statement which you want to apply the same exact action, you can use an else branch to do that.\n\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n\nMany users would also use an else branch to handle a “not supported” case. That is, a case that cannot be properly handled by your code, or, just a case that should not be “fixed”. So many programmers use an else branch to panic (or raise an error) to stop the current execution.\nTake the code example below as an example. We can see that, we are handling the cases for the level object being either 1, 2, or 3. All other possible cases are not supported by default, and, as consequence, we raise an runtime error in these cases, through the @panic() built-in function.\nAlso notice that, we are assigning the result of the switch statement to a new object called category. This is another thing that you can do with switch statements in Zig. If the branchs in this switch statement output some value as result, you can store the result value of the switch statement into a new variable.\n\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\nFurthermore, you can also use ranges of values in switch statements. That is, you can create a branch in your switch statement that is used whenever the input value is contained in a range. These range expressions are created with the operator .... Is important to emphasize that the ranges created by this operator are inclusive on both ends.\nFor example, I could easily change the code example above to support all levels between 0 and 100. Like this:\n\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n\nbeginner\n\n\nThis is neat, and it works with character ranges too. That is, I could simply write 'a'...'z', to match any character value that is a lowercase letter, and it would work fine.\n\n\n1.9.3 For loops\nA loop allows you to execute the same lines of code multiple times, thus, creating a “repetition space” in the execution flow of your program. Loops are particularly useful when we want to replicate the same function (or the same set of commands) over several different inputs.\nThere are different types of loops available in Zig. But the most essential of them all is probably the for loop. A for loop is used to apply the same piece of code over the elements of a slice or an array.\nFor loops in Zig have a slightly different syntax that you are probably used to see in other languages. You start with the for keyword, then, you list the items that you want to iterate over inside a pair of parentheses. Then, inside of a pair of pipes (|) you should declare an identifier that will serve as your iterator, or, the “repetition index of the loop”.\n\nfor (items) |value| {\n // code to execute\n}\n\nInstead of using a (value in items) syntax, in Zig, for loops use the syntax (items) |value|. In the example below, you can see that we are looping through the items of the array stored at the object name, and printing to the console the decimal representation of each character in this array.\nIf we wanted, we could also iterate through a slice (or a portion) of the array, instead of iterating through the entire array stored in the name object. Just use a range selector to select the section you want. For example, I could provide the expression name[0..3] to the for loop, to iterate just through the first 3 elements in the array.\n\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n\n80 | 101 | 100 | 114 | 111 | \n\n\nIn the above example we are using the value itself of each element in the array as our iterator. But there are many situations where we need to use an index instead of the actual values of the items.\nYou can do that by providing a second set of items to iterate over. More precisely, you provide the range selector 0.. to the for loop. So, yes, you can use two different iterators at the same time in a for loop in Zig.\nBut remember from Section 1.4 that, every object you create in Zig must be used in some way. So if you declare two iterators in your for loop, you must use both iterators inside the for loop body. But if you want to use just the index iterator, and not use the “value iterator”, then, you can discard the value iterator by maching the value items to the underscore character, like in the example below:\n\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n\n0 | 1 | 2 | 3 | 4 |\n\n\n1.9.4 While loops\nA while loop is created from the while keyword. While a for loop iterates through the items of an array, a while loop will loop continuously, and infinitely, until a logical test (specified by you) becomes false.\nYou start with the while keyword, then, you define a logical expression inside a pair of parentheses, and the body of the loop is provided inside a pair of curly braces, like in the example below:\n\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n\n1 | 2 | 3 | 4 | \n\n\n\n\n1.9.5 Using break and continue\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, using the keywords break and continue, respectively. The while loop present in the example below, is at first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to true. What makes this while loop stop when the i object reaches the count 10? Is the break keyword!\nInside the while loop, we have an if statement that is constantly checking if the i variable is equal to 10. Since we are increasing the value of this i variable at each iteration of the while loop. At some point, this i variable will be equal to 10, and when it does, the if statement will execute the break expression, and, as a result, the execution of the while loop is stopped.\nNotice the expect() function from the Zig standard library after the while loop. This expect() function is an “assert” type of function. This function checks if the logical test provided is equal to true. If this logical test is false, the function raises an assertion error. But it is equal to true, then, the function will do nothing.\n\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n\nEverything worked!\n\n\nSince this code example was executed succesfully by the zig compiler, without raising any errors, then, we known that, after the execution of while loop, the i variable is equal to 10. Because if it wasn’t equal to 10, then, an error would be raised by expect().\nNow, in the next example, we have an use case for the continue keyword. The if statement is constantly checking if the current index is a multiple of 2. If it is, then we jump to the next iteration of the loop directly. But it the current index is not a multiple of 2, then, the loop will simply print this index to the console.\n\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n\n1 | 3 | 5 |", + "text": "1.9 Control flow\nSometimes, you need to make decisions in your program. Maybe you need to decide wether to execute or not a specific piece of code. Or maybe, you need to apply the same operation over a sequence of values. These kinds of tasks, involve using structures that are capable of changing the “control flow” of our program.\nIn computer science, the term “control flow” usually refers to the order in which expressions (or commands) are evaluated in a given language or program. But this term is also used to refer to structures that are capable of changing this “evaluation order” of the commands executed by a given language/program.\nThese structures are better known by a set of terms, such as: loops, if/else statements, switch statements, among others. So, loops and if/else statements are examples of structures that can change the “control flow” of our program. The keywords continue and break are also examples of symbols that can change the order of evaluation, since they can move our program to the next iteration of a loop, or make the loop stop completely.\n\n1.9.1 If/else statements\nAn if/else statement performs an “conditional flow operation”. A conditional flow control (or choice control) allows you to execute or ignore a certain block of commands based on a logical condition. Many programmers and computer science professionals also use the term “branching” in this case. In essence, we use if/else statements to use the result of a logical test to decide whether or not to execute a given block of commands.\nIn Zig, we write if/else statements by using the keywords if and else. We start with the if keyword followed by a logical test inside a pair of parentheses, and then, a pair of curly braces with contains the lines of code to be executed in case the logical test returns the value true.\nAfter that, you can optionally add an else statement. Just add the else keyword followed by a pair of curly braces, with the lines of code to executed in case the logical test defined in the if returns false.\nIn the example below, we are testing if the object x contains a number that is greater than 10. Judging by the output printed to the console, we know that this logical test returned false. Because the output in the console is compatible with the line of code present in the else branch of the if/else statement.\n\nconst x = 5;\nif (x > 10) {\n try stdout.print(\n \"x > 10!\\n\", .{}\n );\n} else {\n try stdout.print(\n \"x <= 10!\\n\", .{}\n );\n}\n\nx <= 10!\n\n\n\n\n1.9.2 Swith statements\nSwitch statements are also available in Zig. A switch statement in Zig have a similar syntax to a switch statement in Rust. As you would expect, to write a switch statement in Zig we use the switch keyword. We provide the value that we want to “switch over” inside a pair of parentheses. Then, we list the possible combinations (or “branchs”) inside a pair of curly braces.\nLet’s take a look at the code example below. You can see in this example that, I’m creating an enum type called Role. We talk more about enums at Section 6.6. But in essence, this Role type is listing different types of roles in a fictituous company, like SE for Software Engineer, DE for Data Engineer, PM for Product Manager, etc.\nNotice that we are using the value from the role object in the switch statement, to discover which exact area we need to store in the area variable object. Also notice that we are using type inference inside the switch statement, with the dot character, as we described at Section 1.8. This makes the zig compiler infer the correct data type of the values (PM, SE, etc.) for us.\nAlso notice that, we are grouping multiple values in the same branch of switch statement. We just separate each possible value with a comma. So, for example, if role contains either DE or DA, the area variable would contain the value \"Data & Analytics\", instead of \"Platform\".\n\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst Role = enum {\n SE, DPE, DE, DA, PM, PO, KS\n};\n\npub fn main() !void {\n var area: []const u8 = undefined;\n const role = Role.SE;\n switch (role) {\n .PM, .SE, .DPE, .PO => {\n area = \"Platform\";\n },\n .DE, .DA => {\n area = \"Data & Analytics\";\n },\n .KS => {\n area = \"Sales\";\n },\n }\n try stdout.print(\"{s}\\n\", .{area});\n}\n\nPlatform\n\n\nNow, one very important aspect about this switch statement presented in the code example above, is that it exhaust all existing possibilities. In other words, all possible values that could be found inside the order object are explicitly handled in this switch statement.\nSince the role object have type Role, the only possible values to be found inside this object are PM, SE, DPE, PO, DE, DA and KS. There is no other possible value to be stored in this role object. This what “exhaust all existing possibilities” means. The switch statement covers every possible case.\nIn Zig, switch statements must exhaust all existing possibilities. You cannot write a switch statement, and leave an edge case with no expliciting action to be taken. This is a similar behaviour to switch statements in Rust, which also have to handle all possible cases.\nTake a look at the dump_hex_fallible() function below as an example. This function also comes from the Zig Standard Library, but this time, it comes from the debug.zig module17. There are multiple lines in this function, but I omitted them to focus solely on the switch statement found in this function. Notice that this switch statement have four possible cases, or four explicit branches. Also, notice that we used an else branch in this case. Whenever you have multiple possible cases in your switch statement which you want to apply the same exact action, you can use an else branch to do that.\n\npub fn dump_hex_fallible(bytes: []const u8) !void {\n // Many lines ...\n switch (byte) {\n '\\n' => try writer.writeAll(\"␊\"),\n '\\r' => try writer.writeAll(\"␍\"),\n '\\t' => try writer.writeAll(\"␉\"),\n else => try writer.writeByte('.'),\n }\n}\n\nMany users would also use an else branch to handle a “not supported” case. That is, a case that cannot be properly handled by your code, or, just a case that should not be “fixed”. So many programmers use an else branch to panic (or raise an error) to stop the current execution.\nTake the code example below as an example. We can see that, we are handling the cases for the level object being either 1, 2, or 3. All other possible cases are not supported by default, and, as consequence, we raise an runtime error in these cases, through the @panic() built-in function.\nAlso notice that, we are assigning the result of the switch statement to a new object called category. This is another thing that you can do with switch statements in Zig. If the branchs in this switch statement output some value as result, you can store the result value of the switch statement into a new variable.\n\nconst level: u8 = 4;\nconst category = switch (level) {\n 1, 2 => \"beginner\",\n 3 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n\nthread 13103 panic: Not supported level!\nt.zig:9:13: 0x1033c58 in main (switch2)\n @panic(\"Not supported level!\");\n ^\nFurthermore, you can also use ranges of values in switch statements. That is, you can create a branch in your switch statement that is used whenever the input value is contained in a range. These range expressions are created with the operator .... Is important to emphasize that the ranges created by this operator are inclusive on both ends.\nFor example, I could easily change the code example above to support all levels between 0 and 100. Like this:\n\nconst level: u8 = 4;\nconst category = switch (level) {\n 0...25 => \"beginner\",\n 26...75 => \"intermediary\",\n 76...100 => \"professional\",\n else => {\n @panic(\"Not supported level!\");\n },\n};\ntry stdout.print(\"{s}\\n\", .{category});\n\nbeginner\n\n\nThis is neat, and it works with character ranges too. That is, I could simply write 'a'...'z', to match any character value that is a lowercase letter, and it would work fine.\n\n\n1.9.3 The defer keyword\nWith the defer keyword you can execute expressions at the end of the current scope. Take the foo() function below as an example. When we execute this function, the expression that prints the message “Exiting function …” get’s executed only at the end of the function scope.\n\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nfn foo() !void {\n defer std.debug.print(\n \"Exiting function ...\\n\", .{}\n );\n try stdout.print(\"Adding some numbers ...\\n\", .{});\n const x = 2 + 2; _ = x;\n try stdout.print(\"Multiplying ...\\n\", .{});\n const y = 2 * 8; _ = y;\n}\n\npub fn main() !void {\n try foo();\n}\n\nAdding some numbers ...\nMultiplying ...\nExiting function ...\nIt doesn’t matter how the function exits (i.e. because of an error, or, because of an return statement, or whatever), just remember, this expression get’s executed when the function exits.\n\n\n1.9.4 For loops\nA loop allows you to execute the same lines of code multiple times, thus, creating a “repetition space” in the execution flow of your program. Loops are particularly useful when we want to replicate the same function (or the same set of commands) over several different inputs.\nThere are different types of loops available in Zig. But the most essential of them all is probably the for loop. A for loop is used to apply the same piece of code over the elements of a slice or an array.\nFor loops in Zig have a slightly different syntax that you are probably used to see in other languages. You start with the for keyword, then, you list the items that you want to iterate over inside a pair of parentheses. Then, inside of a pair of pipes (|) you should declare an identifier that will serve as your iterator, or, the “repetition index of the loop”.\n\nfor (items) |value| {\n // code to execute\n}\n\nInstead of using a (value in items) syntax, in Zig, for loops use the syntax (items) |value|. In the example below, you can see that we are looping through the items of the array stored at the object name, and printing to the console the decimal representation of each character in this array.\nIf we wanted, we could also iterate through a slice (or a portion) of the array, instead of iterating through the entire array stored in the name object. Just use a range selector to select the section you want. For example, I could provide the expression name[0..3] to the for loop, to iterate just through the first 3 elements in the array.\n\nconst name = [_]u8{'P','e','d','r','o'};\nfor (name) |char| {\n try stdout.print(\"{d} | \", .{char});\n}\n\n80 | 101 | 100 | 114 | 111 | \n\n\nIn the above example we are using the value itself of each element in the array as our iterator. But there are many situations where we need to use an index instead of the actual values of the items.\nYou can do that by providing a second set of items to iterate over. More precisely, you provide the range selector 0.. to the for loop. So, yes, you can use two different iterators at the same time in a for loop in Zig.\nBut remember from Section 1.4 that, every object you create in Zig must be used in some way. So if you declare two iterators in your for loop, you must use both iterators inside the for loop body. But if you want to use just the index iterator, and not use the “value iterator”, then, you can discard the value iterator by maching the value items to the underscore character, like in the example below:\n\nfor (name, 0..) |_, i| {\n try stdout.print(\"{d} | \", .{i});\n}\n\n0 | 1 | 2 | 3 | 4 |\n\n\n1.9.5 While loops\nA while loop is created from the while keyword. While a for loop iterates through the items of an array, a while loop will loop continuously, and infinitely, until a logical test (specified by you) becomes false.\nYou start with the while keyword, then, you define a logical expression inside a pair of parentheses, and the body of the loop is provided inside a pair of curly braces, like in the example below:\n\nvar i: u8 = 1;\nwhile (i < 5) {\n try stdout.print(\"{d} | \", .{i});\n i += 1;\n}\n\n1 | 2 | 3 | 4 | \n\n\n\n\n1.9.6 Using break and continue\nIn Zig, you can explicitly stop the execution of a loop, or, jump to the next iteration of the loop, using the keywords break and continue, respectively. The while loop present in the example below, is at first sight, an infinite loop. Because the logical value inside the parenthese will always be equal to true. What makes this while loop stop when the i object reaches the count 10? Is the break keyword!\nInside the while loop, we have an if statement that is constantly checking if the i variable is equal to 10. Since we are increasing the value of this i variable at each iteration of the while loop. At some point, this i variable will be equal to 10, and when it does, the if statement will execute the break expression, and, as a result, the execution of the while loop is stopped.\nNotice the expect() function from the Zig standard library after the while loop. This expect() function is an “assert” type of function. This function checks if the logical test provided is equal to true. If this logical test is false, the function raises an assertion error. But it is equal to true, then, the function will do nothing.\n\nvar i: usize = 0;\nwhile (true) {\n if (i == 10) {\n break;\n }\n i += 1;\n}\ntry std.testing.expect(i == 10);\ntry stdout.print(\"Everything worked!\", .{});\n\nEverything worked!\n\n\nSince this code example was executed succesfully by the zig compiler, without raising any errors, then, we known that, after the execution of while loop, the i variable is equal to 10. Because if it wasn’t equal to 10, then, an error would be raised by expect().\nNow, in the next example, we have an use case for the continue keyword. The if statement is constantly checking if the current index is a multiple of 2. If it is, then we jump to the next iteration of the loop directly. But it the current index is not a multiple of 2, then, the loop will simply print this index to the console.\n\nconst ns = [_]u8{1,2,3,4,5,6};\nfor (ns) |i| {\n if ((i % 2) == 0) {\n continue;\n }\n try stdout.print(\"{d} | \", .{i});\n}\n\n1 | 3 | 5 |", "crumbs": [ "1  Introducing Zig" ] @@ -184,7 +184,7 @@ "href": "Chapters/01-zig-weird.html#sec-structs-and-oop", "title": "1  Introducing Zig", "section": "1.10 Structs and OOP", - "text": "1.10 Structs and OOP\nZig is a language more closely related to C (which is a procedural language), than it is to C++ or Java (which are object-oriented languages). Because of that, you do not have advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or class inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C. You give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can also register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object that you create with this new type, will always have these methods available and associated with them.\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) to construct or to instantiate every object of this particular class, and you also have a destructor method (or a destructor function) that is the function responsible for destroying every object of this class.\nIn Zig, we normally declare the constructor and the destructor methods of our structs, by declaring an init() and a deinit() methods inside the struct. This is just a naming convention that you will find across the entire Zig standard library. So, in Zig, the init() method of a struct is normally the constructor method of the class represented by this struct. While the deinit() method is the method used for destroying an existing instance of that struct.\nBoth the init() and deinit() methods are used extensively in Zig code, and you will see both of them at Section 2.2.7. In this section, I present the ArenaAllocator(), which is a special type of allocator object that receives a second (child) allocator object at instantiation. We use the init() method to create a new ArenaAllocator() object, then, on the next line, we also used the deinit() method in conjunction with the defer keyword, to destroy this arena allocator object at the end of the current scope.\nBut, as another example, let’s build a simple User struct to represent an user of some sort of system. If you look at the User struct below, you can see the struct keyword, and inside of a pair of curly braces, we write the struct’s body.\nNotice the data members of this struct, id, name and email. Every data member have it’s type explicitly annotated, with the colon character (:) syntax that we described earlier at Section 1.2.2. But also notice that every line in the struct body that describes a data member, ends with a comma character (,). So every time you declare a data member in your Zig code, always end the line with a comma character, instead of ending it with the traditional semicolon character (;).\nNext, also notice in this example, that we registrated an init() function as a method of this User struct. This init() method is the constructor method that you use to instantiate every new User object. That is why this init() function return an User object as result.\n\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n\npedro\n\n\nThe pub keyword plays an important role in struct declarations, and OOP in Zig. Every method that you declare in your struct that is marked with the keyword pub, becomes a public method of this particular struct.\nSo every method that you create in your struct, is, at first, a private method of that struct. Meaning that, this method can only be called from within this struct. But, if you mark this method as public, with the keyword pub, then, you can call the method directly from the User object you have in your code.\nIn other words, the functions marked by the keyword pub are members of the public API of that struct. For example, if I did not marked the print_name() method as public, then, I could not execute the line u.print_name(). Because I would not be authorized to call this method directly in my code.", + "text": "1.10 Structs and OOP\nZig is a language more closely related to C (which is a procedural language), than it is to C++ or Java (which are object-oriented languages). Because of that, you do not have advanced OOP (Object-Oriented Programming) patterns available in Zig, such as classes, interfaces or class inheritance. Nonetheless, OOP in Zig is still possible by using struct definitions.\nWith struct definitions, you can create (or define) a new data type in Zig. These struct definitions work the same way as they work in C. You give a name to this new struct (or, to this new data type you are creating), then, you list the data members of this new struct. You can also register functions inside this struct, and they become the methods of this particular struct (or data type), so that, every object that you create with this new type, will always have these methods available and associated with them.\nIn C++, when we create a new class, we normally have a constructor method (or, a constructor function) to construct or to instantiate every object of this particular class, and you also have a destructor method (or a destructor function) that is the function responsible for destroying every object of this class.\nIn Zig, we normally declare the constructor and the destructor methods of our structs, by declaring an init() and a deinit() methods inside the struct. This is just a naming convention that you will find across the entire Zig standard library. So, in Zig, the init() method of a struct is normally the constructor method of the class represented by this struct. While the deinit() method is the method used for destroying an existing instance of that struct.\nThe init() and deinit() methods are both used extensively in Zig code, and you will see both of them being used when we talk about allocators at Section 2.2. But, as another example, let’s build a simple User struct to represent an user of some sort of system. If you look at the User struct below, you can see the struct keyword, and inside of a pair of curly braces, we write the struct’s body.\nNotice the data members of this struct, id, name and email. Every data member have it’s type explicitly annotated, with the colon character (:) syntax that we described earlier at Section 1.2.2. But also notice that every line in the struct body that describes a data member, ends with a comma character (,). So every time you declare a data member in your Zig code, always end the line with a comma character, instead of ending it with the traditional semicolon character (;).\nNext, also notice in this example, that we registrated an init() function as a method of this User struct. This init() method is the constructor method that you use to instantiate every new User object. That is why this init() function return an User object as result.\n\nconst std = @import(\"std\");\nconst stdout = std.io.getStdOut().writer();\nconst User = struct {\n id: u64,\n name: []const u8,\n email: []const u8,\n\n pub fn init(id: u64,\n name: []const u8,\n email: []const u8) User {\n\n return User {\n .id = id,\n .name = name,\n .email = email\n };\n }\n\n pub fn print_name(self: User) !void {\n try stdout.print(\"{s}\\n\", .{self.name});\n }\n};\n\npub fn main() !void {\n const u = User.init(1, \"pedro\", \"email@gmail.com\");\n try u.print_name();\n}\n\npedro\n\n\nThe pub keyword plays an important role in struct declarations, and OOP in Zig. Every method that you declare in your struct that is marked with the keyword pub, becomes a public method of this particular struct.\nSo every method that you create in your struct, is, at first, a private method of that struct. Meaning that, this method can only be called from within this struct. But, if you mark this method as public, with the keyword pub, then, you can call the method directly from the User object you have in your code.\nIn other words, the functions marked by the keyword pub are members of the public API of that struct. For example, if I did not marked the print_name() method as public, then, I could not execute the line u.print_name(). Because I would not be authorized to call this method directly in my code.", "crumbs": [ "1  Introducing Zig" ] @@ -254,7 +254,7 @@ "href": "Chapters/01-memory.html#sec-allocators", "title": "2  Memory and Allocators in Zig", "section": "2.2 Allocators", - "text": "2.2 Allocators\nOne key aspect about Zig, is that there are “no hidden-memory allocations” in Zig. What that really means, is that “no allocations happen behind your back in the standard library” (Sobeston 2024).\nThis is a known problem, specially in C++. Because in C++, there are some operators that do allocate memory behind the scene, and there is no way for you to known that, until you actually read the source code of these operators, and find the memory allocation calls. Many programmers find this behaviour annoying and hard to keep track of.\nBut, in Zig, if a function, an operator, or anything from the standard library needs to allocate some memory during it’s execution, then, this function/operator needs to receive (as input) an allocator provided by the user, to actually be able to allocate the memory it needs.\nThis creates a clear distinction between functions that “do not” from those that “actually do” allocate memory. Just look at the arguments of this function. If a function, or operator, have an allocator object as one of it’s inputs/arguments, then, you know for sure that this function/operator will allocate some memory during it’s execution.\nAn example is the allocPrint() function from the Zig standard library. With this function, you can write a new string using format specifiers. So, this function is, for example, very similar to the function sprintf() in C. In order to write such new string, the allocPrint() function needs to allocate some memory to store the output string.\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives as input to the function. In the example below, I am using the GeneralPurposeAllocator() as my allocator object. But I could easily use any other type of allocator object from the Zig standard library.\n\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n\nHello Pedro!!!\n\n\nYou get a lot of control over where and how much memory this function can allocate. Because it is you, the user/programmer, that provides the allocator for the function to use. This makes “total control” over memory management easier to achieve in Zig.\n\n2.2.1 What are allocators?\nAllocators in Zig are objects that you can use to allocate memory for your program. They are similar to the memory allocating functions in C, like malloc() and calloc(). So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask for more memory using an allocator.\nZig offers different types of allocators, and they are usually available through the std.heap module of the standard library. So, just import the Zig standard library into your Zig module (with @import(\"std\")), and you can start using these allocators in your code.\nFurthermore, every allocator object is built on top of the Allocator interface in Zig. This means that, every allocator object you find in Zig must have the methods alloc(), create(), free() and destroy(). So, you can change the type of allocator you are using, but you don’t need to change the function calls to the methods that do the memory allocation (and the free memory operations) for your program.\n\n\n2.2.2 Why you need an allocator?\nAs we described at Section 2.1.4, everytime you make a function call in Zig, a space in the stack is reserved for this function call. But the stack have a key limitation which is: every object stored in the stack have a known fixed length.\nBut in reality, there are two very commom instances where this “fixed length limitation” of the stack is a deal braker:\n\nthe objects that you create inside your function might grow in size during the execution of the function.\nsometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer to a local object. As I described at Section 2.1.4, you cannot do that if this local object is stored in the stack. However, if this object is stored in the heap, then, you can return a pointer to this object at the end of the function. Because you (the programmer) control the lyfetime of any heap memory that you allocate. You decide when this memory get’s destroyed/freed.\nThese are commom situations where the stack is not good for. That is why you need a different memory management strategy to store these objects inside your function. You need to use a memory type that can grow together with your objects, or that you can control the lyfetime of this memory. The heap fit this description.\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size during the execution of your program, you grow the amount of memory you have by allocating more memory in the heap to store these objects. And you that in Zig, by using an allocator object.\n\n\n2.2.3 The different types of allocators\nAt the moment of the writing of this book, in Zig, we have 6 different allocators available in the standard library:\n\nGeneralPurposeAllocator().\npage_allocator().\nFixedBufferAllocator() and ThreadSafeFixedBufferAllocator().\nArenaAllocator().\nc_allocator() (requires you to link to libc).\n\nEach allocator have it’s own perks and limitations. All allocators, except FixedBufferAllocator() and ArenaAllocator(), are allocators that use the heap memory. So any memory that you allocate with these allocators, will be placed in the heap.\n\n\n2.2.4 General-purpose allocators\nThe GeneralPurposeAllocator(), as the name suggests, is a “general purpose” allocator. You can use it for every type of task. In the example below, I’m allocating enough space to store a single integer in the object some_number.\n\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n\nWhile useful, you might want to use the c_allocator(), which is a alias to the C standard allocator malloc(). So, yes, you can use malloc() in Zig if you want to. Just use the c_allocator() from the Zig standard library. However, if you do use c_allocator(), you must link to Libc when compiling your source code with the zig compiler, by including the flag -lc in your compilation process. If you do not link your source code to Libc, Zig will not be able to find the malloc() implementation in your system.\n\n\n2.2.5 Page allocator\nThe page_allocator() is an allocator that allocates full pages of memory in the heap. In other words, every time you allocate memory with page_allocator(), a full page of memory in the heap is allocated, instead of just a small piece of it.\nThe size of this page depends on the system you are using. Most systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally allocated in each call by page_allocator(). That is why, page_allocator() is considered a fast, but also “wasteful” allocator in Zig. Because it allocates a big amount of memory in each call, and you most likely will not need that much memory in your program.\n\n\n2.2.6 Buffer allocators\nThe FixedBufferAllocator() and ThreadSafeFixedBufferAllocator() are allocator objects that work with a fixed sized buffer that is stored in the stack. So these two allocators only allocates memory in the stack. This also means that, in order to use these allocators, you must first create a buffer object, and then, give this buffer as an input to these allocators.\nIn the example below, I am creating a buffer object that is 10 elements long. Notice that I give this buffer object to the FixedBufferAllocator() constructor. Now, because this buffer object is 10 elements long, this means that I am limited to this space. I cannot allocate more than 10 elements with this allocator object. If I try to allocate more than that, the alloc() method will return an OutOfMemory error value.\n\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n\n\n\n2.2.7 Arena allocator\nThe ArenaAllocator() is an allocator object that takes a child allocator as input. The idea behind the ArenaAllocator() in Zig is similar to the concept of “arenas” in the programming language Go5. It is an allocator object that allows you to allocate memory as many times you want, but free all memory only once. In other words, if you have, for example, called 5 times the method alloc() of an ArenaAllocator() object, you can free all the memory you allocated over these 5 calls at once, by simply calling the deinit() method of the same ArenaAllocator() object.\nIf you give, for example, a GeneralPurposeAllocator() object as input to the ArenaAllocator() constructor, like in the example below, then, the allocations you perform with alloc() will actually be made with the underlying object GeneralPurposeAllocator() that was passed. So, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator really do is helping you to free all the memory you allocated multiple times with just a single command. In the example below, I called alloc() 3 times. So, if I did not used an arena allocator, then, I would need to call free() 3 times to free all the allocated memory.\n\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n\n\n\n2.2.8 The alloc() and free() methods\nIn the code example below, we are accessing the stdin, which is the standard input channel, to receive an input from the user. We read the input given by the user with the readUntilDelimiterOrEof() method.\nNow, after reading the input of the user, we need to store this input somewhere in our program. That is why I use an allocator in this example. I use it to allocate some amount of memory to store this input given by the user. More specifically, the method alloc() of the allocator object is used to allocate an array capable of storing 50 u8 values.\nNotice that this alloc() method receives two inputs. The first one, is a type. This defines what type of values the allocated array will store. In the example below, we are allocating an array of unsigned 8-bit integers (u8). But you can create an array to store any type of value you want. Next, on the second argument, we define the size of the allocated array, by specifying how much elements this array will contain. In the case below, we are allocating an array of 50 elements.\nAt Section 1.12 we described that strings in Zig are simply arrays of characters. Each character is represented by an u8 value. So, this means that the array that was allocated in the object input is capable of storing a string that is 50-characters long.\nSo, in essence, the expression var input: [50]u8 = undefined would create an array for 50 u8 values in the stack of the current scope. But, you can allocate the same array in the heap by using the expression var input = try allocator.alloc(u8, 50).\n\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n\nAlso, notice that in this example, we use the keyword defer to run a small piece of code at the end of the current scope, which is the expression allocator.free(input). When you execute this expression, the allocator will free the memory that it allocated for the input object.\nWe have talked about this at Section 2.1.5. You should always explicitly free any memory that you allocate using an allocator! You do that by using the free() method of the same allocator object you used to allocate this memory. The defer keyword is used in this example only to help us execute this free operation at the end of the current scope.\n\n\n2.2.9 The create() and destroy() methods\nWith the alloc() and free() methods, you can allocate memory to store multiple elements at once. In other words, with these methods, we always allocate an array to store multiple elements at once. But what if you need enough space to store just a single item? Should you allocate an array of a single element through alloc()?\nThe answer is no! In this case, you should use the create() method of the allocator object. Every allocator object offers the create() and destroy() methods, which are used to allocate and free memory for a single item, respectively.\nSo, in essence, if you want to allocate memory to store an array of elements, you should use alloc() and free(). But if you need to store just a single item, then, the create() and destroy() methods are ideal for you.\nIn the example below, I’m defining a struct to represent an user of some sort. It could be an user for a game, or a software to manage resources, it doesn’t mater. Notice that I use the create() method this time, to store a single User object in the program. Also notice that I use the destroy() method to free the memory used by this object at the end of the scope.\n\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n\n\n\n\n\nChen, Jenny, and Ruohao Guo. 2022. “Stack and Heap Memory.” Introduction to Data Structures and Algorithms with C++. https://courses.engr.illinois.edu/cs225/fa2022/resources/stack-heap/.\n\n\nSobeston. 2024. “Zig Guide.” https://zig.guide/.\n\n\nZig Software Foundation. 2024. “Language Reference.” Zig Software Foundation. https://ziglang.org/documentation/master/.", + "text": "2.2 Allocators\nOne key aspect about Zig, is that there are “no hidden-memory allocations” in Zig. What that really means, is that “no allocations happen behind your back in the standard library” (Sobeston 2024).\nThis is a known problem, specially in C++. Because in C++, there are some operators that do allocate memory behind the scene, and there is no way for you to known that, until you actually read the source code of these operators, and find the memory allocation calls. Many programmers find this behaviour annoying and hard to keep track of.\nBut, in Zig, if a function, an operator, or anything from the standard library needs to allocate some memory during it’s execution, then, this function/operator needs to receive (as input) an allocator provided by the user, to actually be able to allocate the memory it needs.\nThis creates a clear distinction between functions that “do not” from those that “actually do” allocate memory. Just look at the arguments of this function. If a function, or operator, have an allocator object as one of it’s inputs/arguments, then, you know for sure that this function/operator will allocate some memory during it’s execution.\nAn example is the allocPrint() function from the Zig standard library. With this function, you can write a new string using format specifiers. So, this function is, for example, very similar to the function sprintf() in C. In order to write such new string, the allocPrint() function needs to allocate some memory to store the output string.\nThat is why, the first argument of this function is an allocator object that you, the user/programmer, gives as input to the function. In the example below, I am using the GeneralPurposeAllocator() as my allocator object. But I could easily use any other type of allocator object from the Zig standard library.\n\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nconst allocator = gpa.allocator();\nconst name = \"Pedro\";\nconst output = try std.fmt.allocPrint(\n allocator,\n \"Hello {s}!!!\",\n .{name}\n);\ntry stdout.print(\"{s}\\n\", .{output});\n\nHello Pedro!!!\n\n\nYou get a lot of control over where and how much memory this function can allocate. Because it is you, the user/programmer, that provides the allocator for the function to use. This makes “total control” over memory management easier to achieve in Zig.\n\n2.2.1 What are allocators?\nAllocators in Zig are objects that you can use to allocate memory for your program. They are similar to the memory allocating functions in C, like malloc() and calloc(). So, if you need to use more memory than you initially have, during the execution of your program, you can simply ask for more memory using an allocator.\nZig offers different types of allocators, and they are usually available through the std.heap module of the standard library. So, just import the Zig standard library into your Zig module (with @import(\"std\")), and you can start using these allocators in your code.\nFurthermore, every allocator object is built on top of the Allocator interface in Zig. This means that, every allocator object you find in Zig must have the methods alloc(), create(), free() and destroy(). So, you can change the type of allocator you are using, but you don’t need to change the function calls to the methods that do the memory allocation (and the free memory operations) for your program.\n\n\n2.2.2 Why you need an allocator?\nAs we described at Section 2.1.4, everytime you make a function call in Zig, a space in the stack is reserved for this function call. But the stack have a key limitation which is: every object stored in the stack have a known fixed length.\nBut in reality, there are two very commom instances where this “fixed length limitation” of the stack is a deal braker:\n\nthe objects that you create inside your function might grow in size during the execution of the function.\nsometimes, it is impossible to know upfront how many inputs you will receive, or how big this input will be.\n\nAlso, there is another instance where you might want to use an allocator, which is when you want to write a function that returns a pointer to a local object. As I described at Section 2.1.4, you cannot do that if this local object is stored in the stack. However, if this object is stored in the heap, then, you can return a pointer to this object at the end of the function. Because you (the programmer) control the lyfetime of any heap memory that you allocate. You decide when this memory get’s destroyed/freed.\nThese are commom situations where the stack is not good for. That is why you need a different memory management strategy to store these objects inside your function. You need to use a memory type that can grow together with your objects, or that you can control the lyfetime of this memory. The heap fit this description.\nAllocating memory on the heap is commonly known as dynamic memory management. As the objects you create grow in size during the execution of your program, you grow the amount of memory you have by allocating more memory in the heap to store these objects. And you that in Zig, by using an allocator object.\n\n\n2.2.3 The different types of allocators\nAt the moment of the writing of this book, in Zig, we have 6 different allocators available in the standard library:\n\nGeneralPurposeAllocator().\npage_allocator().\nFixedBufferAllocator() and ThreadSafeFixedBufferAllocator().\nArenaAllocator().\nc_allocator() (requires you to link to libc).\n\nEach allocator have it’s own perks and limitations. All allocators, except FixedBufferAllocator() and ArenaAllocator(), are allocators that use the heap memory. So any memory that you allocate with these allocators, will be placed in the heap.\n\n\n2.2.4 General-purpose allocators\nThe GeneralPurposeAllocator(), as the name suggests, is a “general purpose” allocator. You can use it for every type of task. In the example below, I’m allocating enough space to store a single integer in the object some_number.\n\nconst std = @import(\"std\");\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const some_number = try allocator.create(u32);\n defer allocator.destroy(some_number);\n\n some_number.* = @as(u32, 45);\n}\n\nWhile useful, you might want to use the c_allocator(), which is a alias to the C standard allocator malloc(). So, yes, you can use malloc() in Zig if you want to. Just use the c_allocator() from the Zig standard library. However, if you do use c_allocator(), you must link to Libc when compiling your source code with the zig compiler, by including the flag -lc in your compilation process. If you do not link your source code to Libc, Zig will not be able to find the malloc() implementation in your system.\n\n\n2.2.5 Page allocator\nThe page_allocator() is an allocator that allocates full pages of memory in the heap. In other words, every time you allocate memory with page_allocator(), a full page of memory in the heap is allocated, instead of just a small piece of it.\nThe size of this page depends on the system you are using. Most systems use a page size of 4KB in the heap, so, that is the amount of memory that is normally allocated in each call by page_allocator(). That is why, page_allocator() is considered a fast, but also “wasteful” allocator in Zig. Because it allocates a big amount of memory in each call, and you most likely will not need that much memory in your program.\n\n\n2.2.6 Buffer allocators\nThe FixedBufferAllocator() and ThreadSafeFixedBufferAllocator() are allocator objects that work with a fixed sized buffer that is stored in the stack. So these two allocators only allocates memory in the stack. This also means that, in order to use these allocators, you must first create a buffer object, and then, give this buffer as an input to these allocators.\nIn the example below, I am creating a buffer object that is 10 elements long. Notice that I give this buffer object to the FixedBufferAllocator() constructor. Now, because this buffer object is 10 elements long, this means that I am limited to this space. I cannot allocate more than 10 elements with this allocator object. If I try to allocate more than that, the alloc() method will return an OutOfMemory error value.\n\nvar buffer: [10]u8 = undefined;\nfor (0..buffer.len) |i| {\n buffer[i] = 0; // Initialize to zero\n}\n\nvar fba = std.heap.FixedBufferAllocator.init(&buffer);\nconst allocator = fba.allocator();\nconst input = try allocator.alloc(u8, 5);\ndefer allocator.free(input);\n\n\n\n2.2.7 Arena allocator\nThe ArenaAllocator() is an allocator object that takes a child allocator as input. The idea behind the ArenaAllocator() in Zig is similar to the concept of “arenas” in the programming language Go5. It is an allocator object that allows you to allocate memory as many times you want, but free all memory only once. In other words, if you have, for example, called 5 times the method alloc() of an ArenaAllocator() object, you can free all the memory you allocated over these 5 calls at once, by simply calling the deinit() method of the same ArenaAllocator() object.\nIf you give, for example, a GeneralPurposeAllocator() object as input to the ArenaAllocator() constructor, like in the example below, then, the allocations you perform with alloc() will actually be made with the underlying object GeneralPurposeAllocator() that was passed. So, with an arena allocator, any new memory you ask for is allocated by the child allocator. The only thing that an arena allocator really do is helping you to free all the memory you allocated multiple times with just a single command. In the example below, I called alloc() 3 times. So, if I did not used an arena allocator, then, I would need to call free() 3 times to free all the allocated memory.\n\nvar gpa = std.heap.GeneralPurposeAllocator(.{}){};\nvar aa = std.heap.ArenaAllocator.init(gpa.allocator());\ndefer aa.deinit();\nconst allocator = aa.allocator();\n\nconst in1 = allocator.alloc(u8, 5);\nconst in2 = allocator.alloc(u8, 10);\nconst in3 = allocator.alloc(u8, 15);\n_ = in1; _ = in2; _ = in3;\n\n\n\n2.2.8 The alloc() and free() methods\nIn the code example below, we are accessing the stdin, which is the standard input channel, to receive an input from the user. We read the input given by the user with the readUntilDelimiterOrEof() method.\nNow, after reading the input of the user, we need to store this input somewhere in our program. That is why I use an allocator in this example. I use it to allocate some amount of memory to store this input given by the user. More specifically, the method alloc() of the allocator object is used to allocate an array capable of storing 50 u8 values.\nNotice that this alloc() method receives two inputs. The first one, is a type. This defines what type of values the allocated array will store. In the example below, we are allocating an array of unsigned 8-bit integers (u8). But you can create an array to store any type of value you want. Next, on the second argument, we define the size of the allocated array, by specifying how much elements this array will contain. In the case below, we are allocating an array of 50 elements.\nAt Section 1.12 we described that strings in Zig are simply arrays of characters. Each character is represented by an u8 value. So, this means that the array that was allocated in the object input is capable of storing a string that is 50-characters long.\nSo, in essence, the expression var input: [50]u8 = undefined would create an array for 50 u8 values in the stack of the current scope. But, you can allocate the same array in the heap by using the expression var input = try allocator.alloc(u8, 50).\n\nconst std = @import(\"std\");\nconst stdin = std.io.getStdIn();\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n var input = try allocator.alloc(u8, 50);\n defer allocator.free(input);\n for (0..input.len) |i| {\n input[i] = 0; // initialize all fields to zero.\n }\n // read user input\n const input_reader = stdin.reader();\n _ = try input_reader.readUntilDelimiterOrEof(\n input,\n '\\n'\n );\n std.debug.print(\"{s}\\n\", .{input});\n}\n\nAlso, notice that in this example, we use the defer keyword (which I described at Section 1.9.3) to run a small piece of code at the end of the current scope, which is the expression allocator.free(input). When you execute this expression, the allocator will free the memory that it allocated for the input object.\nWe have talked about this at Section 2.1.5. You should always explicitly free any memory that you allocate using an allocator! You do that by using the free() method of the same allocator object you used to allocate this memory. The defer keyword is used in this example only to help us execute this free operation at the end of the current scope.\n\n\n2.2.9 The create() and destroy() methods\nWith the alloc() and free() methods, you can allocate memory to store multiple elements at once. In other words, with these methods, we always allocate an array to store multiple elements at once. But what if you need enough space to store just a single item? Should you allocate an array of a single element through alloc()?\nThe answer is no! In this case, you should use the create() method of the allocator object. Every allocator object offers the create() and destroy() methods, which are used to allocate and free memory for a single item, respectively.\nSo, in essence, if you want to allocate memory to store an array of elements, you should use alloc() and free(). But if you need to store just a single item, then, the create() and destroy() methods are ideal for you.\nIn the example below, I’m defining a struct to represent an user of some sort. It could be an user for a game, or a software to manage resources, it doesn’t mater. Notice that I use the create() method this time, to store a single User object in the program. Also notice that I use the destroy() method to free the memory used by this object at the end of the scope.\n\nconst std = @import(\"std\");\nconst User = struct {\n id: usize,\n name: []const u8,\n\n pub fn init(id: usize, name: []const u8) User {\n return .{ .id = id, .name = name };\n }\n};\n\npub fn main() !void {\n var gpa = std.heap.GeneralPurposeAllocator(.{}){};\n const allocator = gpa.allocator();\n const user = try allocator.create(User);\n defer allocator.destroy(user);\n\n user.* = User.init(0, \"Pedro\");\n}\n\n\n\n\n\nChen, Jenny, and Ruohao Guo. 2022. “Stack and Heap Memory.” Introduction to Data Structures and Algorithms with C++. https://courses.engr.illinois.edu/cs225/fa2022/resources/stack-heap/.\n\n\nSobeston. 2024. “Zig Guide.” https://zig.guide/.\n\n\nZig Software Foundation. 2024. “Language Reference.” Zig Software Foundation. https://ziglang.org/documentation/master/.", "crumbs": [ "2  Memory and Allocators in Zig" ] @@ -764,7 +764,7 @@ "href": "Chapters/09-error-handling.html#how-to-handle-errors", "title": "9  Error handling and unions in Zig", "section": "9.2 How to handle errors", - "text": "9.2 How to handle errors\nNow that we learned more about what errors are in Zig, let’s discuss the available strategies to handle these errors, which are:\n\ntry keyword;\ncatch keyword;\nan if statement;\nerrdefer keyword;\n\n\n9.2.1 What try means?\nAs I described over the previous sections, when we say that an expression might return an error, we are basically referring to an expression that have a return type in the format !T. The ! indicates that this expression returns either an error value, or a value of type T.\nAt Section 1.2.3, I presented the try keyword and where to use it. But I did not talked about what exactly this keyword does to your code, or, in other words, I have not explained yet what try means in your code.\nIn essence, when you use the try keyword in an expression, you are telling the zig compiler the following: “Hey! Execute this expression for me, and, if this expression return an error, please, return this error for me and stop the execution of my program. But if this expression return a valid value, then, return this value, and move on”.\nIn other words, the try keyword is essentially, a strategy to enter in panic mode, and stop the execution of your program in case an error occurs. With the try keyword, you are telling the zig compiler, that stopping the execution of your program is the most reasonable strategy to take if an error occurs in that particular expression.\n\n\n9.2.2 The catch keyword\nOk, now that we understand properly what try means, let’s discuss catch now. One important detail here, is that you can use try or catch to handle your errors, but you cannot use try and catch together. In other words, try and catch are different and completely separate strategies in the Zig language.\nThis is uncommon, and different than what happens in other languages. Most programming languages that adopts the try catch pattern (such as C++, R, Python, Javascript, etc.), normally use these two keywords in conjunction to form the complete logic to properly handle the errors. Anyway, Zig tries a different approach in the try catch pattern.\nSo, we learned already about what try means, and we also known that both try and catch should be used alone, separate from each other. But what exactly catch do in Zig? With catch, we can construct a block of logic to handle the error value, in case it happens in the current expression.\nLook at the code example below. Once again, we go back to the previous example where we were trying to open a file that doesn’t exist in my computer, but this time, I use catch to actually implement a logic to handle the error, instead of just stopping the execution right away.\nMore specifically, in this example, I’m using a logger object to record some logs into the system, before I return the error, and stops the execution of the program. For example, this could be some part of the codebase of a complex system that I do not have full control over, and I want to record these logs before the program crashes, so that I can debug it later (e.g. maybe I cannot compile the full program, and properly debug it with a debugger. So, these logs might be a valid strategy to surpass this barrier).\n\nconst dir = std.fs.cwd();\nconst file = dir.openFile(\n \"doesnt_exist.txt\", .{}\n) catch |err| {\n logger.record_context();\n logger.log_error(err);\n return err;\n};\n\nTherefore, we use catch to create a block of expressions that will handle the error. I can return the error value from this block of expressions, like I did in the above example, which, will make the program enter in panic mode, and, stop the execution. But I could also, return a valid value from this block of code, which would be stored in the file object.\nNotice that, instead of writing the keyword before the expression that might return the error, like we do with try, we write catch after the expression. We can open the pair of pipes (|), which captures the error value returned by the expression, and makes this error value available in the scope of the catch block as the object named err. In other words, because I wrote |err| in the code, I can access the error value returned by the expression, by using the err object.\nAlthough this being the most common use of catch, you can also use this keyword to handle the error in a “default value” style. That is, if the expression returns an error, we use the default value instead. Otherwise, we use the valid value returned by the expression.\nThe Zig official language reference, provides a great example of this “default value” strategy with catch. This example is reproduced below. Notice that we are trying to parse some unsigned integer from a string object named str. In other words, this function is trying to transform an object of type []const u8 (i.e. an array of characters, a string, etc.) into an object of type u64.\nBut this parsing process done by the function parseU64() may fail, resulting in a runtime error. The catch keyword used in this example provides an alternative value (13) to be used in case this parseU64() function raises an error. So, the expression below essentially means: “Hey! Please, parse this string into a u64 for me, and store the results into the object number. But, if an error occurs, then, return the value 13 instead”.\n\nconst number = parseU64(str, 10) catch 13;\n\nSo, at the end of this process, the object number will contain either a u64 integer that was parsed succesfully from the input string str, or, if an error in the parsing process occurs, it will contain the u64 value 13 that was provided by the catch keyword as the “default”, or, the “alternative” value.\n\n\n9.2.3 Using if statements\nNow, you can also use if statements to handle errors in your Zig code. In the example below, I’m reproducing the previous example, where we try to parse an integer value from an input string with a function named parseU64().\nWe execute the expression inside the “if”. If this expression returns an error value, the “if branch” (or, the “true branch”) of the if statement is not executed. But if this expression returns a valid value instead, then, this value is unwrapped into the number object.\nThis means that, if the parseU64() expression returns a valid value, this value becomes available inside the scope of this “if branch” (i.e. the “true branch”) through the object that we listed inside the pair of pipe charactes (|), which is the object number.\nIf an error occurs, we can use an “else branch” (or the “false branch”) of the if statement to handle the error. In the example below, we are using the else in the if statement to unwrap the error value (that was returned by parseU64()) into the err object, and handle the error.\n\nif (parseU64(str, 10)) |number| {\n // do something with `number` here\n} else |err| {\n // handle the error value.\n}\n\nNow, if the expression that you are executing returns different types of error values, and you want to take a different action in each of these types of error values, the catch keyword becomes limited.\nFor this type of situation, the official documentation of the language suggests the use of a switch statement with an if statement (Zig Software Foundation 2024b). The basic idea is, to use the if statement to execute the expression, and use the “else branch” to pass the error value to a switch statement, where you define a different action for each type of error value that might be returned by the expression executed in the if statement.\nThe example below demonstrates this idea. We first try to add (or register) a set of tasks to a queue. If this “registration process” occurs well, we then try to distribute these tasks across the workers of our system. But if this “registration process” returns an error value, we then use a switch statement in the “else branch” to handle each possible error value.\n\nif (add_tasks_to_queue(&queue, tasks)) |_| {\n distribute_tasks(&queue);\n} else |err| switch (err) {\n error.InvalidTaskName => {\n // do something\n },\n error.TimeoutTooBig => {\n // do something\n },\n error.QueueNotFound => {\n // do somethimg\n },\n // and all the other error options ...\n}\n\n\n\n9.2.4 The errdefer keyword\nA commom pattern in C programs in general, is to clean resources when an error occurs during the execution of the program. In other words, one commom way to handle errors, is to perform “cleanup actions” before we exit our program. This garantees that a runtime error does not make our program to leak resources of the system.\nThe errdefer keyword is a tool to perform such “cleanup actions” in hostile situations. This keyword is commonly used to clean (or to free) allocated resources, before the execution of our program get’s stopped because of an error value being generated.\nThe basic idea is to provide an expression to the errdefer keyword. Then, errdefer executes this expression if, and only if, an error occurs during the execution of the current scope. In the example below, we are using an allocator object (that we presented at Section 2.2) to create a new User object. If we are succesfull in creating and registering this new user, this create_user() function will return this new User object as it’s return value.\nHowever, if for some reason, an error value is generated by some expression that is after the errdefer line, for example, in the db.add(user) expression, the expression registered by errdefer get’s executed before the error value is returned from the function, and before the program enters in panic mode and stops the current execution.\n\nfn create_user(db: Database, allocator: Allocator) !User {\n const user = try allocator.create(User);\n errdefer allocator.destroy(user);\n\n // Register new user in the Database.\n _ = try db.register_user(user);\n return user;\n}\n\nBy using errdefer to destroy the user object that we have just created, we garantee that the memory allocated for this user object get’s freed, before the execution of the program stops.\nBecause if the expression try db.add(user) returns an error value, the execution of our program stops, and we loose all references and control over the memory that we have allocated for the user object. As a result, if we do not free the memory associated with the user object before the program stops, we cannot free this memory anymore. We simply loose our chance to do the right thing. That is why errdefer is essential in this situation.\nHaving all this in mind, the errdefer keyword is different but also similar to the defer keyword. The only difference between the two is when the provided expression get’s executed. The defer keyword always execute the provided expression at the end of the current scope, while errdefer executes the provided expression when an error occurs in the current scope.", + "text": "9.2 How to handle errors\nNow that we learned more about what errors are in Zig, let’s discuss the available strategies to handle these errors, which are:\n\ntry keyword;\ncatch keyword;\nan if statement;\nerrdefer keyword;\n\n\n9.2.1 What try means?\nAs I described over the previous sections, when we say that an expression might return an error, we are basically referring to an expression that have a return type in the format !T. The ! indicates that this expression returns either an error value, or a value of type T.\nAt Section 1.2.3, I presented the try keyword and where to use it. But I did not talked about what exactly this keyword does to your code, or, in other words, I have not explained yet what try means in your code.\nIn essence, when you use the try keyword in an expression, you are telling the zig compiler the following: “Hey! Execute this expression for me, and, if this expression return an error, please, return this error for me and stop the execution of my program. But if this expression return a valid value, then, return this value, and move on”.\nIn other words, the try keyword is essentially, a strategy to enter in panic mode, and stop the execution of your program in case an error occurs. With the try keyword, you are telling the zig compiler, that stopping the execution of your program is the most reasonable strategy to take if an error occurs in that particular expression.\n\n\n9.2.2 The catch keyword\nOk, now that we understand properly what try means, let’s discuss catch now. One important detail here, is that you can use try or catch to handle your errors, but you cannot use try and catch together. In other words, try and catch are different and completely separate strategies in the Zig language.\nThis is uncommon, and different than what happens in other languages. Most programming languages that adopts the try catch pattern (such as C++, R, Python, Javascript, etc.), normally use these two keywords in conjunction to form the complete logic to properly handle the errors. Anyway, Zig tries a different approach in the try catch pattern.\nSo, we learned already about what try means, and we also known that both try and catch should be used alone, separate from each other. But what exactly catch do in Zig? With catch, we can construct a block of logic to handle the error value, in case it happens in the current expression.\nLook at the code example below. Once again, we go back to the previous example where we were trying to open a file that doesn’t exist in my computer, but this time, I use catch to actually implement a logic to handle the error, instead of just stopping the execution right away.\nMore specifically, in this example, I’m using a logger object to record some logs into the system, before I return the error, and stops the execution of the program. For example, this could be some part of the codebase of a complex system that I do not have full control over, and I want to record these logs before the program crashes, so that I can debug it later (e.g. maybe I cannot compile the full program, and properly debug it with a debugger. So, these logs might be a valid strategy to surpass this barrier).\n\nconst dir = std.fs.cwd();\nconst file = dir.openFile(\n \"doesnt_exist.txt\", .{}\n) catch |err| {\n logger.record_context();\n logger.log_error(err);\n return err;\n};\n\nTherefore, we use catch to create a block of expressions that will handle the error. I can return the error value from this block of expressions, like I did in the above example, which, will make the program enter in panic mode, and, stop the execution. But I could also, return a valid value from this block of code, which would be stored in the file object.\nNotice that, instead of writing the keyword before the expression that might return the error, like we do with try, we write catch after the expression. We can open the pair of pipes (|), which captures the error value returned by the expression, and makes this error value available in the scope of the catch block as the object named err. In other words, because I wrote |err| in the code, I can access the error value returned by the expression, by using the err object.\nAlthough this being the most common use of catch, you can also use this keyword to handle the error in a “default value” style. That is, if the expression returns an error, we use the default value instead. Otherwise, we use the valid value returned by the expression.\nThe Zig official language reference, provides a great example of this “default value” strategy with catch. This example is reproduced below. Notice that we are trying to parse some unsigned integer from a string object named str. In other words, this function is trying to transform an object of type []const u8 (i.e. an array of characters, a string, etc.) into an object of type u64.\nBut this parsing process done by the function parseU64() may fail, resulting in a runtime error. The catch keyword used in this example provides an alternative value (13) to be used in case this parseU64() function raises an error. So, the expression below essentially means: “Hey! Please, parse this string into a u64 for me, and store the results into the object number. But, if an error occurs, then, return the value 13 instead”.\n\nconst number = parseU64(str, 10) catch 13;\n\nSo, at the end of this process, the object number will contain either a u64 integer that was parsed succesfully from the input string str, or, if an error in the parsing process occurs, it will contain the u64 value 13 that was provided by the catch keyword as the “default”, or, the “alternative” value.\n\n\n9.2.3 Using if statements\nNow, you can also use if statements to handle errors in your Zig code. In the example below, I’m reproducing the previous example, where we try to parse an integer value from an input string with a function named parseU64().\nWe execute the expression inside the “if”. If this expression returns an error value, the “if branch” (or, the “true branch”) of the if statement is not executed. But if this expression returns a valid value instead, then, this value is unwrapped into the number object.\nThis means that, if the parseU64() expression returns a valid value, this value becomes available inside the scope of this “if branch” (i.e. the “true branch”) through the object that we listed inside the pair of pipe charactes (|), which is the object number.\nIf an error occurs, we can use an “else branch” (or the “false branch”) of the if statement to handle the error. In the example below, we are using the else in the if statement to unwrap the error value (that was returned by parseU64()) into the err object, and handle the error.\n\nif (parseU64(str, 10)) |number| {\n // do something with `number` here\n} else |err| {\n // handle the error value.\n}\n\nNow, if the expression that you are executing returns different types of error values, and you want to take a different action in each of these types of error values, the catch keyword becomes limited.\nFor this type of situation, the official documentation of the language suggests the use of a switch statement with an if statement (Zig Software Foundation 2024b). The basic idea is, to use the if statement to execute the expression, and use the “else branch” to pass the error value to a switch statement, where you define a different action for each type of error value that might be returned by the expression executed in the if statement.\nThe example below demonstrates this idea. We first try to add (or register) a set of tasks to a queue. If this “registration process” occurs well, we then try to distribute these tasks across the workers of our system. But if this “registration process” returns an error value, we then use a switch statement in the “else branch” to handle each possible error value.\n\nif (add_tasks_to_queue(&queue, tasks)) |_| {\n distribute_tasks(&queue);\n} else |err| switch (err) {\n error.InvalidTaskName => {\n // do something\n },\n error.TimeoutTooBig => {\n // do something\n },\n error.QueueNotFound => {\n // do somethimg\n },\n // and all the other error options ...\n}\n\n\n\n9.2.4 The errdefer keyword\nA commom pattern in C programs in general, is to clean resources when an error occurs during the execution of the program. In other words, one commom way to handle errors, is to perform “cleanup actions” before we exit our program. This garantees that a runtime error does not make our program to leak resources of the system.\nThe errdefer keyword is a tool to perform such “cleanup actions” in hostile situations. This keyword is commonly used to clean (or to free) allocated resources, before the execution of our program get’s stopped because of an error value being generated.\nThe basic idea is to provide an expression to the errdefer keyword. Then, errdefer executes this expression if, and only if, an error occurs during the execution of the current scope. In the example below, we are using an allocator object (that we presented at Section 2.2) to create a new User object. If we are succesfull in creating and registering this new user, this create_user() function will return this new User object as it’s return value.\nHowever, if for some reason, an error value is generated by some expression that is after the errdefer line, for example, in the db.add(user) expression, the expression registered by errdefer get’s executed before the error value is returned from the function, and before the program enters in panic mode and stops the current execution.\n\nfn create_user(db: Database, allocator: Allocator) !User {\n const user = try allocator.create(User);\n errdefer allocator.destroy(user);\n\n // Register new user in the Database.\n _ = try db.register_user(user);\n return user;\n}\n\nBy using errdefer to destroy the user object that we have just created, we garantee that the memory allocated for this user object get’s freed, before the execution of the program stops. Because if the expression try db.add(user) returns an error value, the execution of our program stops, and we loose all references and control over the memory that we have allocated for the user object. As a result, if we do not free the memory associated with the user object before the program stops, we cannot free this memory anymore. We simply loose our chance to do the right thing. That is why errdefer is essential in this situation.\nJust to make very clear the differences between defer (which I described at Section 1.9.3) and errdefer, it might be worth to discuss the subject a bit further. You might still have the question “why use errdefer if we can use defer instead?” in your mind.\nAlthough being similar, the key difference between errdefer and defer keyword is when the provided expression get’s executed. The defer keyword always execute the provided expression at the end of the current scope, no matter how your code exits this scope. In contrast, errdefer executes the provided expression only when an error occurs in the current scope.\nThis becomes important if a resource that you allocate in the current scope get’s freed later in your code, in a different scope. The create_user() functions is an example of this. If you think closely about this function, you will notice that this function returns the user object as the result.\nIn other words, the allocated memory for the user object does not get freed inside the create_user(), if the function returns succesfully. So, if an error does not occur inside this function, the user object is returned from the function, and probably, the code that runs after this create_user() function will be responsible for freeying the memory of the user object.\nBut what if an error do occur inside the create_user()? What happens then? This would mean that the execution of your code would stop in this create_user() function, and, as a consequence, the code that runs after this create_user() function would simply not run, and, as a result, the memory of the user object would not be freed before your program stops.\nThis is the perfect scenario for errdefer. We use this keyword to garantee that our program will free the allocated memory for the user object, even if an error occurs inside the create_user() function.\nIf you allocate and free some memory for an object in the same scope, then, just use defer and be happy, errdefer have no use for you in such situation. But if you allocate some memory in a scope A, but you only free this memory later, in a scope B for example, then, errdefer becomes useful to avoid leaking memory in sketchy situations.", "crumbs": [ "9  Error handling and unions in Zig" ]