Specification

The Evolang Language Specification

Source code representation

Evolang source code is encoded in UTF-8 as Unicode text. It is not canonicalized, meaning that a single accented character and the combination of a letter with an accent are treated as separate code points. UTF-8 encoded Unicode byte order marks (BOM) are not allowed. Any invalid or non-normalized UTF-8 encoding will result in a parse error.

Lexical elements

Lexical elements define the structure of Evolang configurations, similar to how words form sentences. These elements include identifiers, keywords, operators, literals, and punctuation. The lexer (or lexical analyzer) scans the Evolang configuration file and breaks it down into tokens, which represent these elements. These tokens are then passed to the parser, which constructs an abstract syntax tree (AST) to represent the structure of the configuration file.

White space

Whitespace characters separate tokens. Whitespace is defined as any sequence of space (U+0020), tab (U+0009), newline (U+000A), or carriage return (U+000D) characters.

Comments

Comments are meant for humans. The lexer ignores them, but you shouldn’t. Just avoid placing them inside string or rune literals—the parser won’t be pleased.

Evolang configurations support two types of comments:

  • Line comments: Start with "//" and continue to the end of the line.
  • Block comments: Begin with "/*" and end with "*/".
// This is a line comment.
aws::ec2::instance "example" {
  ami:           "ami-0c55b159cbfafe1f0"
  instance_type: "t2.micro"
  /* This is a multiline comment.
     This is a multiline comment.
     This is a multiline comment. */
  tags: {
    Key: "Value"
  }
}

Block comments act like spaces if they contain no newlines; otherwise, they behave like newlines. You can’t nest them.

Tokens

Tokens are the smallest units of meaningful elements in Evolang configurations, identified during lexical analysis. Think of tokens as the words in a sentence. In Evolang, they include:

  • Identifiers (e.g., example, instance_type)
  • Keywords (function, variable, for, etc.)
  • Operators & punctuation (+, :, {}, etc.)
  • Literals (42, "hello", 0xFF)

Whitespace separates tokens but is otherwise ignored.

Colons

Colons ":" are used to separate keys from values in key-value pairs. They also separate keys from values in map literals.

An example of a key-value pair in an Evolang configuration:

aws::ec2::instance "example" {
  ami:           "ami-0c55b159cbfafe1f0"
  instance_type: "t2.micro"
  tags: {
    Key: "Value"
  }
}

Colons are also used to define the value of an attribute:

variable "instance_type": "t2.micro"

And to assign a value to an output:

output "private_ip": aws::ec2::instance.example.private_ip

Identifiers

Identifiers are used to name attributes, output values, resources, and other language constructs. An identifier must start with a letter and can contain letters, digits, and underscores. Identifiers are case-sensitive.

identifier = letter { letter | digit | "_" } .
letter     = "a""z" | "A""Z" .
digit      = "0""9" .

Identifiers are intentionally kept simple to improve readability and reduce confusion. The following identifiers are valid:

  • example
  • example1
  • example_1

However, the following are not valid:

  • 1example
  • example-1
  • example 1

Throughout the Evolang language specification, similar restrictions are applied to ensure the language remains readable and less error-prone.

Uniqueness of identifiers

Identifiers must be unique within the scope in which they are declared. This means that you cannot declare two objects or attributes with the same identifier in the same package. If you attempt to do so, a runtime error will occur.

Keywords

Keywords are reserved words in Evolang that define the structure and behavior of configurations. They have predefined meanings and cannot be used as identifiers for attributes, objects, or other constructs.

These keywords serve as the foundation of Evolang’s syntax, guiding how declarations, expressions, and configurations are written. Attempting to use a reserved keyword as an identifier will result in a parsing error.

The following keywords are reserved and must not be used as identifiers:

for       in          if          else
switch    range

While the keywords listed above form the foundation of Evolang, they are not exhaustive. Since Evolang is a framework for building domain-specific languages (DSLs), each DSL can define its own set of domain-specific keywords to meet its unique requirements. These additional keywords are supplied to the parser during initialization, allowing DSLs to extend Evolang's syntax and semantics for specific use cases, such as infrastructure provisioning or other specialized domains.

Operators and punctuation

Operators and punctuation define the structure of expressions and statements in Evolang. They are used for assignments, comparisons, arithmetic operations, logical expressions, and accessing attributes.

Operators perform specific actions on values, such as arithmetic or logical operations, while punctuation is used to structure expressions, separate items, or access attributes.

The following character sequences represent operators (including assignment operators) and punctuation in Evolang:

+   -   *   /   %   
=   ==   !=   <   <=   
>   >=   &&   ||   !   
.   :   (   )   [   ]   
{   }   &   |=   ^=   
+=   -=

Literals

Integer literals

An integer literal consists of a sequence of digits representing a whole number. A prefix can indicate a non-decimal base: 0b or 0B for binary, 0, 0o, or 0O for octal, and 0x or 0X for hexadecimal. A standalone 0 is treated as a decimal zero. In hexadecimal literals, the letters a to f (or A to F) represent values 10 through 15.

For better readability, underscores "_" can be placed after a base prefix or between digits. These underscores are ignored and do not affect the literal’s value.

int_lit     = decimal_lit | binary_lit | octal_lit | hex_lit .
decimal_lit = "0" | ( "1""9" ) [ [ "_" ] decimal_digits ] .
binary_lit  = "0" ( "b" | "B" ) [ "_" ] binary_digits .
octal_lit   = "0" [ "o" | "O" ] [ "_" ] octal_digits .
hex_lit     = "0" ( "x" | "X" ) [ "_" ] hex_digits .

Underscores "_" can be used as visual separators, ignored by the parser:

max_connections: 1_000_000

Negative numbers are written with a "-" prefix:

temperature: -273

Evolang doesn’t allow leading zeros in decimal literals—"042" is invalid. It also rejects numbers with unnecessary prefixes, so "0x" alone isn’t valid.

Floating-point literals

A floating-point literal represents a number in decimal (and optionally, hexadecimal) form. It must contain a decimal point, an exponent, or both.

float_lit         = decimal_float_lit | hex_float_lit .

decimal_float_lit = decimal_digits "." [ decimal_digits ] [ decimal_exponent ] |
                    decimal_digits decimal_exponent |
                    "." decimal_digits [ decimal_exponent ] .
decimal_exponent  = ( "e" | "E" ) [ "+" | "-" ] decimal_digits .

hex_float_lit     = "0" ( "x" | "X" ) hex_mantissa hex_exponent .
hex_mantissa      = [ "_" ] hex_digits "." [ hex_digits ] |
                    [ "_" ] hex_digits |
                    "." hex_digits .
hex_exponent      = ( "p" | "P" ) [ "+" | "-" ] decimal_digits .

The exponent is written using "e" or "E", followed by an optional sign ("+" "or" -) and a sequence of digits:

avogadro: 6.02214076e23

Underscores "_" can be used as visual separators, ignored by the parser:

pi: 3.14159_26535

If hexadecimal floating-point literals are supported, they follow a different notation:

hex_float: 0x1.2p3

String Literals

A string literal represents a fixed sequence of characters. There is one kinds of string literal: interpreted.

Interpreted string literals are enclosed in double quotes (" "). Certain characters, like newlines and quotes, must be escaped. Escape sequences follow standard conventions: "\n" and "\x" for individual bytes, "\u" and "\U" for Unicode. For example, "\xFF" represents a single byte ("0xFF"), while "\u00FF" encodes the UTF-8 representation of U+00FF.

string_lit             = interpreted_string_lit .
interpolated_string_lit = `"` { unicode_value | byte_value | "${" expr "}" } `"` .

Interpolated string literals are enclosed in double quotes and allow for variable interpolation. The syntax for interpolated string literals is similar to interpreted string literals, but with the addition of the ${} syntax for variable interpolation.

aws::ec2::instance "example" {
  name: "${var.environment}-example"
}

Multi-line String Literals

To declare a multi-line string literal use the heredoc syntax. This allows strings to span multiple lines, making it easier to embed content such as scripts or configuration blocks without relying on escape sequences.

Multi-line strings are opened using the <<EOF or <<-EOF syntax, and closed using the EOF delimiter. The content between the opening and closing delimiters is treated as the string's value.

ascii::art "tux" {
  ascii: <<EOF
         _nnnn_                      
        dGGGGMMb     ,"""""""""""""".
       @p~qp~~qMb    | Linux Rules! |
       M|@||@) M|   _;..............'
       @,----.JM| -'
      JS^\__/  qKL
     dZP        qKRb
    dZP          qKKb
   fZP            SMMb
   HZM            MMMM
   FqM            MMMM
 __| ".        |\dS"qML
 |    `.       | `' \Zq
_)      \.___.,|     .'
\____   )MMMMMM|   .'
     `-'       `--' hjm
EOF
}
  1. Raw Heredoc (<<EOF)
    Preserves the content exactly as written, including all leading whitespace characters. The closing delimiter EOF may have leading whitespace, which is also preserved.

    script "example" {
      content: <<EOF
      echo "Hello, World!"
      EOF
    }
    
  2. Dedented Heredoc (<<-EOF)
    Strips leading whitespace characters on each line, allowing the heredoc to be indented in the source code without affecting the resulting string.

    script "example" {
      content: <<-EOF
        echo "Hello, World!"
        EOF
    }
    

Having both the raw and dedented heredoc options allows for flexibility in formatting multi-line strings. The raw heredoc is useful when you want to preserve the exact formatting, while the dedented heredoc is helpful for maintaining clean and readable code without unnecessary indentation.

List Literals

A list literal represents an ordered collection of values enclosed in square brackets ([ ]). Lists can contain any valid Evolang expressions, including nested lists, maps, and expressions that evaluate to values at runtime.

list_lit       = "[" [ list_elements ] "]" .
list_elements  = expr { "," expr } [ "," ] .

Lists support elements of mixed types and can span multiple lines for improved readability:

locals {
  core_domains: [
    "example.com",
    "www.example.com",
  ]
}

List elements must be separated by commas. A trailing comma after the last element is optional but recommended in multi-line lists to simplify future additions:

aws::ec2::instance "example" {
  availability_zones: [
    "us-east-1a",
    "us-east-1b",
    "us-east-1c",  // Trailing comma is allowed
  ]
}

For consistency and readability, Evolang enforces that empty lines are not permitted immediately after the opening bracket or before the closing bracket:

locals {
  // Invalid - contains empty line after opening bracket
  invalid_list: [

    "first_item",
    "second_item"
  ]
}

Map Literals

A map literal represents an unordered collection of key-value pairs enclosed in curly braces ({ }). Maps provide a way to associate values with unique keys, enabling structured data representation.

map_lit       = "{" [ map_elements ] "}" .
map_elements  = map_element { "," map_element } [ "," ] .
map_element   = (identifier | literal | traversal_expr) ":" expr .

Map keys can only be identifiers, literals, or traversal expressions, ensuring keys remain simple and predictable:

aws::ec2::instance "example" {
  tags: {
    Name:        "example-instance"
    Environment: "production"
    Owner:       var.owner
  }
}

Like lists, maps can span multiple lines for improved readability, and trailing commas are allowed:

network "configuration" {
  settings: {
    vpc_cidr: "10.0.0.0/16",
    enable_nat_gateway: true,
    azs: ["us-east-1a", "us-east-1b"],
  }
}

For consistency and readability, Evolang enforces that empty lines are not permitted immediately after the opening brace or before the closing brace:

locals {
  // Invalid - contains empty line after opening brace
  invalid_map: {
    
    key: "value",
    another_key: "another_value"
  }
}

Types

Evolang does not require explicit type annotations. Instead, it uses structural inference to determine types based on the assigned values. For example, assigning a string literal results in a string type, while an integer literal results in an integer type. This type inference applies consistently across constructs such as attributes and fields within user-defined blocks.

For values declared in configuration—such as attributes—we recommend that tools built on top of Evolang infer the type from the provided value. For object definitions provided by plugins or drivers, types should be communicated explicitly through a schema. This enables the host tool to match inferred types against expected types, validate configurations early, and produce meaningful error messages when mismatches occur. While Evolang itself does not impose a type system, it provides packages exposing functionality to validate types from Evolang expressions.

The supported types in Evolang are:

  • String: A sequence of characters enclosed in double quotes.
  • Integer: A whole number without a decimal point.
  • Float: A number with a decimal point.
  • Boolean: A value that is either true or false.
  • List: An collection of values enclosed in square brackets.
  • Map: A collection of key-value pairs enclosed in curly braces.

Attributes

Attributes are standalone name-value pairs that are not nested within any other structure. They follow the syntax identifier "name": value, where the identifier and name are strings, and the value is an expression. The value (expression) is optional—when omitted, the colon is also not required. Attributes define simple properties or settings and are typically referenced elsewhere in the configuration as inputs to other constructs.

attribute_decl = "attribute" string_lit ":" expr .
attribute_ref  = "var" "." string_lit .

Declaring an attribute without a value:

variable "instance_type"

Declaring an attribute with a value:

variable "instance_type": "t2.micro"

Objects

Objects represent top-level resources and entities. They follow the [identifier] object_type "name" { ... } syntax pattern, where the object type is typically namespaced (like postgres::table) and the quoted name provides a unique identifier for each instance. An optional identifier can clarify an object’s purpose within the domain-specific language. Objects can contain arguments and (nested) blocks to describe their structure and behavior.

object_decl     = [identifier] object_type string_lit [for_clause] block .
object_type     = identifier | namespaced_identifier .
namespaced_identifier = identifier "::" identifier { "::" identifier } .
block           = "{" { (attribute | nested_block) } "}" .
attribute       = identifier ":" expression .
nested_block    = identifier block .
for_clause      = "for" identifier "in" expression .

To define an object in Evolang, specify its type, name, and any attributes or blocks. The object type is typically namespaced to ensure clarity and avoid naming conflicts, reflecting the domain or plugin that provides the resource. Namespaces are defined using the :: syntax, allowing for hierarchical organization of object types defined by the plugin.

Example of a postgres::table object:

postgres::schema "users" {
  name: "users"
}

postgres::table "users" {
  name:   "users"
  schema: postgres::schema.users
  
  column {
    name:  "id"
    type:  "serial"
    null:  false
  }

  primary_key {
    name:    "users_pkey"
    columns: ["id"]
  }
}

Objects can include a for loop to dynamically create multiple instances of the same object type. The for loop is defined using the for keyword, followed by an iterator variable, the in keyword, and a list. The iterator variable represents each element in the list, while the list specifies the values over which the loop iterates.

aws::ec2::instance "example" for i in range(2) {
  name:          "server-${i}"
  ami:           "ami-0c55b159cbfafe1f0"
  instance_type: "t2.micro"
}

In this example:

  • The for loop iterates over the range(2) list, which generates the values 0 and 1.
  • For each value in the list, a new instance of the aws::ec2::instance object is created.
  • The name attribute is dynamically set using string interpolation, where ${i} is replaced with the current value of the iterator variable.

To maintain simplicity and readability, object loops are restricted to a single for loop. Nesting for loops within objects is not permitted.

Blocks

Blocks serve as structured containers that group related configurations. In Evolang, blocks appear in two primary contexts:

  1. Standalone blocks: Top-level constructs defined using the block_name [string_lit] { ... } syntax.
  2. Object bodies: Blocks that form the body of an object declaration, enclosing the object’s attributes and nested blocks.

All blocks share common characteristics: they enclose a set of attributes and can contain nested blocks, creating a hierarchical structure.

block           = "{" { attribute | nested_block } "}" .
attribute       = identifier ":" expression .
nested_block    = identifier block .

Standalone blocks

Standalone blocks are top-level constructs that typically represent domain-specific configurations. Languages built on top of Evolang commonly define their own block types to represent specific constructs in their domain.

plugin "postgres" {
  version:  "0.2.4"
  host:     var.host
  username: "admin"
  password: var.password
}

Object bodies

Object declarations always include a block that forms their body. This block follows the same syntax rules as standalone blocks:

aws::ec2::instance "example" {  // The curly braces enclose a block
  instance_type: "t2.micro"
  tags: {                       // This is a map value, not a nested block
    Name: "example"
  }
}

Nested blocks

Both standalone blocks and object bodies can contain nested blocks, which create a hierarchy of configurations. Nested blocks are defined directly within their parent block (without a colon separator).

aws::apprunner::vpc_ingress_connection "example" {
  name:        "example"
  service_arn: aws::apprunner::service.example.arn

  // "ingress_vpc_configuration" is a nested block within the object body
  ingress_vpc_configuration {
    vpc_id:          aws::ec2::vpc.example.id
    vpc_endpoint_id: aws::ec2::vpc_endpoint.example.id
  }
  
  // Another nested block
  security_groups {
    group_id: aws::ec2::security_group.example.id
  }
}

Nested blocks are particularly useful for representing structured API resources where certain configurations are logically grouped. They allow Evolang configurations to mirror the natural structure of the underlying systems they describe.

Unlike attributes (which use the name: value syntax), nested blocks do not have a value assigned with a colon. Instead, they contain their own set of attributes and potentially further nested blocks.

Declarations

Each declaration must be unique within an Evolang package and cannot be redefined. Declarations are required to be used—if a declaration is present but never referenced, a runtime error occurs.

See Evolang's documentation for more information on how declarations are used in practice.

Scope

In Evolang, scope defines where an identifier can be referenced. Each directory with .evo files is a package, acting as a self-contained unit. Identifiers in a package are globally visible within it but must be imported to be used in other packages.

Evolang follows a single global scope per package, meaning that identifiers declared in one file are accessible in all other files in the same package. This allows you to define attributes and objects in separate files and reference them across the package.

Here’s an example demonstrating scope:

variable "instance_type": "t2.micro"

aws::ec2::instance "example" {
  instance_type: var.instance_type
}

Expressions

Expressions represent computations that produce values. They are used in assignments, comparisons, and operations, ranging from simple literals and variables to complex constructs involving operators. Each expression has a type, which is inferred based on its values and operators.

expr          = var_ref | int_lit | float_lit | string_lit | operation .
operation     = expr ( "+" | "-" | "*" | "/" ) expr .
var_ref       = "var" "." string_lit .

Languages built on top of Evolang should enforce type safety through runtime type checking/inferring.

An expression can be as simple as summing two numbers:

variable "example": 1 + 2

Or as complex as calling a function with arguments:

output "example": sum(var.a + 1, var.b + 2)

Arithmetic expressions

Arithmetic expressions in Evolang are used to perform mathematical operations on numeric values. These expressions can include addition, subtraction, multiplication, division, and modulus operations. Arithmetic expressions can be used to calculate values or assign values to attributes.

variable "example": 1 + 2 * 3 / 4 % 5

Expressions in Evolang are evaluated from left to right, adhering to the predefined operator precedence and associativity rules. Parentheses can be used to explicitly group expressions and modify the evaluation order. For instance, in the expression 1 + 2 * 3, the multiplication operation has higher precedence and is evaluated first, yielding 1 + 6. The result is then computed by performing the addition.

Logical expressions

Logical expressions in Evolang are used to evaluate conditions and perform logical operations. These expressions can include logical AND (&&), logical OR (||), and logical NOT (!) operations. Logical expressions are used in conditional statements, comparisons, and other contexts where boolean values are required.

variable "example": !true

Comparison expressions

Comparison expressions in Evolang are used to compare values and determine their relationship. These expressions can include equality (==), inequality (!=), less than (<), less than or equal to (<=), greater than (>), and greater than or equal to (>=) operations. Comparison expressions are used to evaluate conditions and make decisions based on the results of the comparison.

variable "example": 1 < 2

Conditional expressions

Conditional expressions in Evolang evaluate conditions and produce values based on those evaluations. They provide a way to incorporate basic decision-making logic directly into expressions, avoiding the need for complex imperative control structures. Conditional expressions are valuable for simple value selection based on runtime conditions.

If/Else expressions

Evolang supports if/else expressions for simple condition-based value selection. These expressions evaluate a condition and return one of two possible values based on whether the condition is true or false.

if_expression   = "if" "(" expression ")" expression ["else" expression] .

If/else expressions are compact and ideal for simple value selection scenarios:

aws::ec2::instance "example" {
  instance_type: if(var.environment == "production") "m5.large" else "t2.micro"
  monitoring:    if(var.enable_monitoring) true else false
}

Multiple if/else expressions can be chained together for more complex conditional logic:

network "main" {
  port:     if(var.enable_https) 443 else if(var.enable_http) 80 else 8080
  protocol: if(var.enable_https) "https" else "http"
}

For readability, parentheses can be used to group complex conditions:

firewall "default" {
  allow_traffic: if((var.environment == "dev") && (var.debug_mode)) true else false
}

If/else expressions can be used anywhere a value is expected, including in string interpolation, list elements, and as function arguments.

To maintain readability and predictability, Evolang limits if/else expression nesting to a single level. While you can chain if/else expressions (as shown earlier), deeply nested conditional structures within a single expression are not permitted. For scenarios requiring complex conditional logic with multiple branches, use a switch expression instead, which provides clearer structure for multi-condition evaluation.

Switch expression

For multi-branch conditional logic, Evolang provides switch expressions. Switch expressions evaluate an input expression and select a result based on matching patterns.

switch_expression = "switch" "(" expression ")" "{" { case_clause } [default_clause] "}" .
case_clause       = "case" expression ":" expression .
default_clause    = "default" ":" expression .

Switch expressions are useful when selecting from multiple options:

aws::ec2::instance "example" {
  instance_type: switch(var.environment) {
    case "production": "m5.large"
    case "staging":    "t2.medium"
    default:           "t2.micro"
  }
}

The switch expression evaluates the input expression (var.environment in this example) and compares it with each case expression. The first matching case determines the result. If no case matches, the default clause is used. If no default clause is provided and no case matches, an error is raised during evaluation.

Switch expressions must be exhaustive—either covering all possible values or providing a default case—to ensure a value is always produced

Functions

In Evolang, functions provide a way to transform data and perform operations within expressions. Each function has a unique name and accepts zero or more arguments. The underlying application that implements Evolang defines which functions are available, their behavior, and their evaluation semantics.

function_call = identifier "(" [arguments] ")" .
arguments     = expression | expression "," arguments .

Functions can be used in any expression context, including within string interpolation. For example:

greet {
  message: "Hello, ${upper(var.name)}!"
  priority: min(var.user_level, 5)
}

Functions can appear anywhere expressions are valid, and are also treated as expressions themselves. This means you can use them in assignments, comparisons, and also nested within other functions. Including within string literals, as shown in the example above, using the ${} syntax to interpolate the result of the function call into the string.