Laurenz's BlogGitHub

Types and Context

This blog post discusses some ideas about how we could make Typst simpler and more expressive. In particular, it talks about custom elements, types, introspection, and “get rules”.

Elements

Typst has the concept of elements that encode semantical entities in a document: Things like a figure, an image, but also just text are elements. The structure of a document is defined by a tree of elements and different concrete presentations can be derived from this tree through show rules. Morover, Typst has the concept of an element function, which is basically a constructor for an element. (This concept is not really explained in the documentation and sometimes a source of confusion.)

Why do we need elements and show rules though? Can’t we just have functions that directly produce visual output? The answer is decoupling: By separately defining the semantic model of an entity (as an element) and its visual presentation (through a show rule), we can display the same semantic element in different ways, through different show rules, and thereby separate content from presentation. We also have different options on how we want to achieve this: (1) We can write separate sets of show rules for different targets. This is similar to having different CSS files for the same document. (2) We can write a single unified set of show rules that internally uses if conditionals to check which output format we are targeting. This a more akin to media queries.

Apart from separation of concerns, elements give us interoperability: A heading created by package A can be styled through a show rule defined by package B. This directly motivates custom elements: Why should the set of semantical entites be restricted to built-in elements? A library-defined entity benefits just as much from separation of concerns and interoperability as a built-in one.

However, there are no custom elements in Typst at the moment. Instead, packages use functions for things that conceptually should be elements. Unfortunately, this severely hinders styling and customization of package “elements”: Most packages either use .with(..) overrides or expose some state()-based solution for global configuration. Both of these solutions have problems:

Both of these problems would be solved by set rules on custom elements. The most direct way to integrate custom elements into Typst (as it is today), would be custom element functions, with a syntax like let element mything(..) = ... However, upon closer inspection, functions aren’t really the right concept to model elements: Elements are data and not computations. What we would write on the right-hand side of an element function’s definition is really the default show rule of the element. So, what is the right tool to model data? The answer to this is simple: Types.

By modelling elements as types instead of functions, I believe we can also clear up a common confusion in Typst: Why can you set arguments on certain functions, but not on others? The answer lies again in composability. Consider the following snippet of markup:

#let it = heading[Hello]
#set heading(numbering: "1.")
#it

The heading stored in it is affected by the numbering set rule even though the heading function was called before the set rule was in effect. This way, we get our desired composability: A heading created by some package is affected by our set and show rules even though it is independent of our local variable scope.

Set rules do not simply pre-set an argument for the remainder of the scope. Instead, they apply configuration to the subtree of content generated by the scope they are in. They manipulate a tree of data, not the flow of computation. Because they manipulate elements, they only work for arguments of element functions, not arguments of normal computational functions.

In a world where we model elements as types instead of functions, set rules thus only work on types and not on functions. This is a much clearer distinction than that of normal vs element functions. It’s still not super trivial to understand when a settable property can be observed on an value, but knowledge of this is also typically only required for more advanced use cases.

Types

As we discussed, elements are data, so it makes sense to model them as types rather than functions. This design has the added benefit that custom types can be generally useful for advanced scripting. Modelling an element as a type could look like this:

#type heading {
  field level = 1
  field numbering = none
  field supplement = auto
  field outlined = true
  field bookmarked = auto
  field body

  // This is the default show rule for headings.
  // It can be overridden by `show heading: ..`.
  show: it => block({
    if it.numbering != none {
      counter(heading)
      h(0.3em, weak: true)
    }
    it.body
  })
}

Within a type definition, there would be fields, scoped bindings, and optionally a default show rule.

  1. Fields associate data with an instance of the type. They can have default values, which makes them optional to specify in the type’s (automatically generated) constructor. Moreover, optional fields could be used with set rules of that type. Fields without default value would be required in the constructor and couldn’t be used with set rules (it wouldn’t make sense because they are always already inherently defined for each instance of the type).

  2. Scoped bindings would become available in the type’s scope. For example, let zero = .. would allow us to write point.zero. Similarly, let add(..) = .. would let us write point.add(..). Moreover, functions in a type’s scope could be called as methods on instances of that type. We could write a.add(b) because add would be defined in the scope of type(a). Here, add is a normal function. Methods wouldn’t be a separate concept from functions, they would just be an alternative way to call a function in a type’s scope. The argument names in add are also arbitrary, we could have also used self instead of a should we prefer that.

  3. The type can have a default show rule that defines its visual presentation. If we omit that default show rule for the type, it becomes just its repr (i.e. its name and its fields).

Note that this design can accomodate type hints down the road, but it doesn’t require them right away. A more classical type could look like this:

#type point {
  field x
  field y

  let zero = point(0, 0)
  let add(a, b) = point(a.x + b.x, a.y + b.y)
}

#let z = point.zero
#let a = point(2, 3)
#let b = point(1, 4)
#let r = point(3, 7)

#assert.eq(a.add(b), r)
#assert.eq(point.add(a, b), r)

If we model elements as normal types, certain things that were specific to content now affect all values.

First, it would mean that any type can have set & show rules, not just special elements. This makes possible some things that just don’t work right now (which is confusing to people): Right now, an integer or a float is eagerly converted to text when it is interpolated into content. As a result, this conversion cannot be affected by set rules (as discussed above, they can only affect the transformation of the tree of elements, not immediate computation). This means that things like setting the decimal separator with a set rule is not possible. If the integer was retained as a value, however, and only converted to text through its show rule, this problem would go away.

Second up is the question what happens to functions that currently take content. Since anything can be displayed, these could be changed to take any. However, this means that functions that accept content or some value (like auto or none) would have somewhat nonsensical signatures like auto | any. I’m not sure whether I consider this a big problem, but it’s still somewhat of an open question how we deal with that. We could also continue to have a notion of content and restrict which types “implement that interface”.

Third, are locations and labels. Supporting them on arbitrary values does complicate things somewhat. Even an integer could have a label, which is slightly crazy. But I think, overall it is not that problematic as long as we implement it efficiently (i.e. keep the overhead minimal for values that don’t have a label.) It’s also kinda nice if we can query for anything we label and put into the document, even an integer. Locations fit in with some other thoughts I had, so I will postpone talking about them for a bit.

Context

As it stands, locate, style, and layout are three separate instances of the same pattern: A function that takes a closure and returns content, which evaluates the closure lazily and possibly multiple times. Precisely this is the important aspect of them: While a normal block of code runs exactly once, code wrapped in such a function can run never, just once, or more than one time. Moreover, the arguments these functions provide (a location, a styles object, and a size dictionary, respectively) may be different during each invocation as they are context-dependant.

The current setup is quite confusing for a number of reasons:

I believe that the fundamental problem is not how these things work internally, but how we expose them. My proposal is the following:

Consider this piece of code:

#locate(loc => {
  style(styles => {
    let last = query(heading, loc).last()
    let size = measure(last.body, styles)
    [Width is #size.width]
  })
})

Using a context expression, this would look like this:

#context {
  let last = query(heading).last()
  let size = measure(last.body)
  [Width is #size.width]
}

Implementation-wise not that much has changed. User-facing, it is quite different though:

The motivation for context expressions this doesn’t quite end here though. As we discussed before, settable properties affect an element where it is used, not were it is created. This means the value of a settable property is contextual. A “get rule” can thus only work within an established context. Currently, it could be implemented roughly like this:

#style(styles => styles.get(text).fill)

Compare this with a context expression:

#context text.fill

Particularly for this case, it is very useful to have the context implicit instead of explicit because it greatly reduces the syntactical overhead. And there is one more nice trick: Because contextual values would be a first-class language concept, we could potentially also allow contextual values in other places that can deal with them:

// make text darker
#set text(fill: context text.fill.darken(20%))

Just like context could be used in a few more places like that, context could also be provided by other places: Show rules, for instance, run once per occurance of a value in the document. Thus, they could just as well provide context, saving us an extra context expression.

Similarly, numberings could be smartly resolved with the correct context. Right now, there is the following tricky bug: If you define a numbering that internally displays another counter, that counter will show different values depending on where the numbering is applied: In the table of contents, in a reference, etc. We could fix this problem by letting these places provide the correct context.

Locations

Let me come back to locations now. These are somewhat “magic” as an opaque type that gives access to page, x, and y. Perhaps they feel magic because of their naming. They are named after how they are used, not what they really are. If you dig down into the inner workings, a location is really just a unique ID for an element. If you write locate(loc => ..), then you get back a LocateElem which holds a unique ID. If you query semantic elements like headings, you get access to their unique IDs.

Now note that labels are also sort of IDs, just not necessarily unique. I think there is some potential for unification here. We could merge labels and unique IDs: Let Typst generate a unique label and then combine it with a normal named label. Such merged labels would be useful in abstractions that consist of fixed elements that need labels for some automation, but the whole thing might appear multiple times in a document. If we just use normal labels, there is a collision between these instances that we need to handle, and it also might collide with user-defined labels.

Interestingly enough, this concept is already used internally in the compiler. The Location type has a .variant(index: usize) method for attaching a well-known location to a generated element. This is used for two-way-linking between citations and bibliography references without tons of back-and-forth introspection.

Getting back to context expressions: A function that generates a unique label for us could be just yet another contextual function. The location method on content would be superseded by a function that takes a value and returns its unique ID (if present) as a label. And to get access to page, x, y we would have yet another contextual function that takes a label (or a value with a unique label) and gives its position. And the perfect name for that function is locate. :)

#context {
  let elem = query(heading).first()
  let (x, y) = locate(elem)
  [Heading is at (#x, #y)!]
}

Async/await?

I would like to emphasize once more the central observation: Context values are not special because they depend on context that is not available when normal code runs. Rather, they are special because the code defining the contextual value may need to run multiple times for different contexts. This makes it distinctly different from something like async/await, where computation is delayed, but still happens just once. While the return value of a context expression is sort of like a Promise or Future, it is different because it may run multiple times. (I’m not sure how this relates to algebraic effects. From my limited understanding, they do allow for multiple resumptions. I am not sure how they fit into this, perhaps this is sort of an algebraic effect.)

While we’re at async/await: In theory, we could have syntactic sugar to unnest context and basically pull the remaining scope into a context block. This would be akin to implicit awaiting. However, I do not think this is a good idea for three reasons:

  1. It would mean that any function dealing with contextual things internally would also return a contextual value. It would not be possible to have e.g. a function that returns an array of contextual values.

  2. By trying to hide what code may run multiple times, it makes it harder to reason about the interaction of different pieces of code. When a computation diverges into a code segment that may run multiple times, can it ever converge again? Even if algebraic effects are some kind of silver bullet, I don’t see how they could fix that. (I would be happy to stand corrected though!)

  3. It hurts efficiency and incremental compilation: Think back to why query currently takes a location parameter to reduce the amount of things it can effect. Allowing top-level desugared context runs into the same issue where a whole module becomes context-dependent. And even if we only allow context access within a function (akin to no top-level await), which would be even more confusing in my opinion, the scope might still be a lot larger than necessary.

Recap

That’s it. Here’s what we talked about: