Types and Context
This blog post discusses some ideas about how we could make Typst simpler and more expressive. In particular, it talks about custom elements, types, introspection, and “get rules”.
Elements
Typst has the concept of elements that encode semantical entities in a document: Things like a figure, an image, but also just text are elements. The structure of a document is defined by a tree of elements and different concrete presentations can be derived from this tree through show rules. Morover, Typst has the concept of an element function, which is basically a constructor for an element. (This concept is not really explained in the documentation and sometimes a source of confusion.)
Why do we need elements and show rules though? Can’t we just have functions that directly produce visual output? The answer is decoupling: By separately defining the semantic model of an entity (as an element) and its visual presentation (through a show rule), we can display the same semantic element in different ways, through different show rules, and thereby separate content from presentation. We also have different options on how we want to achieve this: (1) We can write separate sets of show rules for different targets. This is similar to having different CSS files for the same document. (2) We can write a single unified set of show rules that internally uses if conditionals to check which output format we are targeting. This a more akin to media queries.
Apart from separation of concerns, elements give us interoperability: A heading created by package A can be styled through a show rule defined by package B. This directly motivates custom elements: Why should the set of semantical entites be restricted to built-in elements? A library-defined entity benefits just as much from separation of concerns and interoperability as a built-in one.
However, there are no custom elements in Typst at the moment. Instead,
packages use functions for things that conceptually should be
elements. Unfortunately, this severely hinders styling and
customization of package “elements”: Most packages either use
.with(..)
overrides or expose some
state()
-based solution for global configuration. Both of
these solutions have problems:
-
.with(..)
only affects the current variable scope, it does not affect “elements” produced by some unrelated code. This reduces composability: Imagine that headings created by some package you use wouldn’t be affected by your heading show rules. -
state()
-based solutions are typically global: Styling things locally thus becomes more complex or impossible.
Both of these problems would be solved by set rules on custom
elements. The most direct way to integrate custom elements into Typst
(as it is today), would be custom element functions, with a syntax
like let element mything(..) = ..
. However, upon closer
inspection, functions aren’t really the right concept to model
elements: Elements are data and not
computations. What we would write on the right-hand side of
an element function’s definition is really the
default show rule of the element. So, what is the
right tool to model data? The answer to this is simple:
Types.
By modelling elements as types instead of functions, I believe we can
also clear up a common confusion in Typst: Why can you
set
arguments on certain functions, but not on others?
The answer lies again in composability. Consider the following snippet
of markup:
#let it = heading[Hello]
#set heading(numbering: "1.")
#it
The heading stored in it
is affected by the numbering set
rule even though the heading
function was called before
the set rule was in effect. This way, we get our desired
composability: A heading created by some package is affected by our
set and show rules even though it is independent of our local variable
scope.
Set rules do not simply pre-set an argument for the remainder of the scope. Instead, they apply configuration to the subtree of content generated by the scope they are in. They manipulate a tree of data, not the flow of computation. Because they manipulate elements, they only work for arguments of element functions, not arguments of normal computational functions.
In a world where we model elements as types instead of functions, set rules thus only work on types and not on functions. This is a much clearer distinction than that of normal vs element functions. It’s still not super trivial to understand when a settable property can be observed on an value, but knowledge of this is also typically only required for more advanced use cases.
Types
As we discussed, elements are data, so it makes sense to model them as types rather than functions. This design has the added benefit that custom types can be generally useful for advanced scripting. Modelling an element as a type could look like this:
#type heading {
field level = 1
field numbering = none
field supplement = auto
field outlined = true
field bookmarked = auto
field body
// This is the default show rule for headings.
// It can be overridden by `show heading: ..`.
show: it => block({
if it.numbering != none {
counter(heading)
h(0.3em, weak: true)
}
it.body
})
}
Within a type definition, there would be fields, scoped bindings, and optionally a default show rule.
-
Fields associate data with an instance of the type. They can have default values, which makes them optional to specify in the type’s (automatically generated) constructor. Moreover, optional fields could be used with set rules of that type. Fields without default value would be required in the constructor and couldn’t be used with set rules (it wouldn’t make sense because they are always already inherently defined for each instance of the type).
-
Scoped bindings would become available in the type’s scope. For example,
let zero = ..
would allow us to writepoint.zero
. Similarly,let add(..) = ..
would let us writepoint.add(..)
. Moreover, functions in a type’s scope could be called as methods on instances of that type. We could writea.add(b)
becauseadd
would be defined in the scope oftype(a)
. Here,add
is a normal function. Methods wouldn’t be a separate concept from functions, they would just be an alternative way to call a function in a type’s scope. The argument names inadd
are also arbitrary, we could have also usedself
instead ofa
should we prefer that. -
The type can have a default show rule that defines its visual presentation. If we omit that default show rule for the type, it becomes just its
repr
(i.e. its name and its fields).
Note that this design can accomodate type hints down the road, but it doesn’t require them right away. A more classical type could look like this:
#type point {
field x
field y
let zero = point(0, 0)
let add(a, b) = point(a.x + b.x, a.y + b.y)
}
#let z = point.zero
#let a = point(2, 3)
#let b = point(1, 4)
#let r = point(3, 7)
#assert.eq(a.add(b), r)
#assert.eq(point.add(a, b), r)
If we model elements as normal types, certain things that were specific to content now affect all values.
First, it would mean that any type can have set & show rules, not just special elements. This makes possible some things that just don’t work right now (which is confusing to people): Right now, an integer or a float is eagerly converted to text when it is interpolated into content. As a result, this conversion cannot be affected by set rules (as discussed above, they can only affect the transformation of the tree of elements, not immediate computation). This means that things like setting the decimal separator with a set rule is not possible. If the integer was retained as a value, however, and only converted to text through its show rule, this problem would go away.
Second up is the question what happens to functions that currently
take content
. Since anything can be displayed, these
could be changed to take any
. However, this
means that functions that accept content or some value (like
auto
or none
) would have somewhat
nonsensical signatures like auto | any
. I’m not sure
whether I consider this a big problem, but it’s still somewhat of an
open question how we deal with that. We could also continue to have a
notion of content
and restrict which types “implement
that interface”.
Third, are location
s and label
s. Supporting
them on arbitrary values does complicate things somewhat. Even an
integer could have a label, which is slightly crazy. But I think,
overall it is not that problematic as long as we implement it
efficiently (i.e. keep the overhead minimal for values that
don’t have a label.) It’s also kinda nice if we can query for
anything we label and put into the document, even an integer.
Locations fit in with some other thoughts I had, so I will postpone
talking about them for a bit.
Context
As it stands, locate
, style
, and
layout
are three separate instances of the same pattern:
A function that takes a closure and returns content, which evaluates
the closure lazily and possibly multiple times. Precisely this is the
important aspect of them: While a normal block of code runs exactly
once, code wrapped in such a function can run never, just once, or
more than one time. Moreover, the arguments these functions provide (a
location
, a styles
object, and a
size
dictionary, respectively) may be different during
each invocation as they are context-dependant.
The current setup is quite confusing for a number of reasons:
-
Lack of conceptual generality: An experienced Typst user might see the conceptual parallelity of
locate
,style
, andlayout
. A beginner, however, will not know about this. Because the callback architecture is currently a pattern rather than an actual language concept, there is no good place to document it and no good way to teach it. -
Bad diagnostics: People frequently try to write things like
let final = locate(loc => counter.final(loc)); #(final + 1)
. The compiler will then unhelpfully say that it could not add content and an integer. -
Confusing API: Some functions like
query
andcounter.final
take in alocation
parameter and then they just discard it. This parameter has the sole purpose of ensuring that the function is called within alocate
context. Note that this restriction is not strictly necessary from a language-design perspective. It is motivated by implementation reasons: By encapsulating all queries inlocate
contexts, we can reduce the scope of incremental recomputations from a whole module to a small block of code. If we would allow top-level queries, things like module exports could depend on queries, meaning a module wouldn’t be a well-defined self-contained entity without an introspection context.
I believe that the fundamental problem is not how these things work internally, but how we expose them. My proposal is the following:
-
We introduce the explicit notion of contextual functions. Such functions depend on some piece of information that differs across the document and as such they cannot run without an established context. The documentation clearly marks them as contextual and states what piece of context they depend on.
-
We introduce a
context
expression. This expression wraps another expression and provides context to it implicitly. When you call a contextual function outside of an established context, the compiler produces a helpful error message with the hint to use a context expression. A context expression’s return value is an opaque contextual value that can be placed into the document, yielding different output depending on the context. If you try to use a contextual value for normal computation, the compiler produces a helpful error message and links to an explanation.
Consider this piece of code:
#locate(loc => {
style(styles => {
let last = query(heading, loc).last()
let size = measure(last.body, styles)
[Width is #size.width]
})
})
Using a context expression, this would look like this:
#context {
let last = query(heading).last()
let size = measure(last.body)
[Width is #size.width]
}
Implementation-wise not that much has changed. User-facing, it is quite different though:
- We can omit the
loc
argumentquery
-
We can omit the
styles
argument tomeasure
- We have only a single nesting level instead of two
- The expression is syntactically less noisy than one or even two nested closures.
The motivation for context
expressions this doesn’t quite
end here though. As we discussed before, settable properties affect an
element where it is used, not were it is created. This means
the value of a settable property is contextual. A “get rule” can thus
only work within an established context. Currently, it could be
implemented roughly like this:
#style(styles => styles.get(text).fill)
Compare this with a context
expression:
#context text.fill
Particularly for this case, it is very useful to have the context implicit instead of explicit because it greatly reduces the syntactical overhead. And there is one more nice trick: Because contextual values would be a first-class language concept, we could potentially also allow contextual values in other places that can deal with them:
// make text darker
#set text(fill: context text.fill.darken(20%))
Just like context
could be used in a few more
places like that, context could also be provided by other
places: Show rules, for instance, run once per occurance of a value in
the document. Thus, they could just as well provide context, saving us
an extra context expression.
Similarly, numberings could be smartly resolved with the correct context. Right now, there is the following tricky bug: If you define a numbering that internally displays another counter, that counter will show different values depending on where the numbering is applied: In the table of contents, in a reference, etc. We could fix this problem by letting these places provide the correct context.
Locations
Let me come back to locations now. These are somewhat “magic” as an
opaque type that gives access to page, x, and y. Perhaps they feel
magic because of their naming. They are named after how they are used,
not what they really are. If you dig down into the inner workings, a
location is really just a unique ID for an element.
If you write locate(loc => ..)
, then you get back a
LocateElem
which holds a unique ID. If you query semantic
elements like headings, you get access to their unique IDs.
Now note that labels are also sort of IDs, just not necessarily unique. I think there is some potential for unification here. We could merge labels and unique IDs: Let Typst generate a unique label and then combine it with a normal named label. Such merged labels would be useful in abstractions that consist of fixed elements that need labels for some automation, but the whole thing might appear multiple times in a document. If we just use normal labels, there is a collision between these instances that we need to handle, and it also might collide with user-defined labels.
Interestingly enough, this concept is already used internally
in the compiler. The Location
type has a
.variant(index: usize)
method for attaching a well-known
location to a generated element. This is used for two-way-linking
between citations and bibliography references without tons of
back-and-forth introspection.
Getting back to context expressions: A function that generates a
unique label for us could be just yet another contextual function. The
location
method on content would be superseded by a
function that takes a value and returns its unique ID (if present) as
a label. And to get access to page
, x
,
y
we would have yet another contextual function that
takes a label (or a value with a unique label) and gives its position.
And the perfect name for that function is locate
. :)
#context {
let elem = query(heading).first()
let (x, y) = locate(elem)
[Heading is at (#x, #y)!]
}
Async/await?
I would like to emphasize once more the central observation: Context
values are not special because they depend on context
that is not available when normal code runs. Rather, they are special
because
the code defining the contextual value may need to run multiple
times for different contexts.
This makes it distinctly different from something like async/await,
where computation is delayed, but still happens just once. While the
return value of a context
expression is
sort of like a Promise
or Future
,
it is different because it may run multiple times. (I’m not sure how
this relates to algebraic effects. From my limited understanding, they
do allow for multiple resumptions. I am not sure how they fit into
this, perhaps this is sort of an algebraic effect.)
While we’re at async/await: In theory, we could have syntactic sugar
to unnest context and basically pull the remaining scope into a
context
block. This would be akin to implicit awaiting.
However, I do not think this is a good idea for three reasons:
-
It would mean that any function dealing with contextual things internally would also return a contextual value. It would not be possible to have e.g. a function that returns an array of contextual values.
-
By trying to hide what code may run multiple times, it makes it harder to reason about the interaction of different pieces of code. When a computation diverges into a code segment that may run multiple times, can it ever converge again? Even if algebraic effects are some kind of silver bullet, I don’t see how they could fix that. (I would be happy to stand corrected though!)
-
It hurts efficiency and incremental compilation: Think back to why
query
currently takes alocation
parameter to reduce the amount of things it can effect. Allowing top-level desugared context runs into the same issue where a whole module becomes context-dependent. And even if we only allow context access within a function (akin to no top-level await), which would be even more confusing in my opinion, the scope might still be a lot larger than necessary.
Recap
That’s it. Here’s what we talked about:
- What elements are and why we care
- How we can model elements as types
- How methods become associated functions
- The consequences of values and elements being one and the same
- How context expression can simplify introspection
- A context-based design for “get rules” with minimal syntactical overhead
- Why contextual code is special and different from async/await