Design decisions

Motivation

While motivations for the creation of mathematical knowledge are very diverse, the formal documentation of the results with definitions, theorems and proofs is quite standardized. Getting used to the details of this procedure, however, traditionally represents a major difficulty for students of German undergraduate university mathematics.

In an attempt to alleviate this situation, the research project \(\small\mathbb M\)ATh has been initiated with the goal to formulate consistent rules which model the formal aspects of writing mathematical texts and which are amenable to first year students.

In order to visualize the detailed interaction of the specified rules, a software tool – the \(\small\mathbb M\)ATh interpreter – is developed. It will supervise the generation of mathematical knowledge and produce a report which presents all involved details in the form of a hierachically structured text.

\(\small\mathbb M\)ATh projects

Technically, the interpreter analyzes a \(\small\mathbb M\)ATh project that consists of a sequence of files which are the parts of the project. This mimics the sequential structure of lectures, books or articles which naturally split into single lessons or sections.

In the first part of a project, its initial state has to be specified. This state consists of prerequisites in the form of references to objects and structures and some proven facts about them. Since this mathematical knowledge may be huge, it is organized hierarchically in the form of a tree (the tree of knowledge). Each node of this tree is a context which manages a particular subset of the total knowledge.

The generation of additional knowledge is governed by certain requests which are specified in terms of a formal language within the project files. In order to clarify to which context the generation requests refer, the initial state also specifies a context which is in focus at the beginning of the project.

Moving the focus to other contexts is possible with appropriate requests and similarly, tree enlargements with additional nodes can be requested.

While it is always possible to request certain state changes, they are not always granted, i.e. requests may succeed or fail.

As an example, consider the request to make \(A \Rightarrow B\) a fact using a direct proof (where \(A\) and \(B\) are already defined statements). As a consequence of this request, the focus is moved from the current node to a newly created temporary child context which contains \(A\) as an additional fact. Now, the subsequent requests have to ensure that the conclusion \(B\) becomes a fact (i.e. \(B\) has to be proved under the assumption of \(A\)).

Only if this goal is accomplished, the original request is successful. Then, the temporary node is removed and \(A\Rightarrow B\) becomes a fact of the original context, to which the focus is returned.

Formulating unspoken rules

While the proof strategy for \(A \Rightarrow B\), to prove \(B\) under the assumption of \(A\), is taught in every beginners course, several important details are probably not mentioned. For example, it is difficult to state precisely which facts apart from \(A\) can be used in the proof of \(B\) without refering to some sort of context structure.

In contrast, the rules in \(\small\mathbb M\)ATh are clear: in every context, the facts of the parent context are still considered facts. However all the facts established in a temporary context are no longer accessible, once the context is removed.

Within \(\small\mathbb M\)ATh, many such unspoken conventions and traditions have been transformed into rules to ensure a simple and intuitive behavior of the interpreter. These design decisions always try to mimic typical (but not every possible) usage.

For instance, it is not strictly forbidden (but rather uncommon) to leap from an unfinished proof into another context to start an unrelated consideration. Also, lessons ending in the middle of an ongoing proof may occur while lecturers clearly try to avoid it and, similarly, book sections rarely end halfway in the proof of a statement. In \(\small\mathbb M\)ATh these conventions are incorporated as rules which restrict focus movement requests in temporary proof-nodes and prohibit project parts to end while temporary nodes are still present in the tree of knowledge.

Other examples involve the rules which regulate the use of names as object references in definitions which are difficult to formulate without appropriate name space concepts. Within a \(\small\mathbb M\)ATh context, a referencing name or symbol can only be defined once. Names which are not defined in the same context may still be used if they are defined or usable in the parent context. However, a local meaning overrides the parent meaning once the name is used in a local definition.

Settings

So far, we have discussed context as organizational units which structure knowledge. But there is an important additional feature with far reaching consequences: contexts can be initalized with implicitly defined objects (the parameters of the context). In contrast to explicit definitions which merely introduce abbreviations for object expressions, implicit definitions describe objects by postulating some of their properties. Contexts with such initial settings are called settings.

In some cases, the postulated properties entail that a parameter equals an explicitly definable object. Solving equations is a prominent example of this: let \(x,y\in\mathbb R\) satisfy \(2x + 3y = 5\) and \(x-y = 1\). In this context, we can prove the facts \(x = \frac{3}{5}\) and \(y = \frac{8}{5}\).

In other cases, the postulated properties may be contradictory: let \(x,y\in\mathbb R\) satisfy \(2x + 3y = 5\) and \(x-y = 1\) and \(3x +2y = 5\). Again we find \(x =\frac{3}{5}\) and \(y = \frac{8}{3}\) but now we can also prove \(6 = 5\) which contradicts the fact \(6\neq 5\) which is known in the parent context. Such contradictory settings are useless, because any statement can be proven by contradiction here.

In the most relevant use-case of settings, however, their parameters are neither contradictory nor uniquely defined, for instance: let \(x,y\in\mathbb R\) satisfy \(2x + 3y = 5\). Now, there are many concrete choices like \(x = 7, y = -3\) or \(x = 0, y = \frac{5}{3}\) which fit to the postulates and there are others like \(x = 0, y = 0\) that don’t.

Using the postulates together with facts from the parent context about computations in \(\mathbb R\) we can now prove, for example, \(y = -\frac{2}{3}x +\frac{5}{3}\). The same proof can literally be repeated in the parent context once we define, for example, \(x: = 7\) and \(y:= -3\). Since these two numbers share the properties of the corresponding parameters and all other facts are available in the parent context, the proof will work just as well.

Similarly, we could copy the proof for the pair \( (0, \frac{5}{3})\) and all other pairs of numbers which satisfy the postulates in place of \(x\) and \(y\).

In other words, settings act like proof recipes (theorems) with parameters taking the role of ingredients.

Moreover, each setting generates a predicate (or set) in the parent context because, as any other recipe, it attaches an applicability label to each object constellation, indicating whether it could serve as ingredients or not.

In classical notation, the set corresponding to the setting: “let \(x,y\in\mathbb R\) satisfy \(2x + 3y = 5\)” could be introduced as \(S:=\{(x,y)\in\mathbb R^2: 2x + 3y = 5\}\) which obviously carries identical information.

However, set and settings have different purposes which can be illustrated with the
geometrical interpretation in this case. In fact, \(S\) can be considered as a collection of points which constitutes a straight line in the plane. With the fact \((7, -3)\in S\) we can express that the particular point \((7, -3)\) is located on this straight line.

Putting our focus into the setting, we can reformulate the postulates as \((x, y)\in S\) which means that our ingredients constitute some point on the line. Since the conditions do not restrict \((x,y)\) any further, we could call \((x, y)\) a general point and any subsequent construction like \(P:=\{(u,v)\in \mathbb R^2: 3u -2v = 3x -2y\}\) could be called a general construction.

One can now show that \(P\) is also a straight line which runs through the point \((x,y)\) and which is perpendicular to \(S\). Since both the construction of \(P\) and the subsequent proof can literally be repeated for any element of \(S\), the setting has turned into an object-recipe which assigns to any point of \(S\) the perpendicular line through this point. In other words, settings act like functions.

Altogether, we see that the combination of contexts and implicit object definitions captures important mathematical concepts like predicates, sets, functions and theorems. Even axiomatic theories, structures and mathematical models can be described with this approach.

Having recognized settings (i.e. mathematical recipes) as central units of mathematics it is quite reasonable to exploit this observation even in early mathematics education since recipes are ubiquous and understood even by preschool children.

More unspoken rules

In \(\small\mathbb M\)ATh, structures like metric spaces can be described with settings. In this case, the parameters are a set \(X\) and a real valued function \(d\) on \(X^2\). Three additional properties ensure that \(d\) behaves like a distance measure (a metric) on the elements of \(X\).

How metric spaces may be used in \(\small\mathbb M\)ATh now depends on the usage rules of settings which can be formulated because the setting syntax provides a clear structural framework. For example, that \(\mathbb R\) with \(\delta: \mathbb R \to\mathbb R, (x,y)\mapsto |x-y|\)forms a metric space can be stated as \((\mathbb R,\delta)\in\mathcal M\) if \(\mathcal M\) is the set (or proper class) associated to the setting.

Concepts like open or closed sets defined in the metric space setting depend on the parameters \((X, d)\) and therefore appear as functions when viewed from the parent context. In order to state that the interval \([0,1]\) is closed, we thus have to convert the corresponding setting definition into a function \(c\) and write \([0, 1]\in c(\mathbb R,\delta)\).

Students who are well aware of settings are less likely to be tricked with the popular exam question: “Is the interval \((0, 1)\) closed?” In fact, the formal translation \((0, 1)\in c(?,?)\) immediately reveals that the statement is incomplete so that the answer depends on the set and the metric.

In traditional approaches, where settings are not formalized, precise usage rules for structures are typically omitted and students need to grasp the details intuitively.

Other unspoken rules occur due to the traditional understanding that ZFC set theory is the axiomatic base of (the majority of) contemporary mathematics. However, the goal of ZFC is to model mathematics with a minimalistic selection of rules instead of a selection which allows to quickly enter topics like analysis or linear algebra.

This poses a dilemma because jumping from a minimalistic beginning to a more
comfortable state of the theory inevitably skips a lot of rules and theorems. Also other compromises like resorting to Cantor’s “definition” of sets leads to problems as
precision is sacrificed right from the start.

In \(\small\mathbb M\)ATh, a solution to this problem is obtained by using a foundation that differs from ZFC set theory. For example, sets, functions and basic elements like natural numbers are treated as distinguishable objects and are not reduced to sets. This saves time and is often a better model of actual usage. But it also entails various typing rules which substitute corresponding results in the ZFC approach which may traditionally be skipped.