Logiweb(TM)     Logiweb system pages  
        Tutorial T05: Syntax  
 

System pages
Site pages

Introduction...
Showroom
Tutorials
Man pages
Help
Download
Wiki...
Wiki submission
Background
Machine room...
Contact

T01: Hello world
T02: Programming
T03: Proving
T04: Googling
T05: Syntax

   

Up. Submission form.

Introduction

The present tutorial describes the pyk language.

Prerequisites

You must have done the introductory tutorials.

The shortest possible pyk source

This is the shortest possible valid pyk source:

PAGE p
BIBLIOGRAPHY
BODY
p

To translate it, do as follows:

  • Open the Submission form in a separate window (in most browsers: right click the link and select something like 'open in new window').
  • Enter some strings in the 'org=' and 'name=' fields.
  • Enter the text above in the upper text window.
  • Use the 'level' control to change 'level=body' to 'level=parse'.
  • Click 'Submit'.

The pyk source first defines a page named 'p'. As a side effect, 'p' becomes a valid construct which can be used in the BODY section. Then the pyk source defines an empty bibliography and then a body which contains nothing but the newly defined construct 'p'.

The response from Pyk (the pyk compiler) looks like this:

Frontend
Frontend: parsing associativity sections
Frontend: parsing body
Frontend: invoking priority rules
p

The 'p' in the last line is what Pyk found in the body. When Pyk is invoked with level=parse, it just echoes the body with understood parentheses inserted. There are no understood parentheses in the body you just gave it.

Strings

Now enter this in the upper text window of the submission form:

PAGE p
BIBLIOGRAPHY

Enter this in the lower text window:

BODY
"abc"

Let level stay at level=parse, and click submit. This time, Pyk responds:

Frontend
Frontend: parsing associativity sections
Frontend: parsing body
Frontend: invoking priority rules
"abc"

The upper and lower window of the submission are simply concatenated before sending the text to Pyk, so it is unimportant how the text is split among the two windows. But it is convenient to have the header in the upper window and the body in the lower.

Be sure to end the upper window with a newline character. Otherwise this text will look like this to Pyk:

PAGE p
BIBLIOGRAPHYBODY
"abc"

From now on I give pyk sources as a single text and it is up to you to split it as you like.

String escapes

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
BODY
"abc"-def"!ghi"{comment}jkl"

The Pyk response should be

Frontend
Frontend: parsing associativity sections
Frontend: parsing body
Frontend: invoking priority rules
"abcdef"ghijkl"

Inside strings, the double quote serves as an 'escape' character. Here are some conventions:

  • An escape character followed by a hyphen is ignored.
  • An escape character followed by an exclamation mark denotes one occurrence of the escape character itself.
  • If an escape character is followed by a left brace then all characters until the first right brace are ignored.
  • An escape character followed by a space or a newline or one of the eight characters ,.[]()<> marks the end of the string. In that case the space or newline or other character is not part of the string but is considered to appear after the string.
  • An escape character followed by a small 'n' represents a newline character. A newline character in a string also represents a newline character, so a newline character can be obtained either by an escape-n sequence or directly by including a newline character in the string.
  • An escape character followed by a question mark changes the escape character to be the character after the quotation mark. Hence, e.g., "ab"?+c+ denotes a three-letter-string consisting of a, b, and c. This can be useful when working with strings that contain lots of quote characters.
  • If an escape character is followed by a semicolon then all characters until the end of the line are ignored.
  • If an escape character is followed by a plus sign then all characters until the first non-blank character are ignored.

If the facilities above are not enough to satisfy you, you can read man 5 pyk for more information.

Character encoding

Internally, Logiweb and Pyk work with the Logiweb UTF-8 character encoding. Logiweb UTF-8 is the same as normal UTF-8 except that in Logiweb UTF-8, code 0-9 and code 11-31 are illegal characters and that code 10 and code 10 only marks the end of a line. This convention is important to ensure interoperability across different platforms.

Regardless of the convention above, Logiweb is completely unbiased concerning host operating system. Logiweb and Pyk translate source text expressed using the conventions of the host operating system to Logiweb UTF-8 when reading and translates the other way when writing.

When Pyk reads your source text, the source is 'filtered'. The default filter removes all occurrences of code 13 (carriage return) and translates all occurrences of code 9 (tabulation) to space characters.

Pyk supports more than 100 external encodings like latin1 and jis_x0201 and many others. For more on this, read about the filter option in man pyk.

Infix plus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" " + "
BODY
p + p + p

The Pyk response should be

{{p} + {p}} + {p}

The line saying

"" " + "

is a 'construct line'. A construct line consists of a string followed by a construct. The string is delimited by the first two quotes of the construct line. The construct comprises the characters after the string and up to the end of the line. For that reason, the construct line above consists of the empty string

""

followed by the construct

" + "

Inside constructs, quote characters serve as place holders. The construct above contains two quote characters, so the plus is a binary plus.

Relation to BNF

Now look once more at the header of the page (i.e. everything before BODY):

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" " + "

The header above defines a syntax class T. In BNF, the definition of T would read:

T ::= p | T + T | S

where S is the syntax class of strings. The body of the page (i.e. everything after BODY) is supposed to be expressed in the given grammar. The body reads

p + p + p

which indeed belongs to the syntax class T.

Associativity

The infix plus is declared to be 'preassociative' which means that it is left-associative in text which runs left-to-right, right-associative in text which runs right-to-left, and counter-clockwise-associative in text written in clockwise spirals.

The output when submitting the page with level=parse should be

{{p} + {p}} + {p}

The braces in the output indicate the understood parentheses of p + p + p. The preassociativity of plus has the effect that the first plus binds tighter than the second.

Postassociative plus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
POSTASSOCIATIVE
"" " + "
BODY
p + p + p

The Pyk response should be

{{p} + {p}} + {p}

Three nulary constructs

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" c
"" " + "
BODY
a + b + c

The Pyk response should be

{{a} + {b}} + {c}

The example is the same as the previous one except that now the syntax class T defined by the header is

T ::= p | a | b | c | d | T + T | S

The constructs a, b, and c are all declared to be preassociative. That has no effect. Associativity only has effect on constructs which start with and/or end with a placeholder.

Case sensitivity

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" " + "
BODY
a + B

The Pyk response should be

BODY
a + |
---
B
---
File Form input around line 8 char 5:
No interpretations
Goodbye

Prepare yourself to see the 'No interpretations' message every once in a while when using Logiweb.

Space sensitivity

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" " + "
BODY
a+b

The Pyk response should be

BODY
a|
---
+b
---
File Form input around line 8 char 2:
No interpretations
Goodbye

So the spaces around the plus are important. Until further, consider this irritating if you like. You will see applications later.

Multiple spaces

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" " + "
BODY
a

   +""{comment}b

The Pyk response should be

{a} + {b}

So multiple spaces and newlines count as a single space. Furthermore, comments count as spaces. By the way, notice what a comment looks like outside a string. Previously, you saw what comments look like inside strings. For more on comments, see man 5 pyk.

Infix times

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" c
"" d
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" " + "
BODY
a * b + c * d

The Pyk response should be

{{a} * {b}} + {{c} * {d}}

Now the syntax class T defined by the header is

T ::= p | a | b | c | d | T * T | T + T | S

The plus and times constructs are both preassociative, but times has higher priority than plus because the two constructs occur in different associativity sections and because times occurs before plus.

Parentheses

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" c
"" ( " )
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" " + "
BODY
a * ( b + c )

The Pyk response should be

{a} * {( {{b} + {c}} )}

So the parentheses overrule the priority of multiplication and addition. This is so because a * ( b + c ) is unambiguous to begin with (it can only be parenthesized in one way) so priority and associativity rules do not even have a chance to get into play.

Infix minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" c
"" d
"" e
"" f
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" " + "
"" " - "
BODY
a * b + c - d + e * f

The Pyk response should be

{{{{a} * {b}} + {c}} - {d}} + {{e} * {f}}

Now the syntax class T defined by the header is

T ::= p | a | b | c | d | T * T | T + T | T - T | S

The plus and minus construct have the same priority because they appear in the same associativity section. They both have higher priority than times.

Unary minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" - "
PREASSOCIATIVE
"" " + "
"" " - "
BODY
- a + b

The Pyk response should be

{- {a}} + {b}

Hence, - a + b is unary minus applied to a which is then added to b. To Logiweb, unary minus - a and binary minus a - b are two different constructs which have nothing particular in common.

Mixing unary and binary minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" - "
PREASSOCIATIVE
"" " + "
"" " - "
BODY
- a - - b

The Pyk response should be

{- {a}} - {- {b}}

So Logiweb has no problems when mixing unary and binary minus.

Priority inversion

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" - "
PREASSOCIATIVE
"" " + "
"" " - "
BODY
a * - b

The Pyk response should be

{a} * {- {b}}

A term like a * b + c * d is interpreted as {{a} * {b}} + {{c} * {d}} so the principal operator (the one outside all braces) is a plus operator. Mostly, the principal operator is the one with lowest priority. But in the example above, a * - b is interpreted as {a} * {- {b}} where principal operator (the multiplication) has higher priority than the unary minus. This is so because a * - b is unambiguous to begin with (it can only be parenthesized in one way) so priority and associativity rules do not even have a chance to get into play.

Priority inversion exercise

Try to figure of what the following would give with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" - "
PREASSOCIATIVE
"" " + "
"" " - "
BODY
- a * - b

Then try it.

Then figure out why you were wrong (if you were).

Complex priority inversion

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
"" c
"" d
"" e
PREASSOCIATIVE
"" " * "
PREASSOCIATIVE
"" " + "
PREASSOCIATIVE
"" if " then " else "
BODY
a * if b then c else d + e

The Pyk response should be

{a} * {if {b} then {c} else {{d} + {e}}}

Logiweb parenthesizes a * if b then c else d + e as follows: First, it locates the operator with lowest priority (the 'if-then-else' in our case). Then it puts braces around the 'if-then-else', including as much as possible. Thus, the left brace is put before the 'i' in 'if' and the right brace is put to the far right. In this way, d + e becomes the third argument of the if-then-else whereas the multiplication stays outside. In this way, the operator with the highest priority (the multiplication) ends out being the principal operator even though there is a binary operator (the plus) which has lower priority and which could have ended up as principal operator, had it not been captured by the even lower priority if-then-else.

Prefix, infix, and suffix minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" - "
"" " - "
"" " -
BODY
- a - b -

The Pyk response should be

{{- {a}} - {b}} -

In this case, the three kinds of minus have the same priority, and associativity pushes parentheses to the left.

Postassociative minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
POSTASSOCIATIVE
"" - "
"" " - "
"" " -
BODY
- a - b -

The Pyk response should be

- {{a} - {{b} -}}

Now associativity pushes parentheses to the right.

Ambiguous minus

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
POSTASSOCIATIVE
"" - "
"" " - "
"" " -
BODY
a - - b

The Pyk response should be

BODY
a - - |
---
b
---
File Form input around line 11 char 7:
Ambiguous parse tree
Goodbye

a - - b was more than Logiweb could take. Priority rules apply to - a - b - because the first and second minus share a parameter (the 'a') and because the second and third minus share another parameter (the 'b'). The term a - - b could be interpreted as {{a} -} - {b} or {a} - {- {b}}. In the case of a - - b, priority rules are of no help for disambiguating the term and Logiweb then states an error message.

Gluing constructs

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" 0
"" 1
"" 2
"" 3
"" 4
"" 5
"" 6
"" 7
"" 8
"" 9
PREASSOCIATIVE
"" 0"
"" 1"
"" 2"
"" 3"
"" 4"
"" 5"
"" 6"
"" 7"
"" 8"
"" 9"
PREASSOCIATIVE
"" " + "
BODY
123 + 456

The Pyk response should be

{1{2{3}}} + {4{5{6}}}

The associativity sections above define a grammar with ten nulary constructs (0 to 9), ten unary constructs (0" to 9"), and one binary construct (" + "). Note that the unary constructs are 'gluing' in the sense that there is no space between the digit and the double quote. For that reason, there should be no spaces either between digit and parameter when using the construct.

The example above is not just a gimmick. In Logiweb, numerals are actually treated using grammatical constructs like the ones above. Bodies of Logiweb pages must be built up from strings and grammatical constructs and nothing else. Not even numerals are built in.

Predefined constructs

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" " + "
BODY
Pyk + Priority

The Pyk response should be

{Pyk} + {Priority}

The two keywords 'Pyk' and 'Priority' are grammatical constructs which tacitly belong to the grammar. There are no other predefined grammatical constructs than these two. 'Pyk' and 'Priority' will not be mentioned any further in the present tutorial.

Gluing brackets

Try to submit this with level=parse:

PAGE p
BIBLIOGRAPHY
PREASSOCIATIVE
"" a
"" b
PREASSOCIATIVE
"" " + "
POSTASSOCIATIVE
"" "[ " ]"
BODY
"We have that "[ a + b ]" equals "[ b + a ]" by commutativity."

The Pyk response should be

{"We have that "}[ {{a} + {b}} ]{{" equals "}[ {{b} + {a}} ]{" by commutativity."}}

The gluing bracket construct "[ " ]" is designed to give the impression that it embeds formulas inside strings. What it actually does is that it takes the formula as its second argument and the strings as first and third argument.

Recall that a space or newline or one of the eight character ,.[]()<> can occur right after a string. A left brace cannot occur right after a string since such a brace would mark the start of an intra string comment. The gluing brackets construct works because a left bracket is one of the characters which can occur right after a string. A gluing brace construct would not work since it would be taken as a comment and would disappear before the parser had a chance to see it.

Space maximization convention

As a convention, put as many spaces as possible in the constructs you define. In other words, but spaces everywhere unless you have a good reason for not doing so. As an example, a sine function could be defined thus:

"" sin ( " )

In the construct, there is no space between 's' and 'i' and between 'i' and 'n' in 'sin' since that would look silly. But there is a space between the 'n' in 'sin' and the left parenthesis. Putting spaces everywhere unless there is a reason for not doing so makes it easier to guess names of constructs defined by others.

Bibliographic references

Try to submit this with level=parse:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"" a1
"" a2
PREASSOCIATIVE
"" " + "
BODY
a1 + a2

The Pyk response should be

Unimported "base" construct:
base
Frontend: parsing body
Frontend: invoking priority rules
{a1} + {a2}

The page above is named 'test' because it starts with 'PAGE test'.

The test page references the page at address http:../../../page/base/latest/vector/page.lgw and gives the referenced page the local name 'base'. In general, a bibliographic entry consists of a local name followed by a reference. A bibliography can contain zero, one, or more bibliographic entries. You can read more about references here.

Locally within the test page, the referenced page has local name 'base' whereas the test page itself has local name 'test'. Local name must be nulary constructs meaning that they contain no double quotes. You can give pages local names like 'my page' or '#$*!' or ':-)'.

Even though the base page has local name 'base' inside the 'test' page, it can have other local names in other pages referencing it. As a convention, when you reference some page, you should let its local name be equal to the pages own name unless you have reasons for doing otherwise. But if, e.g., you reference two different pages which both happen to have name 'base' then you could choose to give them local names 'base 1' and 'base 2' since different pages should have different local names.

To use constructs from a referenced page, those constructs must be 'imported', i.e. they must be mentioned in some associativity section such that the priority and associativity of the imported constructs become known.

The warning message 'Unimported "base" construct: base' states that the 'base' page defines a construct named 'base', and that construct has not been imported. Actually, the 'base' page defines hundreds of constructs none of which are imported, but only one warning is printed for brevity.

An import

Try to submit this with level=parse:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"" " + "
BODY
a1 + a2

The Pyk response should be

Unimported "base" construct:
+"
Frontend: parsing body
Frontend: invoking priority rules
{a1} + {a2}

The line ["base" base] in the source above asks Pyk to import the 'base' construct from the base page. An import consists of a local page name in quotes followed by the name of the construct to be imported from that page. This is why different pages must have different page names: Pyk must know which page to import from.

But now Pyk complains that the +" construct remains to be imported. The +" construct is a gluing plus that can be used inside numerals. We return to the gluing plus and other unimported constructs later.

Multiple import

Try to submit this with level=parse:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"" " + "
BODY
a + b

The Pyk response should be

Unimported "base" construct:
+"
Frontend: parsing body
Frontend: invoking priority rules
{a} + {b}

Actually, the line ["base" base] in the pyk source not only asks Pyk to import the base construct from the base page. The ["base" base] line asks Pyk to import all constructs from the base page which has the same priority as the base construct, including the base construct itself.

On the base page, the associativity section containing the base construct also contains 26 constructs named a, b, c, and so on, 26 other constructs named A, B, C, and so on, and a zillion other constructs. This is why Pyk can understand [a + b ] even though the a and b constructs are not mentioned anywhere in the header. The a and b constructs are silently imported from the base page by the ["base" base] line.

When you import another page, how can you know which constructs it defines? Well, there are a number of possibilities. To try them, first open the main menu of the base page in a separate window.

The first way to get information about constructs defined on the base page is to click on 'Source' and then on 'Actual source as html'. When you click that, you will get the actual source of the base page expressed in a way that makes it look nice in your browser. If you want to cut-and-paste from the source, you may prefer 'Actual source as plain text'. Once you have the actual source, you can read the header section and learn everything about which constructs are exported.

But the actual source is not always available: The actual source could have been lost or the page could have been produced by a tool different from the Pyk compiler. At the time of writing, there are no such other tools, but in the past, a wysiwyg editor has existed, and new tools could emerge in the future.

The second option for learning about the exported constructs is to open the main menu of the base page and then click 'Dictionary' and 'Pyk'. That gives a list of all exported constructs, indicating the index and arity of the construct. The index is a natural number which identifies the construct locally within the page and the arity is the number of double quotes in the construct. But this gives no information about which priority section the construct belongs to.

The third option for learning about the exported constructs is to open the main menu of the base page and then click 'Codex' and 'Pdf'. On page 2 of that Pdf document (after the monstrous table of contents), you can find a section with headline 'base'. Inside that section, you can find a definition that starts base-> with the word 'prio' above the right arrow. After the right arrow you find a priority table of all constructs. The priority table is rendered in Pdf. If you want pyk, click 'Codex' and 'Pyk' instead and find the priority table expressed in pyk. You must be a patient reader to read the pyk version of the priority table.

Making warnings go away

The easiest way to make the Unimported "base" construct warning go away is to change 'header=warn' to 'header=nowarn'. Setting 'header=nowarn' is like saying 'I know what I am doing', which is rarely true.

A more lasting solution is to change 'header=warn' to 'header=suggest'. That asks Pyk for a suggestion on what the header could look like. Then Pyk generates a header which, if used, makes the warning go away. But of course Pyk cannot always guess what priority you want to assign to each construct. When Pyk makes a suggestion, it takes all priority information on all referenced pages into account. When the priority information of two pages contradict each other, Pyk makes a choice. And when priority information is missing, Pyk also makes a choice. Pyks choice will not always be what you wanted, but it may be a good starting point for further editing.

Try to submit this with header=suggest:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"" " + "
BODY
a1 + a2

The Pyk response should be

PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"" " + "
PREASSOCIATIVE
"base" +"
PREASSOCIATIVE
"base" " factorial
...
PREASSOCIATIVE
"base" " & "
PREASSOCIATIVE
"base" " \\ "

Pyk respects the priorities you gave, namely that a1 and a2 have the same priority of base and higher priority than [" + "]. But Pyk did not guess that [" + "] should probably have the same priority as the [" Plus "] construct from the base page.

Name collisions

Try to submit this with level=parse and header=warn:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"base" " Plus "
"" " + "
BODY
a + b

The Pyk response should be something like

Frontend: parsing body
Symbol 410 of page 1 has the same name as symbol 3 of page 0
---

"" " + "
BODY
a |
---
+ b

---
File Form input around line 12 char 3:
Use of ambiguous construct
Goodbye

The Unimported "base" construct warning just states the given construct has not been imported. You could make it go away using a more complete header e.g. using the header=suggest option. But I want to keep the header short here to make it readable.

But then comes an error message which bring Pyk to a halt. The associativity section containing [" Plus "] on the base page also contains a [" + "] construct. So now we have two [" + "] constructs in play, one that belongs to the base page and one which belongs to the test page. So the same 'construct' denotes two different 'symbols'.

Formally, a 'symbol' is a pair whose first component, the so-called 'reference', identifies the home page of the symbol, and the second component, the so-called 'index', identifies the symbol inside the home page of the symbol. Both the reference and the index is a natural number.

The error message says that symbol 410 of page 1 has the same name as symbol 3 of page 0.

The number 410 is the index of the [" + "] symbol from the base page. The number 1 is the local reference of the base page, i.e. the position of the base page in the bibliography: the base page is reference number one in the bibliography of the test page. If you want to know the real reference of the base page, look up the main menu of the base page, then click 'Reference' and 'Decimal number' to see the zillion-digit number which constitutes the world-wide unique reference of the base page. Formally, the [" + "] symbol from the base page is a pair whose first component is that zillion-digit number and whose second component is 410.

When the error message says that symbol 410 of page 1 has the same name as symbol 3 of page 0, the number 3 in the error message is the index of the [" + "] symbol from the test page. The number 0 is the local reference of the test page, and indicates that the symbol belongs to the zeroth entry of the bibliography of the test page (which is the test page itself).

It is ok to have several symbols with the same name as long as you do not use the ambiguous name. But the ambiguous [" + "] construct is used in character 3 of line 12 of the source, and that triggers the error message.

The error message contains several parts. First, comes the message which says 'Symbol 410 ...'. Then comes three dashes. Then comes a bit of context before the error. Then comes three dashes. Then comes a bit of context after the error. Then comes three dashes. Then comes the line and character numbers, then comes one more description of the error. Finally, Pyk says 'Goodbye' to emphasize that you are out of luck.

Page qualified constructs

Try to submit this with level=parse and header=warn:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"base" " Plus "
"" " + "
BODY
a test + b

The Pyk response should be

Frontend: parsing body
Frontend: invoking priority rules
{a} test + {b}

Each symbol has two names: in addition to its normal name, each symbol has a 'page qualified' one. The page qualified name of a symbol is the normal name of the symbol extended with the local name of the home page of the symbol. Here are some examples:

NamePagePage qualified name
a1testtest a1
a2testtest a2
" + "test" test + "
testtesttest test
" + "base" base + "
if " then " else "basebase if " then " else "
" factorialbase" base factorial
" [[ " ]]base" base [[ " ]]
+"basebase +"
","base" base+"

The rules are:

  • If the name starts with a double quote, then the page qualified name consists of
    • a double quote followed by
    • a space character followed by
    • the local name of the home page followed by
    • the name with the initial double quote removed.
  • Otherwise, the page qualified name consists of
    • the local name of the home page followed by
    • a space character followed by
    • the name with the initial double quote removed.

A gluing plus and a gluing comma are included in the table above to give examples of how the rules work for gluing constructs.

More features

You have now seen most of the pyk language. There are a few more facilities for renaming symbols during import, for importing a single symbol from an associativity section of another page, and for controlling what index is assigned to a symbol. Those features are mainly rudiments, however.

There is also a feature for including files (like #include in the C programming language). But in addition to ordinary text inclusion, the Pyk include feature allows to translate the included file from one format to another. That is typically used for translating files that are not in Logiweb UTF-8 into Logiweb UTF-8 format. As mentioned before, Logiweb UTF-8 is the same as normal UTF-8 except that in Logiweb UTF-8, code 0-9 and code 11-31 are illegal characters and that code 10 and code 10 only marks the end of a line.

As a convention, one should stick to Logiweb UTF-8 when dealing with text internally in Logiweb. But Logiweb itself allows strings to be arbitrary sequences of bytes. Using the import construct suitably, one can import binary files into a Pyk source in such a way that the imported file becomes a string containing the bytes of the imported file. As an example of use, if a Logiweb page is rendered using TeX and one needs a font which is not part of standard TeX, then one can do a binary include of the TeX font files into the page and then arrange that they will be used during rendering.

What syntax leads to

Try to submit this with level=all:

PAGE test
BIBLIOGRAPHY
"base" "http:../../../page/base/latest/vector/page.lgw".
PREASSOCIATIVE
"base" base
"" a1
"" a2
PREASSOCIATIVE
"base" " Plus "
BODY
a1 + a2

Then locate the line saying 'Rendering page here' and click on 'here'. Then click on 'Vector' and then click on 'Decimal'. The result should look something like this:

001 236 165 252  197 171 246 059  118 009 207 087  141 203 177 204
249 042 147 255  075 238 239 205  250 168 160 170  008 006 001 204
049 226 035 023  102 161 000 051  018 122 152 053  088 086 156 138
167 067 234 236  203 239 204 228  159 170 008 006  000 002 000 001
000 000 186 006  003 005

Those bytes are what the frontend of Pyk got out of the page above. The bytes constitute a 'Logiweb vector'. The bytes of the vector are as follows (the actual bytes may change a little).

Version  001
Ripemd   236 165 252 197 171  246 059 118 009 207
         087 141 203 177 204  249 042 147 255 075
Mantissa 238 239 205 250 168  160 170 008
Exponent 006

Version  001
Ripemd   204 049 226 035 023  102 161 000 051 018
         122 152 053 088 086  156 138 167 067 234
Mantissa 236 203 239 204 228  159 170 008
Exponent 006

BibEnd   000

a2       002 000
a1       001 000
DictEnd  000

" + "    186 006
a1       003
a2       005

The vector starts with the reference of the test page. That reference in turn consists of a version number, a Ripemd-160 hash code, a mantissa, and an exponent. The version number is always 001 and is reserved for future extensions. The Ripemd-160 code is a twenty byte hash code computed on basis of all bytes following the code and ensures that the reference is world-wide unique. The mantissa and exponent indicates when the test page was published in Logiweb time. The exponent has an understood minus sign so a time stamp whose exponent is 006 measures time in microseconds.

The vector continues with the reference of the base page.

The next byte is zero and marks the end of the bibliography.

The next two bytes are the index and arity of the a2 construct.

The next two bytes are the index and arity of the a1 construct.

The next byte is zero and marks the end of the so-called 'dictionary'. The dictionary contains all constructs of the test page except the so-called 'page construct'. The page construct is the construct after PAGE in the beginning of the pyk source and is 'test' for the test page. The index and arity of the page construct are always zero.

Finally comes the body in Polish prefix notation. Let R be the number of pages in the bibliography plus one (so R=2 for the test page). The symbol with index i from local page r is represented by the number 1+r+i*R. The a1 construct is construct number 1 from page number 0, so it is represented by 1+0+1*2=3. The a2 construct is represented by 1+0+2*2=5. The plus construct is construct number 412 from page number 1, so it is represented by 1+1+412*2=826. Numbers below 128 are encoded in a single byte. Numbers above that are expressed base 128. As an example, we have 826=58+128*6, so 822 is expressed as 58 followed by 6. However, 128 is added to all bytes except the last yielding 186 followed by 6.

Now click 'up' two times to get back to the main menu of your page. Then click 'External formats' and 'XML'. That gives an XML version of the the vector. If you click 'Common Lisp S-expression' instead of 'XML' you get the same information expressed in Lisp.

Submission of Logiweb pages

A Logiweb page is submitted to Logiweb by writing the binary version of the Logiweb vector to a file within reach of both an http server and a Logiweb server. That file is typically named 'page.lgw' where 'lgw' stands for 'Logiweb'.

The purpose of Logiweb servers is to maintain an index of available Logiweb pages. All Logiweb servers in the world cooperate on maintaining an index of all Logiweb pages in the world. This is why page.lgw has to be within reach of a Logiweb server: page.lgw has to be indexed.

But Logiweb servers do not deliver Logiweb pages on demand. Logiweb servers only tell where Logiweb pages are located. To actually get a Logiweb page, one has to ask a Logiweb server where page.lgw is and then ask an http server to deliver page.lgw.

When a pyk source is submitted to Pyk with level=submit, then Pyk generates the page.lgw file. When submitting with level=all, then Pyk omits the page.lgw file, so level=submit actually generates one more file than level=all.

When a pyk source is submitted to Pyk with level=submit, then Pyk also sends a message to the local Logiweb server to notify it about the submission. Thereby, the page will be known to the local Logiweb server immediately. Knowledge about the new page will then seep through the mesh of Logiweb servers so that the page can be located starting from any Logiweb server. Each Logiweb server only stores a small amount of information but all Logiweb servers cooperate and collectively store a complete index of all pages.

As you can see, a submitted Logiweb page is no more than the parse tree of the body of the pyk source with a bibliography and a dictionary in front. What a Logiweb page means has nothing to do with syntax and is outside the scope of the present tutorial.

Logiweb defines the syntax and semantics of Logiweb vectors and defines the protocol used between Logiweb servers. The Pyk language lives outside of Logiweb and just constitutes one possible format for expressing Logiweb vectors.

Locating Logiweb pages

The previous section presented a pyk source text and its associated Logiweb vector expressed in decimal. That allows a human to inspect the Logiweb vector, but it does not make up for a submission.

The Logiweb vector in the previous section looked thus:

Version  001
Ripemd   236 165 252 197 171  246 059 118 009 207
         087 141 203 177 204  249 042 147 255 075
Mantissa 238 239 205 250 168  160 170 008
Exponent 006

Version  001
Ripemd   204 049 226 035 023  102 161 000 051 018
         122 152 053 088 086  156 138 167 067 234
Mantissa 236 203 239 204 228  159 170 008
Exponent 006

BibEnd   000

a2       002 000
a1       001 000
DictEnd  000

" + "    186 006
a1       003
a2       005

The Logiweb reference of the vector is the first entry of the bibliography, so the reference is

001 236 165 252 197  171 246 059 118 009
207 087 141 203 177  204 249 042 147 255
075 238 239 205 250  168 160 170 008 006

So the vector comprises 70 bytes and the reference comprises 30. The size of the reference is typical but vectors are usually much larger.

Given the reference of a Logiweb page, a Logiweb server can return the URL of its vector. This is what was meant when it was stated that Logiweb servers can tell where Logiweb pages are located.

There can be many copies around the world of the same Logiweb page. In that case, one can ask a Logiweb server to return the URL of a random one of them.

A page is considered to remain in existence as long as at least one copy is accessible on Logiweb. For that reason, the one who originally submitted a Logiweb page to Logiweb may not be able withdraw it again: the page may have been mirrored, in which case Logiweb servers just point to a copy of the page if the original is deleted. When submitting to Logiweb it is understood that such mirroring may and will happen, so one should not submit pages for which copyright forbids verbatim copying.

Amnesia

The 'frontend' of Pyk translates the pyk source into a Logiweb vector. After that, the Logiweb vector could be written to a file and Pyk could quit, leaving further processing to other tools. However, Pyk passes the Logiweb vector to its 'codifier' which 'understands' the vector and verifies its 'correctness'. That process has to do with semantics and is outside the scope of the present tutorial. When the codifier is done, Pyk passes the output from the codifier to the 'backend' of Pyk which renders the page.

When pyk has translated a pyk source into a Logiweb vector, it suffers from voluntary amnesia: it forgets the pyk source and continues with the vector from that point. For that reason, the frontend on the one side is quite separate from the codifier and the backend on the other side. If one has access to a Logiweb vector but the rendering of the vector has been lost, then one can by-pass the frontend of Pyk and ask Pyk to re-understand, re-verify, and re-render the page based on the vector alone.

The amnesia has the benefit that the codifier and backend only see what everybody can see once the page is published, so one avoids that any dependency from the pyk source sneaks in. The amnesia has the drawback, however, that if the codifier or backend finds an error, then they cannot report which line of the source code the error relates to. They have to relate the position of the error to e.g. the rendering of the page.

Conclusion

The main features of the pyk language have been presented.

As should be clear by now, the pyk language does not really have a syntax. Rather, the user of pyk declares the syntax to be used in the header of the pyk source and then uses that syntax in the body of the source.

A published Logiweb page is no more than a dense, binary representation of the parse tree of the body with a bibliography and a dictionary in front.

Up. Submission form.