7.6.1 Strings

Prev Up Next Page 191 of 800 Search internet

In Logiweb you have "bake" + 1 = "cake".

That is so because Logiweb treats strings as naturals (we refer to non-negative integers as 'naturals').

As an example, take the string "bake". When expressed in Logiweb Unicode UTF-8, "bake" is represented as the list 98, 97, 107, 101 of bytes (we refer to integers in the range from 0 to 255, inclusive, as 'bytes').

To get the value of "bake", we append a one-byte to the four byte sequence above to get this: 98, 97, 107, 101, 1. Then we interpret that sequence little endian base 256:

"bake" = 98+256(97+256(107+256(101+256))) = 5996503394

Likewise, "cake" = 5996503395

By convention, Logiweb represents text in Logiweb Unicode UTF-8 which is essentially the same as Unicode UTF-8. One difference is that Logiweb text always uses code 10 (Line Feed) for representing newlines regardless of the newline convention of the host operating system. That is done in order to support cross platform interoperability of Logiweb pages. Furthermore, code 0-9, inclusive and code 11-31, inclusive, are forbidden in Logiweb text, again for the sake of interoperability.

It is up to the user of Logiweb to decide whether or not to respect these conventions. However, Logiweb supports the convention in several ways. When lgc reads a source file, it converts newline sequences and form feeds to code 10 (LF) and tab characters to code 32 (space). When rendering, lgc offers to leave code 10 (LF) as it is or translate it to CR or CRLF or LFCR or to the newline sequence used by the host operating system (which must be one of LF, CR, CRLF, or LFCR). The lgc compiler knows which newline sequence is used by the host operation system from its newline option, c.f. lgc(1)

Users who for some reason do not want to respect the conventions for Logiweb text can use escape sequences to inject arbitrary bytes into strings.

Prev Up Next Page 191 of 800 Search logiweb.eu

Copyright © 2010 Klaus Grue, GRD-2010-01-05