Data types
The type system of Perun2 is very primitive.
It consists of 9 internally defined data types and there are no more.
Perun2 performs automatic casting between higher and lower data types when it is possible as shown in the image below.
For example, bool casted into string becomes a text value 0 or 1.
It works only one direction.
Perun2 is statically typed.
New variables are declared implicitly, with no need to specify their type.
Several expressions can be applied to multiple data types.
structure | returns |
---|
[bool] ? [value] : [value] | value |
Ternary conditional operator. If the first argument is true, then the second argument is returned. Otherwise the result is the third argument.
This structure works on every data type.
structure | returns |
---|
[bool] ? [value] | value |
This is a simplified ternary operator.
If the first argument equals false, then the whole expression returns empty value of certain data type.
This value is empty text for string, integer zero for number, empty period for period and empty collection for collections.
This structure works on every data type.
Filters work only with collections. See
filters.
structure | returns |
---|
[value] , [value] | value collection |
[value] , [value collection] | value collection |
[value collection] , [value] | value collection |
[value collection] , [value collection] | value collection |
Multiple elements can be joined together by commas in order to create a merged collection.
If one of joined elements is a lazy evaluated collection (also known as definition), then the whole chain of elements becomes lazy evaluated.
structure | returns |
---|
[value collection variable] [ [number] ] | value |
To reach a certain element of a collection variable, use square brackets with an index written inside them.
Indexing is zero-based.
If index is out of range, then an empty value is returned.
The boolean data type can store only two values: true and false.
When casted into number or string, true is treated as 1 and
false is treated as 0.
structure | returns |
---|
true | bool |
false | bool |
New boolean constants can be called by keywords true and false.
structure | returns |
---|
[value] = [value] | bool |
[value] == [value] | bool |
[value] != [value] | bool |
[value] < [value] | bool |
[value] <= [value] | bool |
[value] > [value] | bool |
[value] >= [value] | bool |
Any two valid data type instances can be compared.
Comparisons are carried out according to several rules.
Boolean value true is considered greater than false.
Two times are equal, if they share any common moment in time.
Strings are compared by their alphabetic order and they are sensitive to case size.
Two collections are equal, if they contain the same elements in the same places.
A collection that contains more elements is considered greater.
structure | returns |
---|
not [bool] | bool |
[bool] and [bool] | bool |
[bool] or [bool] | bool |
[bool] xor [bool] | bool |
Unlike other programming languages, Perun2 uses keywords instead of symbols for boolean operators.
Elements can be grouped by brackets to determine the order of operations.
All binary operators (and, or, xor)
have the same priority.
structure | returns |
---|
[value] in [value collection] | bool |
[value] not in [value collection] | bool |
Checks if a specified value can be found inside a collection.
Inclusion of the not keyword reverses the result.
structure | returns |
---|
[string] like [string] | bool |
[string] not like [string] | bool |
This expression can be used to compare a string with a pattern.
It uses several wildcart characters explained
here.
If the pattern is not valid and for example contains not closed brackets, then this expression always returns
false.
structure | returns |
---|
[string] resembles [string] | bool |
[string] not resembles [string] | bool |
This operator works similar to the Like operator.
Go
here for more info.
structure | returns |
[value] between [value] and [value] |
bool |
[value] not between [value] and [value] |
bool |
The Between operator works with any data type that is not a collection.
Bounding values are included, so this expression is an equivalent to >= and <=.
The order of bounds does not matter.
A non-existent value (NaN or never) as any argument makes the Between operator always return false provided there are no casts.
structure | returns |
---|
[string] regexp [string] | bool |
[string] not regexp [string] | bool |
Perun2 uses the ECMAScript convention for regular expression matches. This operator is case sensitive.
Numbers in Perun2 can appear in three forms: as integers, in double-precision format or as NaN (not a number).
The type of a number is assigned dynamically after every performed operation.
For example, division of two integers can result in an integer (8/4), a fraction (8/5) or NaN (8/0).
In order to avoid integer overflows, the greatest effective amount of bits is used for integer representation.
This value depends on the operating system and equals at least 64 bits.
The dot is always the decimal separator.
Numberic literals are consecutive digits with one optionary dot sign inside as a decimal separator.
Preceding sign - makes the number negative.
suffix | multiplier |
kb | 1024 |
mb | 1024² |
gb | 1024³ |
tb | 1024⁴ |
pb | 1024⁵ |
Numbers can be followed by a suffix.
This suffix multiplies the number, so it can be treated as a file size unit.
Make sure that there is no space between the number and the suffix.
For example, 100 megabytes would be expressed as 100mb.
suffix | multiplier |
k | 1000 |
m | 1000² |
We can use these two decimal suffixes to express thousands and millions. For example, 3k means three thousand.
An integer literal can contain one K infix. It multiplies the preceding part by one thousand.
This feature can be used to express years in a shorter way. For example, instead of 2023, we can write 2k23.
structure | returns |
---|
- [number] | number |
[number] + [number] | number |
[number] - [number] | number |
[number] * [number] | number |
[number] / [number] | number |
[number] % [number] | number |
Operators for multiplication, division and modulo have higher priority than operators for addition and subtraction.
Expression elements can be grouped by brackets to enforce a desired order of operations.
structure | returns |
---|
[time variable].[time variable number] | number |
Numeric values can be obtained from time variables.
time variable number |
year |
years |
month |
months |
weekday |
- |
day |
days |
hour |
hours |
minute |
minutes |
second |
seconds |
Values are subject to two rules.
Month take value from 1 (January) to 12 (December).
Week days start from 1 (Monday) to 7 (Sunday).
After all, these values do not have to be used by the language user at all, as convenient
time constants are more readable and not ambiguous.
Time expresses one moment in time. Time can point either a certain month, a certain day, a certain minute or a certain second.
When casted into a string, time is written the same way as an equivalent time constant or clock constant would be expressed.
Month name is written in lowercase except for the first letter that is in uppercase.
structure | returns |
---|
[month name] [nc] | time |
[nc] [month name] [nc] | time |
[nc] [month name] [nc] , [nc] : [nc] | time |
[nc] [month name] [nc] , [nc] : [nc] : [nc] | time |
Time constants can be built in four ways.
All of them require an English month name and several numeric constants (shown above as [nc]).
In the final presented form, numeric constants are in sequence: days, years, hours, minutes and seconds.
month name |
january |
february |
march |
april |
may |
june |
july |
august |
september |
october |
november |
december |
You probably know English month names, but they are presented above anyway.
structure | returns |
[nc] : [nc] | time |
[nc] : [nc] : [nc] | time |
Clock constants need hours, minutes and (optionary) seconds.
structure | returns |
---|
[time] + [period] | time |
[time] - [period] | time |
Time can be increased or decreased by a period.
structure | returns |
---|
[time variable].date | time |
This expression returns a value of a time variable excluding its clock part (hours, minutes and seconds).
Period describes a period of time expressed in years, months, days, hours, minutes and seconds. Each of these units is an integer and can be negative.
The ambiguity of years and months may introduce a bit of confusion, as in reality each of them consists of different amount of days.
Perun2 tries to represent them as realistically as possible.
When conversion into days in inevitable, each ambiguous month is treated as 30 days and each ambiguous year as 365 days.
Period remembers several details such as amount of leap years it contains or days in months and uses them behind the curtain for comparisons.
structure | returns |
---|
1 [period singular] | period |
[number] [period plural] | period |
A new period unit can be defined by a number and a period keyword.
period singular | period plural |
year |
years |
month |
months |
week |
weeks |
day |
days |
hour |
hours |
minute |
minutes |
second |
seconds |
Period keywords can take either singular or plural form.
structure | returns |
---|
- [period] | period |
[period] + [period] | period |
[period] - [period] | period |
Negation, addition and subtraction can be performed on periods.
structure | returns |
---|
[time] - [time] | period |
Subtraction performed on two times returns a period as a result.
String represents a sequence of Unicode characters.
Characters placed between two apostrophes form a new string literal.
There is an exception: it cannot contain asterisks nor apostrophes.
An asterisk turns this structure into an Asterisk Pattern.
For the sake of simplicity, string literals in Perun2 do not involve any escape characters nor escape sequences.
Strings mean exactly what they show, so backslashes can be used safely for filesystem paths.
This is the way to omit the limitations of apostrophe-defined string literals.
A string literal formed with backtick characters can additionally contain asterisks and apostrophes.
Just like before, there are no escape sequences.
structure | returns |
---|
[string variable] [ [number] ] | string |
Reach certain character of a string variable and return it as a new string.
Indexing is zero-based. Negative indexes are allowed and enable reversed access from end to start.
For example, character at index -1 is the last one and at -2 is the penultimate.
If index is out of range, then an empty string is returned.
structure | returns |
---|
[string] + [string] | string |
Multiple strings can be concatenated by pluses.
You should pay attention to adjacent elements.
If two adjacent elements are numbers, they are summed.
The same rule goes for periods and combinations of times and periods.
Definition is a lazy evaluated collection of strings.
Unlike all other data types, definition generates values on demand instead of generating all of them at once.
This data type is crucial for Perun2, as it enables efficient iteration over filesystem elements.
It appears through important built-in variables such as files and directories.
Asterisk Patterns are apostrophe-defined string literals that contain at least one asterisk.
They are explained deeply
here.
Lists are just vectors of values.
The easiest way to initialize them is by writing several values with commas between them.
List of strings is the lowest data type in Perun2.