Giter Club home page Giter Club logo

pybasic's Introduction

A BASIC Interpreter - Program like it's 1979!

Introduction

A simple interactive BASIC interpreter written in Python 3. It is based heavily on material in the excellent book Writing Interpreters and Compilers for the Raspberry Pi Using Python by Anthony J. Dos Reis. However, I have had to adapt the Python interpreter presented in the book, both to work with the BASIC programming language and to produce an interactive command line interface. The interpreter therefore adopts the key techniques for interpreter and compiler writing, the use of a lexical analysis stage followed by a recursive descent parser which implements the context free grammar representing the target programming language.

The interpreter is a homage to the home computers of the early 1980s, and when executed, presents an interactive prompt ('>') typical of such a home computer. Commands to run, list, save and load BASIC programs can be entered at the prompt as well as program statements themselves.

The BASIC dialect that has been implemented is slightly simplified, and naturally avoids machine specific instructions, such as those concerned with sound and graphics for example.

There is reasonably comprehensive error checking. Syntax errors will be picked up and reported on by the lexical analyser as statements are entered. Runtime errors will highlight the cause and the line number of the offending statement.

The interpreter can be invoked as follows:

$ python interpreter.py

Although this started of as a personal project, it has been enhanced considerably by some other Github users. You can see them in the list of contributors! It's very much a group endeavour now.

Operators

A limited range of arithmetic expressions are provided. Addition and subtraction have the lowest precedence, but this can be changed with parentheses.

  • + - Addition
  • - - Subtraction
  • * - Multiplication
  • / - Division
  • MOD (or %) - Modulo
> 10 PRINT 2 * 3
> 20 PRINT 20 / 10
> 30 PRINT 10 + 10
> 40 PRINT 10 - 10
> 50 PRINT 15 MOD 10
> RUN
6
2.0
20
0
5
>

Additional numerical operations may be performed using numeric functions (see below).

Not also that '+' does extra duty as a string concatenation operator, while '*' can be used to repeat strings.

Commands

Programs may be listed using the LIST command:

> LIST
10 LET I = 10
20 PRINT I
>

The list command can take arguments to refine the line selection listed

LIST 50 Lists only line 50.

LIST 50-100 Lists lines 50 through 100 inclusive.

LIST 50 100 Also Lists lines 50 through 100 inclusive, almost any delimiter works here.

LIST -100 Lists from the start of the program through line 100 inclusive.

LIST 50- Lists from line 50 to the end of the program.

A program is executed using the RUN command:

> RUN
10
>

A program may be saved to disk using the SAVE command. Note that the full path must be specified within double quotes:

> SAVE "C:\path\to\my\file"
Program written to file
>

The program may be re-loaded from disk using the LOAD command, again specifying the full path using double quotes:

> LOAD "C:\path\to\my\file"
Program read from file
>

When loading or saving, the .bas extension is assumed if not provided. If you are loading a simple name (alpha/numbers only) and in the working dir, quotes can be omitted:

> LOAD regression

will load regression.bas from the current working directory.

Individual program statements may be deleted by entering their line number only:

> 10 PRINT "Hello"
> 20 PRINT "Goodbye"
> LIST
10 PRINT "Hello"
20 PRINT "Goodbye"
> 10
> LIST
20 PRINT "Goodbye"
>

The program may be erased entirely from memory using the NEW command:

> 10 LET I = 10
> LIST
10 LET I = 10
> NEW
> LIST
>

Finally, it is possible to terminate the interpreter by issuing the EXIT command:

> EXIT
c:\

On occasion, it might be necessary to force termination of a program and return to the interpreter, for example, because it is caught in an infinite loop. This can be achieved by using Ctrl-C to force the program to stop:

> 10 PRINT "Hello"
> 20 GOTO 10
> RUN
"Hello"
"Hello"
"Hello"
...
...
<Ctrl-C>
Program terminated
> LIST
10 PRINT "Hello"
20 GOTO 10
>

Programming language constructs

Statement structure

As per usual in old school BASIC, all program statements must be prefixed with a line number which indicates the order in which the statements may be executed. There is no renumber command to allow all line numbers to be modified. A statement may be modified or replaced by re-entering a statement with the same line number:

> 10 LET I = 10
> LIST
10 LET I = 10
> 10 LET I = 200
> LIST
10 LET I = 200
>

Multiple statements may appear on one line separated by a colon:

> 10 LET X = 10: PRINT X

NOTE: Currently inline loops are NOT supported

10 FOR I = 1 to 10: PRINT I: NEXT

will need to be decomposed to individual lines.

Variables

Variable types follow the typical BASIC convention. Simple variables may contain either strings or numbers (the latter may be integers or floating point numbers). Likewise array variables may contain arrays of either strings or numbers, but they cannot be mixed in the same array.

Note that all keywords and variable names are case insensitive (and will be converted to upper case internally by the lexical analyser). String literals will retain their case however. There is no inherent limit on the length of variable names or string literals, this will be dictated by the limitations of Python. The range of numeric values is also dependent upon the underlying Python implementation.

Note that variable names may only consist of alphanumeric characters and underscores. However, they must all begin with an alphabetic character. For example:

  • MY_VAR
  • MY_VAR6$
  • VAR77(0, 0)

are all valid variable names, whereas:

  • 5_VAR
  • _VAR$
  • 66$

are all invalid.

Numeric variables have no suffix, whereas string variables are always suffixed by '$'. Note that 'I' and 'I$' are considered to be separate variables. Note that string literals must always be enclosed within double quotes (not single quotes). Using no quotes will result in a syntax error.

Array variables are defined using the DIM statement, which explicitly lists how many dimensions the array has, and the sizes of those dimensions:

> REM DEFINE A THREE DIMENSIONAL NUMERIC ARRAY
> 10 DIM A(3, 3, 3)

Note that the index of each dimension always starts at zero, but for compatibility with some basic dialects the bounds of each dimension will be expanded by one to enable element access including the len. So in the above example, valid index values for array A will be 0, 1, 2 or 3 for each dimension. Arrays may have a maximum of three dimensions. Numeric arrays will be initialised with each element set to zero, while string arrays will be initialised with each element set to the empty string "".

As for simple variables, a string array has its name suffixed by a '$' character, while a numeric array does not carry a suffix. An attempt to assign a string value to a numeric array or vice versa will generate an error.

Array variables with the same name but different dimensionality are treated as the same. For example, using a DIM statement to define I(5) and then a second DIM statement to define I(5, 5) will result in the second definition (the two dimensional array) overwriting the first definition (the one dimensional array).

Array values may be used within any expression, such as in a PRINT statement for string values, or in any numerical expression for number values. However, you must be specific about which array element you are referencing, using the correct number of in-range indexes. If that particular array value has not yet been assigned, then an error message will be printed.

> 10 DIM MYARRAY(2, 2, 2)
> 20 LET MYARRAY(0, 1, 0) = 56
> 30 PRINT MYARRAY(0, 1, 0)
> RUN
56
> 30 PRINT MYARRAY(0, 0, 0)
> RUN
Empty array value returned in line 30
>

As in all implementations of BASIC, there is no garbage collection (not surprising since all variables have global scope)!

Program constants

Constants may be defined through the use of the DATA statement. They may consist of numeric or string values and are declared in a comma separated list:

> 10 DATA 56, "Hello", 78

These values can then later be assigned to variables using the READ statement. Note that the type of the value (string or numeric) must match the type of the variable, otherwise an error message will be triggered. Therefore, attention should be paid to the relative ordering of constants and variables. Once the constants on a DATA statement are used by a READ statement, the next READ statement will move to the DATA statement with the next higher line number, if there are no more DATA statements before the end of the program an error will be displayed. This is to ensure that the program is not left in a state where a variable has not been assigned a value, but nevertheless an attempt to use that variable is made later on in the program.

Normally each DATA statement is consumed sequently by READ statements however, the RESTORE statement can be used to override this order and set the line number of the DATA statement that will be used by the next READ statement. If the line-number used in a RESTORE statement does not refer to a DATA statement an error will be displayed.

The constants defined in the DATA statement may be consumed using several READ statements or several DATA statements may be consumed by a single READ statement.:

> 10 DATA 56, "Hello", 78
> 20 READ FIRSTNUM, S$
> 30 PRINT FIRSTNUM, " ", S$
> 40 READ SECONDNUM
> 50 PRINT SECONDNUM
> 60 DATA "Another "
> 70 DATA "Line "
> 80 DATA "of "
> 90 DATA "Data"
> 100 RESTORE 10
> 110 READ FIRSTNUM, S$, SECONDNUM, A$, B$, C$, D$
> 120 PRINT S$," ",A$,B$,C$,D$
> RUN
56 Hello
78
Hello Another Line of Data
>

It is a limitation of this BASIC dialect that it is not possible to assign constants directly to array variables within a READ statement, only simple variables.

Comments

The REM statement is used to indicate a comment, and occupies an entire statement. It has no effect on execution:

> 10 REM THIS IS A COMMENT

Stopping a program

The STOP statement may be used to cease program execution. The command END has the same effect.

> 10 PRINT "one"
> 20 STOP
> 30 PRINT "two"
> RUN
one
>

A program will automatically cease execution when it reaches the final statement, so a STOP may not be necessary. However a STOP will be required if subroutines have been defined at the end of the program, otherwise execution will continue through to those subroutines without a corresponding subroutine call. This will cause an error when the RETURN statement is processed and the interpreter attempts to return control back to the caller.

Assignment

Assignment may be made to numeric simple variables (which can contain either integers or floating point numbers) and string simple variables (string variables are distinguished by their dollar suffix). The interpreter will enforce this division between the two types:

> 10 LET I = 10
> 20 LET I$ = "Hello"

The LET keyword is also optional:

> 10 I = 10

Array variables may also have values assigned to them. The indexes can be derived from numeric expressions:

> 10 DIM NUMS(3, 3)
> 20 DIM STRS$(3, 3)
> 30 LET INDEX = 0
> 40 LET NUMS(INDEX, INDEX) = 55
> 50 LET STRS$(INDEX, INDEX) = "hello"

Attempts to assign the wrong type (number or string) to a numeric or string array, attempts to assign a value to an array by specifying the wrong number of dimensions, and attempts to assign to an array using an out of range index, will all result in an error.

Printing to standard output

The PRINT statement is used to print to the screen (or to a file, see File I/O below):

> 10 PRINT 2 * 4
> RUN
8
> 10 PRINT "Hello"
> RUN
Hello
>

Multiple items may be printed by providing a semicolon separated list. The items in the list will be printed immediately after one another, so spaces must be inserted if these are required:

> 10 PRINT 345; " hello "; 678
> RUN
345 hello 678
>

A blank line may be printed by using the PRINT statement without arguments:

> 10 PRINT "Here is a blank line:"
> 20 PRINT
> 30 PRINT "There it was"
> RUN
Here is a blank line:

There it was
>

A print statement terminated by a semicolon will not include a CR/LF.

Unconditional branching

Like it or loath it, the GOTO statement is an integral part of BASIC, and is used to transfer control to the statement with the specified line number:

> 10 PRINT "Hello"
> 20 GOTO 10
> RUN
Hello
Hello
Hello
...

Subroutine calls

The GOSUB statement is used to generate a subroutine call. Control is passed back to the program at the next statement after the call by a RETURN statement at the end of the subroutine:

> 10 GOSUB 100
> 20 PRINT "This happens after the subroutine"
> 30 STOP
> 100 REM HERE IS THE SUBROUTINE
> 110 PRINT "This happens in the subroutine"
> 120 RETURN
> RUN
This happens in the subroutine
This happens after the subroutine
>

Note that without use of the STOP statement, execution will run past the last statement of the main program (line 30) and will re-execute the subroutine again (at line 100).

Subroutines may be nested, that is, a subroutine call may be made within another subroutine.

A subroutine may also be called using the ON-GOSUB statement (see Conditional branching below).

Loops

Bounded loops are achieved through the use of FOR-NEXT statements. The loop is controlled by a numeric loop variable that is incremented or decremented from a start value to an end value. The loop terminates when the loop variable reaches the end value. The loop variable must also be specified in the NEXT statement at the end of the loop.

> 10 FOR I = 1 TO 3
> 20 PRINT "hello"
> 30 NEXT I
> RUN
hello
hello
hello
>

Loops may be nested within one another.

The STEP statement allows the loop variable to be incremented or decremented by a specified amount. For example, to count down from 5 in steps of -1:

> 10 FOR I = 5 TO 1 STEP -1
> 20 PRINT I
> 30 NEXT I
> RUN
5
4
3
2
1
>

Note that the start value, end value and step value need not be integers, but can be floating point numbers as well. If the loop variable was previously assigned in the program, its value will be replaced by the start value, it will not be evaluated.

After the completion of the loop, the loop variable value will be the end value + step value (unless the loop is exited using a GOTO statement).

Conditionals

Conditionals are implemented using the IF-THEN-ELSE statement. The expression is evaluated and the appropriate statements executed depending upon the result of the evaluation. If a positive integer is supplied as the THEN or the ELSE statement, a branch will be performed to the indicated line number.

Note that the ELSE clause is optional and may be omitted. In this case, the THEN branch is taken if the expression evaluates to true, otherwise the next statement is executed.

Conditional branching example:

> 10 REM PRINT THE GREATEST NUMBER
> 20 LET I = 10
> 30 LET J = 20
> 40 IF I > J THEN 50 ELSE 70
> 50 PRINT I
> 60 GOTO 80
> 70 PRINT J
> 80 REM FINISHED
> RUN
20
>

The following code segment is equivalent to the segment above:

> 10 REM PRINT THE GREATEST NUMBER
> 20 LET I = 10
> 30 LET J = 20
> 40 IF I > J THEN PRINT I ELSE PRINT J
> 80 REM FINISHED
> RUN
20
>

A THEN or ELSE can be supplied multiple statements if they are separated by a colon ":".

> 10 REM PRINT THE GREATEST NUMBER
> 20 LET I = 10
> 30 LET J = 20
> 40 IF I > J THEN LET L = I:PRINT I ELSE LET L = J:PRINT J
> 50 PRINT L
> 80 REM FINISHED
> RUN
20
20
>

Note that should an IF-THEN-ELSE stmt be used in a THEN code block or multiple IF-THEN-ELSE statements are used in either a single THEN or ELSE code block, the block grouping is ambiguous and logical processing may not function as expected. There is no ambiguity when single IF-THEN-ELSE statements are placed within ELSE blocks.

Ambiguous:

> 100 IF I > J THEN IF J >= 100 THEN PRINT "I > 100" else PRINT "Not clear which **IF** this belongs to"

Not Ambiguous:

> 100 IF I < J THEN PRINT "I is less than J" ELSE IF J > 100 THEN PRINT "I > 100" ELSE PRINT "J <= 100"

Allowable relational operators are:

  • '=' (equal, note that in BASIC the same operator is used for assignment)
  • '<' (less than)
  • '>' (greater than)
  • '<=' (less than or equal)
  • '>=' (greater than or equal)
  • '<>' / '!=' (not equal)

The logical operators AND and OR are also provided to allow you to join two or more expressions. The NOT operator can also be given before an expression.

= and <> can also be considered logical operators. However, unlike AND or OR they can't be used to join more than two expressions.

Inputs AND OR = <> / !=
FALSE FALSE FALSE FALSE TRUE FALSE
TRUE FALSE FALSE TRUE FALSE TRUE
TRUE TRUE TRUE TRUE TRUE FALSE
Input NOT
TRUE FALSE
FALSE TRUE

Example:

> 10 a = 10
> 20 b = 20
> 30 IF NOT a > b AND b = 20 OR a >= 5 THEN 60
> 40 PRINT "Test failed!"
> 50 STOP
> 60 PRINT "Test passed!"
> RUN
Test passed!

Expressions can be inside brackets to change the order of evaluation. Compare the output when line 30 is changed:

> 30 IF NOT a > b AND (b = 20 OR a >= 5) THEN 60
> RUN
Test failed!

ON GOTO, ON GOSUB

The ON GOTO|GOSUB expr line1,line2,... statement will call a subroutine or branch to a line number in the list of line numbers corresponding to the ordinal value of the evaluated expr. The first line number corresponds with an expr value of 1. expr must evaluate to an integer value. If expr evaluates to less than 1 or greater than the number of provided line numbers execution continues on the next statement without making a subroutine call or branch:

> 20 LET J = 2
> 30 ON J GOSUB 100,200,300
> 40 STOP
> 100 REM THE 1ST SUBROUTINE
> 110 PRINT "J is ONE"
> 120 RETURN
> 200 REM THE 2ND SUBROUTINE
> 210 PRINT "J is TWO"
> 220 RETURN
> 300 REM THE 3RD SUBROUTINE
> 310 PRINT "J is THREE"
> 320 RETURN
> RUN
J is TWO
>

It is also possible to call a subroutine depending upon the result of a conditional expression using the IFF function (see Ternary Functions below). In the example below, if the expression evaluates to true, IFF returns a 1 and the subroutine is called, otherwise IFF returns a 0 and execution continues to the next statement without making the call:

> 10 LET I = 10
> 20 LET J = 5
> 30 ON IFF (I > J, 1, 0) GOSUB 100
> 40 STOP
> 100 REM THE SUBROUTINE
> 110 PRINT "I is greater than J"
> 120 RETURN
> RUN
I is greater than J
>

Ternary Functions

As an alternative to branching, Ternary functions are provided.

  • IFF(x, y, z) - Evaluates x and returns y if true, otherwise returns z. y and z are expected to be numeric.
  • IF$(x, y$, z$) - As above, but y$ and z$ are expected to be strings.
> 10 LET I = 10
> 20 LET J = 5
> 30 PRINT IF$(I > J, "I is greater than J", "I is not greater than J")
> 40 K = IFF(I > J, 20, 30)
> 50 PRINT K
> RUN
I is greater than J
20

User input

The INPUT statement is used to solicit input from the user (or read input from a file, see File I/O below):

> 10 INPUT A
> 20 PRINT A
> RUN
? 22
22
>

The default input prompt of '? ' may be changed by inserting a prompt string, which must be terminated by a semicolon, thus:

> 10 INPUT "Input a number - "; A
> 20 PRINT A
> RUN
Input a number - 22
22
>

Multiple items may be input by supplying a comma separated list. Input variables will be assigned to as many input values as supplied at run time. If there are more input values supplied than input variables, excess commas will be left in place. Conversely, if not enough input values are supplied, an error message will be printed and the user will be asked to re-input the values again.

Further, numeric input values must be valid numbers (integers or floating point).

> 10 INPUT "Num, Str, Num: ": A, B$, C
> 20 PRINT A, B$, C
> RUN
Num, Str, Num: 22, hello!, 33
22 hello!33
>

A mismatch between the input value and input variable type will trigger an error, and the user will be asked to re-input the values again.

It is a limitation of this BASIC dialect that it is not possible to assign constants directly to array variables within an INPUT statement, only simple variables.

File Input/Output

Data can be read from or written to files using the OPEN, FSEEK, INPUT, PRINT and CLOSE statements.

When a file is opened using the syntax OPEN "filename" FOR INPUT|OUTPUT|APPEND AS #filenum [ELSE linenum] a file number (#filenum) is assigned to the file, which if specified as the first argument of an INPUT or PRINT statement, will direct the input or output to the file.

If there is an error opening a file and the optional ELSE option has been specified, program control will branch to the specified line number, if the ELSE has not been provided an error message will be displayed.

If a file is opened for OUTPUT which does not exist, the file will be created, if the file does exist, its contents will be erased and any new PRINT output will replace it. If a file is opened for APPEND an error will occur if the file doesn't exist (or the ELSE branch will occur if specified). If the file does exist, any PRINT statements will add to the end of the file.

If an input prompt is specified on an INPUT statement being used for file I/O (i.e. #filenum is specified) an error will be displayed.

The FSEEK #filenum,filepos statement will position the file pointer for the next INPUT statement.

The CLOSE #filenum statement will close the file.

> 10 OPEN "FILE.TXT" FOR OUTPUT AS #1
> 20 PRINT #1,"0123456789Hello World!"
> 30 CLOSE #1
> 40 OPEN "FILE.TXT" FOR INPUT AS #2
> 50 FSEEK #2,10
> 60 INPUT #2,A$
> 70 PRINT A$
> RUN
Hello World!
>

Numeric functions

Selected numeric functions are provided, and may be used with any numeric expression. For example, the square root function, SQR, can be applied expressions consisting of both literals and variables:

> 10 LET I = 6
> 20 PRINT SQR(I - 2)
> RUN
2.0
>

Allowable numeric functions are:

  • ABS(x) - Calculates the absolute value of x

  • ATN(x) - Calculates the arctangent of x

  • COS(x) - Calculates the cosine of x, where x is an angle in radians

  • EXP(x) - Calculates the exponential of x, e^x where e=2.718281828

  • INT(x) - Rounds down numbers to the lowest whole integer less than or equal to x

  • LOG(x) - Calculates the natural logarithm of x

  • MAX(x, y[, z]...) - Returns the highest value from a list of expressions

  • MIN(x, y[, z]...) - Returns the lowest value from a list of expressions

> 10 PRINT MAX(-2, 0, 1.5, 4)
> 20 PRINT MIN(-2, 0, 1.5, 4)
> RUN
> 4
> -2
  • PI - Returns the value of pi.

  • POW(x, y) - Calculates x to the power y

  • RND(mode) - Psuedorandom number generator. The behavior is different depending on the value passed. If the value is positive, the result will be a new random value between 0 and 1 (including 0 but not 1). If the value is negative, it will be rounded down to the nearest integer and used to reseed the random number generator. Pseudorandom sequences can be repeated by reseeding with the same number.Generates a pseudo random number N, where 0 <= N < 1. Can be reset using the RANDOMIZE instruction with an optional seed value: e.g.

> 10 RANDOMIZE 100
> 20 PRINT RND(1)
> RUN
0.1456692551041303
>

Random integers can be generated by combining RND and INT: e.g.

> 10 PRINT INT(RND(1) * 6) + 1
> RUN
3
> RUN
6
>

Seeds may not produce the same result on another platform.

  • RNDINT(lo, hi) - Generates a pseudo random integer N, where lo <= N <= hi. Uses the same seed as above.

  • ROUND(x) - Rounds number to the nearest integer.

  • SIN(x) - Calculates the sine of x, where x is an angle in radians

  • SQR(x) - Calculates the square root of x

  • TAN(x) - Calculates the tangent of x, where x is an angle in radians

String functions

Some functions are provided to help you manipulate strings. Functions that return a string have a '$' suffix like string variables.

NOTE For compatibility with older basic dialetcs, all string indexes are 1 based.

The functions are:

  • ASC(x$) - Returns the character code for x$. x$ is expected to be a single character. Note that despite the name, this function can return codes outside the ASCII range.

  • CHR$(x) - Returns the character specified by character code x.

  • INSTR(x$, y$[, start[, end]]) - Returns position of y$ inside x$, optionally start searching at position start and end at end. Returns 0 if no match found.

  • LEN(x$) - Returns the length of x$.

  • LOWER$(x$) - Returns a lower-case version of x$.

  • MID$(x$, y[, z]) - Returns part of x$ starting at position y. If z is provided, that number of characters is returned, if omitted the entire rest of the string is returned

  • LEFT$(x$, y) - Returns the left most y characters from string x$. If y * exceeds the length of x$, the entire string will be returned.

  • RIGHT$(x$, y) - Returns the right most y characters from string x$. If y * exceeds the length of x$, the entire string will be returned.

  • STR$(x) - Returns a string representation of numeric value x.

  • UPPER$(x$) - Returns an upper-case version of x$

  • VAL(x$) - Attempts to convert x$ to a numeric value. If x$ is not numeric, returns 0.

  • TAB(x) - When included in a PRINT statement print-list, specifies the position x on the line where the next text will be printed. If the specified position x is less than the current print position a newline is printed and the print location is set to the specified column. If the TAB function is used anywhere other than on a PRINT statement, it will return a string containing x spaces with no CR/LF

Examples for ASC, CHR$ and STR$

> 10 I = 65
> 20 J$ = CHR$(I) + " - " + STR$(I)
> 30 PRINT J$
> 40 PRINT ASC("Z")
RUN
A - 65
90

Strings may also be concatenated using the '+' operator:

> 10 PRINT "Hello" + " there"
> RUN
Hello there

Strings may be repeated using the '*' operator:

> 10 PRINT "Hello " * 5
> RUN
Hello Hello Hello Hello Hello

Example programs

A number of example BASIC programs have been supplied in the repository, in the examples directory:

  • regression.bas - A program to exercise the key programming language constructs in such a way as to allow verification that the interpreter is functioning correctly.

  • factorial.bas - A simple BASIC program to take a number, N, as input from the user and calculate the corresponding factorial N!.

  • rock_scissors_paper.bas - A BASIC implementation of the rock-paper-scissors game.

  • PyBStartrek.bas - A port of the 1971 Star Trek text based strategy game.

  • adventure-fast.bas - A port of a 1979 text based Microsoft Adventure game.

  • bagels.bas - A guessing game, which made its first appearance in the book 'BASIC Computer Games' in 1978.

  • eliza.bas - A port of the early chatbot, posing as a therapist, originally created by Joseph Weizenbaum in 1964. This BASIC version can trace its lineage back to an implementation originally developed by Jeff Shrager in 1973.

  • oregon.bas - A port (of a port by the looks of it) of The Oregon Trail. This is a text based adventure game, originally developed by Don Rawitsch, Bill Heinemann, and Paul Dillenberger in 1971. This could still be a bit buggy, the listing I found wasn't the greatest.

  • life.bas - An implementation of Conway's Game of Life. This version is a port of the BASIC program which appeared in 'BASIC Computer Games' in 1978.

Informal grammar definition

ABS(numerical-expression) - Calculates the absolute value of the result of numerical-expression

ASC(string-expression) - Returns the character code of the result of string-expression.

ATN(numerical-expression) - Calculates the arctangent value of the result of numerical-expression

CHR$(numerical-expression) - Returns the character specified by character code of the result of numerical-expression.

CLOSE #filenum - Closes an open file

COS(numerical-expression) - Calculates the cosine value of the result of numerical-expression

DATA(expression-list) - Defines a list of string or numerical values

DIM array-variable(dimensions) - Defines a new array variable

EXIT - Exits the interpreter

EXP(numerical-expression) - Calculates the exponential value of the result of numerical-expression

FOR loop-variable = start-value TO end-value [STEP increment] - Bounded loop

FSEEK #filenum,filepos - Positions the file input pointer to the specified location within the open file, the next INPUT #filenum will read starting at file position filepos

GOSUB line-number - Subroutine call

GOTO line-number - Unconditional branch

IF expression THEN line-number|basic-statement(s) [ELSE line-number|basic-statement(s)] - Conditional

IFF(expression, numeric-expression, numeric-expression) - Evaluates expression and returns the value of the result of the first numeric-expression if true, or the second if false.

IF$(expression, string-expression, string-expression) - Evaluates expression and returns the value of the result of the first string-expression if true, or the second if false.

INPUT [#filenum,|input-prompt;] simple-variable-list - Processes user or file input presented as a comma separated list

INSTR(hackstack-string-expression, needle-string-expression[, start-numeric-expression[, end-numeric-expression]]) - Returns position of first needle-string-expression inside first hackstack-string-expression, optionally start searching at position given by start-numeric-expression and optionally ending at position given by end-numeric-expression. Returns -1 if no match found.

LEFT$(string-expression, char-count) - Takes the result of string-expression and returns the left-most char-count characters. If char-count exceeds string length the entire string is returned.

LEN(string-expression) - Returns the length of the result of string-expression

[LET] variable = numeric-expression | string-expression - Assigns a value to a simple variable or array variable

LIST - Lists the program

LOAD filename - Loads a program from disk

LOWER$(string-expression) - Returns a lower-case version of the result of string-expression.

LOG(numerical-expression) - Calculates the natural logarithm value of the result of numerical-expression

NEW - Clears the program from memory

NEXT loop-variable - See FOR statement

MAX(expression-list) - Returns the highest value in expression-list

MID$(string-expression, start-position[, end-position]) - Takes the result of string-expression and returns part of it, starting at position start-position, and ending at end-position. end-position can be omitted to get the rest of the string. If start-position or end-position are negative, the position is counted backwards from the end of the string.

MIN(expression-list) - Returns the lowest value in expression-list

ON expression GOSUB|GOTO line-number1,line-number2,... - Conditional subroutine call|branch - Program flow will be transferred either through a GOSUB subroutine call or a GOTO branch to the line number in the list of line numbers corresponding to the ordinal value of the evaluated expr. The first line number corresponds with an expr value of 1. expr must evaluate to an integer value.

OPEN "filename" FOR INPUT|OUTPUT|APPEND AS #filenum [ELSE linenum] - Opens the specified file. Program control is transferred to linenum if an error occurs otherwise continues on the next line.

PI - Returns the value of pi

POW(base, exponent) - Calculates the result of raising the base to the power of the exponent

PRINT [#filenum,]print-list - Prints a semicolon separated list of literals or variables to the screen or to a file. Included CR/LF by default, but this can be suppressed by ending the statement with a semicolon.

RANDOMIZE [numeric-expression] - Resets random number generator to an unpredictable sequence. With optional seed (numeric expression), the sequence is predictable.

READ simple-variable-list - Reads a set of constants into the list of variables.

REM comment - Internal program documentation

RETURN - Return from a subroutine

RESTORE line-number - sets the line number that the next READ will start loading constants from. line-number must refer to a DATA statement

RIGHT$(string-expression, char-count) - Takes the result of string-expression and returns the right-most char-count characters. If char-count exceeds string length, the entire string is returned.

RND(mode) - For mode values >= 0 generates a pseudo random number N, where 0 <= N < 1. For values < 0 reseeds the PRNG

RNDINT(lo-numerical-expression, hi-numerical-expression) - Generates a pseudo random integer N, where lo-numerical-expression <= N <= hi-numerical-expression

ROUND(numerical-expression) - Rounds numerical-expression to the nearest integer

RUN - Runs the program

SAVE filename - Saves a program to disk

SIN(numerical-expression) - Calculates the sine value of the result of numerical-expression

SQR(numerical-expression) - Calculates the square root of the expression

STOP - Terminates a program

STR$(numerical-expression) - Returns a string representation of the result of numerical-expression

TAN(numerical-expression) - Calculates the tangent value of the result of numerical-expression

UPPER$(string-expression) - Returns an upper-case version of the result of string-expression

VAL(string-expression) - Attempts to convert the result of string-expression to a numeric value. If it is not numeric, returns 0.

Architecture

The interpreter is implemented using the following Python classes:

  • basictoken.py - This implements the tokens that are produced by the lexical analyser. The class mostly defines token categories and provides a simple token pretty printing method.

  • lexer.py - This class implements the lexical analyser. Lexical analysis is performed on one statement at a time, as each statement is entered into the interpreter.

  • basicparser.py - This class implements a parser for individual BASIC statements. This is somewhat inefficient in that statements, for example those in a loop, must be re-parsed every time they are executed. However, such a model allows us to develop an interactive interpreter where statements can be gradually added to the program between runs. Since the parser is oriented to the processing of individual statements, it uses a signalling mechanism (using FlowSignal objects) to its caller indicate when program level actions are required, such as recording the return address following a subroutine jump. However, the parser does maintain a symbol table (implemented as a dictionary) in order to record the value of variables as they are assigned.

  • program.py - This class implements an actual basic program, which is represented as a dictionary. Dictionary keys are statement line numbers and the corresponding value is the list of tokens that make up the statement with that line number. Statements are executed by calling the parser to parse one statement at a time. This class maintains a program counter, an indication of which line number should be executed next. The program counter is incremented to the next line number in sequence, unless executed a statement has resulted in a branch. The parser indicates this by signalling to the program object that calls it using a FlowSignal object.

  • interpreter.py - This class provides the interface to the user. It allows the user to both input program statements and to execute the resulting program. It also allows the user to run commands, for example to save and load programs, or to list them.

  • flowsignal.py - Implements a FlowSignal object that allows the parser to signal a change in control flow. For example, as the result of a jump defined in the statement just parsed (GOTO, conditional branch evaluation), a loop decision, a subroutine call, or program termination. This paradigm of using the parser to simply parse individual statements, the Program object to make control flow decisions and to track execution, and a signalling mechanism to allow the parser to signal control flow changes to the Program object, is used consistently throughout the implementation.

Open issues

  • It is not possible to renumber a program. This would require considerable extra functionality.
  • Negative values are printed with a space (e.g. '- 5') in program listings because of tokenization. This does not affect functionality.
  • Decimal values less than one must be expressed with a leading zero (i.e. 0.34 rather than .34)
  • User input values cannot be directly assigned to array variables in an INPUT or READ statement
  • Strings representing numbers (e.g. "10") can actually be assigned to numeric variables in INPUT and READ statements without an error, Python will silently convert them to integers.

License

PyBasic is made available under the GNU General Public License, version 3.0 or later (GPL-3.0-or-later).

pybasic's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pybasic's Issues

NEXT clause ignores its parameter

This (intentionally byzantine) example should print 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 but it ignores the parameter of NEXT clause and prints "1" three times:

10 FOR I = 1 TO 10
20 FOR J = 1 TO 3
30 PRINT I
40 NEXT I
50 END
60 NEXT J
> run
1
1
1

Add INT command

It would be useful to have an INT command, which is equivalent to math.floor in Python -- it rounds down floating point numbers.

This is particularly useful in combination with RND to create random integers:

> 10 PRINT INT(RND * 6) + 1
> RUN
3
> RUN
6
>

I'll submit a PR to add this command, and to describe it briefly in the README.

Modularisation and Command Hooks (Question)

With all the tasty fixes incoming, I am seriously regretting having to fork the project for Artemis. Pulling from upstream is a bit of a pain currently. Would you be interested in a PR with the following changes?

a) Ability to import PyBasic as a module
b) Ability to add or override commands externally from PyBasic.

Together this would allow me to isolate my modifications from PyBasic and more easily pull in fixes.

I wouldn't have time to look at this right away, but if you are interested, I will add this to my to-do list.

IF-THEN-GOSUB

Placeholder to remind me to add ability to GOSUB from a conditional statement

Corrected pixel values exceed image bit depth

Hi,

I am using PyBasic to correct illumination variation. For some of my images, the output images from PyBasic exceed the bit depth. Is this allowed by PyBasic? It might be better if PyBasic were to clip outliers before output.

Compatibility with vintage basic / Terminal abstraction / Module Packaging

Hi Hi!

This project has been incredible. I've used it as a base for a RP2040 based portable computer that runs basic based on two hardware reference designs:
https://github.com/bobricius/PICOmputer
and
https://www.adafruit.com/product/4818

Along the way I've made some changes and I submitted two PR's that I guessed you may want (Load/Save ASCII and tweaks to make the code micropython/circuitpython compatible while still working in cPython).

Some of the other changes I've made are more substantial and I'm wondering if you'd be interested in them. They may not really fit with what you are doing with this project, so I thought before opening up PR's I'd be wise to post here to get your thoughts.

If you are interested in one or more of these, let me know and I'll open up PR's for them so you can review them. Thanks again for this awesome project, without it I'd never would have been able to have my fun with portable basic computers!

Terminal Abstraction

To work with the LCD display and keyboard of the hardware devices above, I had to change prints/inputs to the hardware specific features. A more robust solve here would be to make PyBasic use the python curses library instead so that hardware specific versions of this library could be used without modifying the base PyBasic code.

Module Packaging

Restructure the code into a PyBasic module which can be cleanly imported into other code bases to enable it's inclusion as a git submodule. This is pretty minor restructuring, just adding a directory to the repo, moving all the .py files there, adding and init.py and changing a couple imports in each file. This would contain everything in a single PyBasic module/namespace.
i.e
from pybasic import interpreter
interpreter.main()
or
import pybasic
pybasic.interpreter.main()

Compatibility with vintage-y basics

Many computers from the late 70's and early 80's had a similar basic dialect based on MS Basic. There is quite a bit of software available in this dialect and it's very close to PyBasic. See:
http://www.vintage-basic.net/index.html
I've made changes to PyBasic to make it more compatible with vintage basic and can get a PR going if you'd like:

  • Print element separator changed to ;
  • Print will not print cr/lf if final token is ;
  • Input prompt / variable separator changed to ;
  • RND required argument. Negative values reseed with that value, Zero returns last value (not implemented), any positive number returns random float between 0 and 1
  • DIM allows multiple array definitions seperated by commas
  • DIM over dim by 1 to handle dialects that are 1 based and expect there to be an element = length

These allow the loading of many older basic programs without modification. I'm also working at multi-statement lines and a couple other bits with the goal of loading most of the software linked above without modification.

NOTE: This would be a breaking change with existing PyBasic programs...

Ctrl-D improvement suggestion

Hi again,

I know there is an "exit" command, but Ctrl-D is frequently used as an exit. You could add an extra try/except:

interpreter.py:

while True:
    stmt = input('> ')

Typical exit with Ctrl-D:
ctrld

For a beautiful exit:

while True:
    try:
        stmt = input('> ')
    except EOFError:
        print("")
        quit()

Incorrect string array initialization

Hi Rich,

I hope you're enjoying the Holidays :-)

I've run into a memory leak which could be in any of CircuitPython, PyDOS or PyBasic. While poking around trying to figure out how to narrow down the possibilities I noticed an issue with PyBasic arrays.

The BASICArray class will initialize array values to zero, however it doesn't have variable type information so string arrays get initialized to zero (rather than presumably a NULL string).

This isn't really a show stopper since an easy work around is to explicitly initialize any string arrays after dimensioning them.

It's been a while since I've worked on PyBasic and I didn't want to throw a solution PR at this without giving it my full attention. My quick thoughts were that a type flag (ie String/Numeric) could be passed into the BASICArray init method so the proper initialization can be done.

This code snippet demonstrates the issue:

      10 DIM A$(10)
      20 T$ = A$(1)
      30 PRINT T$
      > run
      Syntax error: Attempt to assign non string to string variable in line 20

And here it is with the workaround:

      10 DIM A$(10)
      15 FOR I = 1 TO 10
      16 A$(I) = "":NEXT I
      20 T$ = A$(1)
      30 PRINT T$
      > run

Can't save or load files on Linux

On Linux (Ubuntu), I'm trying to load a program I wrote up in a text editor. However, when I try loading it in PyBasic, it just gives me "invalid syntax". The same issue occurs when saving a PyBasic BASIC program.

READ into arrays

Placeholder to remind me to look at READing directly into an array variable

Input is too fragile

At the moment if you ask for a numeric input and the user mistypes their response, it will halt execution. This same applies if an incorrect number of inputs is provided. This is quite inconvenient as there's no way for your program to internally deal with bad user input and continue.

What do you think of the following alternatives?

Option A: Redo from start

Classic 8-bit BASIC variants dealt with this by printing "REDO FROM START" and offering the input prompt again, until the user enters a valid input. This solves the problem nicely. However the message is not customisable in your program and this isn't multi-lingual friendly.

Option B: Default value

If an input missing or invalid, set some default value instead. Probably 0 or -1 for numeric. Strings would be empty: ""

Option C: Don't assign values

If an input missing or invalid, don't even attempt to assign the value. It looks like PyBasic used to work close to this... The advantage is your can specify any default you want by pre-setting the variable before passing it in to INPUT. The downside is if you don't pre-set it, you are opening yourself to errors later in your code. This isn't my favourite, but I do think it's better than how things currently are.

Option D: Change INPUT syntax

Only allow collection of one variable at a time. That would also leave space for optional parameters that tell INPUT how to act when given invalid numeric input. e.g. customise the default value, or goto a line number on failure etc.

Option E: Keep current behaviour and document pitfalls

My least favourite approach. At minimum, the manual ought to tell you that the only safe INPUT is a single string. As anything else may cause your program to fall over depending on what the user enters.

Execute Files Directly

Hello! I wanted to add a scirpt to the repo to allow directly running .bas files without having to interact with the REPL. I'd like to be able to use PyBasic on embedded devices and this would be a requirement to be able to invoke the interpreter without user input.

Here is sample code to do this located in run.py

Let me know if I can open a PR to add this functionality.

Thanks for reading!

import argparse
from lexer import Lexer
from program import Program

def main():
    """
    Main function to run the BASIC interpreter.
    """
    parser = argparse.ArgumentParser(description="BASIC Interpreter")
    parser.add_argument("filename", help="Name of the BASIC program file to load and execute")
    args = parser.parse_args()

    lexer = Lexer()
    program = Program()

    try:
        program.load(args.filename)
        program.execute()
    except Exception as e:
        print("An error occurred:", e)

if __name__ == "__main__":
    main()

PORT access

How to access physical port and serial port ?

It will be nice to read or write I/O and access serail port.

Best regards
Georges

License for PyBasic?

Hi @richpl, thanks so much for creating PyBasic! I'm setting up an old laptop for my niece to teach her programming the way I first learned it about 30 years ago, and this looks perfect to use for an interpreter. :)

I was wondering, is there a particular open source license for PyBasic? I'm happy to submit a PR to add one (whichever you would prefer, GPLv2/v3 or MIT or anything else in between), just wanted to make sure there was a license also in case I come across issues to submit PRs for. Thank you again!

PyBasic Fork

Hello! Thank you for releasing the source of PyBasic under an open license.

I have forked PyBasic for my fantasy computer side-project. Unfortunately, a fork is necessary as I have to add commands for manipulating the 'screen' etc. I have cheekily dubbed the fork "JiBASIC" because I couldn't resist the pun.

But I am tempted to submit a patch to PyBasic to help disentangle print() from the BASICParser and Program classes. This will help me pull upstream fixes and changes from PyBasic, and allow me to push fixes and more generic additions back here more easily. Would this be something you'd be interested in?

On the subject of pushing upstream, I've already added a version of the CHR$() function, and am planning to add convenience functions for string manipulation and more, interested in those? I'd understand if the scope of your project is limited to the basics and you don't want to maintain my feature creep.

Here's a little animation of my WIP

arti_2

Logical operators?

How do you feel about adding the logical operators AND, OR, NOT and XOR? Even though there's no bools proper, this would be useful for compacting IF statements.

I'd like to do this how locomotive BASIC and similar dealt with it. Logical and Relationship operators are treated the same as arithmetic operators with the order of operations: logical, relationship, addition/sub, multiplication/div. Logical and relational operators output 1 for true, 0 for false. Logical operators treat non-0 inputs as true.

Two possible bugs

Hello,

Great project, thank you so much! I've ported your interpreter to micropython and included it in my PyDOS release https://github.com/RetiredWizard/PyDOS.

During the port I ran across a couple issues that I believe would affect your cPython version as well.

The first one was that numbers at the end of variable names seemed to cause a problem.

The second one was that there seemed to be a problem with loop variables caused when a for loop was exited using a goto statement and then the loop was reinitiated.

I tweaked my ported version to resolve the first issue and deal with the second issue although I don't think my solution to loop problem is complete.

Thanks again!

Bug in sample program in README

One of the sample programs in the README dealing with arrays has an error:

> 10 DIM MYARRAY(2, 2, 2)
> 20 LET MYARRAY(0, 1, 0) = 56
> 30 PRINT A(0, 1, 0)
> RUN
56
> 30 PRINT A(0, 0, 0)
> RUN
Empty array value returned in line 30

The program initializes the array as MYARRAY but then tries to address it with the name A in line 30. I'll submit a PR to fix this.

Star Trek: Get PyBasic port closer to original

The current version of Star Trek in this repo is a port of a port and has lost some of the elegance of the original code, particularly the clever use of on-gosub for command parsing.

Now that PyBasic is much more compatible with older dialects, it might be nice to get the PyBasic port of Star Trek closer to the original.

Either:
Pull the version in the PyDos repo at
https://github.com/RetiredWizard/PyDOS/blob/main/PyBasic/startrek.bas
and re-implement the REM'd out on-gosubs bringing the version in this repo closer to the original

OR

Start from the original code and do a fresh port. This will be an interesting test of PyBASIC compatibility!

Don't stop PyBasic after a BASIC error?

Hi,

thanks for creating PyBasic! I'm not really reporting an issue here. I just like to talk about something, I'd like to see in a future version of PyBasic:
At the moment, when there's an error in the BASIC program, the Python code raises an error, and the Python script stops, exiting to the console.
Now, I'm trying to combine PyBasic with Pygame, simulating the graphics of the C64 and such. The goal is, that in the end one (finally) can write BASIC programs in this simulation that are supposed to run about as fast as assembly programs on the 8-bit-computers.
So I'd like to have a cursor in the simulation, and the user should be able to input the BASIC code. And when there's an error, the Python/Pygame environment shouldn't stop. Instead the user interface should for example print

?SYNTAX  ERROR
READY.

and continue running, waiting for the next user input.
At the moment, as far as I can see, there are 87 raise-commands in PyBasic that stop the Python script in case of a BASIC error. Instead, I'd like it to print an error message to the console, and then to go on printing the next input prompt accepting more input. Would that be possible? It seems, I can't dig deep enough into your code yet to realize it myself.

Also, I'd like to see BASIC commands without line numbers being executed directly. You know, so that PyBasic accepts a simple

PRINT "Hello"

which it doesn't at the moment. That would also help a lot.

Thank you for your attention.

Can't input float number

When I try this program:

10 INPUT A
20 PRINT A

I input 2.34, and the error is displayed:
String input provided to a numeric variable

I'm not sure if my program is correct.

I make a simple modification:
basicparser.py line: 451

                elif not left.endswith('$'):
                    try:
                        self.__symbol_table[left] = int(right)
                    except ValueError:
                        try:
                            self.__symbol_table[left] = float(right)
                        except ValueError:
                            raise ValueError('String input provided to a numeric variable ' +
                                         'in line ' + str(self.__line_number))

Is there any better solution?

string multiplication with *

In the example program PyBStartrek.bas is the line:

1413 Q$=" "*192

I did not find a definition for this kind of string multiplication. Neither in the manual of GW-Basic nor in the manuals of QuickBasic, QBasic or TI-99 extended Basic. But maybe I overlooked it.

There is a page about repeating a string at Rosetta Code. None of the BASIC examples uses * for string multiplication. In almost all BASIC examples a loop is used. In some BASIC examples there is a function for string multiplication:

BBC BASIC has STRING$(5, "ha")
PowerBasic has REPEAT$(5, "ha")
TI-99 Extended BASIC has RPT$("ha", 5)

So my question is: Which historic BASIC defines this kind of string multiplication?

If there is no historic BASIC with * as string multiplications I suggest to rewrite line 1413 of the PyBStartrek.bas example and additionally I suggest also to remove this functionality from PyBasic and writing a syntax error such as:

***** [1413] UNEXPECTED SYMBOL "*".

instead.
If there is some historic BASIC with * as string multiplication please tell me such that I can add it to my Bas7 BASIC interpreter (which is also for historic BASIC programs).

INPUT into arrays

Placeholder to add support for user input directly into an array variable

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.