Building an AppleSoft BASIC Interpreter in LiveCode

Developing an AppleSoft BASIC Interpreter in LiveCode, Part 1

This article documents the architectural path taken to build a historically faithful AppleSoft BASIC interpreter using LiveCode. It is intended for developers interested in interpreter design, language emulation, and the disciplined reproduction of legacy language behavior within a modern execution environment.
Rather than cataloging debugging challenges, this discussion focuses on structure, sequencing, and design decisions that enabled a stable, non-graphics AppleSoft implementation.

Project Scope and Intent
The project’s core objective was behavioral fidelity, not nostalgia or convenience. AppleSoft BASIC is deceptively simple on the surface, but its semantics contain many implicit rules that must be preserved to achieve correctness.
Key goals included:

Exact replication of AppleSoft language semantics
Preservation of execution order and error timing
Avoidance of host-language shortcuts
A clean architectural foundation for future graphics support

LiveCode was selected as the host language for its strong text processing capabilities, message-driven execution model, and ease of building an interactive development UI—provided strict discipline was applied to avoid leaking host behavior into the interpreted language.

Interpreter Architecture Overview
The interpreter is structured as a pipeline of clearly separated stages:

Tokenization
Statement segmentation
Expression compilation (RPN)
Expression evaluation
Command dispatch
Program execution loop

Each stage operates on well-defined data structures and avoids implicit coupling with the others. This separation proved essential for correctness, maintainability, and regression locking.

Tokenization Strategy
Tokenization is performed early and aggressively. Each source line is converted into a normalized token stream before execution begins.
Token classes include:

Keywords
Symbols
Operators
Numeric and string variables
String literals

String literals are detected and preserved before any spacing or symbol normalization occurs, ensuring that quoted content is never reinterpreted or fragmented later in the pipeline.
Conceptually:

10 PRINT "HELLO";A+1
Normalizes into a form similar to:

PRINT STR "HELLO" SYM ; VAR A SYM + 1
This normalization allows downstream logic to remain deterministic and free of ad-hoc parsing rules.

Statement Segmentation
AppleSoft allows multiple statements per line using colons. After tokenization, each line is segmented into top-level statements while respecting string and parenthesis boundaries.
This enables precise handling of:

Inline IF/THEN execution
STOP / CONT resume points
Error recovery via RESUME

Each statement becomes an independently executable unit with its own execution context.

Expression Compilation Using RPN
Expressions are never evaluated directly from infix form. All expressions are first compiled into Reverse Polish Notation (RPN) using a Shunting Yard implementation tailored to AppleSoft rules.
Key characteristics:

AppleSoft operator precedence is preserved
Exponentiation is right-associative
Unary operators are explicitly identified
Logical operators map to 16-bit truth values

Integer division (an extension) is translated into an internal opcode rather than relying on host parsing:

A \ B → IDIV
This avoids host-language ambiguity and guarantees correct truncation semantics.

Expression Evaluation
The RPN evaluator operates on a simple, explicit stack model. All values—numbers, strings, array elements, and intermediate results—flow through the same mechanism.
Design constraints include:

Boolean true represented as -1
Uniform resolution of scalars, arrays, and functions
Bounds checking aligned with AppleSoft error rules
Errors raised during evaluation, not compilation

The strict separation between compilation and evaluation dramatically simplifies correctness and testing.

Command Dispatch Model
Each statement begins with a command token that maps directly to a handler:

PRINT → handlePrint
FOR → handleFor
INPUT → handleInput
Handlers are intentionally narrow in scope. They implement only command semantics and delegate expression handling, variable resolution, and control-flow transitions to shared utilities.
This isolation keeps command logic readable and minimizes unintended interactions.

Control Flow and Execution State
Execution is driven by a cooperative stepping loop rather than a blocking run. Interpreter state is tracked explicitly:

Current line index
Current statement index
FOR/NEXT stack
GOSUB/RETURN stack
STOP/CONT resume information

This explicit state model is essential for reproducing AppleSoft behavior and for maintaining UI responsiveness within LiveCode.

Variables, Arrays, and Memory Model
Variables are stored in a unified dictionary keyed by canonicalized names. Arrays are managed through a dedicated bounds table recording rank and dimensions.
Supported behaviors include:

Automatic array dimensioning on first use
Correct 0-based indexing
Separate handling for 1D and 2D arrays
Bounds checking with AppleSoft-correct errors

POKE and PEEK operate through a virtual memory abstraction, providing compatibility without exposing host memory directly.

INPUT, DATA, and Program I/O
INPUT follows AppleSoft’s two-phase model:

Validate all input values
Assign values atomically

This prevents partial assignment and ensures correct reprompt behavior.
DATA statements are pre-scanned into a data pool prior to execution, enabling accurate READ and RESTORE behavior regardless of control flow.

Error Handling Architecture
Errors are centralized and standardized:

Correct AppleSoft error codes and messages
Accurate line attribution and timing
Support for ONERR GOTO
Full RESUME and RESUME NEXT semantics

Error state is tracked explicitly to prevent recursive faults and to support controlled recovery.

Immediate Mode Support
Immediate mode commands and expressions are processed using the same tokenization, compilation, and evaluation pipeline as program execution. This guarantees consistent behavior between immediate and program modes.
UI interaction is carefully decoupled from interpreter state to preserve execution integrity.

Regression-Driven Development
Features were locked through regression tests rather than incremental patching. Once a language feature passed regression, its behavior was frozen.
This approach proved essential in managing complexity and preventing semantic drift over time.

Current Status and Next Phase
Version 2.9.3 represents a complete freeze of all non-graphics AppleSoft BASIC functionality. The interpreter now serves as a stable, validated execution core.
The next major phase (v3.x) will focus on AppleSoft graphics, including screen modes and drawing primitives, built directly on this foundation.

Conclusion
Building a faithful AppleSoft BASIC interpreter in LiveCode required deliberate restraint and architectural discipline. The result is not a BASIC-like system, but a historically accurate execution environment suitable for both preservation and expansion.