Programming Language Syntax

Specifying syntax: Regular Expression and Context Free Grammar
The first three rules that define syntax include concatenation, alternation, and Kleene closure. A regular set or regular language refers to syntax derived from above rules. Regular sets are generated by regular expressions recognized by scanners. Recursion rule if added to define existing syntax language generated is called Context-free language (CFL). Context-free grammar generates CFG (CFG) recognized by parsers.

Tokens and regular expressions
Tokens are shortest strings of characters and basic building blocks of a program that have individual meaning. They include keywords, identifiers, symbols, as well as program constructs. Regular expressions can be

A character
The empty string
Two regular expressions next to each other
Two regular expressions with a separation shown by a vertical bar ( | )
A regular expression followed by a Kleene star
Context Free Grammar
Rules in CFG are called production. Symbols on the left-hand side of productions are variables or non-terminals. Symbols making up strings derived from grammar are terminals. Terminals cannot appear on the left-hand side. Terminals of CFG are language tokens. Non-terminals that appear on the left-hand side in the first production are a start symbol.

Derivations and Parse Trees
Derivation refers to a series of replacement operations showing the process of a driving string of terminals from start symbol. Types of derivations include right most derivation, left most derivation and parsed tree.

Scanner and parsers help in discovering the syntactic structure of a program. Scanning or syntax analysis helps in translating the program into the equivalent target language. Scanner reduces the number of items inspected by parser by removing items like comments.

Generating Finite Automation
Finite Automation Can be hand written or automatically generated from a regular expression. The scanner generates a non-deterministic automaton in three steps. First, converts the regular expressions into a nondeterministic finite automaton (NFA). In the second step, scanner generator translates the NFA into an equivalent DFA. The third step involves space optimization that generates a final DFA with the minimum possible number of states.

Scanner Code
Scanner for capturing circles and arrows in DFA can be implemented in two main ways one involves embedding controls within controls flow using GOTO statements or nested case such as switch. The second way involves use of the table and a driver.

Lexical Errors
Most common lexical errors include cases where next character is not an acceptable continuation, or it does not start with another token. An approach in dealing with lexical errors include

Throwing current invalid token
Skip forward until an acceptable or a character beginning a new token is identified
Restart scanning algorithm
Count on error recovery mechanism of the parser
They are statements that provide directives or hints to the compiler without changing program semantics. Functions of directive pragmas include

Turning run-time checks on and off
Turning specific code such as loop on and off
Enabling or disabling performance of system processes

The parser is the heart of a compiler and a language recognizer. The parser calls scanner to obtain tokens, which are input to the program. It assembles tokens into syntax tree and then passé the tree to next phases of the compiler to help in performing semantic analysis, code generation, and improvements. Parsers that run in linear time include LL and LR. LL means Left-to-right, Leftmost derivation while LR means Left-to-right, Rightmost derivation.

Sherry Roberts is the author of this paper. A senior editor at MeldaResearch.Com in write my essay online if you need a similar paper you can place your order from write my essay for me services.