Question: This is a basic homework question for Compiler project. Please write this in C++ or C!! The first assignment is to write a lexical analyzer

This is a basic homework question for Compiler project.

Please write this in C++ or C!!

The first assignment is to write a lexical analyzer (lexer)

You can build your entire lexer using a FSM, Or build using at least FSMs for identifier, integer and real (the rest can be written ad-hoc)

but YOU HAVE TO CONSTRUCT A FSM for this assignment otherwise, there will be a deduction of 2 points!

Note: In your documentation (design section), YOU MUST write the REs for Identifiers, Real and Integer, and also show the NFSM using Thompson.

The Lexer

A major component of your assignment will be to write a procedure (Function) lexer (), that returns a token when it is needed. Your lexer() should return a record, one field for the token and another field the actual "value" of the token (lexeme), i.e. the instance of a token.

Your main program should test the lexer i.e., your program should read a file containing the source code of Rat18S to generate tokens and write out the results to an output file.

Make sure that you print both, the tokens and lexemes.

Basically, your main program should work as follows

while not finished (i.e. not end of the source file) do

call the lexer for a token

print the token and lexeme

endwhile

Do at least 3 test cases and make sure that you turn in proper documentation using the documentation template.

A simple test case

Source code:

while (fahr < upper) a = 23.00

Output:

token lexeme

keyword while

separator (

identifier fahr

operator <

identifier upper

separator )

identifier a

operator =

real 23.00

below is RE for this language

RAT18S

1) Lexical Conventions:

The lexical units of a program are identifiers, keywords, integers, reals, operators and other

separators. Blanks, tabs and newlines (collectively, "white space") as described below

are ignored except as they serve to separate tokens.

Some white space is required to separate otherwise adjacent identifiers, keywords, reals and integers.

is a sequence of letters or digits, however, the first character must be a letter and last char must be either $ or letter. Upper and lower cases are same.

is an unsigned decimal integer i.e., a sequence of decimal digits.

is integer followed by . and Integer, e.g., 123.00

Some identifiers are reserved for use as keywords, and may not be used otherwise:

e.g., int, if, else, endif, while, return, get, put etc.

Comments are enclosed in ! !

2) Syntax rules : The following BNF describes the Rat18S.

R1. ::= %%

R2. ::= |

R3. ::= |

R4. ::= function [ ]

R5. ::= |

R6. ::= | ,

R7. ::= :

R8. ::= int | boolean | real

R9. ::= { < Statement List> }

R10. ::= |

R11. := ; | ;

R12. ::=

R13. ::= | ,

R14. ::= |

R15. ::= | | | | | |

R16. ::= { }

R17. ::= = ;

R18. ::= if ( ) endif |

if ( ) else endif

R19. ::= return ; | return ;

R20. ::= put ( );

R21. ::= get ( );

R22. ::= while ( )

R23. ::=

R24. ::= == | ^= | > | < | => | =<

R25. ::= + | - |

R26. ::= * | / |

R27. ::= - |

R28. ::= | | ( ) | ( ) |

| true | false

R29. ::= e

thank you

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!