Question: C LANGUAGE: Need help writing common.h , tokenizer.c , & recognizer.c so they are executable. Need to create two programs: a tokenizer and a recognizer.

C LANGUAGE: Need help writing common.h

,

tokenizer.c

,

& recognizer.c so they are executable. Need to create two programs: a tokenizer and a recognizer. Tokenizer will read an input file line by line and convert the textual input into an ordered collection of tokens and lexemes. Recognizer will use recursive descent to parse the output of the tokenizer and determine if the given tokens form a valid program. Common.

*

The purpose of Common.

*

is to define all includes

/

imports

,

constants, globals, enums, structs

/

objects which are shared

(

or common

)

between both Tokenizer & Recognizer. Common.

*

will be stored in the same directory as both Tokenizer & Recognizer. Though Tokenizer & Recognizer are

2

distinct programs which compile independently of each other, it is likely they'll share some commonality Tokenizer will read in two command line arguments when run. The first command line argument is the filepath of the input file and the second is the filepath of the output file. Your program will need to take in these two command line arguments, no other methods for acquiring the filepaths are allowed. When run, Tokenizer.

*

will read in all characters from the input file. It will convert these characters into lexemes, then associate each lexeme with a token class. It's my recommendation that you construct lexemes character by character. i

.

.

iterate over every individual character in the input file and determine if the current character is part of an alphanumeric lexeme, whitespace, or part of a symbol lexeme. The type of character you identify will determine the next action your tokenizer will take. After generating a lexeme, it must be associated with a token class. You can do this association after a lexeme is generated or you can generate all lexemes then associate all generated lexemes with their respective token classes. While either approach is valid, I recommend the latter approach. Breaking lexeme generation apart from token association simplifies the overall program structure. Two token classes in the provided lexical structure are defined via regular expressions

(

any string which matches the specified regex is part of that token class

)

The exception is strings explicitly defined in lexical structure which are reserved words. i

.

.

"return" would be an IDENTIFIER token as it matches the regex provided for IDENTIFIER in the given lexical structure. But "return" is explicitly defined as a RETURN

_

KEYWORD token earlier in the lexical structure. To avoid any confusion, you ought to compare lexemes against the token classes in the order defined in the lexical structure. i

.

.

check if the generated lexeme is a reserved word before checking if it's an IDENTIFIER token. Recognizer will read in two command line arguments when run. The first command line argument is the filepath of the input file and the second is the filepath of the output file. Program will need to take in these two command line arguments, no other methods for acquiring the filepaths will be accepted. When run Recognizer will read in a list of tokens and their associated lexemes from the input file. The output file from Tokenizer will be used as the input file for Recognizer. It will determine if the ordered set of tokens from the input file is legal in the language defined by given EBNF grammar. The purpose of our recognizer is to apply given grammar rules and report any syntax errors. To accomplish this Recognizer will implement a recursive decent parser. The implemented parser must be a recursive decent predictive parser which utilizes single

-

symbol lookahead consuming each token one at a time. Parsers that utilize multi

-

symbol lookahead will not be accepted. An input is syntactically invalid if a token or non

-

terminal was required by the current EBNF grammar rule but not present. If a syntax error is found parsing should halt & program should report an error by printing an error message to the output file. If a token was expected but not present, the error must specify: Which grammar rule had an error, which number token we were examining, expected token & actual token. Example format as follows: Error: In grammar rule body, expected token #

6

to be RIGHT

_

BRACKET but was IDENTIFIER. Given a grammatically valid input every given token must be parsed. If the toplevel grammar rule

(

function

)

is invoked & concludes without error this indicates that the ordered set of input tokens were syntactically valid. If the top

-

level grammar rule concludes without error the given set of tokens is only valid if all tokens have been consumed. i

.

.

if the first

10

tokens are a syntactically valid input but the input file contained

13

tokens, this is a syntax error. This must be identified and reported with following: Error: Only consumed

10

of the

13

given tokens. If all input tokens are consumed and no syntax errors reported, Recognizer will output "PARSED!!!". Recognizers that dont consume every given token for a grammatically valid input will not be accepted

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

The goal of this assignment is to create a C program which: -takes a text file as input -tokenizes the given input -Parses the tokenized input to determine if it is grammatically valid This program...

Assignment 4 The goal of this assignment is to create a C program which: - takes a text file as input - tokenizes the given input - Parses the tokenized input to determine if it is grammatically...

Hi I need help with this project that I am doing. It has to be in C language and I don't what to do. This is for my Data Structure course. Please it has to be in Language of C. Programming Assignment...

Could anyone please help with my C Program? I can't figure out why it has a segmentation fault. Also how could I create this program without a global variable(static int j=0;)? I've included...

please help ASAP. This is a virtual box assignment with a vagrant. 0 opcode mnemonic description NOOP / HALT Indicates system halt state 1 LOAD Load AC from memory 2 STORE Store AC to memory 3 JMP...

please help I need this by tonight ASAP. if you solve this i will give you the BEST RATING. ALL I NEED IS THE FULL DETAILED CODE FOR THE MULTITASKING COMMANDER. other commanders are finished. I...

Introduction and learning objectives When you were learning about operational analysis earlier in the term, we talked about jobs that require multiple visits to the CPU (or servers) to receive their...

I have to create a program in C and I can't figure it out. The program has to read a source file. Please help. /******************************************************************** PROJECT: Glossary...

**Must be in C language** **Need Help** Password Recovery For this assignment, you will write a program that will attempt to recover a salted and hashed password using a file of candidate passwords....

please help with writing this program in C++ language (1) Continuing writing the CMatrix class in Question 2 from Assignment 2, we will expand it in this question. Let's call the new class CMatrix...

In problem 1-3 evaluate each line integral. 1. C(x3 + y) ds; C is the curve x = 3t, y = t3, 0 t 1. 2. Cxt2/5 ds; C is the curve x = t, y = t5-2, 0 t 1. 3. C(sin x + cos y) ds; C is the line...

The J test in Example 8.2 is carried out using over 50 years of data. It is optimistic to hope that the underlying structure of the economy did not change in 50 years. Does the result of the test...

8. Some conditional statements can be reexpressed to form arguments. True or False

What is author purpose of The homeless brother, l cannot save