Question: C++ Project 1: Random Sentence Generator 1 OVERVIEW A Random Sentence Generator (RSG) constructs sentences, paragraphs, and even papers that fit a prescribed format. The

C++

Project 1: Random Sentence Generator

1 OVERVIEW

A Random Sentence Generator (RSG) constructs sentences, paragraphs, and even papers that fit a prescribed format. The format is specified by a set of rules called a grammar. As an example of how useful the RSG program is, consider several different requests for an extension generated by the program and shown below.

Use lots of excuses:

I need an extension because I had to do laundry , and if you

can believe it, all my pencils broke , and then my Linux box

was confiscated by Dean Sue , and just then I just didn't

feel like working , oh, and then I forgot how to write , and

then get this, I lost my dreams , and if you can believe it,

my notes had a problem of a private nature .

Shorter, but still interesting:

I need an extension because I thought I already graduated ,

and then get this, the programming language was inadequately

abstract .

Or use a more honest approach:

I need an extension because I had to go to the ACC finals .

2 GRAMMARS

The input to the program, and the rules the program uses to construct its output, is a Context Free Grammar. A grammar is a set of rules, of a specific form, that can be used to recognize or to generate a sentence. In this program you will use the grammar to generate a sentence (or several sentences). Your C++ compiler uses a grammar to recognize a sentence (i.e., program).

As an example, we'll look at rules for generating a simple poem. All grammars have a rule to start with. Here is the poem's start rule (from poem.g):

{

The tonight. ;

}

Immediately following the start rule, we define the rules for and

{

waves ;

big yellow flowers ;

slugs ;

}

{

sigh ;

portend like ;

die ;

}

{

warily ;

grumpily ;

}

2.1 GRAMMAR ELEMENTS

Strings in grammars are either terminal strings or non-terminal strings. In the RSG examples, non-terminals are delimited by angle brackets like and . The word "terminal" indicates that the string does not expand into something else like the non-terminal strings do.

2.2 DEFINITIONS AND PRODUCTIONS

Each non-terminal in a grammar has a definition, the definition is a set of productions. In the RSG format, definitions are enclosed by curly braces { and } and each production is terminated by a semi-colon. In the poem example, the definition for the non-terminal has three productions. The non-terminal definition has only two productions (which are all terminals).

3 IMPLEMENTATION ISSUES

You will need to use a map class (also called a table, dictionary, or Symbol Table) to find the definitions associated with a non-terminal. A map stores key, value pairs, mapping a key to a value. For this assignment you'll use an unordered_map object. You should create a map in which keys are strings (the non-terminals read from a grammar file) and the values are pointers to Definition objects. The Definition class is one of several subclasses of the abstract base class GrammarElem. These classes are described in more detail in the architecture section.

This poem.g grammar generates many different poems (you can calculate that it generates 21 different poems). Among the poems are the following (each on its own line).

The big yellow flowers sigh warily tonight.

The waves sigh warily tonight.

The slugs portend like waves tonight.

The slugs portend like slugs tonight.

3.1 PARSING FILES/GENERATING SENTENCES

There can be lots of words, comments, etc., between the definitions of the non-terminals in an input file. The function Grammar::parse in rsg.cpp shows how to read a grammar and ignore whatever is outside the definitions. You will need to modify the functions in rsg.cpp to store the terminals, non-terminals, productions, and definitions in such a way that they can be used to generate random sentences according to the grammar.

4 CLASSES/ARCHITECTURE

One possible implementation of a data structure storing the grammar to generate random sentences is diagrammed below.

The abstract base class GrammarElem is provided for you in rsg.cpp, basically it looks like this.

class GrammarElem

{

public:

virtual ~GrammarElem() {}

virtual void expand() = 0;

protected:

// stuff here

};

GrammarElem is an abstract base class since the method expand is pure virtual.

4.1 REPRESENTING THE GRAMMAR

At a minimum, you will need to create a class to store the definition of each non-terminal and you should call this class Definition.

The Grammar class given in rsg.cpp has a method expand, shown below, that illustrates this.

// post: grammar is expanded

void Grammar::expand()

{

GrammarElem * g = gmap->get("");

if (g == 0) {

cout << "error, no start symbol" << endl;

}

else {

g->expand();

}

}

4.2 EXPAND IN SUBCLASSES

Each part of the grammar (subclass of GrammarElem) expands differently. If you use a class for each part of the grammar as in the diagram above, expansion for each class is described as follows.

terminal: print the word (perhaps using a Writer, see below for details).

non-terminal: look up its definition and expand the definition

production: expand each of elements in sequence; the elements are terminals or non-terminals.

definition: pick one of its productions at random and expand the production

Note that expansion is very recursive, though no individual class has a recursive expand function. The process is recursive since expanding a definition means expanding a production, expanding a production means expanding terminals and non-terminals, and expanding a non-terminal requires looking up its definition and expanding it---so we're back to definition expansion.

Just as the provided Grammar class is a subclass of GrammarElem and provides its own version of the expand function, we suggest you make a subclass for each kind of grammar elements described and diagramed above.

4.3 CLASS DETAILS

The class Terminal is the simplest subclass to implement. It stores a terminal string, and when it is expanded it simply writes the string out.

The class NonTerminal uses the static map declared in the GrammarElem base class (thus shared by all subclass objects) to look up its definition. The definition is then expanded.

The class Production is a GrammarElem via inheritance, but it also has several grammar elements that it stores in a private vector of pointers to GrammarElem objects (since it does not know which will be terminals and which will be non-terminals it must use pointers). A production expands itself by asking its parts to expand themselves in sequence.

Finally, a Definition is similar to a Production because it too holds GrammarElems. It stores a collection of pointers to Production objects. A Definition is expanded by choosing a production at random and expanding the production.

In this design both Definition and Production have vectors storing GrammarElem pointers.

5 BUILDING THE STRUCTURE

Each time a non-terminal is read as the grammar is being parsed you will need to create a pointer to a Definition object and add the non-terminal's name and Definition to the map. For example, here's the code from the program rsg.cpp that parses a file. // Pre: first is beginning of this production (first terminal/non-terminal)

// Post: Entire production is read/parsed and stored appropriately

void Grammar::parseProduction(istream & input, const string & first)

{

vector production;

string s = first; // start

do {

cout << " " << s;

production.push_back(s);

} while (input >> s && s != END_PRODUCTION);

cout << endl;

}

// Pre: START_DEFINITION has been read,

// non-terminal ready to read from input

// Post: Definition for non-terminal has been read/parsed, and

// stored in gmap

void Grammar::parseDefinition(istream & input)

{

string name, s;

input >> name; // read the non-terminal

cout << name << endl; // echo for debugging

Definition def;

int prodCount = 0;

while (input >> s && s != END_DEFINITION) {

cout << "\t" << "production " << prodCount << "\t";

parseProduction(input, s);

prodCount++;

}

cout << endl << "-----" << endl;

// Change "this" to some Definition pointer

std::pair grammarPair (name, this);

gmap->insert(grammarPair);

}

5.1 CREATING A DEFINITION

A Definition object must be created at the beginning of parse Definition and stored in the map as the function finishes. In the working program, the key/string name will be mapped to a real Definition pointer, not to this. The Definition has many productions in it.

5.2 CREATING PRODUCTION

For example, each time parseProduction is called on line 22, a Production object will be stored in the Definition being created in parseDefinition. This means the class Definition needs a method that allows a Production to be added to the Definition.

You will probably want to change the signature of the function parseProduction and the other parseXXX functions to return a pointer to a GrammarElement object created by the parse function).

5.3 CREATING TERMINALS AND NONTERMINALS

In the current program, each string that's part of a production is printed on line 7. In the final program rather than printing the string, it should be used to create either a Terminal or NonTerminal object, which is then added to the Production that is created (and presumably returned) by parseProduction.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!