Question: grammarQuestion 3 : Efficiently Simulating Bounded Stacks with RNNs ( 2 0 pts ) As we discussed in the lecture, hierarchical structure, characterized by long

grammarQuestion 3: Efficiently Simulating Bounded Stacks with RNNs (20 pts)
As we discussed in the lecture, hierarchical structure, characterized by long distance and nested
dependencies, lies at the core of the human language. Indeed, this motivated our theoretical discussion
of context-free grammars as a useful paradigm for describing language. Studying modern language
models with respect to their ability to efficiently represent hierarchical structure, therefore, provides
evidence that they are useful models for human language. This question directly addresses this point.
More precisely, we will study the ability of recurrent neural network language models to recognize a
variant of the D(k) languages.
As introduced in the lecture notes, the D(k) languages are in a way archetypal context-free languages.
Recognizing D(k) languages is conceptually simple - a system has to remember the sequence of currently
non-closed opening brackets and make sure they are closed in the correct order, at which point the
closed bracket pairs can be "forgotten", i.e., popped off the stack. This means that the memory
necessary to recognize any string in D(k) is proportional to the number of non-closed brackets at any
time. We formalize it by counting how many more open brackets than closed brackets there are at
each timestep in the string:
d(y?t)=?defcount(y?t,(:)-count(y?t,:))
where count refers to the number of times any opening bracket occurs in y?t and count {:(y?t,:))
the number of times any closing bracket occurs in y?t.
While context-free languages like D(k) describe arbitrarily deep hierarchical structures, natural lan-
guages exhibit bounded nesting in practice, as discussed in the lecture. Furthermore, the infinite
nesting and therefore infinitely long stacks also make it impossible to represent context-free languages
with finite precision. In this question, we investigate how to represent D(k) languages which can only
nest up to some bounded depth m. We denote such languages as D(k,m).
Definition languages). Let k,minN. We define the bounded Dyck language D(k,m)
by combining D(k) with a bound on the nesting depth:
D(k,m)=?def{yinD(k)|d(y?t)m,t=1,dots,T},
where T corresponds to the length of the string.
Due to their bounded nesting depth, D(k,m) languages can be recognized by stacks of bounded depth
and therefore with bounded memory - this means that they are in fact finite-state. This makes them
especially well-suited as a benchmark for finite-precision language models.
This question is roughly divided into two parts: in the first part, you will show that an Elman RNN is
able to simulate a finite-state automaton that recognizes the D(k,m) for some k and m. In the second
part of the question, you are asked to show that the RNN indeed recognizes the language as well using
a specific definition of acceptance.
We begin with a warm-up question.
a)(1 pt) Suppose that the current bounded stack configuration in an automaton recognizing
L=D(2,3) is
What are the new stack configurations 1',2',3' after reading in each of the following symbols
(each one starting from a stack 1,2,3, not one after another)?
:
:
Use O? to denote an empty stack and simply state that the automaton would reject a string if the
processed string is not in D(2,3).
Note: You have to specify nine stack configurations altogether.
We next prove that D(k,m) languages are in fact finite-state by constructing an FSA recognizing
D(k,m).
grammarQuestion 3 : Efficiently Simulating

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!