Question: Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text,

Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image. (Source: wikipedia.com) Widely used as a form of information entry from printed paper data records whether passport documents, invoices, bank statements, computerised receipts, business cards, mail, printouts of static-data, or any suitable documentation it is a common method of digitising printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. (Source: wikipedia.com)

Data: The dataset contains 20,000 handwritten characters. The first column, of each row in the dataset file, is used to specify the class of the character (i.e. A, B, C etc.). The rest 16 columns are used for the corresponding feature vector which describes the specific character. The values in each feature vector are in range 0-15. All values are comma separated. Before you start working on this project, please read the bellow paper (p.161-165) to understand for what each feature stands for: P.W. Frey and D.J. Slate, Letter Recognition Using Holland style Adaptive Classifier, Machine Learning, Vol 6, 161-182 (1991). Example of data: T,2,1,5,6,13,9,2,5,11,10,4,5,2,1,7,9 I,5,12,15,3,6,8,5,9,9,4,12,2,5,7,4,9 The data file is uploaded in your Moodle Account.

Project Tasks:

Task 1 Give a quick description on how you will handle this problem with an MLP Neural Network. Explain: 1. How you will pre-process the data (input-output). 2. Why you can use Multilayer Perceptron (MLP) to solve this problem. 3. Which MLP parameter configurations do you need to choose for this problem? Why? 4. How will you evaluate your results?

Task 2(For this project you will use the KNIME platform.)Download the ocrData.txt file and pre-process the data (manually or through KNIME). During this procedure, you must normalize the input data and encode the output data. Describe how you normalize the input data and why. Describe how you encode the output data and why.

Task 3Use the KNIME Platform to solve this problem. You must develop a KNIME pipeline for MLP Neural Networks. In your report, you must present all KNIME configurations and comparative results for 3 different MLP Neural Network setups. You must present results related of Training Error, Testing Error and network accuracy. How the parameters you have chosen are affecting the results?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

An Optical Character Recognition of Handwritten English Letters using Artificial Neural Networks (Presentation Topic: Pattern Recognition) Optical character recognition or optical character reader...

Question 5. Optical character recognition (OCR) is the electronic conversion of im- ages of typed or printed text into machine-encoded text. Figure 2 shows a section of a scanned page of text in...

Needing ANSWERS ASAP! Starting at pg 34 - Labeled Graded Project 06155200: Graded Project Instructions & Worksheets 1 Lesson 1: Business, Accounting, and You PROJECT GOAL The goal of this graded...

USBB, an Australian property and casualty insurer, actively seeks to maintain a paperless office. Toward that end, it has invested in optical character recognition scanning technology so that...

a) State the Task (T), Performance Measure (P) and Training Experience (E) of the following learning tasks. And also explain which type of learning method we need for each case. i) Credit card fraud...

Write a report for MYOB Group Limited, it should including: Budgets and Performance Measures 1. Look for your company's mission statement or statement that sets out the overall philosophy and...

You are hired as a consultant by ABC Corporation or your current company to evaluate and improve their existing accounting information system (AIS). The company is facing challenges with data...

1. Explain the rationale for sampling rather than using a census for every study. 2. In a study of college search behaviors among the families of college-bound high school students, the search...

Q4. A core part of many pattern recognition algorithms involves comparisons between pairs of points. For example, in optical character recognition, a pattern such as a alpha-numeric character might...

Using a three-dimensional formula, show the direction of the dipole moment of CH3OH. Write (+ and (- signs next to the appropriate atoms.

One way the U.S. Environmental Protection Agency (EPA) tests for chloride contaminants in water is by titrating a sample of silver nitrate solution. Any chloride anions in solution will combine with...

Planning for summative evaluation early for a ohbluc health CVD program

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

Describe the building blocks of dealing with the problem of fluctuating demand.

Be familiar with the five basic ways to manage demand.

Understand what is meant by productive capacity in a service context.