Question: C ase Study: Process Control in a Coffee Roasting PlantNestl , one of the largest food and beverages companies in the world, uses a number
Case Study: Process Control in a Coffee Roasting PlantNestl, one of the largest food and beverages companies in the world, uses a number of continuous-feed coffee roasters to produce a variety of coffee products. Each of these products has a recipe that specifies target values for a plethora of roaster variables such as the temperature of the air at various exhaust points, the speed of various fans, the rate that gas is burned, the amount of water introduced to quench the beans, and the positions of various flaps and valves. There are a lot of ways for things to go wrong when roasting coffee, ranging from a roast coming out too light in color to a costly and damaging roaster fire. A bad batch of roasted coffee wastes the beans and incurs a cost; damage to equipment is even more expensive.
To help operators keep the roaster running properly, data is collected from about 60 sensors. Every 30 seconds, this data, along with control information, is written to a log and made available to operators in the form of graphs. The project described here took place at a Nestl research laboratory in York, England. Nestl built a coffee roaster simulation based on the sensor logs.
Goals for the Simulator
Nestl saw several ways that a coffee roaster simulator could improve its processes: By using the simulator to try out new recipes, a large number of new recipes could be evaluatedwithout interrupting production. Furthermore, recipes that might lead to roaster fires or other damage could be eliminated in advance.
The simulator could be used to train new operators and expose them to routine problems and their solutions. Using the simulator, operators could try out different approaches to resolving a problem.
The simulator could track the operation of the actual roaster and project it several minutes into the future. When the simulation ran into a problem, an alert could be generated while the operators still had time to avert trouble. Fortunately, Nestl was already collecting data at half-minute intervals, which could be used to build the simulator.
Building a Roaster Simulation
A model set of 34,000 cases was created from the historical log data. Each case consisted of a set of measurements on the roaster along with the same measurements 30 seconds later. Notice that the same data might be used as targets for one case, and then, for the next case, might be the inputs (where the targets come 30 seconds later).
This training set is more complicated than the training sets we have been working with, because multiple targets exist all the measurements 30 seconds later. The solution is to build a separate model for each measurement. Each model takes the input from the earlier part of the case, and the target from the later period, as shown in Figure 7.14: Figure 7.14 A decision tree uses values from one snapshot to create the next snapshot in time. The entire set of models was trained, resulting in a set of models that takes the input measurements for the roaster and produces estimates of what happens 30 seconds later.
Evaluation of the Roaster Simulation
The simulation was then evaluated using a test set of around 40,000 additional cases that had not been part of the training set. For each case in the test set, the simulator generated projected snapshots 60 steps into the future (that is, 30 minutes into the future). At each step the projected values of all variables were compared against the actual values. As expected, the size of the error increases with time. For example, the error rate for product temperature turned out to be 2/3C per minute of projection, but even 30 minutes into the future the simulator was considerably better than random guessing.
The roaster simulator turned out to be more accurate than all but the most experienced operators at projecting trends, and even the most experienced operators were able to do a better job with the aid of the simulator. Operators enjoyed using the simulator and reported that it gave them new insight into corrective actions.
Lessons Learned
Decision-tree methods have wide applicability for data exploration, classification, and selecting important variables. They can also be used for estimating continuous values although they are rarely the first choice because decision trees generate lumpy estimates all records reaching the same leaf are assigned the same estimated value. They are a good choice when the data mining task is classification of records or prediction of discrete outcomes. Use decision trees when your goal is to assign each record to one of a few broad categories.
Decision trees are also a natural choice when the goal is to generate understandable and explainable rules. The ability of decision trees to generate rules that can be translated into comprehensible natural language or SQL is one of the greatest strengths of the technique. Even in complex decision trees, following any one path through the tree to a particular leaf is generally fairly easy, so the explanation for any particular classification or prediction is relatively straightforward.
Decision trees are grown using a recursive algorithm that evaluates all values of all inputs to find the split that causes the greatest increase in purity in the children. The same thing happens again inside each child. The process continues until no more splits can be found or some other limit is reached. The tree is then pruned to remove unstable branches. Several tests are used as splitting criteria, including the chi-square test for categorical targets and the F test for numeric targets.
Decision trees require less data preparation than many other techniques because they are equally adept at handling continuous and categorical variables. Categorical variables, which pose problems for neural networks and statistical techniques, are split by forming groups of classes. Continuous variables are split by dividing their range of values. Because decision trees do not make use of the actual values of numeric variables, they are not sensitive to outliers and skewed distributions. Missing values, which cannot be handled by many data mining techniques, cause no problems for decision trees and may even appear in splitting rules.
This robustness comes at the cost of throwing away some of the information that is available in the training data, so a well-tuned neural network or regression model often makes better use of the same fields than a decision tree. For that reason, decision trees are often used to pick a good set of variables to be used as inputs to another modeling technique. Time-oriented data does require a lot of data preparation. Time series data must be enhanced so that trends and sequential patterns are made visible.
Decision trees reveal so much about the data to which they are applied that the authors often make use of them in the early phases of a data mining project even when the final models are to be created using some other technique.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
