Tutorials on Engineering Computations using Microsoft Excel
The following was found to be useful in teaching freshman engineering students to use Excel for computations, particularly for data analysis. There are many textbooks and online tutorials available (e.g., Google “Excel engineering data tutorial”). See also Help in Excel itself. The tutorials below were originally written for Excel 2003 and earlier versions. Most have been updated to Excel 2007 or 2010, with the main difference between these is that for Excel 2010 being that the File button is used to access options rather than the Office button in the upper left corner. Starting with Office 2007, the new default file extension for Excel is xlsx, which is explained by Microsoft. For unknown reasons, files created in this format on one computer have sometimes not been readable on other computers. Therefore, it is recommended that you save all Office files without the added x, i.e. as doc, ppt, or xls, especially if they need to be opened on another computer.
Improvements and additions to Excel functions in Excel 2010 (Performance on test data not yet reported in December 2010)
Gnumeric Free open-source software recommended by statisticians for those who want accuracy in statistical computations using a spreadsheet. Graphing is more suitable for scientific and engineering plots than is Excel. Additional features include normalcy (normality) test and warning when inclusion of a variable in regression analysis is not supported by the data.
Performance of Excel and Gnumeric on test data (as of June 2004)
· See format conversions for the relationships between formats in mathematics, MATLAB and Excel.
· To add arrows, text, etc. to a spreadsheet or graph, click on Insert, Shapes. (This also works in Word and PowerPoint.)
· To add an equation, click on Insert, Object, Microsoft Equation 3.0. (This also works in Word and PowerPoint.) The resulting equation can be moved, enlarged and copied elsewhere.
· Here’s how to copy material from one file or web page into another file in Windows. Use one of the methods below to place the desired material on the clipboard. Place the cursor where you wish the copied material to be displayed. Then Paste Special. Double-click on it to bring up the Format menu. Wrap Text, Square is recommended to allow you to move the figure to wherever you want and resize it. To place something on the clipboard, use one of the following procedures:
o For any portion of whatever’s on the screen, use the Snipping Tool.
o For the window your cursor is on, press Alt & Print Screen simultaneously.
o For everything on the screen, press the Print Screen button.
· Exercise 1. Rudiments of Excel spreadsheets; entering names, numbers and formulas into cells; editing, deleting and copying cell contents; relative and fixed (absolute) cell references; built-in Excel functions; display of equations in the spreadsheet; saving and formatting for printing.
· Exercise 2. This exercise utilizes the Antoine equation with constants for the vapor pressures of benzene and toluene to calculate the composition of a vapor at one atmosphere pressure from a given liquid composition, assuming ideal liquid and vapor mixtures in equilibrium. A vapor-liquid phase diagram is generated thereby. Topics covered include entry of formulas, copying these formulas, using Goal Seek to determine the temperature required for a total pressure of one atmosphere, sorting, graph creation, and formatting of spreadsheets and graphs. Goal Seek is located at Data / What-If Analysis. It is used frequently in engineering calculations.
The following exercises are all given in Excel files. Since Excel operations may not available in your browser, it is recommended that you save the file to your computer. This can be done by right-clicking on the link and then Save Target. Then open it from the saved file using Excel. Several of the exercises require tools that are not included in the default installation of Excel. To install, find the add-in directions by clicking on the help icon (? inside circle), search Add-ins, Load or unload add-in programs. Load the Analysis ToolPak and Solver Add-in as indicated below. The Data Analysis and Solver icons will then show up at the right end of the Data toolbar.
· Excel 2010 for Windows: Click on File, Options, Manage Excel Add-Ins, Go, Analysis ToolPak, OK.
· Excel 2007 for Windows: Click on the Office Button in the upper left corner, Excel Options, Add-Ins, Manage Excel Add-Ins, Go, Analysis ToolPak and Solver Add-in, OK.
· Excel 2008 for the Mac: does not support Visual Basic and so does not have these Add-ins will not run macros written in previous versions of Excel.
· Excel 2011 for the Mac: Reports indicate Visual Basic has again been included, but not the Analysis TookPak. Apparently, Microsoft recommends the StatPlus:mac plug-in
· Exercise 3. Use of the "if" function to display the proper solution to the quadratic equation depending on the values of the three constants.
· Exercise 4. In this exercise, a data set is given, simulating measurements made of a single parameter. The exercise is to characterize these data using the Descriptive Analysis and Histogram tools contained within the Data Analysis package (Analysis ToolPak Add-In). Calculated will be the mean, median, range, standard deviation, variance, kurtosis, skewness, 95% confidence limits, and the frequency and cumulative histogram.
· Exercise 5. In the first part of this exercise, a function is plotted and then approximated by a fifth order polynomial using the Trendline correlating function on the resulting graph. It is shown that this polynomial fits the function only within the range utilized by Trendline. This illustrates one of the major problems in using polynomials to correlate data. The other major problem is that such correlations rarely reveal anything about the physics behind the phenomenon. It is always better to fit data to a relationship with a theoretical basis, whenever possible. If this isn't possible, one should seek the simplest possible relationship, as this will be more likely to provide clues to the physics involved. Polynomials should be the last resort. In the second part of this exercise, another function is plotted in order to obtain rough estimates of its three roots (x values giving y = 0). More accurate values for the roots then are found by using the Goal Seek tool, which is at Data / What-If Analysis in Excel 2007 and 2010.
Correlation of Data: By "correlation," we mean here an equation with constants that best fit the data. (It is also possible to have a graphical correlation, but we do not consider that here.) When the error in the values of the dependent variable is normally distributed, "best" means the smallest sum of the squares of the differences between the experimental values and those predicted by the correlating equation. (These differences are called “residuals.”) This is equivalent to maximizing the absolute value of the correlation coefficient R, where R2 is the fraction of the variation in experimental values that is explained by the correlation. Thus, for example, R2 = 1 signifies that the correlating equation gives exactly the experimental values of the dependent variable (y) for all values of the independent variable(s) (x1, x2, etc.).
When data vary over a wide range, you may instead wish to minimize the relative error, i.e. the difference between experimental and predicted values divided by the experimental value. This is done by minimizing the sum of the squares of the differences in logarithms of the experimental values and predicted values. When possible, it is useful to repeat experiments so that you can determine whether the standard deviation is approximately constant over the entire range of experimental values, or whether the relative error is more constant. If the relative error is more constant, then use logarithms. Depending on the equation being fit, this may not be possible with the elementary tools in Excel, but rather require the method shown in Exercise 11.
The following exercises deal with examples of several correlating methods. For one dependent variable (y) and one independent variable (x) that obey one of five simple equations, the simplest method is to use the Trendline function on a graph. If this is not possible, you must first linearize the equation and calculate the new variables. The Regression tool can be used for multiple linear regression analysis if more than two such variables emerge from this linearization. If linearization is not possible, you can use the Solver Add-In to find the coefficients in the equation that minimize the sum of the squares of the residuals. This is known as non-linear regression analysis. Since more than one minimum sometimes exists, the solution obtained may depend on the initial assumptions made for these coefficients.
· Exercise 6. Data are given for the resistance/length of a wire (H) versus its cross-sectional area (A). These data are correlated first by plotting the data and using the Trendline function on the resulting graph. Then they are correlated by finding the average value of HA. Other correlation methods are possible; some are given in the solution to this exercise.
· Exercise 7. Three sets of x-y data and three correlating equations are given. You are asked to determine which equation best fits which set of data. This is done by plotting the data using linearized forms of the equations and then using Trendline for a linear equation. The best fit can readily be seen by comparing the resulting line with the data points, as well as by the highest value of R2.
· Exercise 8. The Solver Add-In tool is used to find the minimum in a function of two variables, and the diameter of a cylinder giving the minimum in its surface area for a specified volume.
· Exercise 9. Data are given for a parameter measured by two different techniques for several different conditions. You are asked to make as many conclusions as possible. Possible techniques include plotting versus condition number; using the Descriptive Statistics tool in the Data Analysis package on the measurements using individual techniques and on the differences in their values; and the t-Test tools.
· Exercise 10. Data are given for the vapor pressure of CO versus temperature. You are asked to use the Antoine equation to correlate these data, i.e. to find the constants in the Antoine equation giving the best fit. A linearized form of the Antoine equation is given that has three new variables, not two. These three variables are calculated in columns adjacent to one another. The Regression tool is used to find constants for the linearized form, from which the Antoine equation constants are calculated. (The Regression tool requires the independent variables, x1, x2, etc., to be in adjacent columns. This tool performs what is known as multiple linear regression analysis, which, basically, is least-squares for more than one independent variable. Matrix methods are utilized by Excel for the calculations.)
· Exercise 11. This is a continuation of Exercise 10. The same data are now correlated by using the Solver Add-In to maximize R2, which is equivalent to minimizing the sum of the squares of the deviations between the logarithms of the experimental vapor pressures and the predicted values. Since the Antoine equation is not linearized in this exercise, this process constitutes non-linear regression analysis. As noted above, the values of the constants found by Solver may depend on the initial assumptions. Such behavior is common for minimization/maximization algorithms. It is as if you are blindfolded, placed somewhere in the Adirondack mountains, and then asked to find the highest mountain. Unable to see, you can only sense whether you are going uphill or downhill. By continually progressing uphill, you will eventually find the nearest mountain -- but not the highest among all mountains. That is, if there are many maxima your algorithm will find one of them, but the one it finds will depend on where you start and may not be the very highest. In other words, it may find only the local extremum rather than the global extremum. Engineers dealing with optimization problems seek to find algorithms that overcome this difficulty. Sometimes the first use of Solver gives only an approximate result, so run it again to make certain. More advanced optimization tools are included in Excel 2010, but these will not be addressed here. Solver support and tutorial
· Exercise 12. This exercise considers the special case of fitting an equation with 0 intercept. While Excel finds the correct constants for the equation, its results for SS, MS, F and significance F seem strange and, it turns out, are controversial. Rather than the customary definitions and computation methods described at Interpretation, for the Total SS Excel now calculates the sum of the squares of the experimental values of the dependent variable, rather than the sum of the squares of the differences between these values and their average. The Residual SS is calculated in the usual way, but the Regression SS is the new Total SS minus the Residual SS. Apparently, this was done to avoid a negative Regression SS, which occurs when a correct fit would pass nowhere near the origin. This, however, gives a meaningless value for F and its significance. But why would one try to fit data with a 0 intercept when this would obviously be inappropriate? This is demonstrated in some detail in this exercise, along with what is believed to be the correct approach and a reasonable alternative. Draw you own conclusions.
Created by Professor William R. Wilcox at Clarkson University. Please email him your comments and suggestions. Additional examples and exercises are sought.
Last revised June 13, 2012.