JavaBean Components for Statistical
Analysis
Contents
Overview
Types of STATBEANS
Installation
DataSource StatBeans
Calculation StatBeans
Tabular StatBeans
Graphical StatBeans
Developing an Application
Notes
Examples
Trademark and Copyright Notification
STATBEANS^{®}
is a collection of Java Beans which implement many
commonly used statistical procedures. They are designed to
be embedded in userwritten applications or placed on web
pages. Because of their structure as a component library,
they may be easily manipulated in various visual development
environments. Users have a choice of accessing STATBEANS as
JavaBeans or as ActiveX components using the
JavaBeansActiveX bridge.
Types of STATBEANS
There are four basic types of StatBeans:
DataSource StatBeans  these beans
maintain a rectangular data table which other StatBeans
access to retrieve data for analysis. DataSource StatBeans
are provided for reading data from local text files, for
reading data over the Internet or local intranets, for
accessing databases via JDBC, and for maintaining data
generated by user programs.
Calculation StatBeans  these are
nonvisible beans which perform statistical calculations.
They may be called by user programs to calculate statistics.
They are also accessed by the tabular and graphical
StatBeans.
Tabular StatBeans  these StatBeans
perform statistical calculations and display them in the
form of tables.
Graphical StatBeans  these
StatBeans perform statistical calculations and display them
in the form of graphs. Users create applications by first
adding one or more datasource StatBeans to their project,
and then linking the other StatBeans to the datasource.
Installation
STATBEANS consists of a collection of components which
are packaged in a file called statbeans.jar.
To install it:
1. Place the jar file in
a new directory called statbeans.
2. Update the CLASSPATH
environment variable to include
statbeans\statbeans.jar.
3. Import the jar file into your application development tool,
if desired.
There are a number of other useful files distributed with
the package:

 Documentation files  the
documentation for STATBEANS consists of a set of html
files providing an overview of the system and describing
each StatBean.

 Sample applications  sample java
source files showing how each StatBean may be used.
These examples demonstrate both how to use the StatBeans
with data generated by an application and how to hook
them to external data files and databases. Several
sample data files are also included.
DataSource StatBeans
DataSource StatBeans maintain rectangular data tables
into which data is loaded for access by other types of
StatBeans. Each project must contain at least one datasource
StatBean. Five types of datasource StatBeans are provided:

FileDataSource  reads data from file on a local
disk.

JdbcDataSource  retrieves data from a database via
an SQL query using JDBC.

ProgramDataSource  maintains a data table for
holding data generated by a userwritten Java program.

UrlDataSource  reads data from a file located on
the Internet or an intranet via its URL.

Calculation Statbeans  these beans inherit the data
from their parent DataSource and add other variables
which they calculate (such as residuals from a
regression).
Calculation StatBeans
Calculation StatBeans compute statistics which may then
be accessed by user programs or other StatBeans. They
retrieve data from a datasource StatBean by specifying one
or more column names. Each calculation StatBean is set up to
listen for that datasource's dataChange
event, which causes it to request data from the datasource
and perform its calculations. Results may then be obtained
from the calculation StatBean by invoking one of its
methods.
The following calculation StatBeans are currently
available:

Anova  multifactor analysis of variance.

Autocorrelations  calculates sample
autocorrelations and partial autocorrelations for a time
series.

CapabilityAnalysis  compares data to process
specification limits.

ContingencyTableStats  calculates measures of
association for rows and columns in contingency tables.

ControlCharts  calculates control charts for
variables and attributes.

Correlations  estimates correlation coefficients
between pairs of numeric variables.

Crosstabulation  creates a contingency table for 2
categorical or numeric data variables.

Distributions  computes probabilities and generates
random numbers for 24 probability distributions.

FitDistribution  fits distributions to a column of
data, computes probabilities, and generates random
numbers.

GageRandR  estimates gage repeatability and
reproducibility.

HypothesisTests  performs hypothesis tests for
means, medians, standard deviations, proportions, and
rates.

MultipleRegression  fits a regression model to
relate Y to one or more predictor variables.

NonlinearRegression  fits a nonlinear regression
model to relate Y to one or more predictor variables.

Percentiles  calculates percentiles for a column of
numeric data.

Periodogram  calculates periodogram ordinates for a
time series.

PolynomialRegression  fits a polynomial model to
relate Y and X.

SampleStatistics  calculates sample statistics for
two or more columns of data.

SimpleRegression  fits a linear or curvilinear
model to relate Y and X.

Tabulation  tabulates categorical or numeric data.

TimeSeriesAdjustments  applies various mathematical
and other adjustments to a time series.

TimeSeriesForecast  forecasts values of a time
series.

TimeSeriesSmoothing  applies different types of
smoothers to a time series.

ToleranceLimits  calculates normal tolerance limits
for a column of numeric data.
Tabular StatBeans
Tabular StatBeans compute statistics and display them in
the form of tables. They usually retrieve their data and
results from a Calculation StatBean, although simple Tabular
StatBeans (such as DataDisplayTable) retrieve their data
directly from a DataSource StatBean.
The following Tabular StatBeans are currently available:
Graphical StatBeans
Graphical StatBeans compute statistics and display them
in the form of graphs. They usually retrieve their data and
results from a Calculation StatBean, although simple
Graphical StatBeans (such as XYPlot) retrieve their data
directly from a DataSource StatBean.
The following Graphical StatBeans are currently
available:

AutocorrelationsPlot  plot the sample
autocorrelation and partial autocorrelation functions
for a time series.

Barchart  plots a barchart for a single
classification factor.

BoxAndWhiskerPlot  creates a boxandwhisker plot
for a single column of numeric data.

CapabilityAnalysisPlot  displays the results of a
capability analysis.

CasementPlot  displays a series of XY scatterplots
by levels of a third variable.

ComponentChart  creates a component line chart with
filled areas.

ControlChartsPlot  plots control charts for
variables and attributes.

ContourPlot  creates contour plots for a response
surface.

DensityTrace  estimates the probability density
function for a single column of numeric data.

DexPlot  creates mean, standard deviation, and
interaction plots for designed experiments.

DistributionsPlot  plots probability distributions
and related functions.

DotPlot  displays a dot frequency plot for a column
of numeric data.

DraftsmansPlot  displays a top, front and side view
of a 3D scatterplot.

FactorBoxPlot  creates boxandwhisker plots by
levels of an experimental factor.

FactorScatterPlot  creates scatterplots by levels
of an experimental factor.

FitDistributionPlot  plots the results of fitting
one or more distributions to a column of data.

FitDistributionQQPlot  plots a quantilequantile
plot to show goodnessoffit after fitting one or more
distributions to a column of data.

FrequencyPolygon  plot a frequency polygon or
cumulative distribution function.

GageRandRPlot  plots data from a gage repeatability
and reproducibility study.

Histogram  plots a frequency histogram to show the
distribution of numeric data.

HypothesisTestsPlot  displays results of hypothesis
tests.

MosaicPlot  creates a mosaic plot for a twoway
crosstabulation.

MultipleRegressionComponentPlot  creates
component+residual plot for a selected variable in a
multiple regression model.

MultipleRegressionContourPlot  creates a contour
plot of the response in a multiple regression model.

MultipleRegressionSurfacePlot  creates a 3d surface
plot of the response in a multiple regression model.

MultipleRegression2DResponsePlot  creates a 2d plot
of the response in a multiple regression model.

MultipleXYPlot  creates a 2d plot with two or more
set of lines or points.

ParetoChart  plots a Pareto chart to highlight the
"vital few".

PeriodogramPlot  plot a periodogram or integrated
periodogram for a time series.

Piechart  plots a piechart for a single
classification factor.

ProbabilityPlot  constructs a probability plot for
a single column of numeric data.

QuantilePlot  plots a quantile plot for a single
column of numeric data.

QuantileQuantilePlot  plots quantiles of two
samples versus each other.

ScatterplotMatrix  displays a matrix of 2variable
scatterplots for several numeric columns.

SimpleRegressionPlot  displays results of fitting a
regression model relating Y and X.

Skychart  creates a 3D skychart for a twoway
crosstabulation.

SubseriesPlot  plots seasonal time series data.

SurfacePlot  creates 3D surface plots for a
response surface.

TimeSeriesForecastPlot  plots forecasts for a time
series.

TimeSeriesPlot  plots time series data.

TwowayBarchart  creates a barchart for a twoway
crosstabulation.

XYPlot  displays a scatterplot or lineplot for two
columns of data.

XYZPlot  displays a scatterplot or lineplot for
three columns of data.
Developing an Application
To develop an application which uses STATBEANS, you can
use a visual development tool such as Visual Cafe or Visual
Age for Java, or you can manipulate the components directly.
In most development tools, you begin by adding the
statbeans.jar file to a component library, after
which you can drop the components onto a design form.
To develop an application, several steps are then
necessary:

 STEP 1: add a datasource Statbean
to the project and set its properties. For example, to
read a file, you would insert the FileDataSource
bean into your project and set the fileName
property to the name of the file you wanted to read. The
following lines are needed:
import STATBEANS.FileDataSource;
fileDataSource1 = new FileDataSource();
fileDataSource1.setFileName("c:\\statbeans\\samples\\cardata.txt");

 STEP 2: add a Calculation Statbean
to the project and set its properties. For example, to
fit a straight line relating two columns of data in the
datasource named "mpg" and "horsepower", you would
insert the SimpleRegression bean into
your project, and then set the XVariableName
property and YVariableName property to
the names of the columns to be analyzed. The following
lines are needed:
import
STATBEANS.SimpleRegression;
simpleRegression1 = new STATBEANS.SimpleRegression();
simpleRegression1.setYVariableName("mpg");
simpleRegression1.setXVariableName("weight");

 STEP 3: add one or more Tabular and
Graphical Statbeans to the project and set its
properties. For example, to display the results of Step
2, you would insert the SimpleRegressionTable
and SimpleRegressionPlot beans into
your project. The following lines are needed:
import
STATBEANS.SimpleRegressionTable;
import
STATBEANS.SimpleRegressionPlot;
simpleRegressionTable1 = new
STATBEANS.SimpleRegressionTable();
simpleRegressionPlot1 = new STATBEANS.SimpleRegressionPlot();
simpleRegressionPlot1.setConfidenceLevel(99.0);

 STEP 4: connect the
SimpleRegression bean to the FileDataSource bean. Also
connect the SimpleRegressionTable and
SimpleRegressionPlot beans to the SimpleRegression bean.
This is done by selecting the target StatBean and making
it a listener for the datasource Statbean's
dataChange event. To do so, add the following
lines of code to the init() or
main() function:
fileDataSource1.addDataChangeListener(simpleRegression1.listenerForDataChange);
simpleRegression1.addDataChangeListener(simpleRegressionTable1.listenerForDataChange);
simpleRegression1.addDataChangeListener(simpleRegressionPlot1.listenerForDataChange);

 STEP 5: instruct the FileDataSource
bean to read its data. The following line is needed:
fileDataSource1.readData();
When the applet or application is run, it creates the four
Statbeans. The FileDataSource bean reads the data file and
stores the data in an invisible rectangular table. When the
readData() function is executed, it fires its dataChange
event, which causes the SimpleRegression bean to request
data from the datasource bean and calculate the desired
statistics. The SimpleRegression bean then fires its
dataChange event which causes the SimpleRegressionTable and
SimpleRegressionPlot beans to update their displays.
In general, you must add at least one DataSource StatBean
and one Calculation StatBean to each project. Some simple
Tabular and Graphical StatBeans, however, can connect
directly to a DataSource StatBean.
Notes
Some special features and other items of note are:
(1) Each StatBean lists various Read/Write Properties. The
properties may be read or set by capitalizing the first
letter of the property and adding one of the following
suffixes to the front:
"set" to read the value of any property, as in
simpleRegression1.setModelType("Exponential").
"get" to read the value of any property except a
boolean, as in simpleRegression1.getSlope().
"is" to read the value of a boolean, as in
simpleRegression1.isIncludeConstant().
The Other Public Methods are called exactly as listed.
(2) Calculation StatBeans save intermediate results in the
Output Variables listed. The calculation StatBean then
serves as a datasource to other beans, adding these output
variables to the variables of its input datasource. A good
example of this feature is contained in the file
XYPlotExample2.java, which uses the XYPlot StatBean
to plot residuals from a simple regression.
(3) Most Calculation StatBeans require the input of column
names to specify the data to be analyzed. In place of a
simple name such as "weight", you may
specify instead a transformation of a column by entering a
string such as "LOG(weight)". The
transformations currently supported are:
natural logarithm 
LOG(weight)
square root 
SQRT(weight)
cube root  CBRT(weight)
absolute value  ABS(weight)
exponential function  EXP(weight)
raise to a power  weight^1.5
Full algebraic parsing will be added at some point in the
future.
Examples
Examples are included throughout this documentation. In
addition, sample java files are distributed with the system
which implement each Statbean as a simple Java application.
The sample files have names such as
ControlChartsExample.java, which creates the
following output:
Trademark and Copyright Notification
STATBEANS is a trademark of StatPoint Technologies, Inc. All rights
reserved. All STATBEANS code and documentation is copyright
2009 by StatPoint Technologies, Inc., and is not to be redistributed
without express written permission. 