Fundamental PAW Objects
PAW is implicitly based on a family of
fundamental objects.
Each PAW command performs an action that either produces
another object or produces a ``side-effect'' such as
a printed message or graphics display that is not saved
anywhere as a data structure. Some commands do both,
and some may or may not produce a PAW data structure depending
on the settings of global PAW parameters. This page
describes the basic objects that the user needs to keep
in mind when dealing with PAW.
Objects:
A histogram is the basic statistical analysis tool of PAW.
Histograms are created (``booked'') by choosing
the basic characteristics of their bins, variables, and perhaps
customized display parameters;
numbers are entered into the histogram bins from an Ntuple
(the histogram is ``filled'') by selecting the desired events,
weights, and variable transformations to be used while counts are
accumulated in the bins. Functional forms are frequently fit
to the resulting histograms and stored with them. Thus a fit
as an object is normally associated directly with a histogram,
although it may be considered separately.
2D (and higher-dimensional) histograms
are logical generalizations of 1D histograms. 2D histograms, for
example, are viewable as the result of counting the points in
a the sections of a rectangular grid overlaid on a scatter plot
of two variables.
An Ntuple is a set of
events, where for each event the value of a number of variables is recorded.
Ntuples are a very convenient tool for analyzing statistical datasets.
An Ntuple can be viewed as a table with each row corresponding to one event
and each column corresponding to given variable. Typically the events are
scattering or collision events recorded by a HEP experiment and variables
can be measurements like a scattering angle, the number of tracks or the
energy deposited in a calorimeter (but Ntuples are useful for the analysis
of any statistical data sample). The interesting properties of the data in
an Ntuple can normally be expressed as distributions of Ntuple variables or
as correlations between two or more of these variables. Very often it is
important to create these distributions from a subset of the data by
imposing constraints (also known as `cuts') on some of the variables.
An Ntuple is the basic type of data used in PAW.
Typically, an Ntuple is
made available to PAW by opening a direct access file;
this file, as been previously created with an program using HBOOK.
A storage area for an Ntuple may also be created directly
using NTUPLE/CREATE; data may then be stored in
the allocated space
using the NTUPLE/LOOP or NTUPLE/READ commands.
Other commands merge Ntuples into larger Ntuples,
project vector functions of the Ntuple variables into
histograms, and plot selected subsets of events.
A cut is a Boolean function of Ntuple variables.
Cuts are used to select subsets of events in an Ntuple
when creating histograms and ploting variables.
Masks are separate files that are
logically identical to a set of boolean
variables added on the end of an Ntuple's data structure.
A mask is constructed using the Boolean result of applying
a cut to an event set. A mask
is useful only for efficiency; the effect of a mask is identical
to that of the cut that produced it.
PAW provides the facilities to store vectors of integer or real
data. These vectors, or rather arrays with up to 3 index dimensions,
can be manipulated with a set of dedicated commands. Furthermore
they are interfaced to the array manipulation package SIGMA and to the
Fortran interpreter COMIS. They provide a convenient and easy way
to analyse small data sets stored in ASCII files.
A ``style'' is a set of variables that control
the appearance of PAW plots.
Commands of the
form OPTION attribute choose particular plotting options
such as logarithmic/linear, bar-chart/scatter-plot, and statistics
display.
Commands of the form SET parameter value control
a vast set of numerical format parameters used to control plotting.
While the ``style'' object will eventually become a formal part
of PAW, a ``style'' can be constructed by the user in the form
of a macro file that resets all parameters back to their defaults
and then sets the desired customizations.
In normal interactive usage, images created
on the screen correspond to no persistent data structure. If one
wishes to create a savable graphics object, the user establishes
a metafile; as a graphics image is being drawn, each command is
then saved in a text file in coded form that allows the image
to be duplicated by other systems. PostScript format metafiles
are especially useful because they can be directly printed on
most printers; furthermore, the printed quality of graphics objects
such as fonts can be of much higher quality than the original screen image.
Metafiles describing very complex graphics
objects can be extremely lengthy, and therefore inefficient in terms
of storage and the time needed to redraw the image.
A picture is an exact copy of the screen image, and so its
storage and redisplay time are independent of complexity. Pictures
are also intensively used for object picking in the Motif version
of PAW.
In a single PAW
session, the user may work simultaneously with many Ntuples,
histograms, and hierarchies of Ntuple and histograms.
However, this is not accomplished using the native operating system's
file handler. Instead, the user works with a set of objects that
are similar to a file system, but are instead managed by
the ZEBRA RZ package. This can be somewhat confusing because
a single operating system file created by RZ can contain an
entire hierarchy of ZEBRA logical directories; furthermore,
sections of internal memory can also be organized as
ZEBRA logical directories to receive newly-created PAW objects
that are not written to files. A set of commands CDIR,
LDIR, and MDIR are the basic utilities for walking
through a set of ZEBRA logical directories of PAW objects;
Each set of directories contained in an actual file corresponds
to a logical unit number, and the root of the tree is usually
of the form //LUNx; the PAW objects and logical directories
stored in internal memory have the root //PAWC.
A macro is a set of command lines stored in a file, which can be
created or modified with any text editor. In addition to all the
PAW commands, special macro flow control statements are also
available.
Many different
ZEBRA files, some with logically equivalent Ntuples and
histograms, can be arranged in the user's operating system
file directories. Thus one must also keep clearly in mind
the operating system file directories and their correspondence
to the ZEBRA logical directories containing data that one
wishes to work with. In many ways, the operating system file
system is also a type of ``object'' that forms an essential
part of the user's mental picture of the system.
Olivier.Couet@Cern.Ch