Computed theoretical power for N=100 and N=200 scenarios
This commit is contained in:
165
Modules/ado/plus/v/violin.hlp
Normal file
165
Modules/ado/plus/v/violin.hlp
Normal file
@ -0,0 +1,165 @@
|
||||
.-
|
||||
help for ^violin^ (STB-46: gr33)
|
||||
.-
|
||||
|
||||
Violin plots
|
||||
------------
|
||||
|
||||
^violin^ varlist [weight] [^if^ exp] [^in^ range]
|
||||
[^,^ {^bi^weight|^cos^ine|^ep^an|^gau^ss|^par^zen|^rec^tangle|^tri^angle}
|
||||
^n(^#^) w^idth^(^#^) by(^byvar^) tru^ncat^(^#,#|*^) ro^und^(^#^)^
|
||||
graph_options ]
|
||||
|
||||
^fweights^ and ^aweights^ are allowed; see ^help^ @weights@.
|
||||
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
^violin^ produces violin plots, a graphical box plot--kernel density synergism.
|
||||
The violin plot combines the basic summary statistics of a box plot with the
|
||||
visual information provided by a local density estimator. The goal is to
|
||||
reveal the distributional structure in a variable. Much like a traditional
|
||||
box plot, the violin plot displays the median as a short horizontal line, the
|
||||
first-to-third interquartile range as a narrow shaded box, and the lower-to-
|
||||
upper adjacent value range as a vertical line, but it does not plot outside
|
||||
values. Instead, it "boxes" the data with mirrored density curves and labels
|
||||
the y-axis at the minimum, median and maximum observed data values.
|
||||
|
||||
^violin^ also lists basic descriptive statistics about the data (i.e., the
|
||||
lower and upper adjacent values, the 25th and 75th centiles, the minimum,
|
||||
median and maximum of the data, and the sample size) and it provides
|
||||
information about the density estimation (i.e., the kernel method used, the
|
||||
number of points of estimation, and the resulting scale and width factors).
|
||||
When ^by()^ is specified, descriptive statistics are displayed for the combined
|
||||
group only. When multiple variables are included in varlist, statistics are
|
||||
displayed for the last variable only.
|
||||
|
||||
^violin^ discards observations on an casewise basis as a function of 1) missing
|
||||
data and 2) the ^if^ (or ^in^) specification (i.e, it ignores the entire
|
||||
observation). This behavior may lead to unexpected results when multiple
|
||||
variables are in the varlist.
|
||||
|
||||
Note: ^violin^ calls ^centile^ to compute the needed centiles but ^centile^ does
|
||||
not respond to a ^[weight]^ specification. This conflicts with the
|
||||
^kdensity^ code which responds to that specification. The implications of
|
||||
this conflict have not been explored, but ^violin^ currently allows the the
|
||||
^[weight]^ specification to be passed through to ^kdensity^.
|
||||
|
||||
Note: ^violin^ uses a low-level ^gph^ command which is not supported in Stata's
|
||||
release 2 ^.gph^ format. As a result neither ^Stage^ nor the ^gphdot^ or
|
||||
^gphpen^ DOS-based graphics output programs can process a saved violin-plot
|
||||
graphics file. This limitation does not affect screen display or output
|
||||
using the ^Print Graph^ option of Stata's ^File^ menu.
|
||||
|
||||
|
||||
Options
|
||||
-------
|
||||
|
||||
^biweight^, ^cosine^, ..., ^triangle^ specify the kernel. By default, ^epan^, the
|
||||
Epanechnikov kernel, is used.
|
||||
|
||||
^n(^#^)^ specifies the number of points at which density estimates will be
|
||||
evaluated. The default is 50.
|
||||
|
||||
^width(^#^)^ specifies the halfwidth of the kernel, the width of the density
|
||||
window around each point. If ^width()^ is not specified, then the "optimal"
|
||||
width is used; see ^[R] kdensity^. For multimodal and highly skewed
|
||||
densities, the "optimal" width is usually too wide and oversmooths the
|
||||
density.
|
||||
|
||||
^by(^byvar^)^ produces separate plots for the groups of observations defined by
|
||||
byvar and displays them in a single graph having common vertical scale.
|
||||
^by()^ cannot be specified when there is more than one variable in the
|
||||
varlist.
|
||||
|
||||
^truncat(^#^,^#|^*)^ limits the range of the density trace, either to a range
|
||||
specified as ^(^#^,^#^)^, or to the observed data limits, specified as ^(*)^.
|
||||
Regardless of the actual ^(^#^,^#^)^ specification, the maximum range truncation
|
||||
honored is the observed data limits. The precise truncation points will
|
||||
be the most extreme points within the specified range where the density is
|
||||
calculated (the points of density calculation depend on ^n()^, ^width()^
|
||||
and the observed data).
|
||||
|
||||
^round(^#^)^ rounds the y-axis numeric labels to the value specified. As a result,
|
||||
the labels and their corresponding tic marks may not be placed at the true
|
||||
minimum, median, or maximum values, rather they will be at the rounded
|
||||
values. ^round()^ has no effect if ^ylabel^ is specified without arguments,
|
||||
but is operative if ^ylabel^ is not specified or is specified with arguments.
|
||||
The ^round()^ option follows the rules of Stata's ^round(^x^,^y^)^ function, with
|
||||
# being the y argument and each label value being the x argument;
|
||||
see ^[U] 20.3.5 Special functions^.
|
||||
|
||||
graph_options are any of the options allowed by ^graph, twoway^ except ^b2title()^
|
||||
(which is ignored); see ^help^ @graph@. Some options are preset and, although
|
||||
changeable, usually should not be modified. These include ^symbol(i)^ and
|
||||
^connect(l)^ for specifying the plotting symbol and point connection method
|
||||
for the density curve. In addition, ^ylabel()^ is preset to label only the
|
||||
minimum, median and maximum points. ^t1title(Violin Plot)^ is preset but can
|
||||
be changed--except when ^by()^ is specified; in this instance ^t1title^ is used
|
||||
for the variable name or label. When changeable, use of ^t1title(.)^ will
|
||||
result in a blank title. Other preset options, such as ^pen(2)^ for the
|
||||
plot pen color, are intended to be freely changed to suit user preference.
|
||||
A few options, such as the left and right titles, are set (or default to)
|
||||
blank. If specified, they appear beside each plot in a multi-variable
|
||||
graph. Lastly, the ^saving()^ option differs slightly from ^graph^'s in
|
||||
that the filename extension is always ^.gph^ and must not be specified.
|
||||
|
||||
|
||||
Saved values
|
||||
------------
|
||||
|
||||
S_1 name of kernel used for density trace
|
||||
S_2 number of points of density estimation
|
||||
S_3 band width for density estimation
|
||||
S_4 scale factor of density plot
|
||||
S_5 minimum
|
||||
S_6 lower adjacent value
|
||||
S_7 first quartile
|
||||
S_8 median
|
||||
S_9 third quartile
|
||||
S_10 upper adjacent value
|
||||
S_11 maximum
|
||||
S_12 n
|
||||
|
||||
When ^by()^ is specified: S_3 and S_4 contain the averages of the band width and
|
||||
scale factors used in the subgroup density estimations; S_5, S_7, S_8, S_9,
|
||||
S_11 and S_12 are statistics for the combined group; and S_6 and S_10 are set
|
||||
missing.
|
||||
|
||||
When multiple variables are specified, the saved values contain results for
|
||||
the last variable in the varlist.
|
||||
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
. ^violin length, t1(Auto data) l1(length of car)^
|
||||
|
||||
. ^violin length weight, n(100) w(20)^
|
||||
|
||||
. ^violin weight, by(foreign) parzen^
|
||||
|
||||
|
||||
Author
|
||||
------
|
||||
|
||||
Thomas J. Steichen
|
||||
RJRT
|
||||
steicht@@rjrt.com
|
||||
|
||||
|
||||
Reference
|
||||
---------
|
||||
|
||||
Hintze, J. L. and R. D. Nelson (1998). "Violin plots: a box plot-density trace
|
||||
synergism." The American Statistician, 52(2):181-4.
|
||||
|
||||
|
||||
Also see
|
||||
--------
|
||||
|
||||
STB: gr33 (STB-46)
|
||||
Manual: ^[R] kdensity^, ^[R] graph box^, ^[R] centile^
|
||||
^[U] 20.3.5 Special functions^
|
||||
On-line: help for @kdensity@, @graph@, @centile@, @functions@
|
Reference in New Issue
Block a user