Elementary Statistical Concepts

Click on the [?] to find the answers

- 5
- 6
- 10
- 16
- 18
- 19
- 21
- 25
- 36
- 180
- Sample size.
------
**[?]** - Mean.
------
**[?]** - Median.
------
**[?]** - (Sample) variance.
------
**[?]** - (Sample) standard deviation.
------
**[?]** - Lower quartile.
------
**[?]** - 75th percentile.
------
**[?]** - 90th percentile.
------
**[?]** - Range.
------
**[?]** Inter-quartile range. ------ **[?]**

Questions 11-27 refer to the following five graphical displays.- Which display shows the most negative
skewness?
- I
- II
- III
------
**[?]** - IV
- V

- The data
display III is called- a histogram
- a boxplot
- a stemplot
------
**[?]** - a cumulative distribution
- a scatterplot

- The median of this data is
- 60
- 61
- 62
------
**[?]** - 62.4
- 65

- The interquartile-range is
- 10
- 16
- 25
------
**[?]** - 30
- 57

- The data
display I is called- a histogram
- a boxplot
- a stemplot
------
**[?]** - a cumulative distribution
- a scatterplot

- The mean value of this data is
- around 25
- less than 60
- exactly 65
------
**[?]** - between 60 and 70
- between 70 and 90

- The median value of this data is
- around 20
- between 20 and 60
- between 60 and 70
------
**[?]** - between 70 and 80
- between 80 and 90

- The data
display II is called- a histogram
- a boxplot
- a stemplot
------
**[?]** - a cumulative distribution
- a scatterplot

- The median value is
- 40
- 50
- 53
------
**[?]** - 60
- 70

- The lower quartile is
- 25
- 40
- 53.5
------
**[?]** - 60
- 66

- The data
display IV is called- a histogram
- a boxplot
- a stemplot
------
**[?]** - a cumulative distribution
- a scatterplot

- The median of these data is
- 40
- 50
- 60
------
**[?]** - 65
- 70

- The lower quartile of these data is
- 25
- 40
- 53
------
**[?]** - 60
- 75

- The data
display V is called- a histogram
- a boxplot
- a stemplot
------
**[?]** - a cumulative distribution
- a scatterplot

- The correlation coefficient would be around
- -0.9
- -0.5
- 0
------
**[?]** - 0.3
- 0.9

- The slope of the regression line would be
around
- -1
- -0.5
- -0.25
------
**[?]** - 0.5
- 1

- Which pair of displays might be showing the same
data?
- I and II
- I and III
- I and IV
- I and V
- II and III
------
**[?]** - II and IV
- II and V
- III and IV
- III and V
- IV and V

Questions 28-37 present some statistical terms. Choose from the following list the correct definition of these terms.- A random subset (of given size) of a population chosen such that every subset of that size has the same probability of being chosen.
- A sample obtained by taking a simple random sample from selected subpopulations.
- The result of subtracting the mean value and dividing by the
square root of the variance. - A range of values of a population parameter, calculated from the sample, such that the true value of the parameter is contained in that range with a specified probability.
- The probability of not making a
Type II error. - A statistic whose expected value is equal to a population parameter.
- A random variable used to determine whether to reject the null hypothesis.
- Rejecting the null hypothesis when it is true.
- The standard deviation of the sampling distribution of an estimator.
- A systematic error in an estimator.

- Simple random sample.
------
**[?]** - Stratified sample.
------
**[?]** - Standardized value.
------
**[?]** - Test statistic.
------
**[?]** - Confidence interval.
------
**[?]** - Power.
------
**[?]** Type I error. ------**[?]**- Standard error.
------
**[?]** - Bias.
------
**[?]** - Unbiased estimator.
------
**[?]**A city has three hospitals in which babies are delivered. In addition, there are a certain number of home births. An epidemiologist has collected the following data about the number of live births and stillbirths

in 1996 at the different locations. - A data display of this kind is called
- a box plot,
- a contingency table,
- a histogram,
------
**[?]** - a scatter diagram,
- a table.

- What percentage of babies were stillborn?
- 5.22
- 5.55
- 5.87
------
**[?]** - 41.75
- 53.93

- What percentage of stillbirths occurred at
General Hospital?
- 2.82
- 6.90
- 7.41
------
**[?]** - 50.90
- 85.00

- What was the rate of stillbirths at General Hospital?
- 2.82%
- 6.90%
- 7.41%
------
**[?]** - 50.90%
- 85.00%

- Suppose that the epidemiologist wants to demonstrate
that the probability of stillbirths does depend on the location of
birth. A formal statistical procedure for this demonstration might
begin with a null hypothesis that
- home births are more risky,
- home births are less risky,
- the number of births is the same at all hospitals,
------
**[?]** - the rate of stillbirths is the same at all locations,
- the rate of stillbirths depends on the location.

- Under this null hypothesis, what would be the
expected
*number*of stillbirths among home births?- 1.49
- 1.80
- 3
------
**[?]** - 11.15
- 167

- To test the null hypothesis, the
epidemiologist could use
- a regression analysis
- a test for association,
- a one-sample
*t*test, ------**[?]** - a two-sample
*t*test, - a sign test.

- The test statistic would be compared with the
tabulated values for which distribution?
- normal,
- Student's
*t*, - chi-squared,
------
**[?]** - binomial,
- binary.

- The degrees of freedom would be
- 1
- 3
- 7
------
**[?]** - 8
- 3009.

- The test statistic calculated in
question 44 is equal to 68.103 which exceeds the 95th percentile of the appropriate distribution. A correct interpretation of this event would be- [?]there is no relationship between rates of stillbirth and location of birth,
- [?] there is insufficient evidence to demonstrate a relationship between stillbirth and location of birth,
- [?] home births are safer than hospital births,
- [?] going to College Hospital for delivery increases the chances of stillbirth,
- [?] rates of stillbirths are not the same for all the hospitals.

- All
*but one*of the following are a reasonable comment about this study. Select the one that is*not*justified.- [?] The study woud be more informative if information could be obtained about the reasons why the various hospitals were chosen for delivery.
- [?] Since the data were collected from a properly randomized experiment, one can conclude that procedures at College Hospital are causing an increased number of stillbirths.
- [?] It is risky to make cause-and-effect inferences from observational studies since lurking variables could be responsible for apparent association.
- [?] Because the data represent an aggregation of deliveries at various levels of risk, the observed effects could be an instance of Simpson's paradox.
- [?] Since the births analyzed are not a random sample from a larger population, inferences cannot be made beyond the particular city on statistical grounds alone.

A difficult mathematics class contains six students, of which two are math majors, two are education students, and two are engineering students. Both education students are female, as is one of the engineers. The other three students are male. The students are having problems with the professor and decide to pick a committee of two to speak to him about their concerns. The committee is chosen by putting the names of the six students into a hat, and after thorough mixing, randomly drawing two names.

- How many possible committees are there?
- 3
- 6
- 15
------
**[?]** - 30
- 36

- The committee could be considered to be
- a simple random sample,
- a sample with replacement,
- a random variable,
------
**[?]** - a sequence of Bernoulli trials,
- an exclusive event.

Let be the number of men on the committee.*X**X*is- an event,
- an outcome,
- a random variable,
------
**[?]** - a standardized value,
- a parameter.

- The expected value
of is*X*- 0
- ½
- 1
------
**[?]** - 3
- a meaningless concept.

- The variance
of is*X*- 1/3
- 2/5
- 1/2
------
**[?]** - 2/3
- 1

questions 54-60. - 2/15
- 1/5
- 4/15
- 3/10
- 2/5
- 1/2
- 3/5
- 2/3
- 4/5
- cannot be determined from the information given.

- The committee consists entirely of men.
------
**[?]** - Both members of the committee are of the same sex.
------
**[?]** - The committee contains one member of each sex.
------
**[?]** - The committee includes at least one woman.
------
**[?]** - The committee does not include a math major.
------
**[?]** - The committee is made up of one education
------
**[?]**student and one engineering student. - The committee consists of exactly one woman and
one engineering student.
------
**[?]**You are particularly fond of a type of candy that is sold in bulk at the local grocery store. The candy comes in four different colours: green, yellow, orange, and red, each having a different flavour. You like them all except the red ones, which you only eat if there are no others left. You have bought a scoopful of the candy from the bin, but when you get home, you get the uneasy feeling that the bag has too many red ones. You fear that the store manager (who was visibly annoyed when you complained about the stale biscotti) is maliciously dumping red candies into the bin. Before eating any, you sort the candy by colour, and carefully count the number of each kind. You find that you bought a total of

490 candies, ofwhich 125 were green,113 were yellow,112 were orange,and 140 were red. Assume that the candies you bought are a simple random sample of candies in the bin. - Suppose that all four colours are equally
common in the bin. What would be the expected number of red candies in
your purchase?
- 0
- 1/4
- 490/4
------
**[?]** - 140
- 245

- Under the same assumption, what would be the
(approximate) variance of the number of red candies?
- 1
- 13.0767
- 1470/16
------
**[?]** - 490/4
- 171

- These computations are based on
- which is reasonable provided that
- [?] there aren't too many degrees of freedom,
- [?] the candies were well mixed, and the ones you bought were only a small fraction of the number in the bin,
- [?] you purchased less than two standard deviations of candy,
- [?] the probabilty of getting any colour was
exactly ½, - [?] there were only a few candies left in the bin after your purchase.

- In fact,
2/7 of the candies you purchased were red. The figure 2/7=0.2857 is an example of - The (estimated) standard error of the estimator of the true proportion
of red candies in the bin is
- 1/49
- 3/16
- 1/4
------
**[?]** - 10

- A 95% confidence interval on the percentage of
red candies in the bin is approximately
- (18.6,38.6)
- (21.2,28.8)
- (23.1,26.9)
------
**[?]** - (24.6,32.6)
- (26.7,30.5)

- This approximation is based on
- the central limit theorem,
- Student's
*t*-distribution, - 3 degrees of freedom,
------
**[?]** - the regression phenomenon,
- a sign test.

- A correct interpretation of this confidence interval is:
- 95% of the candies are between 24.6%
and 32.6% red, - 95% of the samples will have between 23.1%
and 26.9% red candies, - The estimator of the percentage of red candies will be unbiased
19 times outof 20, ------**[?]** - The proportion of red candies in the bin is estimated to be in the indicated range, and such estimated ranges will contain the true value 95% of the time,
- The hypothesis that the sampling distribution of the candies
includes the 25th percentile is rejected
95% of the time,where =0.05.

- 95% of the candies are between 24.6%
- Instead of concentrating on the red candies, you could use a goodness-of-fit test to determine whether there was sufficient evidence to conclude that the four colours were not equally common. Such a procedure would consider the distribution of a test statistic assuming that that the four colours were equally likely to occur. This assumption is called
- Which of the following would be the most appropriate test statistic in this situation?
- The value of the test statistic
is 4.188. This corresponds to a*P*-value of- around -0.21,
- less than 0.0005,
- between 0.01 and 0.02,
------
**[?]** - slightly less
than 0.25, around 0.5.

- The degrees of freedom for this test statistic are
- 3
- 4
- 10
------
**[?]** - 489
- irrelevant.

- On the basis of this data, one could reasonably state
that
- [?] there is overwhelming evidence that there are too many red candies,
- [?] there is strong evidence that the different colours are not evenly distributed,
- [?] although the number of red candies is somewhat more than expected by chance, there is not much evidence against the hypothesis that the four colours are equally likely,
- [?] it has been conclusively shown that exactly 25% of the candies in the bin were red,
- [?] although there are statistically significant differences between the proportions of colours among the candies, it has not been established whether there really are more red ones than green ones.

- Suppose that manager
*has*indeed spiked the candy bin so that 31% of the candies are red. Then the answer toquestion would illustrate- the power of statistical calculation,
- rejection of the null hypothesis,
- a significant degree of freedom,
------
**[?]** - a Type I error,
- a Type II error.

Consider the following sample:

Choose from the following list
the value of the quantities stated in
questions