Consider the following sample:
Choose from the following list
the value of the quantities stated in
questions 1-10.
- 5
- 6
- 10
- 16
- 18
- 19
- 21
- 25
- 36
- 180
- Sample size.
------[?]
- Mean.
------[?]
- Median.
------[?]
- (Sample) variance.
------[?]
- (Sample) standard deviation.
------[?]
- Lower quartile.
------[?]
- 75th percentile.
------[?]
- 90th percentile.
------[?]
- Range.
------[?]
- Inter-quartile range.
------[?]
Questions 11-27 refer to the following five
graphical displays.
-
-
-
-
- Which display shows the most negative
skewness?
- I
- II
- III
------[?]
- IV
- V
- The data display III is called
- a histogram
- a boxplot
- a stemplot
------[?]
- a cumulative distribution
- a scatterplot
- The median of this data is
- 60
- 61
- 62
------[?]
- 62.4
- 65
- The interquartile-range is
- 10
- 16
- 25
------[?]
- 30
- 57
- The data display I is called
- a histogram
- a boxplot
- a stemplot
------[?]
- a cumulative distribution
- a scatterplot
- The mean value of this data is
- around 25
- less than 60
- exactly 65
------[?]
- between 60 and 70
- between 70 and 90
- The median value of this data is
- around 20
- between 20 and 60
- between 60 and 70
------[?]
- between 70 and 80
- between 80 and 90
- The data display II is called
- a histogram
- a boxplot
- a stemplot
------[?]
- a cumulative distribution
- a scatterplot
- The median value is
- 40
- 50
- 53
------[?]
- 60
- 70
- The lower quartile is
- 25
- 40
- 53.5
------[?]
- 60
- 66
- The data display IV is called
- a histogram
- a boxplot
- a stemplot
------[?]
- a cumulative distribution
- a scatterplot
- The median of these data is
- 40
- 50
- 60
------[?]
- 65
- 70
- The lower quartile of these data is
- 25
- 40
- 53
------[?]
- 60
- 75
- The data display V is called
- a histogram
- a boxplot
- a stemplot
------[?]
- a cumulative distribution
- a scatterplot
- The correlation coefficient would be around
- -0.9
- -0.5
- 0
------[?]
- 0.3
- 0.9
- The slope of the regression line would be
around
- -1
- -0.5
- -0.25
------[?]
- 0.5
- 1
- Which pair of displays might be showing the same
data?
- I and II
- I and III
- I and IV
- I and V
- II and III
------[?]
- II and IV
- II and V
- III and IV
- III and V
- IV and V
Questions 28-37 present some statistical terms.
Choose from the following list the correct definition of these terms.
- A random subset (of given size) of a population chosen such that
every subset of that size has the same probability of being chosen.
- A sample obtained by taking a simple random sample from selected subpopulations.
- The result of subtracting the mean value and dividing by the square root
of the variance.
- A range of values of a population parameter, calculated from
the sample, such that the true value of the parameter is contained in that
range with a specified probability.
- The probability of not making a Type II error.
- A statistic whose expected value is equal to a population parameter.
- A random variable used to determine whether to reject the null
hypothesis.
- Rejecting the null hypothesis when it is true.
- The standard deviation of the sampling distribution of an
estimator.
- A systematic error in an estimator.
- Simple random sample.
------[?]
- Stratified sample.
------[?]
- Standardized value.
------[?]
- Test statistic.
------[?]
- Confidence interval.
------[?]
- Power.
------[?]
- Type I
error.
------[?]
- Standard error.
------[?]
- Bias.
------[?]
- Unbiased estimator.
------[?]
A city has three hospitals in which babies are delivered. In addition,
there are a certain number of home births. An epidemiologist has collected the
following data about the number of live births and stillbirths in 1996
at the different locations.
- A data display of this kind is called
- a box plot,
- a contingency table,
- a histogram,
------[?]
- a scatter diagram,
- a
table.
- What percentage of babies were stillborn?
- 5.22
- 5.55
- 5.87
------[?]
- 41.75
- 53.93
- What percentage of stillbirths occurred at
General Hospital?
- 2.82
- 6.90
- 7.41
------[?]
- 50.90
- 85.00
- What was the rate of stillbirths at General Hospital?
- 2.82%
- 6.90%
- 7.41%
------[?]
- 50.90%
- 85.00%
- Suppose that the epidemiologist wants to demonstrate
that the probability of stillbirths does depend on the location of
birth. A formal statistical procedure for this demonstration might
begin with a null hypothesis that
- home births are more risky,
- home births are less risky,
- the number of births is the same at all hospitals,
------[?]
- the rate of stillbirths is the same at all locations,
- the rate of stillbirths depends on the location.
- Under this null hypothesis, what would be the
expected number of stillbirths among home births?
- 1.49
- 1.80
- 3
------[?]
- 11.15
- 167
- To test the null hypothesis, the
epidemiologist could use
- a regression analysis
- a
test for association,
- a one-sample t test,
------[?]
- a two-sample t test,
- a sign test.
- The test statistic would be compared with the
tabulated values for which distribution?
- normal,
- Student's t,
- chi-squared,
------[?]
- binomial,
- binary.
- The degrees of freedom would be
- 1
- 3
- 7
------[?]
- 8
- 3009.
- The test statistic calculated in
question 44 is equal to
68.103 which exceeds the 95th percentile of the appropriate
distribution. A correct interpretation of this event would be
-
[?]there is no relationship between rates of stillbirth and
location of birth,
- [?] there is insufficient evidence to demonstrate a relationship
between stillbirth and location of birth,
- [?] home births are safer than hospital births,
- [?] going to College Hospital for delivery increases the chances of
stillbirth,
- [?] rates of stillbirths are not the same for all the hospitals.
- All but one of the following are a reasonable
comment about this study. Select the one that is not justified.
- [?] The study woud be more informative if information could be
obtained about the reasons why the various hospitals were chosen for
delivery.
- [?] Since the data were collected from a properly randomized
experiment, one can conclude that procedures at College Hospital are
causing an increased number of stillbirths.
- [?] It is risky to make cause-and-effect inferences from
observational studies since lurking variables could be responsible for
apparent association.
- [?] Because the data represent an aggregation of deliveries at various
levels of risk, the observed effects could be an instance of Simpson's
paradox.
- [?] Since the births analyzed are not a random sample from a
larger population, inferences cannot be made beyond the particular city
on statistical grounds alone.
A difficult mathematics class contains six students, of which two are
math majors, two are education students, and two are engineering
students. Both education students are female, as is one of the engineers.
The other three students are male.
The students are having
problems with the professor and decide to pick a committee of two to
speak to him about their concerns. The committee is chosen by putting
the names of the six students into a hat, and after thorough mixing,
randomly drawing two names.
- How many possible committees are there?
- 3
- 6
- 15
------[?]
- 30
- 36
- The committee could be considered to be
- a simple random sample,
- a sample with replacement,
- a random variable,
------[?]
- a sequence of Bernoulli trials,
- an exclusive event.
- Let X
be the number of men on the committee. X is
an example of
- an event,
- an outcome,
- a random variable,
------[?]
- a standardized value,
- a parameter.
- The expected value of X is
- 0
- ½
- 1
------[?]
- 3
- a meaningless concept.
- The variance of X is
- 1/3
- 2/5
- 1/2
------[?]
- 2/3
- 1
Select from the following list the correct probability
for each of the events in questions 54-60.
- 2/15
- 1/5
- 4/15
- 3/10
- 2/5
- 1/2
- 3/5
- 2/3
- 4/5
- cannot be determined from the information given.
- The committee consists entirely of men.
------[?]
- Both members of the committee are of the same sex.
------[?]
- The committee contains one member of each sex.
------[?]
- The committee includes at least one woman.
------[?]
- The committee does not include a math major.
------[?]
- The committee is made up of one education
------[?]
student and one engineering student.
- The committee consists of exactly one woman and
one engineering student.
------[?]
You are particularly fond of a type of candy that is sold in bulk at the
local grocery store. The candy comes in four different colours: green,
yellow, orange, and red, each having a different flavour. You like them
all except the red ones, which you only eat if there are no others left.
You have bought a scoopful of the candy from the bin, but when you get
home, you get the uneasy feeling that the bag has too many red ones.
You fear that the store manager (who was visibly annoyed when you
complained about the stale biscotti) is maliciously dumping red candies
into the bin. Before eating any, you sort the candy by colour, and
carefully count the number of each kind. You find that you bought a
total of 490 candies, of which 125 were green, 113 were yellow, 112 were
orange, and 140 were red. Assume that the candies
you bought are a simple random sample of candies in the bin.
- Suppose that all four colours are equally
common in the bin. What would be the expected number of red candies in
your purchase?
- 0
- 1/4
- 490/4
------[?]
- 140
- 245
- Under the same assumption, what would be the
(approximate) variance of the number of red candies?
- 1
- 13.0767
- 1470/16
------[?]
- 490/4
- 171
- These computations are based on
- [?] the
-table,
- [?] Simpson's paradox,
- [?] the approximation that the number of red candies follows a binomial
distribution,
- [?] the assumption that the number of red candies follows a normal
distribution,
- [?] the fact that the candies constituted a cluster sample,
- which is reasonable provided that
- [?] there aren't too many degrees of freedom,
- [?] the candies were well mixed, and the ones you bought were only a
small fraction of the number in the bin,
- [?] you purchased less than two standard deviations of candy,
- [?] the probabilty of getting any colour was exactly ½,
- [?] there were only a few candies left in the bin after your purchase.
- In fact, 2/7 of the candies you purchased were red.
The figure 2/7=0.2857 is an example of
- [?] a parameter,
- [?] an estimate,
- [?] an event,
- [?] a biased statistic,
- [?] a Type II error.
- The (estimated) standard error of the estimator of the true proportion
of red candies in the bin is
- 1/49
- 3/16
- 1/4
------[?]
- 10
- A 95% confidence interval on the percentage of
red candies in the bin is approximately
- (18.6,38.6)
- (21.2,28.8)
- (23.1,26.9)
------[?]
- (24.6,32.6)
- (26.7,30.5)
- This approximation is based on
- the central limit theorem,
- Student's t-distribution,
- 3 degrees of freedom,
------[?]
- the regression phenomenon,
- a sign test.
- A correct interpretation of this confidence interval is:
- 95% of the candies are between 24.6% and 32.6% red,
- 95% of the samples will have between 23.1% and 26.9% red candies,
- The estimator of the percentage of red candies will be unbiased
19 times out of 20,
------[?]
- The proportion of red candies in the bin is estimated to be in the
indicated range, and such estimated ranges will contain the true value
95% of the time,
- The hypothesis that the sampling distribution of the candies
includes the 25th percentile is rejected 95% of the time, where
=0.05.
- Instead of concentrating on the red candies, you could
use a goodness-of-fit test to determine whether there was sufficient
evidence to conclude that the four colours were not equally common.
Such a procedure would consider the distribution of a test statistic
assuming that that the four colours were equally likely to occur. This
assumption is called
- [?] an unbiased estimate,
- [?] the null hypothesis,
- [?] the alternative hypothesis,
- [?] a Type I error,
- [?] a Type II error.
- Which of the following would be the
most appropriate test statistic in this situation?
- [?]
- [?]
- [?]
- [?]
- [?]
- The value of the test statistic is 4.188. This
corresponds to a P-value of
- around -0.21,
- less than 0.0005,
- between 0.01 and 0.02,
------[?]
- slightly less than 0.25,
- around 0.5.
- The degrees of freedom for this test statistic are
- 3
- 4
- 10
------[?]
- 489
- irrelevant.
- On the basis of this data, one could reasonably state
that
- [?] there is overwhelming evidence that there are too many red
candies,
- [?] there is strong evidence that the different colours are not evenly
distributed,
- [?] although the number of red candies is somewhat more than expected
by chance, there is not much evidence against the hypothesis that the
four colours are equally likely,
- [?] it has been conclusively shown that exactly 25% of the candies in
the bin were red,
- [?] although there are statistically significant differences between
the proportions of colours among the candies, it has not been
established whether there really are more red ones than green ones.
- Suppose that manager has indeed spiked the
candy bin so that 31% of the candies are red. Then the answer to
question would illustrate
- the power of statistical calculation,
- rejection of the null hypothesis,
- a significant degree of freedom,
------[?]
- a Type I error,
- a Type II error.