The Statistics Seminar Series is a forum for researchers with interest in statistics to share their ideas or problems and forge collaborative relationships.
Our meeting of term 2 is scheduled on Thursdays, 2-3pm. If you wish to invite a speaker or speak yourself for this seminar, please contact the current organizer: Longhai Li (longhai@math.usask.ca). Please check our tentative schedule of future talks for available time slots. The current organizer particularly invites researchers on U of S campus that are seeking interdisciplinary work with statisticians to introduce their research problems.
| April 5, 2-3, Educ 2014 | Mohammed Obeidat, Ph.D. Candidate, Department of Mathematics and Statistics, University of Saskatchewan Analysis of Time Series Models for Count Data My research is aimed to analyze time series models for count data via both frequentest and Bayesian approaches. Parameter-driven Poisson model will be fitted to the count data. In Parameter-driven model the distribution of the observed data depends on a latent (unobserved) process in such a way that the observed data are assumed to be independent given this latent process, while they are correlated marginally. The estimation of such models is not an easy problem as the likelihood function of the observed data involves high-dimensional integrals over the distribution of the latent process. I will discuss the estimation of such models using the full likelihood function and a pseudo-likelihood function. The motivation of using a pseudo-likelihood is to reduce computational burden by avoiding the evaluation of high-dimensional integrals. Moreover, it has been shown that the maximum pseudo-likelihood estimator is asymptotically unbiased and normally distributed. |
| Mar 22, 2-3, Murry 299 |
Matthew Schmirler, M.Sc. Candidate, Department of Mathematics and Statistics,
University of Saskatchewan Multiple Markov Chain Simulations of an Interacting Self-Avoiding Polygon Model My research is motivated by the type II topoisomerase enzyme. This enzyme helps to unknot DNA molecules during cellular replication via a 'strand passage' process, in which one segment of DNA is passed through another. This process is essential to replication, as a DNA molecule that is knotted cannot successfully replicate. It is not fully understood how this enzyme chooses a location to act on the DNA; better understanding this mechanism could possibly lead to more effective disease treatments (by inhibiting the topoisomerase process, which would lead to cell death). The talk will focus on modelling DNA molecules and the topoisomerase strand passage process using self-avoiding polygons in the cubic lattice, using multiple markov chain (MMC) algorithms to generate random self-avoiding polygons, as well as the introduction of an interactive energy term which represents the effect of 'adding salt' into the model. Brief introductions to knot theory and DNA topology will also be included. |
| Feb 16, 2-3, Educ 2014 |
Dr. Bassirou Chitou, US Centers for Disease Control and Prevention, Rwanda
Methods for a Behavioral Surveillance Survey of Female Sex Workers in Rwanda Abstract: see the PDF file. |
| Feb 02, 2-3, Murry 299 | Prof.
Mikelis Bickis, Department of Mathematics and Statistics,
University of Saskatchewan The geometry of imprecise
inference A statistical model can be constructed from a null probability distribution on the observation space by defining a set of functions representing the log-likelihood ratios of alternative distributions to the null distribution. Conversely, any model all of whose members have the same null sets can be expressed in this way. The set of functions parametrizes the model, which can be extended to a convex subset in their linear span. An exhaustive model using only a finite number of basis functions constitutes an exponential family model. Given an arbitrary "prior" distribution on this parameter space, the space of function generates a family of possible posterior distributions parametrized by elements of the observation space. Bayesian updating of a prior distribution can then be visualized as a translation by an update function. Inference by Bayesian updating can be justified by axioms of rationality such as those proposed by de Finetti or Savage. Weakening these axioms leads to the imprecise inference of Walley in which there is not a single distinguished prior distribution, but a set of priors. Updating is now achieved by translation of the entire set, leading to upper and lower limits on posterior expectations. A crucial but seemingly arbitrary choice of this inferential paradigm is the definition of a suitable set of priors giving maximum imprecision a priori yet leading to informative inferences after observing data. The shape of the set of priors can affect various additional desiderata for rules of inference. |
| Jan 19, 2-3, Educ 1024 | Prof. Jill
Johnstone, Department of Biology, University of Saskatchewan Process uncertainty, measurement error, complex systems, and messy data: Perspectives from the desktop of a field ecologist Ecological research often
generates datasets with a suite of characteristics that make
statistical analysis a formidable and daunting task for ecologists.
Common attributes of ecological datasets include hierarchical (nested)
data structures, spatial or temporal autocorrelation, multicollinearity
of predictor data, unknown measurement error, and complex interactions
between dependent and independent variables. This seminar will provide
some examples of datasets from field investigations of plant ecology
that have great potential for ecological insight, but have often left
me struggling with a pandora's box of statistical challenges. The
intent is to stimulate dialogue to improve understanding between
ecologists and statisticians about more effective ways to address the
challenges of collecting and analyzing messy ecological data.
|
| Jan 05, 2-3, Arts 241 | Prof.
Lisa Lix, School of Public Health, University of Saskatchewan Comparing Variable Importance Measures for Two Independent Groups Descriptive discriminative
analysis (DDA), logistic regression analysis (LRA), and stepwise
multivariate analysis of variance (MANOVA) procedures can be used to
produce measures of the relative importance of a set of correlated
variables for distinguishing between two independent groups. This
research compares six measures of relative importance based on DDA,
LRA, and MANOVA models for rank ordering a set of correlated variables
using Monte Carlo techniques.
|
| Jan 03, 2-3, Arts 105 | Dr. Yunqi Ji,
Postdoc Fellow, Faculty of Medicine, McGill University Analysis of Imperfect Longitudinal Data Subject to Misclassification and Informative ``Unsure" Responses In epidemiological studies,
respondents are often required to answer some questions from pretested
questionnaires using a "Yes", "No" or "Unsure" as the response. An
"Unsure" answer leads to loss of information about the respondent's
inherent status. In addition, even a "Yes" or "No" response may
misclassify the respondent's true status. An unbalanced
misclassification model is presented to describe the misclassification
and "Unsure" Responses. We examined the impact of misclassification and
"Unsure" responses on model estimation. An estimating approach is
proposed to correct the attenuation and improve e ciency of statistical
inference taking into account both misclassification and "Unsure"
responses.
|
| Nov 24, 1:15-2:15, Arts 105 | Yaqing Liu, M.Sc.
Candidate, University of Saskatchewan Bias Analysis for Logistic Regression with a Misclassified Multi-categorical Exposure In epidemiological studies,
one common issue is that, for various reasons, possible errors may
contaminate the exposure variable. The term ``measurement error" refers
to a continuous exposure variable, and the term ``misclassification"
refers to a categorical or discrete exposure. The mismeasurement has an
effect on detecting the actual relationship between the exposure and
the health outcom. In fact, biased estimates with falsely small
standard errors may be obtained if investigators naively ignore the
mismeasurement. The aim of my talk is to assess the asymptotic bias
when the misclassification in a multi-categorical exposure is ignored.
The theoretical result of my work extends the work by Davidov et
al.(2003) from a binary exposure to a multi-categorical exposure under
the context of logistic regression models. The result of this study is
useful to guide for the large scale prospective cohort and case-control
studies.
|
| Nov 10, 1:15-2:15, Arts 213 | Tolulope Sajobi,
PhD Candidate, University of Saskatchewan Robust Descriptive Discriminant Analysis for Repeated Measures Data Discriminant analysis
procedures based on parsimonious mean and/or covariance structures have
recently been proposed for repeated measures data. However, these
procedures rest on the assumption of a multivariate normal
distribution. This study examines repeated measures DA (RMDA)
procedures based on maximum likelihood (ML) and coordinatewise trimming
(CT) estimation methods and investigates bias and root mean square
error (RMSE) in discriminant function coefficients (DFCs) of these
procedures under non-normal distributions. Our study results suggest
that the average bias of CT estimates of DFCs for RMDA procedures that
assume unstructured group means were at least 40% smaller than the
values for corresponding procedures based on ML estimators. However,
the average RMSE for the former were about 10% smaller than the values
for the latter procedures when the data were sampled from extremely
skewed or heavy-tailed distributions. The proposed robust procedures
can be used to identify the measurement occasions that make the largest
contribution to group separation when the data are sampled from
multivariatej skewed or heavy-tailed distributions.
|
| Oct 27, 1:15-2:15, Arts 105 | Lai Jiang, PhD
Candidate, University of Saskatchewan Classification and Feature Selection via Bayesian t-Probit Model The purpose of this talk is to
introduce our recent work on high-dimensional classification problem
with heavy-tailed t-probit model. In genomics studies the sparsity of
high dimensional data always intensify the outliers problem, where
traditional Gaussian assumption fail and lead to nonrobust classifiers
that are vulnerable to type 2 errors. In this talk we propose a
hierarchical Bayesian auxiliary model that incorporates heavy-tailed t
distribution both for noise and regression parameters. We compare our
model with other methods (e.g. logistic regression) and show that one
can obtain a robust classifier with heavy-tailed and symmetric t prior.
|
| Oct 13, 1:15-2:15, Arts 105 | Prof. Hyun Lim, University of Saskatchewan Semi-Parametric Additive Hazards Model to Competing Risks Analysis When subjects possess
different demographic and disease characteristics and are exposed to
more than one types of failure, a practical problem is to assess
covariate effect on each type of failure as well as on all-cause
failure. The most widely used method is adopts Cox models on
cause-specific or all-cause hazards models. It has been pointed out
that this method causes the problem of internal inconsistency. To
resolve such problem, the additive hazards model has been advocated as
an alternative method. In this talk, both constant and time-varying
covariate effects in cause-specific hazard models are specified. We
illustrated that the covariate effect on all-cause failure can be
estimated by sum of the effects on all competing risks. Using an
illustrative example, we show that the proposed method gives simple
interpretation of the final results, when the primary covariate effect
is constant in the additive manner on all cause-specific hazards. Based
on the cause- specific hazard models, estimation of the adjusted
overall survival and cumulative incidence curves are presented.
|
| Sep 30, 3:30-4:30, Arts 108 | Prof.
Peng Zhang, University of Alberta Efficient estimation for subject-specific effects in longitudinal data using nonnormal linear mixed models We propose a new class of
nonnormal linear mixed models that provide an efficient estimation of
subject-specific disease progression in the analysis of longitudinal
data from the Modification of Diet in Renal Disease (MDRD) trial. We
assume a log-gamma distribution for the random effects and provide the
maximum likelihood inference for the proposed nonnormal linear mixed
model. This method is extended to model associations among
subject-specific effects in a multiple characteristics longitudinal
study. More reliable estimates of correlations between random effects
are obtained using the log-gamma mixed model.To validate the adequacy
of the log-gamma assumption versus the usual normality assumption for
the random effects, we propose a lack-of-fit test that clearly
indicates a better fit for the log-gamma modeling in the analysis of
the MDRD data and the glaucoma study.
Note that: This is a joint talk for Colloquium of Department of Mathematics and Statistics |