Recent Posts

Slides for my presentation at CMStatistics 2021 are available here. The talk was about generalized additive latent and mixed models, which is further described in this post.

Slides for my presentation at the Nordic-Baltic Biometrics Conference are available here.

The organizers of the European R User Meeting 2020 have put together a really impressive event, with lots of opportunities for interaction and stimulating discussions while being fully online. I have particularly enjoyed the good mix of academic presentations focusing on methodology and more business and industry related presentations focusing on use of R in production. Today I presented the BayesMallows package in a five-minute lightning talk, and the slides (with links) are available here.

The European R User Meeting 2020 has so far been a really great event, with interesting talks and online presentations working smoothly. I presented the metagam package this morning, and the slides are available here.

Tonight I gave a presentation of Rcpp at the Oslo UseR! Group. The slides are here.

It was a nice opportunity to meet some of the many R users in town. Thanks to Deemah for organizing!

Publications

We present generalized additive latent and mixed models (GALAMMs) for analysis of clustered data with responses and latent variables depending smoothly on observed variables. A scalable maximum likelihood estimation algorithm is proposed, utilizing the Laplace approximation, sparse matrix computation, and automatic differentiation. Mixed response types, heteroscedasticity, and crossed random effects are naturally incorporated into the framework. The models developed were motivated by applications in cognitive neuroscience, and two case studies are presented.

We address the problem of estimating how different parts of the brain develop and change throughout the lifespan, and how these trajectories are affected by genetic and environmental factors. Estimation of these lifespan trajectories is statistically challenging, since their shapes are typically highly nonlinear, and although true change can only be quantified by longitudinal examinations, as follow-up intervals in neuroimaging studies typically cover less than 10% of the lifespan, use of cross-sectional information is necessary.

Analyzing data from multiple neuroimaging studies has great potential in terms of increasing statistical power, enabling detection of effects of smaller magnitude than would be possible when analyzing each study separately and also allowing to systematically investigate between-study differences. Restrictions due to privacy or proprietary data as well as more practical concerns can make it hard to share neuroimaging datasets, such that analyzing all data in a common location might be impractical or impossible.

BayesMallows is an R package for analyzing preference data in the form of rankings with the Mallows rank model, and its finite mixture extension, in a Bayesian framework. The model is grounded on the idea that the probability density of an observed ranking decreases exponentially with the distance to the location parameter. It is the first Bayesian implementation that allows wide choices of distances, and it works well with a large amount of items to be ranked.

Researchers interested in hemispheric dominance frequently aim to infer latent functional differences between the hemispheres from observed lateral behavioural or brain-activation differences. To be valid, these inferences may not only rely on the observed laterality measures but also need to account for the antecedent probabilities of the studied latent classes. This fact is frequently ignored in the literature, leading to misclassifications especially when considering low probability classes as, for example, “atypical” right hemispheric language dominance.

This is the companion paper to the hdme R package. Link to paper.

Ranking and comparing items is crucial for collecting information about preferences in many areas, from marketing to politics. The Mallows rank model is among the most successful approaches to analyze rank data, but its computational complexity has limited its use to a particular form based on Kendall distance. We develop new computationally tractable methods for Bayesian inference in Mallows models that work with any right-invariant distance. Our method performs inference on the consensus ranking of the items, also when based on partial rankings, such as top-k items or pairwise comparisons.

In many problems involving generalized linear models, the covariates are subject to measurement error. When the number of covariates p exceeds the sample size n, regularized methods like the lasso or Dantzig selector are required. Several recent papers have studied methods which correct for measurement error in the lasso or Dantzig selector for linear models in the p > n setting. We study a correction for generalized linear models, based on Rosenbaum and Tsybakov’s matrix uncertainty selector.

Regression with the lasso penalty is a popular tool for performing dimension reduction when the number of covariates is large. In many applications of the lasso, like in genomics, covariates are subject to measurement error. We study the impact of measurement error on linear regression with the lasso penalty, both analytically and in simulation experiments. A simple method of correction for measurement error in the lasso is then considered. In the large sample limit, the corrected lasso yields sign consistent covariate selection under conditions very similar to the lasso with perfect measurements, whereas the uncorrected lasso requires much more stringent conditions on the covariance structure of the data.

Software

R package available from CRAN. See also the accompanying Shiny App. Functional differences between the cerebral hemispheres are a fundamental characteristic of the human brain. Researchers interested in studying these differences often infer underlying hemispheric dominance for a certain function (e.g., language) from laterality indices calculated from observed performance or brain activation measures. However, any inference from observed measures to latent (unobserved) classes has to consider the prior probability of class membership in the population.

R package available from CRAN. Meta-analysis of generalized additive models and generalized additive mixed models. A typical use case is when data cannot be shared across locations, and an overall meta-analytic fit is sought. ‘metagam’ provides functionality for removing individual participant data from models computed using the ‘mgcv’ and ‘gamm4’ packages such that the model objects can be shared without exposing individual data. Furthermore, methods for meta-analysing these fits are provided.

R package available from CRAN. An implementation of the Bayesian version of the Mallows rank model. Both Cayley, footrule, Hamming, Kendall, Spearman, and Ulam distances are supported in the models. The rank data to be analyzed can be in the form of complete rankings, top-k rankings, partially missing rankings, as well as consistent and inconsistent pairwise preferences. Several functions for plotting and studying the posterior distributions of parameters are provided. The package also provides functions for estimating the partition function (normalizing constant) of the Mallows rank model, both with the importance sampling algorithm of Vitelli et al.

R package available from CRAN.

Penalized regression for generalized linear models for measurement error problems (aka. errors-in-variables). The package contains a version of the lasso (L1-penalization) which corrects for measurement error. It also contains an implementation of the Generalized Matrix Uncertainty Selector, which is a version the (Generalized) Dantzig Selector for the case of measurement error.

Experience

 
 
 
 
 

Professor of Biostatistics

University of Oslo

Jan 2024 – Present Oslo
Department of Psychology.
 
 
 
 
 

Associate Editor

Journal of Open Source Software

Jun 2021 – Present Oslo
 
 
 
 
 

Associate Professor of Statistics

University of Oslo

Sep 2018 – Dec 2023 Oslo
Working at the Center for Lifespan Changes in Brain and Cognition, Department of Psychology.
 
 
 
 
 

Postdoctoral Research

University of Oslo

Jun 2018 – Aug 2018 Oslo
Working at the Oslo Center for Biostatistics and Epidemiology (OCBE).
 
 
 
 
 

Data Scientist

NextBridge Analytics

Aug 2016 – May 2018 Oslo
Data science consultant.
 
 
 
 
 

Analyst

Storebrand Life Insurance

Nov 2014 – Jul 2016 Oslo
Market analyst in the B2B market.
 
 
 
 
 

PhD Student

University of Oslo

May 2011 – Oct 2014 Oslo
PhD student at the Department of Biostatistics, with 25% teaching.