Hello and welcome! My full name is Dung Ngoc Nguyen but most people call me Dung with the D being pronounced by Z in English pronunciation.
I am currently a postdoctoral research fellow in Statistics under the supervision of Professor Alberto Roverato at the Department of Statistical Sciences, University of Padova, Italy. Our objective is to conduct a research activity on the development of new statistical tools for learning the structure of complex networks characterized by high dimensionality in non-standard experimental setups.
My research interest is related to Data Science, in particular, I can apply Machine Learning and Statistical Methodologies to analyze and interpret large and complex datasets, especially data related to biology and genetics.
I am a motivated young researcher who wishes to leave a legacy for science. It is with pleasure and enthusiasm that I will participate in any collaborative relationship.
Ph.D. in Statistical Sciences, 2018-2021
Università degli studi di Padova, Italy
M.S. in Applied Mathematics, 2017-2018
Université de Tours, France
B.S. in Mathematics and Computer Sciences, 2013-2017
Vietnam National University-Ho Chi Minh Univeristy of Science, Vietnam
We consider the problem of learning a Gaussian graphical model in the case where the observations come from two dependent groups sharing the same variables. We focus on a family of coloured Gaussian graphical models specifically suited for the paired data problem. Commonly, graphical models are ordered by the submodel relationship so that the search space is a lattice, called the model inclusion lattice. We introduce a novel order between models, named the twin order. We show that, embedded with this order, the model space is a lattice that, unlike the model inclusion lattice, is distributive. Furthermore, we provide the relevant rules for the computation of the neighbours of a model. The latter are more efficient than the same operations in the model inclusion lattice, and are then exploited to achieve a more efficient exploration of the search space. These results can be applied to improve the efficiency of both greedy and Bayesian model search procedures. Here we implement a stepwise backward elimination procedure and evaluate its performance by means of simulations. Finally, the procedure is applied to learn a brain network from fMRI data where the two groups correspond to the left and right hemispheres, respectively.
Mixture of experts (MoE) models are among the most popular and interesting combination techniques, with great potential for improving the performance of machine learning and statistical learning systems. We are the first to consider a polynomial softmax-gated block-diagonal mixture of experts (PSGaBloME) model for the identification of potentially nonlinear regression relationships for complex and high-dimensional heterogeneous data, where the number of explanatory and response variables can be much larger than the sample size and possibly hidden graph-structured interactions exist. These PSGaBloME models are characterized by several hyperparameters, including the number of mixture components, the complexity of softmax gating networks and Gaussian mean experts, and the hidden block-diagonal structures of covariance matrices. We contribute a nonasymptotic theory for model selection of such complex hyperparameters with the help of the slope heuristic approach in a penalized maximum likelihood estimation (PMLE) framework. In particular, we establish a non-asymptotic risk bound on the PMLE, which takes the form of an oracle inequality, given lower bound assumptions on the penalty function. Furthermore, we propose two Lasso–MLE–rank procedures, based on a new generalized expectation–maximization algorithm, to tackle the estimation problem of the collection of PSGaBloME models.
We consider the problem of learning a graphical model when the observations come from two groups sharing the same variables but, unlike the usual approach to the joint learning of graphical models, the two groups do not correspond to different populations and therefore produce dependent samples. A Gaussian graphical model for paired data may be implemented by applying the methodology developed for the family of graphical models with edge and vertex symmetries, also known as coloured graphical models. We identify a family of coloured graphical models suited for the paired data problem and investigate the structure of the corresponding model space. More specifically, we provide a comprehensive description of the lattice structure formed by this family of models under the model inclusion order. Furthermore, we give rules for the computation of the join and meet operations between models, which are useful in the exploration of the model space. These are then applied to implement a stepwise model search procedure and an application to the identification of a brain network from fMRI data is given.
Lecturer: Andrew Ng
Objectives: it provides a broad introduction to modern machine learning, including supervised learning (multiple linear regression, logistic regression, neural networks, and decision trees), unsupervised learning (clustering, dimensionality reduction, recommender systems), and some of the best practices used in Silicon Valley for artificial intelligence and machine learning innovation (evaluating and tuning models, taking a data-centric approach to improving performance, and more.).
Lecturer: Omiros Papaspiliopoulos
Objectives: