Language: 简体中文 English

Short Courses

Announcement of the Second JCSDS 2024 Series of Short Courses and the Yunnan Provincial Key Laboratory of Statistical Modeling and Data Analysis 2024 Summer School


To provide a broader academic exchange platform for professionals engaged in data science and statistics, the "Second JCSDS 2024 Short Course and Yunnan Provincial Key Laboratory of Statistical Modeling and Data Analysis 2024 Summer School" will be held at the East Campus of Yunnan University from July 6 to July 11, 2024 (registration on July 5). Senior undergraduates, graduate students, and young teachers engaged in data science and statistics from both domestic and international institutions are warmly invited to apply.

Target Audience and Scale

Senior undergraduates, graduate students, and young teachers engaged in data science and statistics. The summer school plans to enroll around 100 participants.



Courses Information

Class 1Statistical Hypothesis Testing: From Fundamental Principles to Frontier Research

LecturerZheyang Wu, Professor, Mathematical Sciences, Worcester Polytechnic Institute(WPI)

Class 2Penalized Least Squares: Theory and Algorithms

LecturerWeichen Wang, Assistant Professor, Business School, Hong Kong University

Class 3Statistical and Algorithmic Foundations of Reinforcement Learning

LecturerYuting Wei, Assistant Professor, Statistics and Data Science Department at the Wharton School, University of Pennsylvania

Class 4Statistical foundations of Deep Neural Network Models 

LecturerLizhen Lin, Professor, Department of Mathematics  University of Maryland

Class 5An Introduction to Selected Classification Algorithm

LecturerYang Feng, Professor, New York University

Class 6An Introduction to Transfer Learning

LecturerYang Feng, Professor, New York University

Class 6Statistical perspectives on Clustering and PCA 

LecturerWen Zhou, Associate Professor, NColorado State University

Other courses are being updata......

Abstracts and Lecturers

Class 1Statistical Hypothesis Testing: From Fundamental Principles to Frontier Research

Abstract:This short course is designed to introduce foundational concepts and techniques, as well as recent cutting-edge research, in statistical hypothesis testing with a focus on high-dimensional data analysis. It covers topics in multiple-hypothesis testing and global testing related to vectors, matrices, and networks. Relevant computational tools and application scenarios, including meta-analysis, data integration, and signal detection, will be presented to facilitate a better understanding of statistical hypothesis testing.

Brief Bio:


Prof Zheyang Wu is a Professor of Mathematical Sciences at Worcester Polytechnic Institute (WPI). His expertise lies in biostatistics, particularly in statistical learning in genetics and genomics, such as employing statistical testing-based signal detection methods for identifying novel disease genes using whole genome sequencing data. He obtained his PhD in Biostatistics from Yale University before joining WPI in 2009.

.

Class 2Penalized Least Squares: Theory and Algorithms

Abstract:In this short course, we will introduce the high-dimensional linear regression model and the problem of variable selection. To accurately estimate the coefficients with sparsity, we use a powerful method called penalized least squares (PLS). We are going to study its theoretical framework, leading to discussions on pros and cons of different choices of penalties. We will further discuss the computational algorithms for PLS, especially when the objective function is non-convex. 

Brief Bio:


Prof. Weichen Wang joined The University of Hong Kong in 2021 as an Assistant Professor. He obtained his PhD in Operations Research and Financial Engineering from Princeton University in 2016. After graduation, he joined Two Sigma Investments as a quantitative researcher. Before his PhD, he received his bachelor’s degree in Mathematics and Physics from Tsinghua University in 2011. Prof. Wang’s research areas include big data analysis, econometrics, robust statistics and machine learning, and he is particularly interested in the factor structure of the financial market and real-world applications of machine learning. His works have been published in top journals including Annals of Statistics, Journal of Machine Learning Research, Journal of Econometrics etc.

.

Class 3:Statistical and Algorithmic Foundations of Reinforcement Learning

Abstract:As a paradigm for sequential decision-making in unknown environments, reinforcement learning (RL) has received a flurry of attention in recent years. However, the explosion of model complexity in emerging applications and the presence of nonconvexity exacerbate the challenge of achieving efficient RL in sample-starved situations, where data collection is expensive, time-consuming, or even high-stakes (e.g., in clinical trials, autonomous systems, and online advertising). How to understand and enhance the sample and computational efficiencies of RL algorithms is thus of great interest and imminent need. In this short course, we aim to present a coherent framework that covers important algorithmic and theoretical developments in RL, highlighting the connections between new ideas and classical topics. Employing Markov Decision Processes as the central mathematical model, we start by introducing classical dynamic programming algorithms when precise descriptions of the environments are available. Equipped with this preliminary background, we introduce four distinctive RL scenarios (i.e., RL with a generative model, offline RL, online RL, and multi-agent RL), and present three mainstream RL paradigms (i.e., model-based approach, model-free approach, and policy optimization). Our discussions gravitate around the issues of sample complexity and computational efficiency, as well as algorithm-dependent and information-theoretic lower bounds in the non-asymptotic regime. We will systematically introduce several effective algorithmic ideas (e.g., stochastic approximation, variance reduction, optimism in the face of uncertainty for online RL, pessimism in the face of uncertainty for offline RL) that permeate the design of efficient RL. 

Brief Bio:


Yuting Wei is currently an assistant professor in the Statistics and Data Science Department at the Wharton School, University of Pennsylvania. Prior to that, Yuting spent two years at Carnegie Mellon University as an assistant professor and one year at Stanford University as a Stein Fellow. She received her Ph.D. in statistics at the University of California, Berkeley. She was the recipient of the 2023 Google Research Scholar Award, NSF Career Award, and the Erich L. Lehmann Citation from the Berkeley statistics department. Her research interests include high-dimensional and non-parametric statistics, statistical machine learning, and reinforcement learning.

.

Class 4Statistical foundations of Deep Neural Network Models 

Abstract:As deep learning has achieved breakthrough performance in a variety of application domains  such as image recognition , speech recognition  natural language processing, and healthcare,  a significant effort has also been made to understand  theoretical foundations of such models.   This short course  will  in particular focus on understanding statistical foundations of deep neural network models.  From a statistical viewpoint, a deep learning model can largely be considered as a nonparametric function or distribution estimation where the underlying function or distribution can be parametrized by a deep neural network (DNN). In the supervised setting such as regression and classification analysis, the underlying regression function or classification map can be modeled using a deep neural network such a feedforward DNN.  For distribution estimation based on a DNN, a popular model is the so-called deep generative model. In understanding the theoretical foundations of DNN models, statisticians have devoted for example to understanding why deep neural network models outperform classical nonparametric models or estimates. Characterizing the statistical foundations of DNNs would allow us to explain why deep neural networks can perform well in practice from the lens of statistical theory.  

More specifically, the tutorial will focus on the following sub themes: 

(1) Approximation theory of DNNs 

(2) Statistical properties  of DNNS (e.g, a feedforward deep neural network model) for regression and classification learning 

(3) Statistical properties of deep generative models 

(4) Bayes and Variational Bayes learning of DNNs. 

Brief Bio:

Lizhen Lin is a professor of statistics in the Department of Mathematics at the University of Maryland, where she currently also serves as the director of the statistics program. Her areas of expertise are in Bayesian modeling and theory for high-dimensional and infinite-dimensional models, statistics on manifolds, statistical network analysis and statistical properties of deep generative models. 

.

Class 5An Introduction to Selected Classification Algorithm

       Abstract:Dive into the world of data classification with our in-depth course, designed to provide a thorough understanding of key methods in this field. Explore a wide range of techniques such as Linear and Quadratic Discriminant Analysis, K-Nearest Neighbor, Decision Trees,Support Vector Machines, Random Forest, Boosting, and Neural Networks.This course is crafted to equip you with the knowledge and skills necessary to effectively apply these methods in various data analysis scenarios.

          Class 6An Introduction to Transfer Learning

          Abstract:This course offers a comprehensive introduction to the statistical foundations underpinning a prevalent machine learning technique: transfer learning. We delve into how transfer learning effectively transfers knowledge from one task to another in an adaptive and robust fashion, thereby enhancing model performance across both supervised and unsupervised learning frameworks. Various transfer learning frameworks will be discussed, including covariate shift and posterior drift, under different assumptions such as model sparsity and low-rank structure. Participants will gain the essential knowledge required to skillfully implement these techniques in diverse supervised and unsupervised learning scenarios.

  Brief Bio:


Yang Feng is a Professor of Biostatistics at New York University.He obtained his Ph.D. in Operations Research at Princeton University in 2010. Feng’s research interests encompass the theoretical and methodological aspects of machine learning, high-dimensional statistics, network models, and nonparametric statistics, leading to a wealth of practical applications. He has published more than 70 papers in statistical and machine-learning journals. His research has been funded by multiple grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), notably the NSF CAREER Award. He is currently an Associate Editor for the Journal of the American Statistical Association (JASA), the Journal of Business & Economic Statistics (JBES), and the Annals of Applied Statistics(AoAS). His professional recognition includes being named a fellow of the American Statistical Association (ASA) and the Institute of Mathematical Statistics (IMS), as well as an elected member of the International Statistical Institute (ISI). 

.

     Class 7Statistical perspectives on Clustering and PCA 

     Abstract: This explosion encompasses data that is ultra high-dimensional, complexly structured or unstructured, dynamic, and even derived from heterogeneous sources. These data are being produced, collected, stored, and made increasingly accessible to a wide range of stakeholders, including industrial institutions, academic researchers, investors, and individuals. However, learning from this vast trove of information and making accurate predictions poses significant challenges for both algorithm-driven machine learning methods and traditional statistical approaches. Key among these challenges are learning from data heterogeneity and implementing effective dimension reduction techniques without compromising the integrity of the data. This short course aims to lay the groundwork for understanding cluster analysis—an essential unsupervised statistical learning technique for uncovering data heterogeneity. It also covers Principal Component Analysis (PCA) and its variants, which are among the most prevalent tools for dimension reduction.
     Brief Bio:

       Dr. Zhou is an Associate Professor in the Department of Statistics at Colorado State University and the Department of Biostatistics and Informatics at the Colorado School of Public Health. Before joining CSU, he received his Ph.D. in Statistics at Iowa State University, where he also received his Ph.D. in Applied Mathematics. His research is focused on developing theory and methods for high dimensional inference, network modeling, machine learning, statistical genomics and genetics, and causal inference. He is currently serving as the Co-Editor in Chief for Journal of Biopharmaceutical Statistics, as well as an associate editor for Biometrics, Statistica Sinica, and Journal of Multivariate Analysis, in addition to the editorial advisory board for New Phytologist. Starting in 2024, he has been elected  as the WNAR program coordinator. 



Other courses are being updated......


Schedule


Registration Date: July 5, 2024 

Training Period: July 6 to July 11, 2024


Application and Admission


Please scan the QR code below to apply. The application deadline is April 30, 2024.



Admission adjustments will be made based on the actual application situation of each school, and official participants will be notified by email.



Training Fees


Participants who receive an admission notice should pay a training fee of 1,000 RMB by May 30, 2024, collected by the Yunnan Applied Statistical Society.


Contact Information


Contact Person: Teacher Zhou

Email: jcsds2024@163.com

Mailing Address: College of Mathematics and Statistics, Chenggong Campus, Yunnan University, Zip Code: 650504


END


Fold