CPSC 5380: Big Data Systems: Trends and Challenges
Anurag Khandelwal, PhD
Today’s Internet-scale applications and cloud services generate massive amounts of data. At the same time, the availability of inexpensive storage has made it possible for these services and applications to collect and store every piece of data they generate, in the hopes of improving their services by analyzing the collected data. This introduces interesting new opportunities and challenges designing systems for collecting, analyzing, and serving the so-called big data. This course looks at technology trends that have paved the way for big data applications, surveys state-of-the-art systems for storage and processing of big data, and considers future research directions driven by open research problems. Our discussions span topics such as cluster architecture, big data analytics stacks, scheduling and resource management, batch and stream analytics, graph processing, ML/AI frameworks, and serverless platforms and disaggregated architectures.
CPSC 5710: Trustworthy Deep Learning
Rex (Zhitao) Ying
In recent years, deep learning has seen applications in many fields, from science and technology, to finance, humanity, and businesses. However, real-world, high-impact machine learning applications demand more than just model performance. In particular, deep learning models are often required to be “trustworthy,” so that domain experts can trust that the models consistently behave in a way that corresponds to their domain knowledge. For example, medical experts would expect a deep learning diagnosis model to be able to explicitly utilize medical domain knowledge in its prediction; an insurance company would expect a decision on insurance price to be explainable in terms of risk factors; a financial company would expect its fraud detection model to be robust to adversarial attacks; a physicist would expect models to provide consistency with the underlying laws. This course introduces various fields of trustworthy deep learning, including model robustness, defenses for adversarial attacks, interpretability, explainability, fairness, privacy, domain adaptation, rules, and constraints. The course covers some of these aspects in the context of graph neural networks but also covers many other ML models in general deep learning, natural language processing, and computer vision.
S&DS 5720: YData: Data Science for Political Campaigns
Joshua Kalla, PhD
Political campaigns have become increasingly data driven. Data science is used to inform where campaigns compete, which messages they use, how they deliver them, and among which voters. In this course, we explore how data science is being used to design winning campaigns. Students gain an understanding of what data is available to campaigns, how campaigns use this data to identify supporters, and the use of experiments in campaigns. The course provides students with an introduction to political campaigns, an introduction to data science tools necessary for studying politics, and opportunities to practice the data science skills presented in S&DS 523.
CPSC 5700: Artificial Intelligence
Tesca Fitzgerald, PhD
Introduction to artificial intelligence research, focusing on reasoning and perception. Topics include knowledge representation, predicate calculus, temporal reasoning, vision, robotics, planning, and learning.
S&DS 6890: Scientific Machine Learning
Lu Lu, PhD
Introduction to artificial intelligence research, focusing on reasoning and perception. Topics include knowledge representation, predicate calculus, temporal reasoning, vision, robotics, planning, and learning.
CPSC 5460: Data and Information Visualization
Holly Rushmeier, MS, PhD
Visualization is a powerful tool for understanding data and concepts. This course provides an introduction to the concepts needed to build new visualization systems, rather than to use existing visualization software. Major topics are abstracting visualization tasks, using visual channels, spatial arrangements of data, navigation in visualization systems, using multiple views, and filtering and aggregating data. Case studies to be considered include a wide range of visualization types and applications in humanities, engineering, science, and social science.
CPSC 5150: Law and Large Language Models
Ruzica Piskac, PhD, Scott J. Shapiro, JD, PhD
This course is intended for computer science and law students interested in how artificial intelligence can be applied to legal reasoning. It combines basic AI theory with practical project work, focusing on using tools like large language models (LLMs) and other AI technologies for tasks common in legal practice. Students learn how to automate case summarization, draft legal memos and briefs, simulate oral arguments for better argumentation skills, and assist in the preparation of pro-se motions for self-represented litigants. The course emphasizes hands-on experience, helping students build real-world skills in applying AI in legal settings. Our goal is to bring together students from computer science and from law and match them together in the teams. Each team works on a project that automates a specific aspect of the legal process or legal reasoning, focusing on practical, real-world applications. In addition to all standard course requirements, graduate students need to present a recent, relevant research paper in class.
CPSC 5860: Probabilistic Machine Learning
Andre Wibisono, MA, MEng, PhD
This course provides an overview of the probabilistic frameworks for machine learning applications. The course covers probabilistic generative models, learning and inference, algorithms for sampling, and a survey of generative diffusion models. This course studies the theoretical analysis of the problems and how to design algorithms to solve them. This course familiarizes students with techniques and results in literature and prepares them for research in machine learning.
S&DS 5650: Introductory Machine Learning
John Lafferty, PhD
This course covers the key ideas and techniques in machine learning without the use of advanced mathematics. Basic methodology and relevant concepts are presented in lectures, including the intuition behind the methods. Assignments give students hands-on experience with the methods on different types of data. Topics include linear regression and classification, tree-based methods, clustering, topic models, word embeddings, recurrent neural networks, dictionary learning, and deep learning. Examples come from a variety of sources including political speeches, archives of scientific articles, real estate listings, natural images, and others. Programming is central to the course and is based on the Python programming language.
CPSC 5520: Deep Learning Theory and Applications
Smita Krishnaswamy, PhD
Deep neural networks have gained immense popularity within the past decade due to their success in many important machine-learning tasks such as image recognition, speech recognition, and natural language processing. This course provides a principled and hands-on approach to deep learning with neural networks. Students master the principles and practices underlying neural networks, including modern methods of deep learning, and apply deep learning methods to real-world problems including image recognition, natural language processing, and biomedical applications. Course work includes homework, a final exam, and a final project—either group or individual, depending on enrollment—with both a written and oral (i.e., presentation) component. The course assumes basic prior knowledge in linear algebra and probability.
CPSC 7520: Biomedical Data Science: Mining and Modeling
Mark Gerstein, PhD, Matt Simon, PhD
Biomedical data science encompasses the analysis of gene sequences, macromolecular structures, and functional genomics data on a large scale. It represents a major practical application for modern techniques in data mining and simulation. Specific topics to be covered include sequence alignment, large-scale processing, next-generation sequencing data, comparative genomics, phylogenetics, biological database design, geometric analysis of protein structure, molecular-dynamics simulation, biological networks, normalization of microarray data, mining of functional genomics data sets, and machine-learning approaches to data integration.
CPSC 5810: Introduction to Machine Learning
Alex Wong, MS, PhD
This course focuses on fundamental topics in machine learning. We begin with an overview of different components of machine learning and types of learning paradigms. We introduce a linear function, discuss how one can train a linear function on a given dataset, and utilize it to tackle classification and regression problems. We then consider kernel methods to enable us to solve nonlinear problems. Additionally, we introduce the concept of generalization error and overfitting. We discuss the role of regularization and extend linear regression to ridge regression. We also cover topics in optimization, beginning from gradient descent and extending it to stochastic gradient descent and its momentum variant. We also cover the concept of alternating optimization and topics within it. We introduce the curse of dimensionality and discuss topics on dimensionality reduction. Finally, we conclude the course with neural networks: how to build them using the topics discussed, how to optimize them, and how to apply them to solve a range of machine learning tasks.
CPSC 7760: Topics in Industrial AI Applications
Xiuye (Sue) Chen, PhD
This seminar aims to familiarize students with cutting-edge topics in industrial AI research and their practical applications. We will explore a broad range of topics such as large language models, image generation, ML/AI systems considerations, autonomous vehicles, robotics, recommender systems, ambient intelligence, and AI applications in the life sciences and healthcare. Most sessions will be devoted to in-depth discussions of one to two key papers on modern AI applications. We will also feature a series of industry guest speakers, providing students with the opportunity to learn directly from practicing experts. In this seminar, students are expected to present papers, actively participate in class discussions, and work either individually or in groups on a final project that emphasizes the practical implementation of AI techniques. Students should be familiar enough with ML/AI concepts to read academic papers, and comfortable with programming to run open source code in the ML/AI space.
CPSC 5830: Deep Learning on Graph-Structured Data
Rex (Zhitao) Ying, PhD
Graph structure emerges in many important domain applications, including but not limited to computer vision, natural sciences, social networks, languages, and knowledge graphs. This course offers an introduction to deep learning algorithms applied to such graph-structured data. The first part of the course is an introduction to representation learning for graphs and covers common techniques in the field, including distributed node embeddings, graph neural networks, deep graph generative models, and non-Euclidean embeddings. The first part also touches upon topics of real-world significance, including auto-ML and explainability for graph learning. The second part of the course covers important applications of graph machine learning. We learn ways to model data as graphs and apply graph learning techniques to problems in domains including online recommender systems, knowledge graphs, biological networks, physical simulations and graph mining. The course covers many deep techniques (graph neural networks, graph deep generative models) catered to graph structures. We cover basic deep learning tutorials in this course.
S&DS 5170: Applied Machine Learning and Causal Inference
P. Aronow, PhD
Approaches to causal inference using machine learning. Covers randomized experiments with and without noncompliance, observational studies with and without ignorable treatment assignment, instrumental variables, and regression discontinuity. Machine-learning methods include bagging, boosting, tree-based methods such as random forests, and neural networks. Assignments provide students with hands-on experience with the methods. Applications are drawn from a variety of fields including political science, economics, public health, and medicine. Programming is central to the course and is based on the R programming language.
S&DS 5230: YData: An Introduction to Data Science
Ethan Meyers, PhD
Computational, programming, and statistical skills are no longer optional in our increasingly data-driven world; they are essential for opening doors to manifold research and career opportunities. This course aims to dramatically enhance students’ knowledge and capabilities in fundamental ideas and skills in data science, especially computational and programming skills and inferential thinking. It emphasizes the development of these skills while providing opportunities for hands-on experience and practice. The course is designed to be accessible to students with little or no background in computing, programming, or statistics, but also engaging for more technically oriented students through extensive use of examples and hands-on data analysis. Python 3 is the computing language used. Enrollment is limited.
S&DS 6650: Intermediate Machine Learning
John Lafferty, PhD
S&DS 365 is a second course in machine learning at the advanced undergraduate or beginning graduate level. The course assumes familiarity with the basic ideas and techniques in machine learning, for example as covered in S&DS 265. The course treats methods together with mathematical frameworks that provide intuition and justifications for how and when the methods work. Assignments give students hands-on experience with machine learning techniques, to build the skills needed to adapt approaches to new problems. Topics include nonparametric regression and classification, kernel methods, risk bounds, nonparametric Bayesian approaches, graphical models, attention and language models, generative models, sparsity and manifolds, and reinforcement learning. Programming is central to the course, and is based on the Python programming language and Jupyter notebooks.
SOCY 5670: AI in Social Science Methods
Daniel Karell, PhD
Social scientists have begun integrating AI technology into the designs and methods of their research projects. How are they doing so? What are the current standards and best practices? This course uses a seminar format to review, discuss, and critique how AI technologies are currently being incorporated into social science research activities. Students read recently published articles and widely discussed unpublished papers, and, through class discussion, identify the promises and pitfalls of using AI to conduct social science research. Students also learn how to justify and explain the use of AI in their own research projects. During the course, students conduct an original research project that investigates a social science topic while making use of AI in the project’s design and/or methods.
Peter Salovey and Marta Moret Data Science Fellows Program
Welcome to the Peter Salovey and Marta Moret Data Science Fellows Program, a unique fellowship for Yale University PhD students. Established in 2025 with a generous endowment, we aim to inspire, support, and promote high-quality, impactful research at the intersection of data science and other academic fields. Through tailored opportunities for skill development, training, and networking, PhDs from any discipline are equipped to address today’s most pressing challenges through the use of data science. Beginning March 2026, Yale graduate students are invited to apply to join our community of innovators and scholars.
MGT 695: Intro to AI Applications
Xiuye (Sue) Chen, PhD
Introduction to AI Applications' demystifies the core principles and practical applications of artificial intelligence for students with introductory programming experience. Covering essential topics like machine learning models, data handling, ethical AI use, and real-world problem-solving with AI, this course incorporates hands-on projects to foster an intuitive understanding of AI's capabilities and limitations. Students will emerge from the class with foundational AI knowledge, ready to apply AI solutions across various domains and explore more specialized AI disciplines. This course is cross-listed with Computer Science and will follow the Yale College academic calendar.
MGT 554: AI for Business Decisions
Tong Wang, PhD
Artificial Intelligence (AI) is revolutionizing today’s companies and industries. This course provides an in-depth exploration of the fundamental principles, algorithms, and applications of AI, with a special focus on recognizing and circumventing the numerous pitfalls that can lead to misguided managerial decisions and insights. Through a blend of lectures, case studies, and team discussions, students will learn to apply AI algorithms and methodologies to solve real-world business problems. Students will gain skills in discovering patterns, making predictions, and generating insights to support business decision-making. With a strong emphasis on practical learning through real-world examples and case studies, students will develop a nuanced understanding of the context and complexities of AI models, enabling them to navigate AI applications effectively while avoiding common pitfalls.
MGT 853: AI Strategy & Marketing
Vineet Kumar, PhD
Artificial Intelligence is a general-purpose technology which has the potential to transform many aspects of business and society. In business, the impact ranges from commonplace predictive improvements at one end of the spectrum to opportunities for creating entirely new markets at the other. As background, the course will briefly introduce students to Artificial Intelligence / Machine Learning methods comprising of Unsupervised, Supervised and Reinforcement Learning. Through a combination of lectures and case studies, we will evaluate how to integrate AI into decision making, and examine the strategic choices facing companies developing and using AI / ML technologies. We will evaluate how both consumers and decision-makers evaluate decisions made by AI systems, and the feasibility of explainable AI. The course will also examine issues at the intersection of AI and Society including fairness and bias that are proving to be especially challenging, and an understanding of how both consumers and decision-makers evaluate decisions made by AI systems. Note: This is a new course currently under development, so there is no syllabus currently available. The syllabus will be available and posted to canvas and the professor’s website when it becomes available during the spring-1 term.
MGT 860: Generative AI for Managers
K. Sudhir
This course equips future managers to understand, evaluate, and lead the integration of Generative AI solutions within organizations. It offers a practical overview of foundational models such as GPT-4, Retrieval-Augmented Generation, and Agentic AI and help students to effectively use prompt engineering to enhance workplace productivity. Students will explore use cases across customer-facing and organizational activities and learn frameworks for selecting and implementing AI solutions that maximize organizational value while addressing ethical considerations. The course features lectures, case studies, guest speakers, and a final project where students design and present a tailored Generative AI use case in a domain of interest.
MGT 575: Generative AI and Social Media
Tauhid Zaman, PhD, MEng
This course equips students with the tools and techniques of generative AI, focusing on its transformative applications for social media analysis and content creation. Emphasizing practical, hands-on learning, the curriculum trains students to leverage AI for analyzing, designing, and optimizing social media strategies. Key topics include: 1. Building social media apps for sentiment analysis, influencer identification, and audience segmentation 2. Harnessing generative AI to craft compelling text and visual content tailored to specific audiences 3. Automating and optimizing social media campaigns to boost engagement and impact Students will work extensively with advanced AI tools such as ChatGPT, gaining experience in analysis, content generation, and app development. Course assignments and projects are grounded in real-world social media datasets, culminating in a group project where students will create a social media application powered by generative AI. As a fully project-based course, there are no exams. The assignments and projects are designed to be accessible to all students, regardless of prior coding experience, making it an ideal opportunity to develop expertise in applying generative AI to the dynamic field of social media.
MGT 819: Data Science
Vahideh Hosseinikhah Manshadi, MS, PhD
Cheap storage and computing power have enabled the gathering and analysis of an unprecedented amount of data on everything from genetic health risk profiles to real-time Wall Street diaper consumption. To take advantage of these massive datasets, new statistical tools and ideas have been developed and this body of knowledge is sometimes referred to as Data Science. The aim of this course is to provide a gentle tour of the business and industry applications of data science. Through the examples we will study, you will gain an intuitive understanding of the underlying data analytic techniques, which are often applicable to a wider class of problems. After completing this course you will have developed an appreciation for what opportunities exist for use of data within your organization.
MGT 899: Generative AI & Entrepreneurship
Anand Ranganathan, PhD
The advent of Generative AI has revolutionized industries by enabling the creation of new content, solutions, and business models. In this course, we will explore how entrepreneurs can harness the power of Generative AI to innovate, build scalable ventures, and drive competitive advantage. Through a blend of theory, hands-on work, and market analysis, students will learn how to leverage AI to develop innovative products and build AI-driven businesses. Our course delves into the foundational principles of constructing, deploying, and managing Generative AI systems in real-world scenarios. We'll explore widely used concepts, techniques, and frameworks, such as prompt engineering, working with external structured and unstructured datasets, knowledge extraction, agentic workflows, multimedia search and generation, code generation, chatbots, etc. Additionally, we'll delve into various aspects of LLMOps, with a particular emphasis on metrics for evaluating Generative AI systems. We’ll also explore certain strategies for enhancing performance and accuracy, such as fine-tuning and Graph-RAG. Finally, we’ll analyze regulatory and ethical considerations in using AI for business. In each lecture, we will go through the concepts, techniques, and frameworks, followed by an analysis of entrepreneurial opportunities in the space. This will include a review of some sample companies in the space. We will explore the opportunity for startups to disrupt different industries and technology spaces, while at the same time, examining the dangers that startups have of themselves getting disrupted by bigger players. As a result, by the end of the course, students will get a wide view of the landscape of Gen AI companies and the opportunities and challenges that exist.
AI in Medicine Student Interest Group Monthly Seminar
We aim to provide students with early exposure to AI in medicine, create awareness and address concerns about the "new" advanced technology that is already or will certainly reach all areas of medicine, and prepare strategies or create initiatives for students to get actively involved with AI in medicine through research projects or other means.
Examples of Activities: We host a monthly speaker series in conjunction with the Biomedical Informatics & Data Science Group. We also help students get involved with research, attend conferences, and get them connected to different groups at YSM and YNHH working on AI applications to medicine.
Yale University and Boehringer Ingelheim Biomedical Data Science Fellowship Program
The program has successfully offered ten competitive fellowships during its initial phase and will expand by recruiting three new fellows in 2025 and two in 2026. Post-doctoral researchers awarded a three-year fellowship will have access to Yale’s world class faculty, cutting edge research programs, various biomedical data repositories, and robust computational resources. Applicants with strong biomedical data science background and research experience are invited to submit research proposals for consideration. If approved for a fellowship, they will be jointly mentored throughout the research process by Yale’s faculty and scientists from Boehringer Ingelheim. In addition to receiving research funding and mentorship, program fellows will be invited to participate in campus and corporate visits, networking events, and annual symposia.
The program leverages the expertise of faculty and research members representing diverse disciplines from computational biology, bioinformatics, biomedical informatics, statistics, biostatistics, computer science, mathematics, biomedical engineering, systems biology, precision medicine, and public health informatics.
NIH/NLM Biomedical Informatics and Data Science Training Program
The Biomedical Informatics and Data Science Training Program supports predoctoral and postdoctoral fellows. There are three general areas of emphasis in the training program:
1. Clinical informatics - focused in areas of clinical medicine and patient care
2. Translational bioinformatics - focused in areas of genomics and proteomics, broadly defined
3. Clinical research informatics - focused in areas of clinical trials and data sciences
4. Generative AI and large language models - focused on developing and applying AI technologies for healthcare applications, clinical documentation, and biomedical knowledge discovery
BIS 555: Machine Learning with Biomedical Data
Leying Guan, PhD
This course covers many popular topics in machine learning and statistics that are widely used for the exploration of biomedical data. Techniques covered include different linear prediction methods, random forest, boosting, neural networks, and some recent progress on model inference in high dimensions, as well as dimension reduction and clustering. Various examples using biomedical data—e.g., microarray gene expression data, single-cell RNA-Seq data—are provided. The emphasis is on the statistical aspects of different machine-learning methods and their applications to problems in computational biology.
EMD 538 Quantitative Methods for Infectious Disease Epidemiology
Virginia Pitzer, ScD
This course provides an overview of statistical and analytical methods that apply specifically to infectious diseases. The assumption of independent outcomes among individuals that underlies most traditional statistical methods often does not apply to infections that can be transmitted from person to person. Therefore, novel methods are often needed to address the unique challenges posed by infectious disease data. Topics include analysis of outbreak data, estimation of vaccine efficacy, time series methods, and Markov models. The course consists of lectures and computer labs in which students gain experience analyzing example problems using a flexible computer programming language (MATLAB).
BIS 568: Applied Artificial Intelligence in Healthcare
Wade Schulz, MD, PhD
Recent advances in machine learning (ML) offer tremendous promise to improve the care of patients. However, few ML applications are currently deployed within healthcare institutions and even fewer provide real value. This course is designed to empower students to overcome common pitfalls in bringing ML to the bedside and aims to provide a holistic approach to the complexities and nuances of ML in the healthcare space. The class focuses on key steps of model development and implementation centered on real-world applications. Students apply what they learn from the lectures, assignments, and readings to identify salient healthcare problems and tackle their solutions through end-to-end data engineering pipelines.
Students are expected to be proficient in programming (R, Python, or Julia preferred) and have some prior experience in machine learning including data preprocessing (e.g., Python-Pandas, R- Tidyverse) and the development and validation of ML models (e.g. logistic regression, random forest, XGBoost). Otherwise, permission of the instructor is required.
BIS 565: The Role of Ethics and Equity in Data Science and AI
Bhramar Mukherjee, PhD
With the explosion of conversational generative artificial intelligence (AI) tools, such as ChatGPT, Gemini, Llama, Claude Sonnet, DeepSeek, and many others, innovations in data science are greatly influencing day-to-day decision making of the public including decision regarding health, well-being, prevention, treatment and care. This new course thematically belongs to the intersectional field of critical data studies, data science, and public health. Critical data studies is an interdisciplinary field focused on the social, cultural, ethical, and epistemological aspects of data. We first define some of the fundamental technical terms and tools in modern data science, machine learning (ML), and AI such as random forest, neural networks, transformer, auto-encoder, embeddings, stable diffusion process, large language models/foundation models, reinforcement learning, and prediction-powered inference. We then introduce the notion of data equity and data ethics in broad philosophical terms, held by a theoretical framework that appeals to a set of key underlying principles, drawing primarily from the extant computer science and statistical/epidemiological literature. We introduce these core concepts and associated evaluation metrics. The discussed concepts include: fairness, accountability, transparency, ethics, privacy, governance, reflexivity, reproducibility, generalizability, representativeness, causality, confounding bias, selection bias, and information bias. The course consists of lectures, homework, paper presentations, discussion sessions, and a final project that involves critical appraisal of an open-source AI/data science tool or prediction model in terms of the principles taught in the course.
Yale Summer Course in Public Health Modeling
Virginia Pitzer, ScD
The Yale School of Public Health’s Summer Course in Public Health Modeling is an exciting opportunity to learn to understand and implement the latest techniques from distinguished Yale faculty and network with an international group of public health researchers.
The course is designed to provide researchers, clinicians, industry professionals, and policymakers with the systems-based perspective and analytic tools they need to better understand and manage the complex forces that drive the health of populations. Course topics include prediction and control of infectious disease outbreaks such as COVID-19, optimal decision-making in healthcare delivery, and designing interventions to mitigate the effects of drug overdoses.
Course instructors are Yale faculty experts in epidemiology, biostatistics, health policy, and health care operations who have been at the forefront of informing model-guided responses to COVID-19 and other disease threats locally, nationally, and around the world.
Quantitative Methods for Infectious Disease - Time Series Analysis
Daniel Weinberger, PhD
These are guest lecture materials from Dr. Weinberger for EMD 538, a course focused on specialized statistical methods for infectious disease data that accounts for person-to-person transmission patterns. The course covers outbreak analysis, vaccine efficacy estimation, time series methods, and Markov models through lectures and hands-on MATLAB labs. Published on November 13th, 2023.
Ordinary Differential Equations (ODE) in R
Melanie H Chitwood, PhD, Jiye Kwon, MPH
This tutorial teaches how to create and implement Susceptible-Infectious (SI) epidemiological models using Ordinary Differential Equations (ODEs) in R, including writing a custom model function for numerical integration. Created for the Yale School of Public Health's Summer Course in Public Health Modeling. Published June 11th, 2024.
Course Materials for Bootstrap Learning Python with R
Michael Kane, PhD
This repository houses all of the materials to be presented at the day long course "Bootstrap Learning Python with R," which is presented as a day-long course at the 2019 Joint Statistical Meetings, put on by the American Statistical Association.
BIS 557: How to Create an R Package
Michael Kane, PhD
The goal of these lecture materials is to demonstrate the process of creating an R package for the Yale Biostatistics BIS 557 class. Published on September 9th, 2020.
PUBH 580 Seminar for Modeling in Public Health
A. David Paltiel, MBA, PhD, Virginia Pitzer, ScD
This yearlong, monthly seminar is targeted most specifically to students in the Public Health Modeling Concentration but open to all interested members of the Yale community. The seminar features talks by faculty from across Yale University doing modeling-related research, as well as invited speakers from other universities and public health agencies. The objectives are to offer students the opportunity to witness the scope and range of questions in public health policy and practice that may be addressed, understood, and informed using model-based approaches; appreciate the breadth of public health modeling research being conducted around the University and beyond; explore possible collaborations/relationships with other scholars and professionals; review, critique, and evaluate model-based public health research in a structured environment; and form their own opinions regarding the applicability, relevance, and responsible use of modeling methods. Two terms of this no-credit seminar are required of students in the Public Health Modeling Concentration. For each class, one or two readings are circulated/posted on the course website prior to the talk. Students are encouraged to read the articles and articulate questions for the speaker.
Getting started with Git in RStudio
Iris Artin, MMSc, MPH, MEng
An introductory tutorial on using Git within the RStudio Integrated Development Environment. Published on October 26th, 2019.
Introduction to R Lab
Daniel Weinberger, PhD
This tutorial teaches how to manipulate and plot data in R, using qPCR data saved as an Excel file. Published on October 6th, 2021.
Teaching Workshops and Journal Clubs for Faculty
Mike Honsberger, PhD
Find a variety of lecture materials, workshops, and panel discussions for Yale School of Public Health (YSPH) instructors, covering topics such as AI use in the classroom, creating inclusive environments, gamification, AI-proofing courses, and digital accessibility training.
AI at Yale Symposium
Please save the date for From Innovation to Impact: AI at Yale, the university’s second campus-wide interdisciplinary AI symposium, on April 28, 2026, in Kline Tower.
Organized by the Office of the Provost, this event invites faculty, students, and staff to share innovations, applications, impacts, and critical perspectives related to machine learning and generative AI. Students, faculty, and staff will showcase their discoveries and insights through lightning talks, poster sessions, panels, demonstrations, performances, and more.
Learn About AI
Here you will find videos to help get you started using AI chatbots, along with explanations of key terms you might encounter when interacting with AI tools. These resources include definitions and practical examples to enhance your understanding. Additionally, you’ll find learning courses designed to deepen your knowledge and skills in using AI effectively.
Virtual AI Brown Bag Series
Nicholas Warren
DISSC offers a monthly virtual brown bag during the academic year for research support facing staff to explore and discuss use cases of machine learning and LLMs. Other members of the Yale community are welcome to attend. Contact molly.aunger@yale.edu to join the call.