Time | Session | Instructor |
---|---|---|
9:30 AM - 10:30 AM | Overview of Big Data in Cancer Studies | Md. Jubayer Hossain |
10:30 AM - 10:35 AM | Break | - |
10:35 AM - 11:05 AM | Fundamentals of R | Muhibullah Shahjahan |
11:05 AM - 11:10 AM | Break | - |
11:10 AM - 12:00 PM | Data Wrangling with R | Sajjad Hossen |
Big Data in Cancer Research with R (Cohort 01)
🧑 Lead Instructor: Md. Jubayer Hossain |
🗓 View the Schedule and Curriculum on Google Sheet |
🏨 Medium - Zoom |
💥 Enroll with Google Forms |
📝 To join private Telegram group for the course, follow instructions in the email you received after registration. |
Overview
This intensive, four-day workshop is designed to provide participants with a comprehensive understanding of big data analytics in cancer research using R. Attendees will gain hands-on experience in processing, visualizing, and analyzing large-scale cancer genomics datasets, specifically focusing on data from The Cancer Genome Atlas (TCGA). The workshop will cover essential skills such as data wrangling, visualization, mutation analysis, and machine learning techniques for biomarker discovery.
What You’ll Learn?
By participating in this workshop, you will gain hands-on experience and comprehensive knowledge in the following areas:
- Overview of Big Data in Cancer Studies: Understand the significance of big data in cancer research, including sources of data and current challenges in the field.
- Fundamentals of R Programming: Learn the basics of R programming and its application in data analysis for cancer research.
- Data Wrangling with R: Master techniques to clean, preprocess, and manipulate cancer data to prepare it for analysis.
- Data Visualization with R: Gain skills in visualizing complex cancer datasets using various types of graphs and plots.
- Strategies for Working with Big Data in R: Learn strategies to efficiently handle large cancer datasets, optimizing performance in R.
- Downloading Data from GDC Portal using TCGAbiolinks: Learn how to access and download cancer genomics data from the GDC portal using the TCGAbiolinks package.
- Somatic Mutation Analysis with maftools: Understand how to analyze somatic mutations in cancer using the maftools package in R.
- Differential Expression Analysis using DESeq2: Explore RNA-Seq data analysis and differential expression in cancer research using the DESeq2 package.
- Dimensionality Reduction and Clustering Analysis: Learn how to reduce the dimensionality of large datasets and apply clustering techniques to find patterns in cancer data.
- Survival Analysis with Kaplan-Meier Curve: Learn how to perform survival analysis and visualize results using Kaplan-Meier curves for cancer patient data.
- TCGA Biomarker Identification using Machine Learning: Apply machine learning techniques to identify potential biomarkers for cancer using TCGA datasets.
- Project Integration: Apply all learned skills to a final project, integrating big data tools for cancer genomics, from data wrangling to biomarker discovery.
Why R?
- R is a programming and statistical language.
- R is used for data Analysis and Visualization.
- R is simple and easy to learn, read and write.
- R is an example of a FLOSS (Free Libre and Open Source Software) where one can freely distribute copies of this software, read its source code, modify it, etc.
Recording of classes
Class lectures will be recorded automatically using cloud. The links will be posted to CHIRAL Classes when they are available.
Is this course for me?
If your answer to any of the following questions is “yes”, then this is the right workshop for you.
- Do you make summary tables in R (data, survey data, regression models, time-to-event data, adverse event reports)?
- Do you want your workflow to be reproducible?
- Are you often frustrated with the immense amount of code required to create great-looking tables in R?
The workshop is designed for those with some experience in R. It will be expected that you can perform basic data manipulation. Experience with the tidyverse and the %>%
operator is a plus, but not required.
Zoom + Working Virtually
Zoom link will be emailed to students
Class sessions will be recorded and later posted
We will have lectures as well as breakout room sessions to work on labs
Please be aware that there is the option to use closed captioning
Instructors
Md. Jubayer Hossain
Founder & Executive Director, CHIRAL
Md. Jubayer Hossain is the Founder and Executive Director of CHIRAL Bangladesh. CHIRAL Bangladesh is a non-profit organization dedicated to health research to improve lives in Bangladesh. He aspires to maximize the quality of life of people around him by working at the intersection of education, technology, and biomedical research. Detailed research and teaching activities were found on his website.
Muhibullah Shahjahan
Research Assistant, Big Bioinformatics Lab, CHIRAL
Muhibullah Shahjahan is a researcher specializing in Cancer Bioinformatics and Machine Learning. As a Research Assistant at the Big Bioinformatics Lab of CHIRAL Bangladesh, he focuses on applying computational tools to address biological research questions. With experience in statistical analysis and R programming, he contributes to advancing biomedical data science education and supports researchers in analyzing and interpreting complex datasets.
Muhammad Mohtasim Billah, Instructor
Research Assistant, Big Bioinformatics Lab, CHIRAL
Muhammad Mohtasim Billah is a researcher with a focus on Cancer Bioinformatics. Affiliated with the Big Bioinformatics Lab at CHIRAL Bangladesh, he applies computational approaches to study cancer biology and support data-driven research. His work involves statistical analysis and bioinformatics techniques, contributing to the understanding and management of complex biological data.
Sajjad Hossain, Instructor
Research Assistant, Big Bioinformatics Lab, CHIRAL
Sajjad Hossen is a researcher with expertise in Cancer Bioinformatics and serves as an instructor at the Big Bioinformatics Lab, CHIRAL Bangladesh. In addition to his research endeavors, he is responsible for managing and coordinating the lab’s training programs, fostering the dissemination of advanced bioinformatics knowledge. His work focuses on computational analysis and the interpretation of complex biological datasets, contributing to capacity-building efforts and the advancement of biomedical research.
Muntasim Fuad, Instructor
Research Assistant, Big Bioinformatics Lab, CHIRAL
Muntasim Fuad is a researcher specializing in Cancer Bioinformatics. As a Research Assistant Instructor at the Big Bioinformatics Lab, CHIRAL Bangladesh, he applies computational techniques to explore complex biological data and advance cancer research. His role involves supporting research initiatives and contributing to the academic development of trainees in bioinformatics and data analysis.
Workshop Timeline
Time | Session | Instructor |
---|---|---|
9:30 AM - 10:30 AM | Data Visualization with R | Muntasim Fuad |
10:30 AM - 10:35 AM | Break | - |
10:35 AM - 11:05 AM | Strategies for Working with Big Data in R | Md. Jubayer Hossain |
11:05 AM - 11:10 AM | Break | - |
11:10 AM - 12:00 PM | Download Data from GDC Portal using TCGAbiolinks | Md. Jubayer Hossain |
Time | Session | Instructor |
---|---|---|
9:30 AM - 10:30 AM | Somatic Mutation Analysis using maftools | Md. Jubayer Hossain |
10:30 AM - 10:35 AM | Break | - |
10:35 AM - 11:05 AM | Differential Expression Analysis using DESeq2 | Md. Jubayer Hossain |
11:05 AM - 11:10 AM | Break | - |
11:10 AM - 12:00 PM | Dimensionality Reduction and Clustering Analysis | Md. Jubayer Hossain |
Time | Session | Instructor |
---|---|---|
9:30 AM - 10:30 AM | Survival Analysis by Kaplan-Meier Curve | Md. Jubayer Hossain |
10:30 AM - 10:35 AM | Break | - |
10:35 AM - 11:05 AM | TCGA Biomarkers Identification using Machine Learning | Md. Jubayer Hossain |
11:05 AM - 11:10 AM | Break | - |
11:10 AM - 12:00 PM | Project: Integrating Big Data Skills for Cancer Genomics | Md. Jubayer Hossain |