Bioinformatics, Biostatistics and Machine Learning Techniques for Biotech Applications

FIRST CALL - Industry Short Training Course

Contact person: Dr. Alexander Lukyanov, Scientific Director Artificial Intelligence and Machine learning, Uncertainty Analysis and Quantification, BISC Global

Bioinformatics, Biostatistics and Machine Learning Techniques for Biotech Applications will be offered to bioengineers, bioinformaticians, computational biologists and researchers who want either to refresh or enter one of the fields such as bioinformatics, biostatistics and machine learning. The course consists of lectures introducing topics such as RNA-seq analysis, single-cell omics, analysis of variation, crossover designs, regression modeling, and classification modeling. Each topic has a hands-on element, which is provided to the course attendees via a compartmentalized virtual cloud environment (AWS), containing all tools and data necessary for demonstrations.

Target Groups: Bioengineers, bioinformaticians, computational biologists, researchers, and engineers interested in the basics of processing biological data using bioinformatic, statistical and machine learning methods. Participants must bring their own laptops for the three-day training course.

Training course materials: A course website containing course materials and sample code repositories will be set up prior to the start of the course.

Topics covered:

NGS data analysis (Day 1)

  • RNA-seq analysis (from raw sequencing data to QC, to normalized gene expression table, group comparison, pathway analysis, interactive reporting and visualization of the results).
  • Small RNA-seq analysis (e. miRNAs, circRNAs, piRNAs, eRNAs, etc.).
  • Single-cell omics analysis (10x genomics pipeline, clustering, trajectory analysis, spatial transcriptomics, etc.).
  • ChIP-seq / ATAC-seq analysis (e.g. QC, peak calling, differential binding, UCSC Genome Browser visualization, etc.).
  • Variant calling from genotyping/WGS/WES data.
  • Other next-generation sequencing (NGS) data analysis (Methyl-seq, CAGE, ChIA-PET, Hi-C etc.).
  • Data management (storage/backup, meta-table management, GEO/dbGap submission).

Introduction to Biostatistics of Clinical Trials (Day 2)

  • Choosing a statistical test when comparing distributions (t-test for one sample, t-test for unrelated populations, t-test for related populations, Wilcoxon test, chi-square test, Mann-Whitney test, F-test, McNemar’ test, Kruskal-Wallis test, Cochran Test, Friedman test).
  • Analysis of Variation (One-way ANOVA, Repeated measures ANOVA).
  • Power Analysis (Effect Size, Sample Size, Significance, Statistical Power).
  • Crossover Designs (planning a clinical trial, A/B hypothesis testing, first-order carryover, sequence, period, washout, aliased effect, distinguish between situations where a crossover design would or would not be advantageous, distinguish between population bioequivalence, average bioequivalence and individual bioequivalence. Relate the different types of bioequivalence to prescribability and switchability).
  • Understand the cons & pros of crossover design.
  • Be able to evaluate a crossover design.
  • Apply principles of data science to the analysis of clinical trials.

Machine Learning (Day 3)

  • Regression modeling with linear / non-linear regression.
  • Classification modeling with DNN classifier, naive Bayes, k-nearest neighbor, and support vector machines.
  • Decision tree models with random forest and the accompanying boosting algorithms such as XGBoost, Light GBM and CatBoost.
  • Anomaly detection modeling with isolated forests, PCA, and k-Means clustering.
  • Recommendation systems and time series prediction models.
  • Model selection, evaluation, and interpretation concepts like regularization, dimension reduction, and cross-validation.


Takeaway
:  You will learn about the most in-demand bioinformatics, biostatistics and machine learning techniques you need to succeed as a bioinformatician or data scientist in the biotech industry. For each pipeline and model, you will learn how it works conceptually first, then apply it to a particular industrial application, and finally learn to analyze and visualize the results.

Registration: Secure your spot now. Seats are limited, and we accept applicants on a first-come, first-served basis.ss

Would you like to follow this training?

Fill out this form with your details so we can keep you updated on the dates and locations of this training.