Relevant Coursework

The following is a select list of courses I’ve taken or am currently taking as a data science master’s student at Columbia University

  • The course covers basic statistical principles of supervised machine learning, as well as some common algorithmic paradigms. Topics include:

    • maximum likelihood estimation

    • linear regression, least squares, geometric view

    • ridge regression, probabilistic views of linear regression

    • bias-variance, Bayes rule, maximum a posteriori

    • Bayesian linear regression

    • sparsity, subset selection for linear regression

    • nearest neighbor classification, Bayes classifiers

    • linear classifiers, perceptron

    • logistic regression, Laplace approximation

    • kernel methods, Gaussian processes

    • maximum margin, support vector machines

    • trees, random forests

    • boosting

    • clustering, k-means

    • EM algorithm, missing data

    • mixtures of Gaussians

    • matrix factorization

    • non-negative matrix factorization

    • latent factor models, PCA and variations

    • Markov models, hidden Markov models

    • continuous state-space models

    • association analysis

    • model selection

  • This class offers a hands-on approach to machine learning and data science. The class discusses the application of machine learning methods like SVMs, Random Forests, Gradient Boosting and neural networks on real world dataset, including data preparation, model selection and evaluation. It relies entirely on available open source implementations in scikit-learn and tensor flow for all implementations. Apart from applying models, we discuss software development tools and practices relevant to productionizing machine learning models.

  • In this course, we systematically cover fundamentals of statistical inference and modeling, with special attention to models and methods that address practical data issues. The course is focused on inference and modeling approaches such as the EM algorithm, MCMC methods and Bayesian modeling, linear regression models, generalized linear regression models, nonparametric regressions, and statistical computing.

    In addition, the course provides introduction to statistical methods and modeling that addresses various practical issues such as design of experiments, analysis of time-dependent data, missing values, etc.

    Throughout the course, real-data examples are used in lecture discussion and homework problems. This course lays the statistical foundation for inference and modeling using data, preparing the MS in Data Science students, for other courses in machine learning, data mining and visualization.

  • The goal of this class is to provide data scientists and engineers that work with big data a better understanding of the foundations of how the systems they will be using are built. It will also give them a better understanding of the real-world performance, availability and scalability challenges when using and deploying these systems at scale. In the course we will cover foundational ideas in designing these systems, while focusing on specific popular systems that students are likely to encounter at work or when doing research.

  • Topics Include:

    • Understanding Data: Perception, Continuous variables, Discrete variables, Dependency relationships, Multivariate categorical variables, Temporal data, Spatial data

    • Context: Data Science Pipeline: Collect, Import, Clean, Transform, Visualize, Model, Communicate

    • Tools: R (base graphics / ggplot2), Plotly, htmlwidgets, Shiny, D3, Git / GitHub, Rmarkdown

  • This course is designed as an introduction to elements that constitute the skill set of a data scientist. The course provides a foundation of methodology with applied examples to analyze large engineering, business, and social data for data science problems.

    Topics include:

    • Python Data Science Tools

    • Data Cleaning, Exploration and Visualization

    • Hypothesis Testing and Statistical Modeling

    • Classification, Regression and Clustering

    • Dimensionality Reduction and Topic Modeling

    • Model Evaluation and Model Selection

    • Feature Engineering and Feature Selection

    • Natural Language Processing

    • Data processing and delivery using ETL and APIs

    • Dealing with Time Series Data

  • This course focuses on the design and analysis of algorithms relevant to data science and machine learning. We will discuss general design paradigms as well as specific problems. Topics include:

    • Asymptotics, sorting, searching, fast integer and matrix multiplication

    • Graph algorithms (BFS, DFS, shortest paths)

    • Data compression

    • Dynamic programming

    • Network flows

    • Linear programming

    • Reductions and NP-completeness

    • Approximation algorithms

    • Hashing, Bloom filters, count-min sketch

    • The web graph, hubs & authorities, page rank

    • SVD and PCA

    • Basic streaming algorithms

  • This course is a self-contained introduction to probability and statistics with a focus on data science. Topics include:

    • probability theory

    • probability distributions

    • simulations

    • parameters estimation

    • hypothesis testing

    • simple regression

 

The following is a select list of courses I took as and electrical engineering undergraduate student at Boston University

  • This course aims to introduce students to software design, programming techniques, data structures, and software engineering principles. The course is structured bottom up, beginning with basic hardware followed by an understanding of machine language that controls the hardware and the assembly language that organizes that control. It then proceeds through fundamental elements of functional programming languages, using C as the case example, and continues with the principles of object-oriented programming, as principally embodied in C++ but also its daughter languages Java, C#, and objective C. The course will conclude with an introduction to elementary data structures and algorithmic analysis.

  • This course discusses advanced structures and techniques for digital signal processing and their properties in relation to application requirements. Topics include:

    • real-time, low-bandwidth, and low-power operation

    • optimal FIR filter design

    • time-dependent Fourier transform and filterbanks

    • Hilbert transform relations

    • cepstral analysis and deconvolution

    • parametric signal modeling

    • multidimensional signal processing

    • multirate signal processing

  • An introduction to programming concepts and modern computational environments used to solve engineering problems. Topics include:

    • procedural programming concepts (input/output, selection, looping, functions, data structures, pointers, memory management)

    • object-oriented programming concepts and terminology and event-driven programming.

    • programming style

    • debugging, top-down design and modular code.

    • command line interface and high-level language.

  • An introductory course to the principles of engineering design, intended to give students a basic understanding of the process of converting a product from concept through design and deployment. Students work in multi-disciplinary teams with time and budget constraints on externally sponsored design projects. Web-based lectures will cover topics concurrent with specific phases of the projects.

  • Topics include:

    • detailed analysis of differential amplifiers

    • design and principles of operational amplifier including multistage circuit structure, BJT, MOSFET, CMOS, and BiCMOS

    • active filters and oscillators

    • negative and positive feedback

    • power devices.

  • Introduction to hardware building blocks used in digital computers. Topics include:

    • boolean algebra, combinatorial and sequential circuits: analysis and design.

    • adders, multipliers, decoders, encoders, multiplexors.

    • programmable logic devices: read- only memory, programmable arrays, Verilog.

    • counters and registers.

  • Topics include:

    • time varying electric and magnetic fields

    • Maxwell equations.

    • electromagnetic waves.

    • propagation, reflection, and transmission.

    • remote sensing applications.

    • radio frequency coaxial cables, microwave waveguides, and optical fibers.

    • microwave sources and resonators.

    • antennas and radiation.

    • radio links, radar, and wireless communication systems.

    • electromagnetic effects in high-speed digital systems.

  • This course presents a detailed perspective of electric power systems from generation, transmission, storage, to distribution to end users. Significant emphasis is placed on methodologies for reliable and efficient transmission and distribution of power over the grid including challenges for adapting to renewable resources such as photovoltaics and wind.