DSCI-644: Software Engineering for Data Science

Spring 2025

Course Overview

This course focuses on the software engineering challenges of building scalable and highly available big data software systems. Software design and development methodologies and available technologies addressing the major software aspects of a big data system including software architectures, application design patterns, different types of data models and data management, and deployment architectures will be covered in this course.

Course Projects

Project 1: Software Development Life Cycle

Project 2: Advanced Batch Processing ETL Pipeline

Project 3: Database Normalization and Optimization

Project 4: Stream Processing ETL Pipeline