Introduction to Python Workshops

Author

Python Group

Welcome🙌

This workshop series is geared toward R and SAS users who are interested in exploring Python–a powerful, versatile programming language that excels in many areas such as machine learning, large-scale data processing, and broader data science applications.

Our goal is to help you build a solid foundation in Python and gain skills that can be integrated into your own work. This series will guide you through reproducible workflow using Python virtual environments, understanding basic Python data structures, data manipulation through pandas, and, finally, applying machine learning and deep learning methods using libraries such as scikit-learn and PyTorch.

All workshop materials, including pre-session handouts, coding demos, assignments, and session slides, will be uploaded here. Be sure to check out the FAQ tab for common questions. If you have any questions, suggestions, or ideas for future sessions, please feel free to share them on our GitHub Discussions page. We value your feedback and aim to continually improve the workshops. We look forward to learning with you!

Workshop Information

Time and Location: Tuesdays (3/3, 3/10, 3/26, 3/31) | 3:00 – 4:30pm | 708 Broadway, Room 801

Contact and Office Hours:

Session Date Outline Materials

Session 1:

Python Installation and Reproducible Workflow (Yufei)

Tuesday (3/3)

3-4:30 pm

Python, Positron IDE, GitHub, and virtual environment overview

  • Create a GitHub repository for the workshop series
  • Set up a venv virtual environment through command lines
  • Navigate Positron to work on Python Projects

Installation guide html

Session Slides html

Session 2:

Intro to Python Data Structures (Patrick)

Tuesday (3/10)

3-4:30 pm

  • Data container types: Strings, lists, tuples, dicts, etc.
  • Handling missingness
  • Methods vs. functions

Pre-reading docx

Session Slides html

Session 3:

Intro to Pandas (Patrick)

Thusday (3/26)

3-4:30 pm

  • pandas for data manipulation and cleaning
  • Data management
  • great_tables for gtsummary-style tables
Session Slides html

Session 4:

Object-Oriented Programming & Intro to ML Libraries (Dani)

Tuesday (3/31)

3-4:30 pm

Part A: Introduction to object-oriented programming (OOP)

  • Recap: classes, methods, and attributes

  • Understanding classes (base, derived, mixins)

  • Creating classes and instances

  • Understanding inheritance

Part B: PyTorch for machine learning

  • Building Neural Network with Pytorch

  • Plotting with Python (plotnine, seaborn)

Pre-read: PyTorch and Plotting Primer

Session Slides html

Other Resources

🔗Quick Installation & Downloads

Python Download

Install Python (❗recommend 3.10+)

Positron Download

Install Positron

File

Download requirements.txt file

File

Download Follow Along File

File

Download Toy Dataset

Useful External Tutorials