Links
Downloads & Installations for the Workshop
Install Python (❗recommend 3.10+)
Install Positron
Download environment txt file
Download Follow Along File
Download Toy Dataset
Additional Resources for learning python
General Data Analysis
Python for Data Analysis Open Edition is a classic resource for data analysis (including data cleaning and data visualization). It’s a great place to get started and also has a brief introduction to modeling libraries like statsmodels and scikit-learn.
Python Rgonomics. An in-depth article on using tools in Python that are genuinely “Pythonic” while being consistent with the workflow and style of the best R has to offer (we will cover some like great_tables and plotnine in later sessions. Stay tuned!).
Data Science Tools
Official Positron Guide. An all-Encompassing wikipedia of the Positron IDE with all tutorials you might need to get started with and learn more about the IDE. Some relevant guides
A Quick Tour of Positron. To quickly learn how to navigate the Positron IDE.
Git and GitHub learning resources by GitHub Docs with links to free online courses and tutorials.
Machine Learning
Scikit-learn has some good tutorials and examples for machine learning and modeling on its official website.
PyTorch also has an introductory tutorial and some other more in-depth tutorials including creating NLP models and visualizing models with TensorBoard.
For hyperparameter tuning, optuna and optuna-dashboard are pretty easy to use and work well with PyTorch.
Using R and Python together
Reticulate lets you run Python code from within R. rpy2 lets you run R code from Python. If you install rpy2, you can use %%R to run R code in Jupyter notebooks with Cell Magic commands.
For working with large data
Polars is an alternative to Pandas for working with large data (like datasets larger than available RAM). Not everything that works with Pandas will automatically work with Polars, but current versions of Scikit-learn have added support for it.