Alex Lin


Data Scientist

I architect data solutions that bridge technical innovation with business value.

About

I'm a data scientist passionate about data pipelining and creative solutions. I recently earned my M.S. in Analytics from NC State’s Institute for Advanced Analytics, and now work at Sprout Pharmaceuticals as a Data Analytics and Engineering Lead.

My experience spans sports analytics, process automation, and optimization, with a focus on building end-to-end solutions-such as interactive dashboards, automated reporting pipelines, and custom optimization tools-that translate complex analyses into clear, actionable deliverables for business leaders, product teams, and non-technical partners.

As the communication lead for a major data-driven transformation project at Axios, I worked with over 600 GB of user data to enhance content personalization and engagement. By combining advanced analytics with robust engineering practices, I create scalable, ethical solutions that integrate cutting-edge technologies with real-world business needs.

Outside of work, I enjoy playing games from the Resident Evil franchise and practicing piano.

Skills

Programming Languages

Python
SQL
R
Git

Software

pandas
Tableau
PowerBI
Polars
Streamlit
GIS
TensorFlow
PyTorch
FastAPI

Cloud Technologies

AWS
Snowflake
MongoDB

Databases

PostgreSQL
SQLite
MySQL
MongoDB

Projects

RAG Movie Recommender App

I built a personalized movie and TV show recommendation app that leverages the TMDB API and DeepSeek R1 to generate AI-powered summaries tailored to user preferences. The Streamlit-based interface supports genre and year filtering, trending content discovery, and provides clear explanations for each recommendation, showcasing seamless integration of external APIs and local AI models.

Python
Streamlit
RESTful APIs
Ollama

Lyric Quest

I developed Lyric Quest, an interactive web app that transforms song lyrics into a dynamic word puzzle experience. By integrating a lyrics API, MongoDB caching, and React-based UI design, the app dynamically renders lyrics while allowing users to uncover hidden words in real-time.

React
Node.js
MongoDB
JavaScript

ASCII Video Converter

I programmed a video processing tool that transforms any video into a dynamic ASCII gif. Using edge detection and contrast enhancement, the program preserves intricate details while converting frames into artistic text-based representations, offering multiple character styles for unique visual effects.

Python
OpenCV
NumPy
Pillow

Pokemon CV Quiz Automation

I created a bot that automates the Pokémon Quiz game using computer vision and image processing. After gathering sprites from the PokeAPI, the system generates silhouettes, then leverages template matching to automatically identify and input Pokémon names in the quiz. Demonstrating a full run in 10:32, this project showcases a blend of API integration, custom image processing, and real-time automation.

Python
OpenCV

Computer Vision Automation with Mario Party DS

I developed a series of Python-based bots leveraging computer vision and input simulation to automate gameplay in Mario Party DS minigames. These projects showcase expertise in real-time image processing, algorithm design, and problem-solving, with a focus on optimizing performance and adapting to complex in-game scenarios. Used tools like OpenCV, PyAutoGUI, and advanced debugging techniques.

Python
OpenCV

Experience

Data Analytics and Engineering Lead · Sprout Pharmaceuticals

September 2025 — Present

Working with cross-functional teams to design and automate scalable data pipelines, enabling efficient processing of large-scale pharmaceutical datasets. Leading advanced analytics initiatives to uncover insights that drive strategic business decisions and enhance operational efficiency. Overseeing data architecture optimization and integrating robust engineering practices to support reliable data solutions in the healthcare sector.

Python
Tableau
pandas
Snowflake

Communication Lead · Axios

September 2024 — April 2025

Working with a team of 5 to analyze 600+ GB of user data to enhance content targeting and reduce newsletter churn. Implementing analytics using logistic regression and clustering algorithms while maintaining strong data ethics practices. Optimizing data infrastructure through PostgreSQL migration and establishing robust data processing pipelines.

Python
pandas
PostgreSQL

Data Scientist Intern · City of Raleigh

June 2023 — August 2023

Analyzed 2 million data points of historical rainfall data for the Walnut Creek floodplain, creating comprehensive visualizations using R and PowerBI. Integrated GIS data to enhance context of flood events and presented findings to the Stormwater Management Advisory Commission.

R
PowerBI
GIS