Purdue Data Mine — Corporate Partner Project

Seed Demand Forecasting
Analytics Dashboard

An end-to-end predictive analytics platform that transforms raw production-line data into actionable demand forecasts, helping seed producers plan inventory with confidence.

About the Project

Built for a corporate seed partner in collaboration with the Purdue Data Mine

Descriptive Analytics

Interactive treemaps, donut charts, and consistency analyses reveal how seed product lines (PLFs) perform across years and companies.

Predictive Modeling

Multiple regression and ML models are compared on R² and RMSE to find the best predictor for future seed demand by PLF cluster.

Data Pipeline

A robust ETL pipeline ingests messy, wide-format CSVs from multiple production years and normalizes them into analysis-ready frames.

Synthetic Data Disclaimer: All data displayed on this website is entirely synthetic and was programmatically generated to mimic the structure and statistical properties of the original dataset. No real or proprietary data from any corporate partner is used, shown, or stored on this site. The synthetic data exists solely to demonstrate the dashboard’s interactive capabilities and analytical features.

Interactive Dashboard

Filter by year and explore seed demand patterns (synthetic data)

PLF Distribution by Year

Quantity Share by Company

Quantity Distribution

Stability by PLF (lower CV = steadier)

Trend by PLF (slope of yearly totals)

Yield by Relative Maturity

PLFs Meeting Stability Rule

PLF × Year Summary Table

Predictive Analytics

Machine learning model comparisons and cluster analysis

Model Comparison — Test R² (2024)

R-squared comparison

Model Comparison — RMSE / Mean (2024)

RMSE comparison

Elbow Method for Optimal k (PLF Binning)

Elbow method

Actual vs Predicted Quantity by PLF Bin (2024)

Actual vs predicted

Technology Stack

Tools and frameworks powering this project

Python
Dash
Plotly
Pandas
NumPy
Bootstrap
Git
GitHub Pages
JavaScript
HTML / CSS