U.S. Agricultural Analysis Dashboard
Interactive dashboard exploring long-term trends in U.S. agriculture using USDA QuickStats data, backed by an automated AWS data pipeline.
01.Project Overview
Overview
An interactive data analysis and visualization platform designed to explore long-term trends in U.S. agriculture using large-scale public datasets from the USDA QuickStats program. The system combines a modern web dashboard with an automated cloud-based data ingestion pipeline.
Architecture
The project is structured as two independent systems that share a common data layer on S3:
- Web Application — Built with Next.js 16, React 19, and TypeScript. Visualization is handled through Recharts for standard charts and Deck.gl + MapLibre GL for geospatial views.
- Data Pipeline — A Python-based automated ingestion system running on an EC2 instance via cron. It pulls data from the USDA QuickStats API, transforms and validates it, converts to partitioned Parquet format, and uploads to S3.
Data flows from the USDA API through the pipeline into S3, and the web app fetches partitioned Parquet files directly from S3 with a local API fallback.
Dashboard Modules
The application includes five analytical dashboards:
- Crops — Production, yield, and acreage trends across states and commodities
- Land & Area — Shifts in cultivated land and crop allocation patterns
- Labor — Employment levels, wage trends, and farm operation density
- Animals — Livestock inventory and production metrics
- Economics — Price, revenue, and market data with comparative analysis
Each dashboard supports shared filtering (state, year, commodity) and features contextual annotations, tooltips, and design-driven visual encodings.
Data Pipeline
The ingestion pipeline automates the full ETL flow:
- Ingest — Queries USDA QuickStats API with configurable parameters
- Transform & Validate — Harmonizes measures, normalizes units, and runs quality checks
- Partition — Converts to Parquet partitioned by state for fast, selective reads
- Upload — Pushes to S3 bucket
usda-analysis-datasets(us-east-2)
The pipeline runs on an EC2 instance with a cron scheduler and supports incremental processing.
Tech Stack
- Frontend: Next.js 16, React 19, TypeScript, Recharts, Deck.gl, MapLibre GL
- Data Pipeline: Python, Pandas, NumPy, Parquet
- Infrastructure: AWS S3, AWS EC2, Cron
- Data Source: USDA QuickStats (millions of records, multi-year, multi-state)
Technologies
Role
Data Engineer & Full Stack Developer
Timeline
Nov 2025 - Present
Category
Data Engineering / Visualization