Interactive exploration of shot metadata, model training datasets, and production lineage. Treating data as strategic infrastructure, not scattered artifacts.
Production data is fragmented across ShotGrid, editorial systems, render logs, and QC databases. There's no unified view of how shots, assets, and model training data relate—making it difficult to trace model decisions or audit dataset quality.
When an AI model makes a mistake, you need to understand: What training data influenced this decision? Which shots contributed similar examples? How has artist feedback modified the model over time? Without data lineage, these questions are unanswerable.
Can we trace model decisions back to training examples? If a segmentation model fails on a reflective surface, which training images taught it about reflections? This explainability builds trust and identifies dataset gaps.
How does data quality correlate with model performance? By visualizing relationships between annotation quality, shot complexity, and inference accuracy, we can prioritize dataset improvements with highest impact.
What's the ROI of human feedback? When artists correct AI outputs, that becomes training data. Tracking this loop shows which corrections improve the model most, guiding where to invest annotation effort.
Designing data schema and graph relationships. Prototyping with small dataset from personal projects before scaling to production volumes. Exploring Neo4j for graph database and D3.js for visualization.
The goal is a system that makes production data explorable and valuable—not just archived. Data becomes infrastructure that improves over time, not just historical records.