I'm eager to deepen my data engineering skills by using modern data platforms to build ETL pipelines, tackle data transformation challenges, and create BI dashboards. I prioritize data governance, documentation, and writing simple, readable code.
On my free time, I enjoy reading, tinkering, working out, coding, and gaming.
For complete resume, please reach out to me at pymkdb@gmail.com.
Technical Skills
- Platforms: AWS, Github
- Languages: R, Python, SQL, Go (learning)
- Tools: git, dbt, DuckDB, Github Actions (CI/CD), Docker, Terraform (IaC), Claude Code (agentic coding tool)
- Experiences:Working in fast-paced startups, handling sensitive data (PHI and HR data)
Professional Experiences
| Title | Company | Dates |
|---|---|---|
| Data Engineer | Delfi Diagnostics | Apr 2022 - Present |
| Program Manager | N-Power Medicine | Oct 2021 - Apr 2022 |
| Data Analyst | GRAIL | Jan 2018 - Oct 2021 |
| Clinical Data Manager | Gilead Sciences | Nov 2015 - Jan 2018 |
| Biosample Coordinator | Genentech | Dec 2012 - Aug 2013 |
Data Engineer
Delfi Diagnostics :: Apr 2022 - Present
Specializing in pipeline automation, data integration, and making data more accessible across the organization.
- Built and maintained CI/CD pipelines for data workflows and infrastructure deployments.
- Automated ETL processes using GitHub Actions, Docker, and Dagster orchestration.
- Integrated data from diverse external sources including APIs, SFTP servers, and email.
- Designed and implemented a data quality system with validation checks, monitoring reports, and alerts.
- Developed internal Python and R packages.
- Created BI dashboards for data visualization and reporting.
- Derived semantic layers and clinical variables following CDISC/ADaM standards.
- Executed programmatic biosample selection aligned with cross-functional requirements.
- Contributed to team documentation, including technical guides, best practices, and onboarding materials.
- Built an LLM chatbot with DuckDB MCP server integration, enabling natural language queries of internal data.
Program Manager
N-Power Medicine :: Oct 2021 - Apr 2022
Drove cross-functional program management and software development initiatives.
- Led coordination of complex work-streams across internal teams and external partners.
- Established program management frameworks to align deliverables with organizational strategy.
- Managed software development prioritization and implementation roadmaps.
- Implemented structured communication protocols to ensure project visibility and accountability.
Data Analyst
GRAIL :: Jan 2018 - Oct 2021
Led the workflow development of biosample management operations.
- Designed and deployed data quality pipeline with end-to-end biospecimen tracking and data reconciliation.
- Developed custom R packages and interactive dashboards for business intelligence reporting.
- Worked on AWS-based data warehouse, utilizing S3, Glue, Athena, and QuickSight.
- Worked with software engineers as a project manager for technical development of internal biosample management systems.
Clinical Data Manager
Gilead Sciences :: Nov 2015 - Jan 2018
Managed data operations for Phase I clinical trials with focus on biomarker data.
Biosample Coordinator
Genentech :: Dec 2012 - Aug 2013
Coordinated biosample management operations for clinical studies while maintaining data accuracy.
Education
- M.S. Molecular Biology :: San Jose State University :: Aug 2015
- B.A. General Biology :: San Francisco State University :: May 2012