Mid-level Data Scientist specializing in ML, MLOps, and customer analytics
Tempe, AZData Scientist4 years experienceMid-LevelSaaSTechnologyConsulting
ScreenedIdentity Verified
Connect with SUMIT
SUMIT already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.
Recommended
Already have an account?
About
ML/NLP practitioner focused on insurance/claims analytics for a large financial firm, working with millions of fragmented structured and unstructured records. Built production-grade pipelines for entity extraction, entity resolution, and semantic search using Sentence-BERT + vector DB, including fine-tuning with contrastive learning (reported ~15% recall lift) and scalable ETL/containerized deployment on Kubernetes.
Experience
Data ScientistQlik
Data ScientistCognizant
Education
Arizona State Universitymaster, Data Science (2025)
Key Strengths
Built end-to-end NLP pipeline to extract and standardize entities from messy insurance claim text
Improved NER precision by creating a custom entity dictionary and fine-tuning on a manually labeled subset
Designed and validated an entity resolution pipeline using hybrid deterministic IDs + fuzzy matching
Quantified linkage quality with precision/recall and manual sampling; cross-checked against historical claim records
Implemented pragmatic precision/recall tradeoffs via staged matching and empirically tuned similarity thresholds (~0.88) plus business-rule filters
Applied Sentence-BERT embeddings with a vector database for semantic claim search; fine-tuned with contrastive learning to improve recall (~15%)
Production-minded data engineering: modular functions, batch parallel ETL, partitioned Parquet, join optimization, containerized scaling on Kubernetes, and automated tests
Structured experimentation with hypothesis-driven model comparisons (GLM, boosting, ensembles) and traceable experiment tracking
Discover more candidates like SUMIT
Search across thousands of pre-screened, high-quality, high-intent candidates on Reval.