Mid-level Data Scientist specializing in ML, MLOps, and Generative AI
TX, USAData Scientist5 years experienceMid-LevelManufacturingIndustrial EquipmentConsulting
ScreenedIdentity Verified
Connect with Anurag
Anurag already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.
Recommended
Already have an account?
About
ML/NLP engineer who built a RAG-based technical assistant for Caterpillar field engineers, transforming PDF keyword search into intent-based semantic retrieval across manuals, logs, sensor reports, and technician notes. Strong in productionizing data/ML systems (Airflow, PySpark) with rigorous preprocessing, entity resolution, and evaluation—delivering measurable gains in accuracy, relevance, and duplicate reduction.
Experience
Data ScientistCaterpillar
Data ScientistCognizant
Education
University of Illinois Chicagomaster, Computer Science
Amity Universitybachelor, Computer Science and Engineering
Key Strengths
Built RAG-based technical assistant for field engineers using semantic retrieval over unstructured corpora
Designed 4-step document preprocessing pipeline (OCR normalization, dedup/versioning, segmentation, quality scoring) improving embedding quality and boosting accuracy by >20%
Hybrid entity-resolution pipeline with supervised validation; improved pairwise F1 by ~30% and reduced duplicate entities by ~40%
Fine-tuned embedding model on domain data; improved top-3 search relevance by ~25% and reduced irrelevant matches