Reval LogoFind More Talent
DM

Diana Minine Gudinho

Mid-level Data Scientist specializing in GenAI, RAG, and forecasting

New Jersey, USAResearch Assistant4 years experienceMid-LevelHealthcareHealthcare ITEducation
ScreenedIdentity Verified

Connect with Diana

Diana already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.

Recommended

Already have an account?

About

ML/NLP engineer focused on large-scale data linking for e-commerce-style catalogs and customer records, combining transformer embeddings (BERT/Sentence-BERT), NER, and FAISS-based vector search. Has delivered measurable lifts (e.g., +30% matching accuracy, Precision@10 62%→84%) and built production-grade, scalable pipelines in Airflow/PySpark with strong data quality and schema-drift handling.

Experience

Research AssistantUniversity at Buffalo
Graduate Student AssistantUniversity at Buffalo
Data AnalystTata Consultancy Services Ltd.
Full-Stack Software Engineering InternQSpiders

Education

University at Buffalomaster, Data Science (2025)
Visvesvaraya Technological Universitybachelor, Information Science (2020)

Key Strengths

  • Built NLP pipeline to unify multi-vendor product catalogs using BERT/DistilBERT + NER + FAISS
  • Improved product matching accuracy by ~30%
  • Designed scalable entity resolution with hybrid blocking + BERT similarity; scaled to tens of millions of records
  • Improved entity match accuracy by ~25% vs prior system
  • Improved semantic search relevance via Sentence-BERT fine-tuning; Precision@10 from 62% to 84% and ~35% relevance lift
  • Production-grade data workflow engineering (Airflow, PySpark, Docker, CI/CD, monitoring, data quality checks)
  • Handled vendor schema drift with automated schema validation and dynamic mapping layer
  • Built and deployed a RAG-based compliance/legal document review system
  • Designed for high-precision retrieval to reduce compliance risk
  • Optimized retrieval and model performance for large-scale data (latency/compute constraints)
  • Hands-on Airflow orchestration for ETL + ML pipelines at scale (5M+ daily records)
  • End-to-end pipeline design from S3/Glue/PySpark to Redshift with DAG-based scheduling
  • Structured agent/workflow testing approach (unit + integration) with measurable metrics
  • Production monitoring with real-time dashboards and stakeholder feedback loops
  • Effective collaboration with non-technical stakeholders (marketing/compliance) translating requirements into usable outputs

Discover more candidates like Diana

Search across thousands of pre-screened, high-quality, high-intent candidates on Reval.

Search Talent

Connect with Diana

Diana already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.

Recommended

Already have an account?

Contact

candidate@example.com(555) 123-4567LinkedIn Profile
Sign up to view

Languages

English

Skills

PythonPandasNumPyScikit-learnPyTorchTensorFlowKerasSQLRScalaLarge Language Models (LLMs)Retrieval-Augmented Generation (RAG)Multi-agent systemsPrompt engineeringText generation