“Asset Management Risk professional at Fidelity Investments who built and productionized an agentic RAG platform enabling compliance and analysts to query 10,000+ fund documents with cited answers in seconds. Implemented structure-aware semantic chunking (AWS Textract), hierarchical retrieval, and hybrid search to raise accuracy from 68% to 94%, and built an evaluation framework tracking accuracy/latency/cost/hallucinations—delivering 40+ hours/month saved and zero critical production failures.”

AI Agents Apache Airflow AWS AWS Lambda CI/CD Claude+85

View profile

Sahithi K

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

Boston, MA4y exp

ModernaUniversity of Massachusetts Dartmouth

“Data engineer with experience at Moderna and Block owning high-volume (≈10TB/day) production pipelines on AWS, using Kafka/S3/Glue/dbt/Snowflake with strong data quality and observability practices (schema validation, anomaly detection, CloudWatch monitoring). Also built external financial API ingestion with Airflow retries, throttling/token rotation, and schema versioning, and helped stand up an early-stage biomedical data platform with CI/CD and incident debugging.”

Python SQL PySpark Apache Spark Apache Kafka Amazon Kinesis+94

View profile

Lalithya Manasa Patri

Screened

Senior Data Engineer specializing in cloud ETL and real-time streaming pipelines

Austin, TX5y exp

eBayTexas Tech University

“Data engineer with eBay experience owning end-to-end pipelines for real-time order and user behavior analytics at 10M+ records/day. Strong in PySpark/SQL transformations, Airflow reliability patterns, and production observability (CloudWatch), with measurable outcomes including improved data quality and 30–40% query performance gains. Also built Python data APIs for analytics/ML consumers with versioning and backward compatibility.”

Python SQL Java Scala R Apache Spark+97

View profile

Travoy Spelling

Screened

Senior Data Scientist / ML Engineer specializing in GenAI, LLMs, and NLP

Texarkana, TX10y exp

TredenceUniversity of Texas at Austin

“ML/NLP engineer focused on production GenAI and data linking systems: built a large-scale RAG pipeline over millions of support docs using LangChain/Pinecone and added a LangGraph-based validation layer to cut hallucinations ~40%. Also built scalable PySpark entity resolution (95%+ accuracy) and fine-tuned Sentence-BERT embeddings with contrastive learning for ~30% relevance lift, with strong CI/CD and observability practices (OpenTelemetry, Prometheus/Grafana).”

A/B Testing API Development AWS AWS Lambda AWS Step Functions Azure Data Factory+247

View profile

Saiteja Gaddam

Screened

Mid-Level Data Engineer specializing in cloud data platforms and streaming analytics

3y exp

IntuitUniversity at Buffalo

“Data engineer (Intuit) who owned an end-to-end telemetry and subscription analytics platform processing ~22M events/day, built on Kinesis/S3/Glue/Spark/Airflow/Redshift. Strong focus on reliability and data quality (schema drift controls, quarantine layers, idempotent reruns) and performance tuning, achieving a reporting latency reduction from ~15 minutes to under 4 minutes while enabling revenue and churn analytics for business teams.”

Scala Hibernate JDBC JSON HTML CSS+120

View profile

Sai Anuhya Bandi

Screened

Mid-level Full-Stack Engineer specializing in AI-driven data platforms

Santa Barbara, CA5y exp

UberUniversity of Alabama at Birmingham

“Full-stack engineer with 5+ years of experience who built real-time data visualization and analytics systems at Uber, spanning React/TypeScript frontends, Node/GraphQL services, Kafka pipelines, and PostgreSQL. Particularly compelling for teams needing a hands-on builder who can turn ambiguous customer needs into scalable products, and who has also applied RAG with LangChain/OpenAI over 1.8M support files to surface actionable insights.”

TypeScript JavaScript Python Java SQL React+232

View profile

Venkata Sai Pavan Dema

Screened

Mid-level Data Scientist/ML Engineer specializing in GenAI agents and MLOps

5y exp

Capital OneUniversity of the Cumberlands

“AI/LLM engineer at Capital One who deployed a production RAG-powered fraud analysis and document intelligence platform using LangChain, OpenAI, Pinecone, Kafka, and AWS. Focused on reliability in real-time investigations via hybrid retrieval, schema-validated outputs, and LLM verification loops, reporting review-time reduction from hours to minutes and ~99% fraud detection precision.”

A/B Testing Amazon EC2 Amazon Redshift Amazon S3 Amazon SageMaker Azure App Service+163

View profile

Nikita Vivek Kolhe

Screened

Junior Data & Machine Learning Engineer specializing in MLOps and NLP

Los Angeles, United States1y exp

WorkUpUSC

“ML/LLM practitioner with production experience building a healthcare review sentiment pipeline (RateMDs) using Hugging Face Transformers plus a LangChain+FAISS RAG layer for interactive querying. Also led orchestration-driven optimization of Nike’s Fusion ETL pipeline, improving runtime efficiency by 20%, and has experience translating ML outputs into Tableau dashboards for non-technical healthcare stakeholders (e.g., readmission risk).”

Python SQL C C++R MATLAB+90

View profile

sai venkata

Screened

Senior Data Engineer specializing in cloud lakehouse and real-time streaming pipelines

Texas, USA6y exp

CVS HealthUniversity of Central Missouri

“Senior data engineer with experience in both healthcare (CVS Health) and financial services (Bank of America), building large-scale Azure lakehouse pipelines (30+ EHR sources, ~5TB) and real-time streaming services (Event Hubs/Kafka) for patient vitals. Strong focus on reliability and data quality (Great Expectations, monitoring/alerting, schema drift automation), with measurable outcomes like 50% runtime reduction and 99%+ uptime for regulatory reporting pipelines.”

Python SQL Scala Java Shell Scripting Apache Spark+117

View profile

jahnavi Vasala

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

San Diego, CA6y exp

IntuitCleveland State University

“Data engineer with Intuit experience owning end-to-end, high-volume financial data pipelines (API/S3 ingestion, Airflow orchestration, Spark/PySpark + SQL transforms, Snowflake marts). Strong focus on reliability and data quality—achieved 99.8% SLA and cut discrepancies by 35% using Great Expectations, reconciliation, schema versioning, and automated backfills; also built near real-time Kafka/API data services with CI/CD and observability.”

Python SQL PySpark Scala Shell scripting Apache Spark+87

View profile

Biplob Bidari

Screened

Senior Data Engineer specializing in FinTech analytics and ML data platforms

USA5y exp

Goldman SachsUniversity of the Cumberlands

“ML/AI engineer with Goldman Sachs experience building production fraud detection and RAG-based trading insights systems end-to-end. Stands out for combining real-time ML infrastructure, GenAI retrieval systems, and compliance-aware design, with measurable impact including nearly 25% false-positive reduction and improved analyst productivity.”

Python Pandas NumPy PySpark SQL Bash+139

View profile

Kevin Cruz

Screened

Senior Gen AI Engineer specializing in agentic LLM systems

Tempe, AZ15y exp

OpendoorUSC

“Built and owned end-to-end production systems for a healthcare platform, including a predictive task recommendation feature (React + FastAPI + ML on AWS ECS) that cut backlog 20% and saved coordinators ~10 hours/week. Also productionized an AI-native RAG system (vector DB + LLM) delivering 40% faster query resolution, and led phased modernization of a monolithic FastAPI service into async microservices using feature flags and canary releases.”

Generative AI Multi-Agent Systems Prompt Engineering Vector Databases LangChain LangGraph+396

View profile

Abhay Naik

Screened

Mid-level Data Engineer specializing in cloud-native analytics and enterprise integrations

Remote3y exp

The GrooveUC Berkeley

“Built and productionized an LLM-powered clinical assistant at a healthcare startup, re-architecting a prototype into a robust RAG system on AWS with guardrails, citations, monitoring, and automated tests for clinical reliability. Works closely with clinicians to convert workflow feedback into evaluation criteria and iterative system improvements, and has hands-on experience debugging agentic systems in real time (including during live client demos).”

AWS Amazon S3 Amazon EKS Amazon EC2 Amazon ECS AWS IAM+91

View profile

Rahul Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

New York, NY6y exp

CVS HealthSouthern Arkansas University

“Data engineer with healthcare (CVS Health) experience who migrated production PySpark workloads to native BigQuery SQL and built a Great Expectations-based validation microservice on GKE (Flask + REST) integrated into Cloud Composer. Has operated high-volume pipelines (~300–400GB/day) and designed external vendor ingestion on AWS (Lambda/Step Functions/Glue) with schema-drift detection, alerting, and backfill-safe controls to protect downstream Snowflake/BigQuery tables.”

Python Java SQL MySQL PostgreSQL Apache Hive+118

View profile

Shriya Bannikop

Screened

Mid-level Software Engineer specializing in cloud platforms, data engineering, and distributed systems

Seattle, WA5y exp

Amazon Web ServicesKLE Technological University

“Full-stack engineer who built and owned an AI-assisted job-matching dashboard in Next.js App Router/TypeScript, keeping LLM logic server-side and improving performance via deduplication, caching/revalidation, and streaming (35% fewer duplicate LLM calls; 40% faster first render). Also has strong data/backend chops: designed Postgres models and optimized queries at million-record scale (1.8s to 120ms) and built durable AWS multi-region telemetry workflows with idempotency, retries, and monitoring.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon ECS Amazon EKS+170

View profile

Shalini Jeela

Screened

Senior Data Engineer specializing in data pipelines, APIs, and machine learning

Austin, TX6y exp

ExpediaTrine University

“Data engineer with experience at Expedia building SQL Server and Azure Data Factory pipelines for business reporting and analytics. Stands out for pragmatic end-to-end pipeline ownership in ambiguous environments, with a strong emphasis on data quality, rerunnability, query performance, and making downstream datasets reliable for other teams.”

Python SQL Java C#JavaScript R+100

View profile

Apoorva Nanabolu

Screened

Senior Data Scientist / Generative AI Engineer specializing in fraud, risk, and MLOps

5y exp

PayPalUniversity of New Haven

“Built and deployed a production LLM/RAG fraud investigation system to replace manual investigator workflows, combining transaction data, historical cases, and policy documents with agent-style steps and LoRA fine-tuning. Demonstrates strong reliability engineering (grounding, citations, abstention paths), performance optimization (retrieval/indexing/caching), and end-to-end MLOps orchestration using Azure ML Pipelines/MLflow plus Kubernetes/Argo with canary and rollback deployments.”

Python R SQL NoSQL Snowflake BigQuery+178

View profile

Vivek Reddy

Screened

Mid-level Data Scientist/Data Engineer specializing in ML pipelines, insurance and healthcare analytics

Los Angeles, CA7y exp

Venture ConnectUC Berkeley

“Built a production assistive-vision iPhone app to help visually impaired users find grocery items, training a custom YOLO detector on 2,000+ self-collected/annotated images and deploying via CoreML with a cloud multimodal LLM for navigation instructions. Brings hands-on AWS serverless + ECS container deployment (CDK/GitHub Actions) and a disciplined approach to AI workflow reliability (state-machine design, offline evals, stress tests, logging/metrics), plus experience communicating model insights to non-technical stakeholders (MOTER Technologies).”

A/B Testing Amazon Bedrock Amazon ECS Amazon RDS AWS Lambda CI/CD+109

View profile