Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Engineers

Pre-screened and vetted.

Python SQL ETL CI/CD Amazon S3 AWS

Bay Area DFW Metroplex NYC Metro Remote Chicago Metro Greater Boston Greater Seattle Los Angeles Metro Austin Metro DMV

Venkata Siva Sai Prathyush Kolli

Screened ReferencesStrong rec.

Intern Robotics Software Engineer specializing in ROS2 multi-robot autonomy

Newark, DE1y exp

University of DelawareUniversity of Delaware

“Robotics intern at the University of Delaware who built and debugged ROS2-based multi-robot coordination systems, focusing on real-time reliability (timestamp alignment, latency/jitter instrumentation, QoS/executor tuning). Also improved SLAM stability by fixing LiDAR/encoder synchronization and tuning state-estimation parameters, with a simulation-first workflow using Gazebo and Docker/CI for reproducible deployments.”

C++Multithreading Python API development Data pipelines ROS 2+139

View profile

sriram Yalamati

Screened

Mid-level Data Engineer specializing in healthcare data platforms and MLOps

Chicago, IL3y exp

Health Care Service CorporationWichita State University

“ML/NLP practitioner with healthcare payer experience at HCSC, focused on connecting messy unstructured clinical notes to structured claims/provider data to improve fraud-analytics workflows. Has hands-on experience fine-tuning transformers in AWS SageMaker, building large-scale embedding search with FAISS, and implementing robust entity resolution using golden datasets, precision/recall calibration, and production monitoring for drift.”

Python SQL Scala Java AWS Amazon Redshift+133

View profile

Teja Babu Mandaloju

Screened

Mid-level Data Scientist/MLOps Engineer specializing in NLP, GenAI, and cloud ML platforms

Chicago, USA5y exp

VosynUniversity of North Texas

“AI/ML engineer who led production deployment of a multimodal (text/video/image) RAG system on GCP using Gemini 2.5 + Vertex AI Vector Search, scaling to 10M+ documents with sub-second latency and +40% retrieval accuracy. Strong MLOps/orchestration background (Kubernetes, CI/CD, Airflow, MLflow) with proven impact on reliability (75% fewer incidents) and deployment speed (92% faster), plus experience delivering explainable ML (XGBoost + SHAP + Tableau) to non-technical retail stakeholders.”

Python R SQL MATLAB C#Scikit-learn+166

View profile

sai Pavan

Screened

Mid-level AI/ML Engineer specializing in MLOps, NLP, and real-time ML pipelines

5y exp

American Family InsuranceGeorge Mason University

“Built a production, real-time insurance claims document-understanding and fraud-detection pipeline using TensorFlow + fine-tuned BERT, deployed on AWS (SageMaker/Lambda/API Gateway) with automated retraining via MLflow and Jenkins. Addressed noisy documents and latency using augmentation and model distillation (3x faster), cutting claims ops manual review by ~50% and reducing fraudulent payouts.”

A/B Testing AI Agents Amazon API Gateway Amazon EC2 Amazon Kinesis Amazon Redshift+157

View profile

Phanideep P

Screened

Senior Data Engineer specializing in cloud lakehouse and streaming data platforms

5y exp

Cadence BankWright State University

“Data platform/data engineer with cross-industry experience in banking and healthcare, building cloud-native lakehouse architectures across AWS/Azure/GCP. Has owned high-volume (millions of records; TB/day) pipelines with strong data quality automation (dbt/Great Expectations), observability (Grafana/Prometheus), and real-time streaming (Kafka/Spark) for fraud monitoring; also delivered an early-stage migration from SQL Server to BigQuery with 40% batch latency reduction.”

Python SQL Apache Spark PySpark Snowflake Databricks+126

View profile

Sai Bandaru

Screened

Mid-level Machine Learning Engineer specializing in fraud detection and LLM systems

Boston, MA6y exp

FiVerityNortheastern University

“At FiVerity, built and deployed a production LLM/RAG-based Information Gathering Tool for credit union fraud analysts that generates auditable investigation summaries from verified evidence. Focused on high-stakes constraints—hallucination prevention, cross-entity leakage controls, compliance/PII-safe monitoring, and latency—while also shipping customer-facing agentic workflows using CrewAI and LangGraph in close partnership with fraud and compliance stakeholders.”

Python PyTorch Hugging Face Transformers LoRA Scikit-learn XGBoost+105

View profile

Harshita Loganathan

Screened

Junior Analytics Engineer specializing in modern data platforms

Boston, MA2y exp

QuipliUniversity of Massachusetts Amherst

“Analytics engineer/data professional with strong healthcare and membership analytics experience, combining SQL, dbt, BigQuery, Python, and Tableau to turn messy source data into trusted executive reporting. Stands out for metric governance and stakeholder alignment work, including unifying conflicting business definitions and delivering a CMS market-risk model that identified $792M in excess payer costs.”

SQL BigQuery Snowflake PostgreSQL Clustering Apache Airflow+64

View profile

Meet Doshi

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML analytics

Chicago, IL4y exp

EDNANortheastern University

“Backend/data engineer in healthcare who built an AWS-based clinical analytics platform from scratch (DynamoDB/S3/Airflow/dbt) with sub-second clinician query goals, 99.9% uptime, and HIPAA-grade controls (KMS encryption, IAM RBAC, audit trails). Also modernized ML delivery by replacing a manual 4-hour deployment with a 30-minute Docker/GitHub Actions CI/CD pipeline using parallel runs, parity testing, and rollback, and caught critical EHR data edge cases (date formats/timezones) that could have impacted patient care.”

Python PySpark SQL R Java Scala+120

View profile

Anas Baig

Screened

Junior Software Engineer specializing in full-stack web and cloud systems

Boston, MA2y exp

EnFi, IncNortheastern University

“Co-op engineer at EnFi who built and maintained a multi-tenant prompt library and LLM workflow tooling used by internal teams and external enterprise clients. Led TypeScript/React package design and standardized a typed workflow abstraction across disparate implementations (React, Go, JSON), improving reliability and developer adoption. Delivered measurable performance gains (~25% latency reduction) and owned end-to-end execution including docs, demos, debugging, and deployment.”

Go Python TypeScript JavaScript SQL Java+125

View profile

Darshan Rahul Rajopadhye

Screened

Junior AI/ML Engineer specializing in LLM agents and RAG systems

Boston, MA2y exp

Humanitarians.AINortheastern University

“Backend/data engineer who built a production-ready multi-agent financial intelligence system (Mycroft) that orchestrates specialized AI agents to analyze real-time market data using FastAPI and Pinecone vector search. Brings strong security/reliability instincts (rate limiting, JWT/OAuth2, retries/backoff, health checks) and has caught high-impact data integrity issues in financial migrations (timezone normalization across global legacy systems).”

Python PyTorch TensorFlow Hugging Face Transformers Machine Learning Deep Learning+86

View profile

Hariom Vyas

Screened

Senior Laboratory Technician specializing in clinical diagnostics and quality compliance

Los Angeles, CA8y exp

Innovative Health DiagnosticsCalifornia State University Channel Islands

“Forward-deployed, full-stack/platform engineer who owns production features end-to-end across frontend, backend, data, and infrastructure (AWS serverless, Terraform, React). Has modernized critical fintech/payment systems (zero-downtime monolith-to-microservices with Kafka event sourcing) and productionized AI-native support workflows (LLM + RAG on Pinecone) with measurable gains in latency, incidents, CSAT, and support efficiency.”

Data integrity Documentation Inventory management Time management Apache Spark Automated Testing+233

View profile

KHUSHBU KAKDIYA

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and cloud MLOps

California, USA6y exp

CVS HealthCleveland State University

“Built and deployed a production LLM/RAG system at CVS to automate clinical documents, addressing PHI compliance, retrieval accuracy, and latency; achieved a 35–40% reduction in review effort through chunking and FP16/INT8 optimization. Also has experience translating AI outputs into actionable insights for non-technical stakeholders (sports analysts).”

Python SQL PySpark R Bash Scikit-learn+114

View profile

Akash Shanmuganathan

Screened

Mid-level GenAI & Data Engineer specializing in agentic AI systems and AWS Bedrock

Fort Mill, SC4y exp

OneData Software SolutionsNortheastern University

“At onedata, built and deployed an LLM-powered, multi-agent analytics platform on AWS Bedrock that lets users create Amazon QuickSight dashboards through natural-language conversation, cutting dashboard build time from ~30 minutes to ~5 minutes. Strong in production concerns (observability, token/cost tracking, model tradeoffs) and in bridging business + technical work, owning pre-sales pitching through delivery with an engineering management background focused on AI product management.”

Agentic AI Amazon Athena Amazon Bedrock Amazon Redshift Amazon RDS Amazon S3+95

View profile

Shawn Faber

Screened

Executive Engineering Leader specializing in enterprise SaaS, AI/ML, and cloud modernization

Charlotte, NC28y exp

Rhythm SystemsUniversity of Wisconsin–Stevens Point

“Commercially minded candidate with exposure to private equity firms as sales targets and experience building partnerships around venture-backed company ecosystems. They appear more inclined to join an existing company than found one, bringing a metrics-driven, process-oriented approach and creative execution style with a strong focus on clear growth paths and capital stability.”

Machine Learning OCR Data Visualization DevOps CI/CD Change Management+144

View profile

Thilak P

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and big data pipelines

5y exp

W. R. BerkleySacred Heart University

“Backend/data engineer who builds Python (FastAPI) data-processing API services for internal analytics/reporting, emphasizing modular architecture, async performance tuning, and reliability patterns (health checks, retries, observability). Also migrated legacy on-prem ETL pipelines to Azure using ADF/Data Lake/Functions and implemented a near-real-time ingestion flow with Event Hubs plus watermarking to handle late events and deduplication.”

Python SQL R C HTML CSS+153

View profile

Yash Pankhania

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and data engineering

Boston, MA5y exp

Humanitarians.AINortheastern University

“AI Engineer Co-Op at Northeastern University who built a production Patient Persona Chat Bot to help nursing students practice clinical interactions, fine-tuning Llama 3 and integrating a LangChain + Pinecone RAG pipeline deployed on Amazon Bedrock. Emphasizes clinical accuracy and reliability with guardrails, retrieval filtering, and continuous evaluation, and also brings strong data engineering/orchestration experience (Airflow, EMR/PySpark, ADF, dbt, Databricks, Snowflake).”

Agile Amazon Bedrock Amazon DynamoDB Amazon EMR Amazon RDS Amazon Redshift+127

View profile

Mohith Venkata

Screened

Mid-level Full-Stack Developer specializing in cloud-native APIs and data workflows

Tukwila, WA4y exp

Reshmi’s Group Inc.Seattle University

“Built and owned end-to-end ordering and inventory/order management systems for a wholesale distributor, delivering an MVP quickly and iterating based on direct observation of daily users. Experienced with TypeScript/React + Node.js layered architectures and microservices using RabbitMQ, including real-world scaling issues (duplicates, backpressure) and observability practices (correlation IDs, structured logging).”

Python Java JavaScript TypeScript C++C#+147

View profile

Shabari Vignesh

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI agents

Santa Clara, CA6y exp

SwirepaySan José State University

“Data/Backend engineer who has owned end-to-end merchant analytics systems on AWS: orchestrated multi-source ingestion (FISERV/Shopify/Clover) with Step Functions/Lambda, enforced strong data quality gates, and served curated datasets via Redshift and a FastAPI layer. Also built an early-stage Merchant Insights AI agent that converts natural language questions into SQL using OpenAI models, with full CI/CD and observability.”

Python Pandas PySpark NumPy SQL Shell Scripting+106

View profile

Snehitha Penumaka

Screened

Mid-level AI/ML Engineer specializing in predictive modeling and cloud ML pipelines

Dallas, TX3y exp

Cambard LLCUniversity of Texas at Dallas

“LLM engineer/data engineer who has deployed production RAG systems for internal-document Q&A, building end-to-end ingestion, embedding, vector search, and FastAPI serving while actively reducing hallucinations and latency through rigorous retrieval tuning and caching. Also experienced in orchestrating cloud data pipelines (Airflow, AWS Glue, Azure Data Factory) and partnering with non-technical business teams to deliver AI solutions like automated document review.”

A/B Testing Agile Anomaly Detection Apache Spark AWS Lambda Classification+93

View profile

RIYA CHADDHA

Screened

Mid-level Data Engineer and Business Analyst specializing in cloud ETL and analytics

Remote, US5y exp

MellicellNortheastern University

“Data analyst with cross-industry experience spanning insurance analytics at L&T Infotech and experimental imaging analytics at Mylyser. Stands out for building scalable SQL/PySpark data pipelines, standardizing business-critical metrics like claims lifecycle and policy retention, and delivering measurable impact such as 50%+ faster query performance and a 15% reduction in claims settlement time.”

Python NumPy Pandas Scikit-learn PyTorch SQL+118

View profile

Neha Shastri

Screened

Junior AI & Data Engineer specializing in LLM systems and analytics platforms

2y exp

AI-Assisted Grading Platform - Startup Funded by BUBoston University

“Backend/ML engineer who built a job-search automation SaaS using a modular Selenium ETL pipeline, rigorous testing/observability, and a cost-optimized two-pass LLM ranking approach. Has led high-integrity data extraction from messy multi-city PDF records (95% integrity) and managed modular production rollouts for a 20+ engineer team, with a strong security focus (deny-by-default, row-level access control) in an AI-assisted grading platform.”

Python Pandas NumPy SQL Git GitHub+64

View profile

Lakshmi Priya Ramisetty

Screened

Mid-level ML & Data Engineer specializing in GenAI, graph modeling, and fraud/risk analytics

Redwood City, CA5y exp

BlueArcYeshiva University

“Built a production AI fraud/risk scoring platform at BlueArc that ingests web business/product/site data, generates text+image embeddings, and connects entities in a graph to detect reuse patterns and links to known bad actors. Optimized for scale with incremental graph re-scoring and delivered investigator-friendly explainability by surfacing the exact signals/relationships behind each score; orchestrated workflows with Airflow and GCP event-driven components (Pub/Sub, Dataflow, Cloud Run) and has recent LLM workflow orchestration experience (retrieval, prompting, scoring).”

Python SQL PySpark Apache Airflow ETL PostgreSQL+92

View profile

somasekhar G

Screened

Mid-level Data Engineer specializing in cloud big data and streaming pipelines

California, USA4y exp

Smarc Solutions IncUniversity of Colorado Boulder

“Data engineer focused on large-scale financial data platforms, with hands-on ownership of an AWS + Databricks + Snowflake pipeline processing ~2TB/day. Strong in data quality (Great Expectations), schema drift automation, and production reliability (99.9%), plus measurable performance/cost wins (4h→1.2h, ~25% cost reduction). Also built an async Python crawling/ingestion framework with anti-bot mitigation, retries, and Airflow-driven backfills.”

AWS Amazon EMR AWS Lambda Amazon Kinesis AWS Step Functions Amazon EKS+93

View profile

Sravya Chunduri

Screened

Mid-level AI/ML Engineer specializing in LLM, NLP, and MLOps

Virginia, USA4y exp

Blackhawk NetworkUniversity of Maryland, Baltimore

“AI/ML Engineer with 3+ years of experience spanning RAG pipelines, MLOps, large-scale data workflow automation, and resilient Playwright-based UI automation. At Black Hawk Network and Wipro, they describe shipping production systems with strong observability and compliance controls, including reducing flaky automation failures from 30% to under 2% and automating 3+ TB/day reconciliation workflows.”

Python Java C++JavaScript Bash Classification+152

View profile

Data Engineers in Bay Area Data Engineers in DFW Metroplex Data Engineers in NYC Metro Data Engineers in Remote Data Engineers in Chicago Metro Data Engineers in Greater Boston Data Engineers in Greater Seattle Data Engineers in Los Angeles Metro Data Engineers in Austin Metro Data Engineers in DMV

Need someone specific?

AI Search

Related

Need someone specific?