Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Quality Professionals

Pre-screened and vetted.

Data Quality Python SQL AWS CI/CD Docker

Akhil Reddy Edla

Screened

Senior Data Engineer specializing in cloud data platforms and automated data quality

Houston, TX4y exp

CenterPoint EnergyUniversity of Central Missouri

“Data engineer at CenterPoint Energy who built and operated multiple production-grade GCP data systems: a daily Snowflake→BigQuery replication framework (150+ tables) with Monte Carlo/Atlan-driven observability and schema-drift protection, plus a FastAPI metrics service for pipeline health. Demonstrated measurable impact (40% faster dashboard queries, 70% less manual refresh work, zero data loss) and strong operational rigor (scaling Cloud Run jobs, SAP SLT reconciliation, quarantine patterns, CI/CD via GitHub Actions + Terraform).”

Apache Airflow Apache Kafka Apache Spark API Development AWS AWS Glue+116

View profile

Vu Tran

Screened

Senior Product Manager specializing in 0–1 platforms, AI workflows, and product operations

Oakland, CA8y exp

AllRoomieThe Art Institute of Washington

“Customer success/product-facing operator from AllRoomie (AI-powered rental marketplace) who owned an enterprise real estate brokerage account end-to-end. Improved conversion and adoption by automating lead intake/routing and fixing integration issues, citing ~40% less manual coordination, faster response times, and expansion into additional regions through ROI-driven land-and-expand motions.”

Product Management Product Strategy Program Management Project Management Cross-Functional Leadership Stakeholder Management+84

View profile

Sri Charan Raju Karampudi

Screened

Mid-level Data Engineer specializing in cloud ETL and financial data platforms

Virginia, USA3y exp

Capital OneAvila University

“Data engineer with experience at Capital One and HSBC building and operating GCP-based data platforms. Led an end-to-end Oracle-to-BigQuery migration processing ~200–300GB/day using Dataflow/Beam, Airflow, Dataproc/PySpark, and Looker, achieving ~99.5% pipeline success and ~30% fewer data quality issues. Strong in production reliability, schema drift handling for external APIs, and BigQuery performance/serving patterns (materialized views, authorized views, versioned datasets).”

ETL Java Spring Framework Apache Airflow SQL Snowflake+102

View profile

Hilary Lutz

Screened

Intern IT and cybersecurity professional with data and Python skills

Philadelphia, PA5y exp

ProsciaBryn Mawr College

“Internship experience at Arkema and Proscia focused on improving onboarding and internal automation workflows. Built SQL-based processes for computer onboarding and security compliance checks, redesigned cybersecurity onboarding for different departments, and created templated setup instructions with GitHub-based review safeguards.”

SQL Python Data Validation Data Analysis Cross-Functional Collaboration Data Pipelines+54

View profile

Himangshu Das

Screened

Staff Software Engineer / Technical Architect specializing in cloud data platforms and GenAI agents

Menlo Park, CA10y exp

PromethiumUniversity of Illinois Urbana-Champaign

“Small-team builder of Promethium’s “Mantra” next-gen agentic text-to-SQL engine, using vector DB + LangGraph tooling and SQL validation/evaluation to improve query accuracy. Experienced in diagnosing production LLM workflow failures via LangSmith traces and in running hands-on developer workshops and pre-sales POCs with live debugging and real customer data.”

AI Agents Alerting Amazon Athena Amazon CloudWatch Amazon DynamoDB Amazon EC2+107

View profile

Sowmya Sree

Screened

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

Dallas, TX5y exp

Bank of AmericaUniversity of North Texas

“Built production LLM systems including a real-time customer feedback analysis and workflow automation platform using RAG and multi-agent orchestration with confidence-based human escalation, addressing privacy and legacy integration challenges. Also automated ML operations with Airflow/Kubernetes (e.g., daily churn model retraining) cutting retraining time to under 30 minutes, and demonstrates a rigorous testing/monitoring approach plus strong non-technical stakeholder collaboration.”

Python Java Spring Boot JavaScript R Bash+148

View profile

Sailaja Lokasani

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and healthcare analytics

Dallas, TX5y exp

Lightbeam Health SolutionsSyracuse University

“Healthcare-focused data engineer/ML practitioner with experience at Lightbeam Health Solutions and Humana building production entity-resolution and semantic similarity pipelines across EMR, lab, and claims data. Uses NLP/ML (spaCy, scikit-learn, BioBERT/LightGBM) plus Snowflake/Airflow and vector search (Pinecone) to improve linkage accuracy (reported 90%) and semantic match quality (reported +12–15%), while reducing manual cleanup by 40%+.”

Apache Airflow AWS AWS Glue AWS Lambda Agile C+++134

View profile

Hongye Xiong

Screened

Intern Software Engineer specializing in backend, cloud data platforms, and microservices

Renton, WA0y exp

PACCARSeattle University

“Full-stack engineer who shipped a group scheduling SaaS feature with live availability updates using Next.js App Router + TypeScript, owning production reliability after launch (auth debugging, monitoring, polling/backoff tuning). Has hands-on experience with Postgres schema/index design and query optimization (EXPLAIN ANALYZE) and building durable orchestrated backend workflows with retries and idempotency.”

API Gateway Angular AWS AWS Lambda Automated Testing CI/CD+82

View profile

Vamshi Arempula

Screened

Senior AI/ML Engineer specializing in Generative AI, RAG, and agentic systems

6y exp

Wellmark Blue Cross and Blue ShieldIndiana Wesleyan University

“GenAI/LLM ML engineer (currently at Webprobo) building an enterprise GenAI platform with document intelligence and automation on AWS and blockchain. Has hands-on experience with RAG, LLM evaluation tooling, and orchestrating production LLM workflows with Apache Airflow, plus deep exposure to reliability challenges in globally distributed/edge deployments. Also partnered with business/marketing stakeholders at a banking client to deliver an AI-driven customer retention insights solution.”

A/B Testing Agile Amazon API Gateway Amazon Bedrock Amazon CloudWatch Amazon Redshift+212

View profile

Shashank Garg

Screened

Engineering leader specializing in FinTech ML/AI platforms

San Francisco, CA12y exp

TravelBankSan José State University

“Engineering Manager/player-coach leading Data Infrastructure, ML/DS, and AI Engineering pods who recently shipped multiple production agentic GenAI features. Built privacy-preserving LLM workflows (PII redaction via Microsoft Presidio) and drove an AI expense-approval agent from ambiguous ask to GA, cutting approval time from ~2.5 days to <4 hours with >85% accuracy. Also owned a major LLM cost overrun incident and implemented cost observability plus circuit breakers to prevent runaway agent loops.”

Leadership Team Building Agile Generative AI MLOps LangGraph+102

View profile

Mayur Komaravelly

Screened

Senior Data Analyst specializing in data pipelines, web scraping, and legal data enrichment

Illinois, USA5y exp

The HartfordIndiana Wesleyan University

“Data engineer focused on reliable, scalable analytics pipelines and external data collection. Has owned end-to-end pipelines processing 5–10M records/day, serving Snowflake data marts to Power BI/Tableau, and reports ~99% reliability through strong validation/monitoring. Also shipped versioned REST APIs for curated data with query optimization and caching.”

Apache Airflow Apache Kafka Apache Spark Ansible API Design AWS Glue+140

View profile

Revanth Goli

Screened

Senior Data & Backend Engineer specializing in cloud data pipelines and LLM/RAG systems

Morrisville, NC6y exp

Syneos HealthUniversity of Alabama at Birmingham

“Data engineer with end-to-end ownership of large-scale retail and clinical data ingestion/processing on AWS, including real-time streaming and batch pipelines. Delivered measurable outcomes: 20M daily transactions processed, latency cut from 4 hours to 5 minutes, ~70% fewer failures, and 120+ pipelines running at 99.8% reliability with full audit compliance.”

Python Pandas PySpark FastAPI LangChain SQL+97

View profile

Sreelekha Vuppala

Screened

Mid-level Data Scientist specializing in Generative AI, MLOps, and cloud data platforms

USA4y exp

CitiusTechArizona State University

“GenAI/ML engineer (CitiusTech) who has deployed production RAG systems for compliance/operations document Q&A, using Pinecone + FastAPI microservices on Kubernetes with strong monitoring and guardrails. Also built a GenAI-powered incident triage/routing solution in collaboration with non-technical stakeholders, achieving 35% faster response times and 40% fewer misclassified tickets, and has hands-on orchestration experience with Airflow and AutoSys.”

A/B Testing Agile Amazon Kinesis Apache Airflow Apache Hadoop Apache Kafka+246

View profile

Bhargavi Kondaveeti

Screened

Mid-level Data Engineer specializing in big data pipelines and real-time streaming

Dallas, TX6y exp

Johnson & JohnsonUniversity of North Texas

“Data engineer who has owned end-to-end production pipelines processing a few million records/day, using Python/Airflow/SQL/PySpark with Snowflake serving to BI (Power BI). Built resilient external web data collection systems (anti-bot, schema-change detection, backfills) and shipped versioned REST APIs for internal consumers, improving pipeline success rates to 99% through monitoring, retries, and idempotent design.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EMR Amazon Redshift Amazon S3+101

View profile

Sai Vardhan Reddy

Screened

Mid-Level Data Engineer specializing in cloud data platforms and governed analytics

5y exp

OptumUniversity of Central Missouri

“Data engineer with Optum experience building end-to-end healthcare data pipelines for HL7/FHIR, processing millions of records daily across Kafka streaming and Databricks/Spark batch. Strong focus on data quality (schema enforcement/validations), reliability (Airflow monitoring/alerts), and analytics-ready serving in Snowflake powering Power BI/Tableau, with CI/CD via Git and Jenkins.”

AWS Amazon EC2 AWS Lambda AWS Glue Amazon S3 Amazon Kinesis+94

View profile

Tanvi Dasaripally

Screened

Mid-level Cloud Data Engineer specializing in Azure/AWS pipelines and medallion architecture

USA4y exp

UnitedHealth GroupSouthern Illinois University Carbondale

“Data engineer focused on reliability and data quality, owning end-to-end pipelines processing ~100k–300k records/day. Implemented robust validation and monitoring that cut reporting issues by ~30%, and built stable external data collection with anti-bot measures, backfills, and schema-change detection while maintaining backward-compatible internal data services.”

Python SQL PySpark Apache Kafka Azure Data Factory AWS+72

View profile

Sriraj Samala

Screened

Mid-level Data Analyst specializing in business analytics and BI

Dayton, OH3y exp

University of DaytonUniversity of Dayton

“Analytics professional with higher education experience at the University of Dayton, focused on turning inconsistent operational data into standardized metrics and recurring dashboards. They combine SQL, Python, and Power BI to automate reporting, improve data integrity, and reduce manual reporting by 30%, with outputs adopted in semester planning and cross-department performance tracking.”

Power BI Tableau Microsoft Excel Python Pandas NumPy+61

View profile

Avantik Tiwari

Screened

Junior Data Scientist / Big Data Engineer specializing in ML, LLMs, and analytics platforms

Tempe, Arizona3y exp

Arizona State UniversityArizona State University

“Backend/data platform engineer who led a major redesign of a hybrid streaming+batch analytics platform processing 10+ TB/day (Airflow/Hive/BigQuery) with strong data-quality automation. Also built a production RAG PDF assistant with concrete mitigations for hallucinations and prompt injection (re-ranking, grounding, verifier step) and has deep experience executing low-risk migrations (dual-write, blue-green, rapid rollback) and implementing JWT-based row-level security.”

Python SQL Java JavaScript MySQL PostgreSQL+112

View profile

Mohammad Kashif

Screened

Junior Data Engineer / Analyst specializing in AI/ML data infrastructure

Houston, Texas1y exp

CallAgent AIUniversity of Texas at Austin

“Built and deployed a compliance-sensitive LLM pipeline that extracts rebate logic from hospital–supplier medical contracts, using multi-layer redaction (regex/NER/dictionary), schema-validated structured outputs, and secure placeholder reinsertion. Hosted models on Amazon Bedrock to avoid retraining on sensitive data and improved both accuracy and cost by splitting the workflow into a lightweight section classifier plus a fine-tuned extraction model, orchestrated with LangChain and evaluated via layered, test-driven agent assessments.”

Agentic AI AWS BigQuery Compliance Data Modeling Data Pipelines+175

View profile

Farhath Banu

Screened

Senior Software Engineer specializing in AI-driven marketing and data platforms

Boston, MA7y exp

PostscriptShadan College of Engineering and Technology

“Backend/data engineer who builds production FastAPI microservices and AWS serverless/Glue pipelines for SMS analytics and marketing segmentation. Led a legacy batch modernization into modular services (FastAPI + Glue/Athena + ClickHouse) using shadow-mode parity checks, feature flags, and incremental rollout. Demonstrated measurable performance wins (12s to sub-second SQL; ~40% CPU reduction) and strong incident ownership with proactive schema-drift prevention.”

Python TypeScript Java C C++FastAPI+127

View profile

Saloni Patadia

Screened

Mid-level Machine Learning Engineer specializing in LLM systems and healthcare data automation

California, USA2y exp

Prime HealthcareUSC

“React performance-focused engineer who contributed performance patches back to an open-source context+reducer state helper after profiling and fixing excessive re-renders in an enterprise project management platform at Easley Dunn Productions. Also built an end-to-end LLM-driven pipeline at Prime Healthcare to normalize millions of supply-chain records, reducing defects by 80% and saving 160+ hours/month.”

LangChain LlamaIndex FAISS Vector Search Semantic Search Prompt Engineering+100

View profile

Neeraj Jawahirani

Screened

Mid-level Data & AI Engineer specializing in healthcare data pipelines and MLOps

FL, USA4y exp

HumanaFlorida State University

“Built and deployed a production LLM-powered clinical note summarization system used by care managers to speed review of 5–20 page unstructured medical records. Implemented safety-focused validation (prompt constraints, rule-based and section-level checks, human-in-the-loop) to reduce hallucinations while maintaining low latency and meeting privacy/regulatory constraints, integrating via APIs into existing clinical tools.”

Agile Amazon CloudWatch Amazon EMR Amazon Redshift Amazon S3 Amazon SageMaker+122

View profile

Siva Manikanta Lakumarapu

Screened

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Dallas, TX5y exp

Gilead SciencesUniversity of North Texas

“AI/LLM engineer with production experience building secure, scalable compliance-focused generative AI systems (GPT-3/4, BERT) including RAG over internal regulatory document bases. Has delivered end-to-end pipelines on AWS with PySpark/Airflow/Kubernetes/FastAPI, emphasizing privacy controls, monitoring, and iterative evaluation (A/B testing). Also partnered closely with bank compliance officers using prototypes to refine NLP summarization/classification and reduce document review time.”

A/B Testing Agile Amazon EC2 Amazon Redshift Amazon S3 Apache Airflow+164

View profile

Hrishikesh Raghunath

Screened

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp

Dreamline AICalifornia State University, Fullerton

“At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.”

A/B Testing Amazon Athena Amazon CloudWatch Amazon Kinesis Amazon Redshift Amazon S3+114

View profile

Data Engineers Software Engineers Machine Learning Engineers Data Analysts Data Scientists Business Analysts Data & Analytics Engineering AI & Machine Learning Executive & Leadership

Need someone specific?

AI Search

Related

Need someone specific?