Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Engineers in the NYC Metro

Pre-screened and vetted in the NYC Metro.

Python SQL ETL Apache Airflow CI/CD Apache Kafka

Rom Manzano

Screened

Executive technical founder and full-stack engineer specializing in AI, SaaS, and FinTech

New York, NY15y exp

1848VUC Berkeley

“Engineer coming out of a venture studio as it winds down, now seeking another zero-to-one environment with strong studio support and go-to-market playbooks. They show a thoughtful founder mindset centered on rapid shipping, design-partner validation, lean execution, and testing whether users will actually pay for a workflow-specific solution.”

Full-Stack Development Prototyping SaaS Backend Development Prompt Engineering Business Intelligence+49

View profile

Thuc Duong

Screened

Senior Data Engineer specializing in AI-driven GTM analytics and LLM evaluation

Long Island City, NY5y exp

MetaTemple University

“Data/analytics engineer who stood up foundational pipelines and services at Meta for the Ray-Ban Meta launch—building a retailer sales ingestion system (S3/Hive) with rigorous DQ checks, 1-day SLAs, and dimensional rollups used by GTM to track sales trends. Also built a modular multi-retailer web-scraping system for out-of-stock alerts and shipped internal GraphQL APIs and an n8n-like workflow builder using serverless (AWS Lambda) with strong testing and observability practices.”

Data Engineering Data Quality Docker ETL JSON Machine Learning+63

View profile

Tianming Zhang

Mid-level Data Engineer specializing in big data platforms and analytics infrastructure

New York, NY7y exp

MetaUniversity of Illinois Chicago

Python Scala Go Java Apache Spark Apache Airflow+57

View profile

John Villarraga

Staff-level Software Engineer specializing in AI, data platforms, and cloud infrastructure

New York, NY8y exp

GrowthLoopCarnegie Mellon University

Python Node.js SQL TypeScript Celery SQLAlchemy+50

View profile

Sanketh Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and large-scale ETL

Jersey City, NJ6y exp

JPMorgan ChaseUniversity of Texas at Dallas

“Data engineer focused on large-scale ETL/ELT pipelines across cloud stacks (GCP and AWS), including Spark-based transformations and orchestration with Airflow. Has experience loading up to ~2TB per BigQuery target table and designing atomic loads to multiple downstream systems (Elasticsearch + Kafka), with Kubernetes deployment and Jenkins CI/CD.”

Python SQL Scala Java R C+++81

View profile

SAI MALLIPEDDI

Mid-level AI/ML Engineer specializing in Generative AI and enterprise machine learning

New York, NY4y exp

BroadcomUniversity of Central Missouri

Machine Learning Artificial Intelligence Supervised Learning Unsupervised Learning Natural Language Processing Large Language Models+44

View profile

Rahul Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

New York, NY6y exp

CVS HealthSouthern Arkansas University

“Data engineer with healthcare (CVS Health) experience who migrated production PySpark workloads to native BigQuery SQL and built a Great Expectations-based validation microservice on GKE (Flask + REST) integrated into Cloud Composer. Has operated high-volume pipelines (~300–400GB/day) and designed external vendor ingestion on AWS (Lambda/Step Functions/Glue) with schema-drift detection, alerting, and backfill-safe controls to protect downstream Snowflake/BigQuery tables.”

Python Java SQL MySQL PostgreSQL Apache Hive+118

View profile

Randy Hollins

Senior Data & AI/ML Engineer specializing in LLM/NLP platforms and cloud data engineering

Bronx, NY11y exp

CBRENYU

Python R SQL Java JavaScript Scala+147

View profile

Revanth Peddi

Mid-level Data Engineer specializing in LLM agents, RAG pipelines, and LLMOps

New York, US6y exp

mcSquared AIUniversity at Buffalo

Python SQL TypeScript LangChain Large Language Models (LLMs)Retrieval-Augmented Generation (RAG)+70

View profile

Priyanka Ponnam

Senior Data Engineer specializing in Cloud Data Platforms and Generative AI

Brooklyn, NY11y exp

JPMorgan ChaseOsmania University

Agile Amazon Athena Amazon Bedrock Amazon CloudWatch Amazon EC2 Amazon ECS+179

View profile

Shruti Pangare

Junior AI/ML Software Engineer specializing in LLMs and data-intensive systems

New York, NY3y exp

NYU Langone HealthNYU

Python SQL PL/SQL R Pandas NumPy+131

View profile

Karveandhan Palanisamy

Mid-level Data Engineer specializing in GCP, Spark, and healthcare analytics

New York, NY3y exp

CVS HealthColumbia University

Agile Algorithms Apache Airflow Apache Hadoop Apache Kafka Apache Spark+81

View profile

Sai Gowtham Madaka

Screened

Mid-level Data Engineer specializing in streaming and cloud data platforms for financial services

Edison, NJ3y exp

Morgan StanleyPace University

“Data engineering-focused candidate (internship/project experience) who built end-to-end pipelines processing a few million transactional records/day for fraud detection and reporting, using Airflow, Python/SQL, and PySpark with strong emphasis on data quality gates, idempotency, and monitoring. Also implemented an external web/API data collection system with anti-bot tactics and schema-change quarantine, and shipped a versioned Flask API to serve curated warehouse data.”

Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+82

View profile

Zhiwen Zhao

Screened

Junior Data Engineer specializing in cloud ETL and big data platforms

New York, NY3y exp

Bank of ChinaNYU

“Data engineer focused on transit/transportation datasets, building Spark-based pipelines that ingest from Oracle/APIs, apply PySpark data-quality fixes, and publish star-schema fact tables to Azure Data Lake. Experienced troubleshooting complex Spark failures (using checkpointing to manage long lineage) and operating Airflow-driven backfills and GitLab CI deployments for production DAGs.”

Python Java Scala R SQL C#+75

View profile

Shiva Kumar

Mid-level Data Engineer specializing in cloud ETL and analytics

New York, NY7y exp

VerizonNew England College

Amazon EC2 Amazon S3 Anomaly Detection Ansible Apache Airflow Apache Kafka+63

View profile

Nikith Neelisetty

Senior Data Engineer specializing in cloud ELT/ETL and data warehousing

New York, NY4y exp

American ExpressWeber State University

Python Pandas NumPy Scala Java Shell Scripting+66

View profile

Triman Sachdeva

Mid-level Data Engineer specializing in lakehouse and cloud data platforms

New York, NY3y exp

Invisible TechnologiesRutgers University–New Brunswick

Amazon EMR Amazon S3 Analytics Data Engineering Data Ingestion Data Modeling+47

View profile

Roberto Rodas-Herndon

Junior Data Scientist specializing in analytics automation and BI dashboards

Newark, NJ2y exp

Public Service Enterprise GroupBoston University

Python JavaScript SQL Flask React Node.js+36

View profile

Sreevardhan Reddy Soma

Mid-level Data Engineer specializing in Azure data platforms and near real-time pipelines

New York, USA4y exp

ServiceNowUniversity of Missouri-Kansas City

Python C C++Java R SQL+96

View profile

Chaitanya Konakalla

Mid-level Data Engineer specializing in financial data engineering and scalable pipelines

Jersey City, NJ4y exp

JPMorgan Chase

Python Object-Oriented Programming (OOP)SQL PostgreSQL MySQL Shell Scripting+57

View profile

PAVAN VARMA PENMETHSA

Screened

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp

AvanadeUniversity of North Texas

“Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.”

Machine Learning Generative AI Large Language Models (LLMs)Prompt Engineering Retrieval-Augmented Generation (RAG)Embeddings+131

View profile

Need someone specific?

AI Search