Pre-screened and vetted in the NYC Metro.
Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines
“Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.”
Mid-level Data Engineer specializing in capital markets post-trade data platforms
“Data/streaming engineer in capital markets who led an end-to-end trade settlement data product (Kafka→MongoDB→data lake) with rigorous data-quality logic and ~$175K first-year operational impact. Also built a low-latency Go-based CME market data engine feeding SOFR curve generation, using MSK on EKS with performance tuning (idempotency, compression, partitioning) to achieve sub-100ms delivery.”
Mid-level Azure Data Engineer specializing in Databricks lakehouse and Spark pipelines
Principal Cloud Data Engineering Leader specializing in lakehouse and streaming platforms
Mid-level Data Engineer specializing in cloud ETL/ELT and analytics platforms
Senior AI/ML Engineer specializing in Python, LLMs, and agentic AI on cloud platforms
Senior Data Engineer specializing in Azure Lakehouse and LLM/ML data platforms
Mid-level Data Engineer specializing in cloud ETL, big data, and analytics
Mid-level Data Engineer specializing in cloud ETL and real-time streaming
“Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.”
Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture
“Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.”
Mid-level Data Engineer specializing in real-time streaming and cloud data platforms
“Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.”
Senior Backend/Cloud Developer specializing in Python and AWS-native data workflows
Mid-level Data Engineer specializing in cloud data pipelines and warehousing
Mid-level Data Engineer specializing in cloud ETL/ELT, Spark, and streaming pipelines
Mid-Level Data Engineer specializing in cloud data platforms (AWS & GCP)
Mid-level Data Engineer specializing in cloud lakehouse and streaming analytics for financial services
Senior Data Engineer specializing in cloud data platforms and lakehouse architecture
Mid-level AI/Data Engineer specializing in LLM agents, RAG, and cloud data pipelines
Senior Lead Data Engineer specializing in cloud data platforms and real-time ML pipelines
Mid-level Data Analyst/Data Engineer specializing in machine learning and NLP
Mid-level sales and data professional specializing in FinTech, telecom, and insurance
Mid-Level Data Engineer specializing in cloud data pipelines and big data platforms
“Data engineer with ~4 years of experience building Python-based data ingestion/processing services and real-time streaming pipelines (Kafka/PubSub + Spark Structured Streaming). Has deployed containerized data applications on Kubernetes with GitLab CI/Jenkins pipelines and applied GitOps to cut deployment time ~40% while reducing config drift. Also supported a legacy on-prem data warehouse/backend migration to GCP using phased migration and parallel validation to meet strict reliability/SLA needs.”
Junior Data Engineer specializing in cloud ETL/ELT and lakehouse platforms