Data Engineer — New York City

About Knotch

Knotch is the Content Intelligence Platform that enables communications and marketing teams to measure, understand, and optimize content ROI across their digital efforts. We work closely with the following verticals: financial (including JP Morgan Chase & Co., Capital One, Citi, TD, Discover, Ally, Bank of America and Synchrony), automotive (including Ford), telecommunications (including AT&T and Sprint), retail (including Walmart), technology (including Salesforce, HP Inc, HP Enterprise, Citrix, and AWS), and consulting (including Deloitte and PwC.) We are currently expanding quickly into the insurance, fashion and health verticals.

At Knotch, we:

  1. Enable brands to become better content creators through real-time and independent measurement & optimization across on and off property content investments. 
  2. Activate content to move customers to high value actions.
  3. Turn content into a first party audience data source that flows into the rest of the downstream marketing platforms (CRM, DMP, CDP etc).

We’re proudly based in SoHo, NYC and we have a stellar team:

Knotch Engineering Team

Engineering is the cornerstone of our organization, and we work hard everyday to build the most impactful products as possible. We love to experiment, find a deep joy in product iteration and achieve stability with thoughtful architecture and testing all while monitoring our performance and progress at every step.

Knotch’s founding mission has always been to improve the advertising and marketing industries in a lasting and meaningful way. Transparency through data is our ethos and something every member of our company takes seriously. We are looking for highly-motivated engineers who are passionate about data and eager to transform an industry to join us on our journey.

At Knotch, data is at the heart of what we do. We receive millions of web requests a day, process insights in real-time, and store gigabytes worth of information. Well designed architectures, simple yet effective software implementations, and smart execution have been the keys to our success thus far. We’re looking for engineers who embrace these aspects and who are eager to contribute on all levels of our products.

Data Engineer

Working under our Lead Data Engineer, you will write the algorithms for the most prominent features of our analytics products. Your quantitative mindset will steer you to solving complex problems in collecting, storing, and accessing all the data Knotch collects.

We Will Trust You to

  • Independently create and maintain ETL processes using Spark
  • Create and maintain existing Extraction Libraries for text/images/video
  • Maintaining data integrity in every facet of our pipeline
  • Relentlessly clean, fix, and aggregate our data
  • Handle all data jobs in Wisdom Stream, a prominent feature

You Should Bring to the Table

  • Proficiency with writing in Python (complemented by PySpark)
  • At the minimum, internship or co-op experience building distributed computing and orchestration frameworks like Apache Spark and Airflow
  • Experience writing and productionizing complex data transformations in SQL
  • Experience designing and building ETL from various input sources
  • Experience working in cloud infrastructure (e.g. AWS services such as S3, EC2, EMR, Lambda, and Redshift)
  • Experience with data extractions (e.g. text, images, video) from HTML documents
  • Exposure to Data Science areas and algorithms such as these would be a big plus:
    • Anomaly Detection
    • Machine Learning 
    • NLP
    • Data Mining
    • Deep Learning
    • AI


  • Unlimited PTO and flexible work-from-home
  • Comprehensive health insurance
  • Free lunch while in-office
  • Unlimited monthly Metrocard
  • Mental Health and Learning & Development Programs

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.