hero

Work with the most ambitious teams.

Your
single
hub
to
explore
opportunities
at
the
best
technology
companies,
backed
by
Blackbird.

Senior Research Engineer - Datasets

Canva

Canva

San Francisco, CA, USA
USD 220k-280k / year + Equity
Posted on Aug 20, 2025

Company Description

Join the team redefining how the world experiences design.

Hey, hello, g'day, mabuhay, kia ora, 你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.

Where and how you can work

Our flagship office is in Sydney, Australia, but we've made our way from down under, to a hub in San Francisco, which is now home to our US operations. We offer flexibility in how and where you work. We trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.

Job Description

At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research Engineer to build the foundational datasets and systems that fuel our next-generation generative AI models and evaluation capabilities.

About the role:

In this foundational role, you will be the expert on what truly fuels our models: data. You'll own the end-to-end lifecycle of our critical datasets, from curating nuanced human feedback on design quality to pioneering machine learning for high-quality synthetic data generation. This unique position calls for a data-first thinker with a strong design sensibility, passionate about building the high-quality ground truth at scale that will define the future of creativity by ensuring our models align with human taste and intent.

At the moment, this role is focused on:

  • Statistical Analysis & Insights: Applying robust statistical methodologies to analyze data, identify significant trends, and derive actionable insights. This includes hypothesis testing, regression analysis, and determining statistical significance to validate model performance and user preferences.

  • Human Feedback Data Curation: Owning the design, processing, cleaning, and strategic curation of large-scale, subjective human feedback on design quality, which is the lifeblood of our models.

  • Synthetic Data Generation: Using generative AI and machine learning techniques to create novel, high-quality synthetic data that augments our training sets and improves model capabilities.

  • Alignment Analysis & Evaluation Design: Designing methods to analyze outputs from both human and automated systems to deeply understand and measure our models' alignment with user preferences.

Primary Responsibilities:

  • Design and build scalable pipelines for processing and curating large datasets of human design feedback.

  • Research and develop ML models to generate high-quality synthetic data for training and fine-tuning.

  • Own the design and implementation of human evaluation workflows, including creating guidelines and quality rubrics.

  • Prepare datasets for automated evaluation systems and conduct deep analysis of their outputs to provide robust signals on model performance and human alignment.

  • Design and analyze experiments to measure the real-world impact of our models on design quality.

  • Conduct deep-dive analyses into model performance to identify failure modes and guide future development.

You’re probably a match if you have:

  • A strong aesthetic sense, with a background or demonstrated passion for visual design or human-computer interaction.

  • Strong proficiency in Python and ML frameworks (e.g., PyTorch, TensorFlow).

  • Extensive experience with designing and implementing large-scale data processing workflows using libraries like Pandas and data warehousing solutions such as Snowflake.

  • Solid understanding of statistical methods, including experimental design, A/B testing, and quality evaluation systems.

  • Experience with generative AI and synthetic data generation is highly desirable.

Nice to have:

  • Experience with cloud platforms (e.g., AWS, GCP, Azure) for data storage, processing, and MLOps related to dataset management.

  • Experience with MLOps practices and tools specifically for data versioning, lineage, and pipeline automation.

  • Ability to develop data visualization or data collection interfaces (e.g., TypeScript, Python).

Additional Information

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a stack of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

  • Equity packages - we want our success to be yours too
  • Health benefits plans to support you and your wellbeing
  • 401(k) retirement plan with company contribution
  • Inclusive parental leave policy that supports all parents & carers
  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out lifeatcanva.com for more information.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

At Canva, we value fairness, and we strive to provide competitive, market-informed compensation whilst ensuring internal equity within the team in each region. We make hiring decisions based on your skills, experience and our overall assessment of what we observed and learnt in the hiring process. The target base salary range for this position is $220,000 - $280,000. When calculating offers, we make salary decisions based on market data, your experience levels, and internal benchmarks of your peers in the same domain and job level.

Please note that interviews are predominantly conducted virtually.