Back to all jobs

Senior Platform Engineer I, AI Evaluation (24 months fixed-term)

Khan Academy

View all jobs from this company →

Location

Mountain View, CA / Remote (Continental US + Hawaii + Canada Only)

Salary

$137,871 - $172,339 USD

Posted

0d ago

Full-timeOther
Apply on Company Website

📧 Get More Jobs Like This

Join 100+ subscribers getting first dibs on new remote jobs, free guides, resume help, and more!

No spam. Unsubscribe anytime.

About the Role

ABOUT KHAN ACADEMY

Khan Academy is a nonprofit with the mission to deliver a free, world-class education to anyone, anywhere. Our proven learning platform offers free, high-quality supplemental learning content and practice that cover Pre-K - 12th grade and early college core academic subjects, focusing on math and science. We have over 181 million registered learners globally and are committed to improving learning outcomes for students worldwide, focusing on learners in historically under-resourced communities.

OUR COMMUNITY 

Our students, teachers, and parents come from all walks of life, and so do we. Our team includes people from academia, traditional/non-traditional education, big tech companies, and tiny startups. We hire great people from diverse backgrounds and experiences because it makes our company stronger. We value diversity, equity, inclusion, and belonging as necessary to achieve our mission and impact the communities we serve. We know that transforming education starts in-house with learning about ourselves and our colleagues. We strive to be world-class in investing in our people and commit to developing you as a professional.

 

THE ROLE

We’re looking for an AI Platform Engineer to evolve and extend our internal evaluation framework for assessing the quality of our AI-driven experiences at Khan Academy. This engineer will have worked with enough eval systems to quickly make sense of Khan's internal eval framework and recognize opportunities for improvement. This is largely a software development role, but domain experience with AI eval is essential for appreciating the hill-climbing and data science workflows we need to support. Soft skills will be important for gathering internal requirements, getting buy-in for changes, and then developing documentation and training materials. You’ll work closely with ML data engineers and platform developers to help internal teams adopt an eval-driven development process incorporating offline benchmark tests and online experiments.

 

As a Platform Engineer focused on evaluation, you’ll be expected to:

  • Be fluent in the range of offline and online evaluation strategies, and when to apply the techniques over the lifecycle of development
  • Have intuitions about how to specify eval pipelines succinctly using declarative syntax
  • Understand the role of stratified datasets and ground truth labeling
  • Appreciate the range of eval scoring schemes from human raters to automated LLMs-as-judge 

We are a remote-first organization and we strive to build using technology that is best suited to solving problems for our learners. Currently, we build with Go, GraphQL, JavaScript, React & React Native, Redux and we adopt new technologies like LLMs when they’ll help us better achieve our goals. At Khan, one of our values is “Cultivate Learning Mindsets”, so for us, it’s important that we’re working with all of our engineers to help match the right opportunity to the right individual, in order to ensure every engineer is operating at their “learning edge”.

 

Currently, we are focused on providing equitable solutions to historically under-resourced communities of learners and teachers, and guided by our Engineering Principles. You can read about our latest work on our Engineering Blog. A few highlights:

 

WHAT YOU BRING

Required

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, related field, or equivalent professional experience.
  • 5 years of Software Engineering including significant time working on the evaluation of generative AI systems or other evaluations of ML model quality
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Familiarity with the architecture of large language models and their industry-standard APIs

 

Preferred

  • Experience with labeling platforms (e.g., Label Studio, Scale AI, Toloka) and human-in-the-loop concerns such as rubric development and inter-rater agreement
  • Exposure to MLOps practices such as model registry, feature store, or continuous evaluation
  • Background in education technology or other human-centered AI applications

 

PERKS AND BENEFITS

We may be a non-profit, but we reward our talented team extremely well! We offer:

  • Competitive salaries
  • Ample paid time off as needed – Your well-being is a priority
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • And we offer all those other typical benefits as well: 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life

 

At Khan Academy, we are committed to fair and equitable compensation practices, the well-being of our employees, and our Khan community. This belief is why we have built out a robust Total Rewards package that includes competitive base salaries, and extensive benefits and perks to support physical, mental, and financial well-being.

The compensation band for this role is $137,871 - $172,339 USD annually for candidates based in the United States and $186,306 - $232,883 CAN annually for candidates based in Canada.

The pay range for this position is a general guideline only. The salary offered will depend on internal pay equity and the candidate’s relevant skills, experience, qualifications, and job market data. Additional incentives are provided as part of the complete total rewards package, in addition to comprehensive medical and other benefits.

 

As part of our hiring process, we use a secure identity verification service through CLEAR® (in partnership with Greenhouse) to confirm that each applicant is who they claim to be. CLEAR® provides a safe, consistent way to confirm identity, helping protect both applicants and the company from impersonation or fraud. Read more about it here.

MORE ABOUT US

OUR COMPANY VALUES

Live & breathe learners

We deeply understand and empathize with our users. We leverage user insights, research, and experience to build content, products, services, and experiences that our users trust and love. Our success is defined by the success of our learners and educators.

Take a stand

As a company, we have conviction in our aspirational point of view of how education will evolve. The work we do is in service to moving towards that point of view. However, we also listen, learn and flex in the face of new data, and commit to e

Skills & Tags

RemoteGreenhouse
Apply on Company Website

Job sourced from Greenhouse (Khan Academy)

Similar Other Jobs