We are seeking an experienced Application Management Specialist to join our innovative team, focusing on the development and maintenance of sophisticated software solutions. This role is centered on leveraging extensive software engineering expertise, particularly in Python, to manage large-scale applications and integrate advanced machine learning components. The position requires a local presence in either Dallas, TX or New York City, NY, on a hybrid basis consisting of three days onsite per week. Candidates will be at the forefront of implementing production-grade ML systems and Large Language Models to drive business value.
The ideal candidate will be a technical leader capable of architecting complex data processing pipelines and model fine-tuning workflows. You will collaborate with global teams to design and launch LLM-based applications using RAG, tool-using agents, and various API integrations. While the primary focus is on technical execution, strong ownership and the ability to simplify complex concepts for stakeholders are essential. This role offers the opportunity to work with cutting-edge cloud infrastructure, primarily within the AWS ecosystem, to deliver reliable and efficient software services.
Key Requirements
10+ years of professional software development experience in Python, C/C++, Go, or Java.
3+ years of experience designing, architecting, and launching production-level ML systems.
Deep expertise in building and maintaining large-scale Python applications.
Practical experience with Large Language Models (LLMs) including API integration and fine-tuning.
Proficiency in prompt engineering and building applications using RAG (Retrieval-Augmented Generation).
Understanding of commercial and open-source LLMs such as OpenAI, Gemini, Llama, and Claude.
Solid grasp of applied statistics, core machine learning concepts, and data structures.
Ability to manage model deployment, serving, evaluation, and monitoring processes.
Strong analytical problem-solving skills with a high sense of ownership and urgency.
Effective communication skills for collaborating across diverse global teams.
Experience with cloud infrastructure, specifically AWS services like ECS, EKS, and Lambda.
Familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.