About Ellucian Ellucian is a global market leader in education technology. We power innovation for higher education, partnering with more than 2,800 customers across 50 countries and serving over 20 million students. Ellucians AI-powered platform, trained on the richest dataset available in higher education, drives efficiency, personalized experiences, and strengthened engagement for all students, faculty and staff. Fueled by decades of experience with a singular focus on the unique needs of learning institutions, the Ellucian platform features best-in-class SaaS capabilities and delivers insights needed now and into the future. These solutions and services span the entire student lifecycle, from student recruitment, enrollment, and retention to workforce analytics, fundraising, and alumni engagement. Ellucians innovative solutions, vast ecosystem of partners, and user community of more than 45,000 provide best practices leading to greater institutional success and achieving better student outcomes. Values Rooted in Purpose We embrace the power to lead, the courage to innovate, and the determination to grow. At our core, we believe in humanizing our approach, recognizing that our people are our greatest strength. With a shared vision of transformation, we endeavor to shape a brighter future for higher education. About the Opportunity We’re seeking a seasoned Senior Cloud Engineer to help modernize our Ellucian application ecosystem and drive AI-powered automation in service management. You’ll be part of a transformational journey—from manual infrastructure to a fully pipeline-driven, scalable SaaS platform combining Agentic AI, SMART workflows &amp; self healing APIs coded in Python. With a focus on delivering intelligent workflows and creating a Highly available CI/CD Platform, this high-impact role combines deep technical expertise with a passion for innovation, shaping Ellucian into a leader in SaaS for higher education. If you get excited about automation &amp; creation and the impact, you can make on shaping the future of higher education then we should talk! <div>Where you will make an impact</div><ul><li>Lead the creation of agentic AI–powered service management applications, using LLMs, embeddings, and vector databases to automate incident triage, change requests, and user support workflows.</li><li>Architect and optimize scalable data lakes and vectorization pipelines that transform logs, tickets, and knowledge articles into high-quality embeddings for semantic search and proactive problem detection.</li><li>Design orchestration frameworks that integrate LLM agents with ITSM platforms, internal tools, and third-party services via REST APIs and event-driven architectures to deliver zero-touch operations.</li><li>Champion MLOps best practices—implement CI/CD for models, establish monitoring and alerting on SLAs, and automate retraining—to ensure AI agents maintain peak performance in live service environments.</li><li>Collaborate with service management, product, and engineering teams to translate operational challenges into AI-driven solutions, providing technical leadership and mentorship throughout delivery.</li><li>Stay at the cutting edge of generative AI and vector search research to continuously enhance our service management applications and drive innovation in automated support.</li></ul><div>What you will bring</div><ul><li>7+ years of AI/ML engineering experience with deep expertise in Python, TensorFlow or PyTorch, and cloud platforms (AWS preferred).</li><li>Proven track record building LLM-powered applications and autonomous agent frameworks (e.g., LangChain), with a focus on IT and service management use cases.</li><li>Strong proficiency in embedding strategies and vector databases such as Pinecone, FAISS, or Milvus, including designing indexing and retrieval pipelines for ticketing and knowledge data.</li><li>Solid background in data engineering and data lake architectures, ensuring seamless support for advanced service management workloads.</li><li>Demonstrated ability to integrate AI services via REST APIs, message queues, and event routers, delivering robust, scalable service management solutions.</li><li>Extensive MLOps experience—using tools like MLflow, SageMaker Pipelines, Kubeflow, MCPS, and AGNO—to automate model lifecycle from training to monitoring and retraining in production support environments.</li><li>Expertise in observability tools and practices to instrument, monitor, and troubleshoot AI/ML pipelines, ensuring system reliability and performance.</li><li>Exceptional problem-solving skills, clear communication, and a passion for mentoring peers and driving cross-functional collaboration between ITSM and engineering teams.</li></ul><ul><li><div><div><div><ul><li> </li></ul></div></div></div></li></ul> What makes #Ellucianlife <ul><li>22 days annual leave plus 11 public holidays</li><li>Competitive gratuity policy</li><li>Group insurance and Annual health checkup plan with a variety of family and wellness benefits.</li><li>Thrive Flex Lifestyle Account (LSA) that allows you to contribute towards your health, financial or learning interests</li><li>5 charitable days to support the community that supports us</li><li>Wellness o Headspace (mental health) o Wellbeats (virtual fitness classes)</li><li>RethinkCare – caregiver support</li><li>Diversity and inclusion programs that promote employee resource groups such as: Buzzinga and Lean In Team to name a few. </li><li>Parental leave</li><li>Employee referral bonuses to encourage the addition of great new people to the team</li><li>We Foster a learning culture with:<ul><li>Education Assistance Program</li><li>Professional development opportunities</li></ul></li></ul>#LI-HS1#LI-remote We’re seeking a seasoned Senior Cloud Engineer to help modernize our Ellucian application ecosystem and drive AI-powered automation in service management. You’ll be part of a transformational journey—from manual infrastructure to a fully pipeline-driven, scalable SaaS platform combining Agentic AI, SMART workflows &amp; self healing APIs coded in Python. With a focus on delivering intelligent workflows and creating a Highly available CI/CD Platform, this high-impact role combines deep technical expertise with a passion for innovation, shaping Ellucian into a leader in SaaS for higher education. If you get excited about automation &amp; creation and the impact, you can make on shaping the future of higher education then we should talk! <div>Where you will make an impact</div><ul><li>Lead the creation of agentic AI–powered service management applications, using LLMs, embeddings, and vector databases to automate incident triage, change requests, and user support workflows.</li><li>Architect and optimize scalable data lakes and vectorization pipelines that transform logs, tickets, and knowledge articles into high-quality embeddings for semantic search and proactive problem detection.</li><li>Design orchestration frameworks that integrate LLM agents with ITSM platforms, internal tools, and third-party services via REST APIs and event-driven architectures to deliver zero-touch operations.</li><li>Champion MLOps best practices—implement CI/CD for models, establish monitoring and alerting on SLAs, and automate retraining—to ensure AI agents maintain peak performance in live service environments.</li><li>Collaborate with service management, product, and engineering teams to translate operational challenges into AI-driven solutions, providing technical leadership and mentorship throughout delivery.</li><li>Stay at the cutting edge of generative AI and vector search research to continuously enhance our service management applications and drive innovation in automated support.</li></ul><div>What you will bring</div><ul><li>7+ years of AI/ML engineering experience with deep expertise in Python, TensorFlow or PyTorch, and cloud platforms (AWS preferred).</li><li>Proven track record building LLM-powered applications and autonomous agent frameworks (e.g., LangChain), with a focus on IT and service management use cases.</li><li>Strong proficiency in embedding strategies and vector databases such as Pinecone, FAISS, or Milvus, including designing indexing and retrieval pipelines for ticketing and knowledge data.</li><li>Solid background in data engineering and data lake architectures, ensuring seamless support for advanced service management workloads.</li><li>Demonstrated ability to integrate AI services via REST APIs, message queues, and event routers, delivering robust, scalable service management solutions.</li><li>Extensive MLOps experience—using tools like MLflow, SageMaker Pipelines, Kubeflow, MCPS, and AGNO—to automate model lifecycle from training to monitoring and retraining in production support environments.</li><li>Expertise in observability tools and practices to instrument, monitor, and troubleshoot AI/ML pipelines, ensuring system reliability and performance.</li><li>Exceptional problem-solving skills, clear communication, and a passion for mentoring peers and driving cross-functional collaboration between ITSM and engineering teams.</li></ul><ul><li><div><div><div><ul><li> </li></ul></div></div></div></li></ul> <ul><li>22 days annual leave plus 11 public holidays</li><li>Competitive gratuity policy</li><li>Group insurance and Annual health checkup plan with a variety of family and wellness benefits.</li><li>Thrive Flex Lifestyle Account (LSA) that allows you to contribute towards your health, financial or learning interests</li><li>5 charitable days to support the community that supports us</li><li>Wellness o Headspace (mental health) o Wellbeats (virtual fitness classes)</li><li>RethinkCare – caregiver support</li><li>Diversity and inclusion programs that promote employee resource groups such as: Buzzinga and Lean In Team to name a few. </li><li>Parental leave</li><li>Employee referral bonuses to encourage the addition of great new people to the team</li><li>We Foster a learning culture with:<ul><li>Education Assistance Program</li><li>Professional development opportunities</li></ul></li></ul>#LI-HS1#LI-remote