Develop software, typically in Python, to independently acquire data from disparate sources (databases, files, APIs, etc.) and combine them into appropriate training, validation, and testing datasets
Analyze raw datasets using descriptive statistics, working directly with domain experts to understand the meaning of data fields
Have a deep understanding of Generative AI and LLMs, evaluation metrics, and theory behind LLMs & conversational AI
Be able to explain the full pattern of Retrieval Augmented Generation (RAG) and make recommendations on specific technology and techniques to use when building solutions based on this pattern
Develop and integrate Retrieval-Augmented Generation (RAG) systems on Azure cloud
Leverage open-source AI software like TensorFlow, PyTorch, HuggingFace, LangChain, AutoGen, LangGraph, and LamaIndex for solution development and evaluation
Evaluate various AI frameworks and services for efficacy and make recommendations on their inclusion as standardized tooling for AI development
Integrate these tools and software with Azure
Responsibilities
Develop, deploy, and maintain state-of-the-art Generative AI solutions within the Azure cloud environment
Build and refine Large Language Models (LLM) and RAG systems on Azure infrastructure, ensuring they meet business requirements and enterprise-level standards
Perform data science research and develop custom prototypes to solve problems not adequately addressed by commercial solutions, enhance customer/employee experience, and provide competitive advantage
Roll out Generative AI solutions and frameworks from Proof of Concepts to full stack development and vendor evaluations
Inspire and empower applied data science practitioners through high-quality data science education and collaboration within Northern Trust and with top universities/research institutions