Data Engineer / Senior Data Engineer
ArcadiaFull Time
Senior (5 to 8 years)
Key technologies and capabilities for this role
Common questions about this position
At least 5 years of relevant work experience is required.
Candidates need a solid background in algorithms, data structures, system design, experience building scalable and fault-tolerant distributed systems, and experience with data processing or database internals like Spark or Dask.
This information is not specified in the job description.
This information is not specified in the job description.
The Ray Data team develops and maintains the Ray Datasets library, powers production use cases like data compaction at Amazon and ML pipelines at Alibaba, and works closely with Ray Core and ML libraries including Train, RLlib, and Serve.
Platform for scaling AI workloads
Anyscale provides a platform designed to scale and productionize artificial intelligence (AI) and machine learning (ML) workloads. Its main product, Ray, is an open-source framework that helps developers manage and scale AI applications across various fields, including Generative AI, Large Language Models (LLMs), and computer vision. Ray allows companies to enhance the performance, fault tolerance, and scalability of their AI systems, with some users reporting over 90% improvements in efficiency, latency, and cost-effectiveness. Anyscale primarily serves clients in the AI and ML sectors, including major companies like OpenAI and Ant Group, who rely on Ray for training large models. The company operates on a software-as-a-service (SaaS) model, charging clients a subscription fee for access to the Ray platform. Anyscale's goal is to empower organizations to effectively scale their AI workloads and optimize their operations.