Senior Data Scientist
Dataiku- Full Time
- Senior (5 to 8 years)
Candidates must have at least 3 years of experience with Pandas, Numpy, and Scipy for data manipulation and analysis. Experience with large-scale data pipelines and feature engineering using tools like PySpark, BigQuery, AWS EMR, or Hive is essential. Strong storytelling ability to communicate complex findings to both technical and non-technical audiences is required, along with a self-starter attitude and motivation to make a positive societal impact. Candidates should be able to wear multiple hats and demonstrate leadership as the team grows, with the ability to work in Canada. Bonus qualifications include a strong open-source portfolio, experience with PyTorch for machine learning, and familiarity with Databricks and Amplitude for product analytics.
The Senior Data Scientist will build an understanding of user behavior and needs using product analytics with tools like Pandas, Numpy, Scipy, and PySpark. They will write clean and efficient Python and SQL code to extract insights, identify patterns, and evaluate the performance of machine learning models. Additionally, they will build robust, scalable data pipelines to ingest, preprocess, and analyze large-scale datasets, and collaborate with machine learning engineers and product teams to validate datasets and recommend improvements.
AI detection API for content authenticity
GPTZero.me specializes in detecting content generated by artificial intelligence (AI) models. The company offers an API that organizations can use to identify whether written content comes from AI sources like ChatGPT, GPT-3, GPT-4, and Bard. This service is particularly useful for educational institutions and businesses that need to verify the originality of their content and prevent AI-generated plagiarism. Unlike many competitors, GPTZero.me focuses on providing a customizable API that can be integrated into clients' systems, allowing for tailored solutions to meet specific needs. The company operates on a business-to-business model, generating revenue through subscription fees for API access and additional charges for customization services. The goal of GPTZero.me is to lead the market in AI detection technology, helping organizations ensure the authenticity of their written content.