Azure Capacity Supply Chain and Provisioning is looking for a highly motivated senior level data scientist to help Capacity Supply Chain & Provisioning (CSCP) Organization at Microsoft to predict, optimize and build the future of cloud computing. You will operate as part of a well-established data science team responsible for researching, developing, training and testing state of the art data science models (with an emphasis in machine learning) to forecast demand for Microsoft’s Azure cloud resources. Your contributions will drive significant investment and planning decisions for Microsoft’s vast and rapidly growing cloud business. You will interact closely with fellow data scientists, PMs, engineers, seniorleadershipand other stakeholders by acting as a subject matter expert in machine learning in distributed computing environment.


  • Train, test a variety of machine learning models for demand forecasting of Microsoft’s cloudresources
  • Work with the data scientists and engineers to bring a wide variety of ML libraries to production.
  • Design, test and deployML training and test algorithmsfor optimal run timeusing parallel computing technologies in Azure cloud(such as Databricks.)
  • Contribute to the development of our Data Science Machine Learning (DSML) Platform designed to speed up ML model training and testing.
  • Research, developand deploy production-gradeutility functions for forecasting, anomaly detection, optimization, clustering, etc.
  • Keep abreast ofnew statistical / machine learning techniquesand distributed computing techniques to enhance the performance of DSML Platform.


Basic Qualifications

Bachelors degree in Computer Science or Engineering, Statistics, Applied Mathematics, Operations Research, or similar applied quantitative field.

5+ years of industry experience in developing production-grade statistical and machine learning code in a team environment (experience in agile process is a plus).

Atleast have 5 years of experience in coding in Python (scikit / numpy / pandas / statsmodel) in distributed computing environment.

Experience with deep learning models (e.g., tensorflow, PyTorch, CNTK) and solid knowledge of theory and practice of 3 plus years.

At least 3 years of experience with typical data management systems and tools such as SQL.

Knowledge and ability to work within a large-scale computing context, and 3 years of hands-on experience with Hadoop, Spark, DataBricks or similar.

Preferred Qualifications:

MS or PhD in Computer Science or Engineering, Statistics or Applied Mathematics, Operations Research, or similar applied quantitative field.

Experience in distributed computing environment (experience in R is a plus)

Experience with maintaining a large code base using code versioning systems such as git.

Experience in agile process is a plus.

Experience in understanding business needs and translating them into technical solutions.

Excellent interpersonal and communication (verbal and written) skills.

Excellent creative thinking skills with emphasis on developing innovative methods to solve hard problems under ambiguity and no obvious solutions.

