AWS Announces Six New Amazon SageMaker Capabilities, Including the First Fully Integrated Development Environment (IDE) for Machine Learning (Amazon SageMaker Studio)
Amazon SageMaker Notebooks allows developers to spin up elastic machine learning notebooks in seconds, and automates the process of sharing notebooks with a single-click
Amazon SageMaker Experiments helps developers visualize and compare machine learning model iterations, training parameters, and outcomes
Amazon SageMaker Autopilot allows developers to submit simple data in CSV files and have machine learning models automatically generated, with full visibility to how the models are created so they can impact evolving them over time
Amazon SageMaker Debugger provides real-time monitoring for machine learning models to improve predictive accuracy, reduce training times, and facilitate greater explainability
Amazon SageMaker Model Monitor detects concept drift to discover when the performance of a model running in production begins to deviate from the original trained model
Amazon SageMaker is a fully managed service that removes the heavy lifting from each step of the machine learning process. Tens of thousands of customers utilize Amazon SageMaker to help accelerate their machine learning deployments, including ADP,
Amazon SageMaker makes a lot of the building block steps to developing great machine learning models much easier. But many times, building truly great models that evolve successfully as a business grows takes a lot of optimizations between these building blocks and requires visibility into what’s working or not and why. These challenges are not unique to machine learning, as the same is true of software development, generally. However, over the past few decades, lots of tools like IDEs that help with testing, debugging, deployment, monitoring, and profiling have been built to help with the challenges faced by software developers. But due to its relative immaturity, these same tools simply haven’t existed in machine learning – until now.
Today’s announcements include significant capabilities that make it much easier for customers to build, train, explain, inspect, monitor, debug, and run custom machine learning models:
- Machine learning IDE:
Amazon SageMaker Studiopulls together all of the components used for machine learning in a single place. Just like an IDE, developers can view and organize their source code, dependencies, documentation, and other applications assets (e.g. images used for mobile apps) in Amazon SageMaker Studio. Today, there are a lot of components to machine learning workflows, many of which come with their own set of tools that exist separately today. The Amazon SageMaker Studio IDE provides a single interface for both all of the Amazon SageMaker capabilities announced today and the entire machine learning workflow. Amazon SageMaker Studiogives developers the ability to create project folders, organize notebooks and datasets, and discuss notebooks and results collaboratively. Amazon SageMaker Studiomakes it simpler and faster to build, train, explain, inspect, monitor, debug, and run machine learning models from a single interface.
- Elastic notebooks: Amazon SageMaker Notebooks provides one-click Jupyter notebooks with elastic compute that can be spun up in seconds. Notebooks contain everything needed to run or recreate a machine learning workflow. Before today, to view or run a notebook, developers needed to spin up a compute instance in Amazon SageMaker to power the notebook. If they found out they needed more compute power they had to spin up a new instance, transfer the notebook, and shut down the old instance. And, because the notebook was coupled to the compute instance, and the notebook typically existed on a developer’s workstation, there was no easy way to share notebooks and iterate collaboratively. Amazon SageMaker Notebooks delivers elastic Jupyter notebooks, allowing developers to easily dial up or down the amount of compute powering the notebook (including GPU acceleration), with the changes taking place automatically in the background without interrupting the developer’s work. Developers no longer need to lose time shutting down the old instance and recreating all their work in a new instance. This makes it much faster to get started building a model. Amazon SageMaker Notebooks will also enable one click sharing of notebooks by automatically reproducing the specific environment and library dependencies. This will make it easier to build models collaboratively, since an engineer will be able to easily make their work available to other engineers for them to build on top of the existing work.
- Experiment management: Amazon SageMaker Experiments helps developers organize and track iterations to machine learning models. Machine learning typically entails several iterations aimed at isolating and measuring the incremental impact of changing specific inputs. Developers produce hundreds of artifacts such as models, training data, and parameter settings during these iterations. Today, they have to rely on cumbersome mechanisms like spreadsheets to track these experiments and manually sort through these artifacts to understand how they impact the experiments. Amazon SageMaker Experiments helps developers manage these iterations by automatically capturing the input parameters, configuration, and results, and stores them as ‘experiments’. Developers can browse active experiments, search for previous experiments by their characteristics, review previous experiments with their results, and compare experiment results visually. And, Amazon SageMaker Experimentsalso preserves the full lineage of the experiments, so if a model begins to deviate from its intended outcome, developers can go back in time and inspect its artifacts. Amazon SageMaker Experimentsmakes it much easier for developers to iterate and develop high-quality models more quickly.
- Debugging and profiling: Amazon SageMaker Debugger allows developers to debug and profile model training to improve accuracy, reduce training times, and facilitate a greater understanding of machine learning models. Today, the training process is largely opaque, training times can be long and hard to optimize, and the ‘black box’ effect makes it hard to interpret and explain models. With Amazon SageMaker Debugger, models trained in Amazon SageMaker automatically emit key metrics that are collected and can be reviewed in
Amazon SageMaker Studioor via Amazon SageMaker Debugger’s API. These metrics provide real-time feedback on training accuracy and performance. When training problems are detected, Amazon SageMaker Debugger provides warnings and remediation advice. Amazon SageMaker Debugger also helps developers interpret how a model is working, representing an early step towards the explainability of neural networks. Automatic Model Building: Amazon SageMaker Autopilot provides the industry's first automated machine learning capability that does not require developers to give up control and visibility into their models. Today’s approaches to automated machine learning do an adequate job of creating an initial model, but they have no data available for developers on how the model was created or what’s in it. So, if the model is mediocre and developers want to evolve it, they’re out of luck. Also, today’s automatic machine learning services only give customers one simple model. Sometimes customers are willing to make trade-offs, such as sacrificing a little accuracy in a version of the model in exchange for a variant that makes lower latency predictions, but given that customers only have one model to choose from, there are no such options. Amazon SageMaker Autopilot automatically inspects raw data, applies feature processors, picks the best set of algorithms, trains multiple models, tunes them, tracks their performance, and then ranks the models based on performance – all with just a few clicks. The result is a recommendation for the best-performing model that customers can deploy, but at a fraction of the time and effort normally required to train it, and with full visibility into how the model was created and what’s in it. Amazon SageMaker Autopilot can be used by people who lack experience with machine learning to easily produce a model based on data alone, or it can be used by experienced developers to quickly develop a baseline model on which teams can further iterate. Amazon SageMaker Autopilot also gives developers a range of up to 50 different models that can be inspected in Amazon SageMaker Studio, so developers can choose the best model for their use case and have options to consider depending on which factor for which they choose to optimize.
- Concept drift detection: Amazon SageMaker Model Monitor allows developers to detect and remediate concept drift. Today, one of the big factors that can affect the accuracy of models deployed in production is if the data being used to generate predictions starts to differ from that used to train the model (e.g. changing economic conditions driving new interest rates affecting home purchasing predictions, changing seasons with different temperature, humidity, and air pressure impacting confidence in predicted equipment maintenance schedules, etc.). If the data starts to differ, it can lead to something called concept drift, whereby the patterns the model uses to make predictions no longer apply. Amazon SageMaker Model Monitor automatically detects concept drift in deployed models. Amazon SageMaker Model Monitor creates a set of baseline statistics about a model during training and compares the data used to make predictions against the training baseline. Amazon SageMaker Model Monitor alerts developers when drift is detected and helps them visually identify the root cause. Developers can use Amazon SageMaker Model Monitor’s out of the box features to detect drift right away, or they can write their own rules for Amazon SageMaker Model Monitor to monitor. Amazon SageMaker Model Monitor makes it easier for developers to adjust the training data or algorithm to accommodate concept drift.
“As tens of thousands of customers have used Amazon SageMaker to remove barriers to building, training, and deploying custom machine learning models, they’ve also encountered new challenges from operating at scale, and they’ve continued to provide feedback to AWS on their next set of challenges,” said Swami Sivasubramanian, Vice President, Amazon Machine Learning, AWS. “Today, we are announcing a set of tools that make it much easier for developers to build, train, explain, inspect, monitor, debug, and run custom machine learning models. Many of these concepts have been known and used by software developers to build, test, and maintain software for many years; however, they were not available for developers to build machine learning models. Today, with these launches, we are bringing these concepts to machine learning developers for the very first time.”
SyntheticGestalt is an applied machine learning company that develops models, software, and intelligent agents for research automation in the pharmaceutical and other life-sciences industries. “We train our drug discovery models and synthetic biology simulation models with Amazon SageMaker, and the new features help us systematically manage and evaluate our experiment results. In order to gain insight into the performance of experiments, our researchers must maintain consistent experiment settings and model results,”
About Amazon Web Services
For 13 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. AWS offers over 165 fully featured services for compute, storage, databases, networking, analytics, robotics, machine learning and artificial intelligence (AI),