Reading the AWS Re:Invent Tea Leaves
As TNS founder Alex Williams points out, you can always learn a lot about where the industry is heading by keeping an eye on what Amazon Web Services launches at its annual re:Invent user conference. And this year’s show, even though it was held virtually, was no exception.
One interesting note was AWS’ increased emphasis on data and machine learning. The company has put a lot of effort over the last few years towards integrating machine learning models into modern development life cycles, in the process developing Amazon SageMaker, a platform that streamlines the process.
Now, the company is taking the next step, integrating the ML workflows directly into the sources of data themselves. The company has incorporated its tool for automating the creation of ML models, called SageMaker Autopilot, into many of its chief data management services, said Swami Sivasubramanian, in this year’s ML keynote.
The idea is to provide users of AWS data storage, databases and data warehouse tools the ability to create models with an interface almost everyone knows: SQL.
This work actually started last year AWS Technical Evangelist Ian Massingham pointed out, by integrating ML inside Amazon Aurora for relational database developers. This new feature allowed them to add ML capabilities to an enterprise application through a simple query. It did something similar with its interactive query service, called Athena, allowing developers to access built-in or custom ML models directly from Athena ad-hoc queries.
This year the integrations continue. The company’s Redshift data warehouse has been outfitted with machine learning capabilities. As the company explained in a blog post:
Amazon Redshift now enables you to run ML algorithms on Amazon Redshift data without manually selecting, building, or training an ML model. When you run an ML query in Amazon Redshift, the selected data is securely exported from Amazon Redshift to Amazon Simple Storage Service (Amazon S3).”
Even Amazon’s graph database, Neptune, gets some ML smarts. A graph database can be used to examine the links between different entities, revealing patterns that can’t be identified strictly through examination of the entities themselves. A new update to Neptune brings graph neural networks (GNNs), a technique to improve the accuracy of predictions by over 50% compared to traditional approaches, according to the company.
In general, AWS is paying close attention to users' needs for data management. Last week, the cloud giant unleashed a whole range of data services. Version 2 of the Amazon Aurora Serverless database service can now instantly scale “to hundreds-of-thousands of transactions per second.” The newly-released Babelfish for Aurora PostgreSQL, is an open source translation layer that makes it easy — and much less costly — to move from Microsoft SQL Server to an AWS database product.
“Between lift and shift, rearchitecting, and/or rebuild,” explained Robert Koch, lead architect at S&P Global Platts, “migrating to the cloud is a challenge in itself and Babelfish might be a big help in that regard. If there’s a sense of urgency in making our data more resilient, we can copy our SQL Server instances over to Aurora PostgreSQL and potentially not miss a beat."
|