You can learn a lot about where the industry is heading by keeping an eye on what AWS launches at its annual Re:Invent user conference. View in browser »

ISSUE 246: Reading the AWS re:Invent Tea Leaves

“Authorization must be enforced at every layer of the application stack: microservices, gateways, the frontend, databases and more.”

___
Tim Hinrichs, “Why We Need to Rethink Authorization for Cloud Native”

Code, Not Financial Contributions, Are Needed Most

Throwing money at a problem is not the best way to get things done, whether in business, charity, or the open source economy. Two recent studies prove the point. Non-monetary reasons for contributing topped the list when over a thousand contributors to free and open source software (FOSS) projects were asked to rank 10 motivating factors for the “2020 FOSS Contributor Survey”, a collaboration between The Linux Foundation and the Laboratory for Innovation Science at Harvard. Regardless of why a company initially contributed to open source, Tidelift’s “2020 Managed Open Source Survey” found that "employee contributions to projects a company manages" and "financing open source foundations" are the most effective approaches to helping open source projects.

In 2020 Tidelift found that more than four-fifths of organizations that use open source to build applications also contribute to open source in at least one of six different ways. Allowing employees to contribute to a project is by far the most common method, but company policies differ depending on whether or not it actually manages or sponsors the endeavor. Only 22% of organizations currently provide financial support to the projects themselves, whether via a foundation, consortium, or independent entity (15%), or via support to individual project maintainers (12%).

An organization’s control over contributions only sometimes influences opinions about effectiveness. By a two-to-one margin, employee time spent contributing to a company managed or sponsored project is viewed as extremely or very effective as opposed to ineffective. In contrast, almost as many respondents believe contributions to projects their company does not sponsor are ineffective versus effective. However, providing financial support via a foundation, consortium, or independently governed entity is most likely to be viewed as an effective way to contribute — by a three-to-one margin.

What is DataOps? Why is a real-time data platform essential to the use cases driving it? How can you build data pipelines with open source complexity?

In this episode of The New Stack Makers podcast — recorded live for KubeCon+CloudNativeCon North America — we talk to Andrew Stevenson, chief technical officer and co-founder of Lenses.io, about how Apache Kafka and Kubernetes can together dramatically increase the agility, efficiency and security of building real-time data applications.

Why Kubernetes and Kafka are the Combo for DataOps Success

Reading the AWS Re:Invent Tea Leaves

As TNS founder Alex Williams points out, you can always learn a lot about where the industry is heading by keeping an eye on what Amazon Web Services launches at its annual re:Invent user conference. And this year’s show, even though it was held virtually, was no exception.

One interesting note was AWS’ increased emphasis on data and machine learning. The company has put a lot of effort over the last few years towards integrating machine learning models into modern development life cycles, in the process developing Amazon SageMaker, a platform that streamlines the process.

Now, the company is taking the next step, integrating the ML workflows directly into the sources of data themselves. The company has incorporated its tool for automating the creation of ML models, called SageMaker Autopilot, into many of its chief data management services, said Swami Sivasubramanian, in this year’s ML keynote.

The idea is to provide users of AWS data storage, databases and data warehouse tools the ability to create models with an interface almost everyone knows: SQL.

This work actually started last year AWS Technical Evangelist Ian Massingham pointed out, by integrating ML inside Amazon Aurora for relational database developers. This new feature allowed them to add ML capabilities to an enterprise application through a simple query. It did something similar with its interactive query service, called Athena, allowing developers to access built-in or custom ML models directly from Athena ad-hoc queries.

This year the integrations continue. The company’s Redshift data warehouse has been outfitted with machine learning capabilities. As the company explained in a blog post:

Amazon Redshift now enables you to run ML algorithms on Amazon Redshift data without manually selecting, building, or training an ML model. When you run an ML query in Amazon Redshift, the selected data is securely exported from Amazon Redshift to Amazon Simple Storage Service (Amazon S3).”

Even Amazon’s graph database, Neptune, gets some ML smarts. A graph database can be used to examine the links between different entities, revealing patterns that can’t be identified strictly through examination of the entities themselves. A new update to Neptune brings graph neural networks (GNNs), a technique to improve the accuracy of predictions by over 50% compared to traditional approaches, according to the company.

In general, AWS is paying close attention to users' needs for data management. Last week, the cloud giant unleashed a whole range of data services. Version 2 of the Amazon Aurora Serverless database service can now instantly scale “to hundreds-of-thousands of transactions per second.” The newly-released Babelfish for Aurora PostgreSQL, is an open source translation layer that makes it easy — and much less costly — to move from Microsoft SQL Server to an AWS database product.

“Between lift and shift, rearchitecting, and/or rebuild,” explained Robert Koch, lead architect at S&P Global Platts, “migrating to the cloud is a challenge in itself and Babelfish might be a big help in that regard. If there’s a sense of urgency in making our data more resilient, we can copy our SQL Server instances over to Aurora PostgreSQL and potentially not miss a beat."

This Week in Programming: Kubernetes Says ‘Don’t Panic’ About Docker Deprecation

At first, the news of Kubernetes deprecating Docker after its v1.20 release might seem shocking, when it was the success of the Docker container that paved the way for Kubernetes. But fear not, say experts, it is only a shim that Kubernetes required to work around the extra Docker software that is being excised — not the containers themselves. Any containers built on the containerd standard — including those created by the Docker software — will continue to run just fine on K8s.

Unfixable Kubernetes Security Hole Means Potential Man-in-the-Middle Attacks

A new security issue discovered within Kubernetes puts multitenant clusters at risk. If a potential attacker can already create or edit services and pods, then they may be able to intercept traffic from other pods (or nodes) in the cluster. You’re not going to like this, but there’s no patch for this problem. Instead, “it can currently only be mitigated by restricting access to the vulnerable features.”

Red Hat Deprecates Linux CentOS in Favor of a Streaming Edition

In a big blow to the open source community, Red Hat has announced that, moving forward, the company will be shifting its “investment fully from CentOS Linux to CentOS Stream.” The move ends CentOS Linux 8 distribution on Dec. 31, 2021, and cancels the release of CentOS 9, instead, launching CentOS Stream 9 in the second quarter of 2021.

Congrats to Francesco Gualazzi, our winner of the Star Wars spatula set sweepstakes at our KubeCon + CloudNativeCon North America pancake and podcast breakfast! Francesco is a self-described engineer, passionate about self-healing and reliable systems.

Bratin Saha, senior vice president, AWS ML and engines, spoke with The New Stack during during the virtual sidelines at re:Invent on scaling operations with machine learning

AT AWS re:Invent, Dr. Nashlie Sephus discussed Amazon's new tool for detecting bias in machine learning models, called Clarify.

AWS re:Invent (from left to right and top to bottom): Nina Lindsey, senior PR manager, AWS artificial intelligence services; Mike Miller, director, AI devices, AWS; Bratin Saha, VP, AWS ML and Engines; Dr. Vasi Philomin, AWS GM of Machine Learning and AI; and Nashlie Sephus, AWS Manager, Applied Science, all gave a press briefing on AWS' machine learning cloud push.

The New Stack Makers podcast is available on:

SoundCloud — Fireside.fm — Pocket Casts — Stitcher — Apple Podcasts — Overcast — Spotify — TuneIn

Technologists building and managing new stack architectures join us for short conversations at conferences out on the tech conference circuit. These are the people defining how applications are developed and managed at scale.

Pre-register to get the new second edition of the Kubernetes ebook!

A lot has changed since we published the original Kubernetes Ecosystem ebook in 2017. Kubernetes has become the de facto standard platform for container orchestration and market adoption is strong. We now see Kubernetes as the operating system for the cloud — evolving into a universal control plane for compute, networking and storage that spans public, private and hybrid clouds. In this ebook you’ll learn:

Kubernetes architecture.
Options for running Kubernetes across a host of environments.
Key open source projects in the Kubernetes ecosystem.
Adoption patterns of cloud native infrastructure and tools.

Download Ebook

We are grateful for the support of our ebook sponsors: