Releases: hdinsight/release-notes
Azure HDInsight release notes
Azure HDInsight release notes
This article provides information about the most recent Azure HDInsight release updates.
Summary
Azure HDInsight is one of the most popular services among enterprise customers for open-source analytics on Azure.
Subscribe to our release notes and watch releases on this GitHub repository.
Release date: February 28, 2023
This release applies to HDInsight 4.0. 5.0, and 5.1. HDInsight release will be available to all regions over several days. This release is applicable for image number 2302250400. How to check the image number?
HDInsight uses safe deployment practices, which involve gradual region deployment. it may take up to 10 business days for a new release or a new version to be available in all regions.
OS versions
- HDInsight 4.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
- HDInsight 5.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
For workload specific versions, see
HDInsight 4.x component versions
HDInsight 5.x component versions
Note:
Microsoft has issued CVE-2023-23408, which is fixed on the current release and customers are advised to upgrade their clusters to latest image.
What's new?
HDInsight 5.1
We have started rolling out a new version of HDInsight 5.1. All new open-source releases added as incremental releases on HDInsight 5.1.
For more information, see HDInsight 5.x version
Kafka 3.2.0 Upgrade (Preview)
- Kafka 3.2.0 includes several significant new features/improvements.
- Upgraded Zookeeper to 3.6.3
- Kafka Streams support
- Stronger delivery guarantees for the Kafka producer enabled by default.
- log4j 1.x replaced with reload4j.
- Send a hint to the partition leader to recover the partition.
JoinGroupRequest
andLeaveGroupRequest
have a reason attached.- Added Broker count metrics8.
- Mirror Maker2 improvements.
HBase 2.4.11 Upgrade (Preview)
- This version has new features such as the addition of new caching mechanism types for block cache, the ability to alter
hbase:meta table
and view thehbase:meta
table from the HBase WEB UI.
Phoenix 5.1.2 Upgrade (Preview)
- Phoenix version upgraded to 5.1.2 in this release. This upgrade includes the Phoenix Query Server. The Phoenix Query Server proxies the standard Phoenix JDBC driver and provides a backwards-compatible wire protocol to invoke that JDBC driver.
Ambari CVEs
- Multiple Ambari CVEs are fixed.
Note:
ESP isn't supported for Kafka and HBase in this release.
End of support
End of support for Azure HDInsight clusters on Spark 2.4 February 10, 2024. For more information, see Spark versions supported in Azure HDInsight
Coming soon
- Autoscale
- Autoscale with improved latency and several improvements
- Cluster name change limitation
- The max length of cluster name will be changed to 45 from 59 in Public, Mooncake and Fairfax.
- Cluster permissions for secure storage
- Customers can specify (during cluster creation) whether a secure channel should be used for HDInsight cluster nodes to contact the storage account.
- Non-ESP ABFS clusters [Cluster Permissions for World Readable]
- Plan to introduce a change in non-ESP ABFS clusters, which restricts non-Hadoop group users from executing Hadoop commands for storage operations. This change to improve cluster security posture. Customers need to plan for the updates.
- Open-source upgrades
- Apache Spark 3.3.0 and Hadoop 3.3.4 are under development on HDInsight 5.1 and will include several significant new features, performance and other improvements.
NOTE:
We advise customers to use to latest versions of HDInsight Images as they bring in the best of open-source updates, Azure updates and security fixes.
Release 2022-12-08
Azure HDInsight release notes
This article provides information about the most recent Azure HDInsight release updates.
Summary
Azure HDInsight is one of the most popular services among enterprise customers for open-source analytics on Azure.
Release date: December 12, 2022
This release applies to HDInsight 4.0. and 5.0 HDInsight release is made available to all regions over several days.
HDInsight uses safe deployment practices, which involve gradual region deployment. It may take up to 10 business days for a new release or a new version to be available in all regions.
OS versions
- HDInsight 4.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
- HDInsight 5.0: Ubuntu 18.04.5 LTS Linux Kernel 5.4
For workload specific versions, see here.
What's New?
- Log Analytics - Customers can enable classic monitoring to get the latest OMS version 14.19. To remove old versions, disable and enable classic monitoring.
- Ambari user auto UI logout due to inactivity. For more information, see here
- Spark - A new and optimized version of Spark 3.1 is included in this release.
New Regions
- Qatar Central
- Germany North
What's changed?
-
HDInsight has moved away from Azul Zulu Java JDK 8 to Adoptium Temurin JDK 8, which supports high-quality TCK certified runtimes, and associated technology for use across the Java ecosystem.
-
HDInsight has migrated to reload4j. The log4j changes are applicable to
- Apache Hadoop
- Apache Zookeeper
- Apache Oozie
- Apache Ranger
- Apache Sqoop
- Apache Pig
- Apache Ambari
- Apache Kafka
- Apache Spark
- Apache Zeppelin
- Apache Livy
- Apache Rubix
- Apache Hive
- Apache Tez
- Apache HBase
- Apache OMI
- Apache Pheonix
Updated
HDInsight will implement TLS1.2 going forward, and earlier versions will be updated on the platform. If you're running any applications on top of HDInsight and they use TLS 1.0 and 1.1, upgrade to TLS 1.2 to avoid any disruption in services.
For more information, see How to enable Transport Layer Security (TLS)
End of Support
End of support for Azure HDInsight clusters on Ubuntu 16.04 LTS from 30 November 2022. HDInsight had begun release of cluster images using Ubuntu 18.04 from June 27, 2021. We recommend our customers who are running clusters using Ubuntu 16.04 is to rebuild their clusters with the latest HDInsight images by 30 November 2022.
For more information on how to check Ubuntu version of cluster, see here
-
Execute the command “lsb_release -a” in the terminal.
-
If the value for “Description” property in output is “Ubuntu 16.04 LTS”, then this update is applicable to the cluster.
Bug fixes
- Support for Availability Zones selection for Kafka and HBase (write access) clusters.
Open-source bug fixes
Hive bug fixes
Bug Fixes | Apache JIRA |
---|---|
HIVE-26127 | INSERT OVERWRITE error - File Not Found |
HIVE-24957 | Wrong results when subquery has COALESCE in correlation predicate |
HIVE-24999 | HiveSubQueryRemoveRule generates invalid plan for IN subquery with multiple correlations |
HIVE-24322 | If there's direct insert, the attempt ID has to be checked when reading the manifest files |
HIVE-23363 | Upgrade DataNucleus dependency to 5.2 |
HIVE-26412 | Create interface to fetch available slots and add the default |
HIVE-26173 | Upgrade derby to 10.14.2.0 |
HIVE-25920 | Bump Xerce2 to 2.12.2. |
HIVE-26300 | Upgrade Jackson data bind version to 2.12.6.1+ to avoid CVE-2020-36518 |
Release 2021-07-27
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
The OS versions for this release are:
- HDInsight 3.6: Ubuntu 16.04.7 LTS
- HDInsight 4.0: Ubuntu 18.04.5 LTS
New features
New Azure Monitor integration experience (Preview)
The new Azure monitor integration experience will be Preview in East US and West Europe with this release. Learn more details about the new Azure monitor experience here.
Deprecation
Basic support for HDInsight 3.6 starting July 1, 2021
Starting July 1, 2021, Microsoft offers Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You are automatically enrolled in Basic support starting July 1, 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.
We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments. We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.
Behavior changes
HDInsight Interactive Query only supports schedule-based Autoscale
As customer scenarios grow more mature and diverse, we have identified some limitations with Interactive Query (LLAP) load-based Autoscale. These limitations are caused by the nature of LLAP query dynamics, future load prediction accuracy issues, and issues in the LLAP scheduler's task redistribution. Due to these limitations, users may see their queries run slower on LLAP clusters when Autoscale is enabled. The effect on performance can outweigh the cost benefits of Autoscale.
Starting from July 2021, the Interactive Query workload in HDInsight only supports schedule-based Autoscale. You can no longer enable load-based autoscale on new Interactive Query clusters. Existing running clusters can continue to run with the known limitations described above.
Microsoft recommends that you move to a schedule-based Autoscale for LLAP. You can analyze your cluster's current usage pattern through the Grafana Hive dashboard. For more information, see Automatically scale Azure HDInsight clusters.
Upcoming changes
The following changes will happen in upcoming releases.
Built-in LLAP component in ESP Spark cluster will be removed
HDInsight 4.0 ESP Spark cluster has built-in LLAP components running on both head nodes. The LLAP components in ESP Spark cluster were originally added for HDInsight 3.6 ESP Spark, but has no real user case for HDInsight 4.0 ESP Spark. In the next release scheduled in Sep 2021, HDInsight will remove the built-in LLAP component from HDInsight 4.0 ESP Spark cluster. This change will help to offload head node workload and avoid confusion between ESP Spark and ESP Interactive Hive cluster type.
New region
- West US 3
- Jio India West
- Australia Central
Component version change
The following component version has been changed with this release:
- ORC version from 1.5.1 to 1.5.9
You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Back ported JIRAs
Here are the back ported Apache JIRAs for this release:
Impacted Feature | Apache JIRA |
---|---|
Date / Timestamp | HIVE-25104 |
HIVE-24074 | |
HIVE-22840 | |
HIVE-22589 | |
HIVE-22405 | |
HIVE-21729 | |
HIVE-21291 | |
HIVE-21290 | |
UDF | HIVE-25268 |
HIVE-25093 | |
HIVE-22099 | |
HIVE-24113 | |
HIVE-22170 | |
HIVE-22331 | |
ORC | HIVE-21991 |
HIVE-21815 | |
HIVE-21862 | |
Table Schema | HIVE-20437 |
HIVE-22941 | |
HIVE-21784 | |
HIVE-21714 | |
HIVE-18702 | |
HIVE-21799 | |
HIVE-21296 | |
Workload Management | HIVE-24201 |
Compaction | HIVE-24882 |
HIVE-23058 | |
HIVE-23046 | |
Materialized view | HIVE-22566 |
Release 2021-06-02
Release date: 06/02/2021
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
The OS versions for this release are:
- HDInsight 3.6: Ubuntu 16.04.7 LTS
- HDInsight 4.0: Ubuntu 18.04.5 LTS
New features
OS version upgrade
As referenced in Ubuntu’s release cycle, the Ubuntu 16.04 kernel will reach End of Life (EOL) in April 2021. We started rolling out the new HDInsight 4.0 cluster image running on Ubuntu 18.04 with this release. Newly created HDInsight 4.0 clusters will run on Ubuntu 18.04 by default once available. Existing clusters on Ubuntu 16.04 will run as is with full support.
HDInsight 3.6 will continue to run on Ubuntu 16.04. It will change to Basic support (from Standard support) beginning 1 July 2021. For more information about dates and support options, see Azure HDInsight versions. Ubuntu 18.04 will not be supported for HDInsight 3.6. If you’d like to use Ubuntu 18.04, you’ll need to migrate your clusters to HDInsight 4.0.
You need to drop and recreate your clusters if you’d like to move existing HDInsight 4.0 clusters to Ubuntu 18.04. Plan to create or recreate your clusters after Ubuntu 18.04 support becomes available.
After creating the new cluster, you can SSH to your cluster and run sudo lsb_release -a
to verify that it runs on Ubuntu 18.04. We recommend that you test your applications in your test subscriptions first before moving to production. Learn more about the HDInsight Ubuntu 18.04 update.
Scaling optimizations on HBase accelerated writes clusters
HDInsight made some improvements and optimizations on scaling for HBase accelerated write enabled clusters. Learn more about HBase accelerated write.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Deprecation
No deprecation in this release.
Behavior changes
Disable Stardard_A5 VM size as Head Node for HDInsight 4.0
HDInsight cluster Head Node is responsible for initializing and managing the cluster. Standard_A5 VM size has reliability issues as Head Node for HDInsight 4.0. Starting from this release, customers will not be able to create new clusters with Standard_A5 VM size as Head Node. You can use other two-core VMs like E2_v3 or E2s_v3. Existing clusters will run as is. A four-core VM is highly recommended for Head Node to ensure the high availability and reliability of your production HDInsight clusters.
Network interface resource not visible for clusters running on Azure virtual machine scale sets
HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.
Upcoming changes
The following changes will happen in upcoming releases.
HDInsight Interactive Query only supports schedule-based Autoscale
As customer scenarios grow more mature and diverse, we have identified some limitations with Interactive Query (LLAP) load-based Autoscale. These limitations are caused by the nature of LLAP query dynamics, future load prediction accuracy issues, and issues in the LLAP scheduler's task redistribution. Due to these limitations, users may see their queries run slower on LLAP clusters when Autoscale is enabled. The affect on performance can outweigh the cost benefits of Autoscale.
Starting from July 2021, the Interactive Query workload in HDInsight only supports schedule-based Autoscale. You can no longer enable Autoscale on new Interactive Query clusters. Existing running clusters can continue to run with the known limitations described above.
Microsoft recommends that you move to a schedule-based Autoscale for LLAP. You can analyze your cluster's current usage pattern through the Grafana Hive dashboard. For more information, see Automatically scale Azure HDInsight clusters.
Basic support for HDInsight 3.6 starting July 1, 2021
Starting July 1, 2021, Microsoft will offer Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You'll automatically be enrolled in Basic support starting July 1, 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.
We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments. We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.
VM host naming will be changed on July 1, 2021
HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. This migration will change the cluster host name FQDN name format, and the numbers in the host name will not be guarantee in sequence. If you want to get the FQDN names for each node, refer to Find the Host names of Cluster Nodes.
Bug fixes
HDInsight continues to make cluster reliability and performance improvements.
Component version change
You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Release 2021-03-24
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
Spark 3.0 preview
HDInsight added Spark 3.0.0 support to HDInsight 4.0 as a Preview feature.
Kafka 2.4 preview
HDInsight added Kafka 2.4.1 support to HDInsight 4.0 as a Preview feature.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Eav4-series support
HDInsight added Eav4-series support in this release. Learn more about Dav4-series here. The series has been made available in below regions:
- AUSTRALIA EAST
- BRAZIL SOUTH
- CENTRAL US
- EAST ASIA
- EAST US
- JAPAN EAST
- SOUTHEAST ASIA
- UK SOUTH
- WEST EUROPE
- WEST US 2
Deprecation
No deprecation in this release.
Behavior changes
Default cluster version is changed to 4.0
The default version of HDInsight cluster is changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0.
Default cluster VM sizes are changed to Ev3-series
Default cluster VM sizes are changed from D-series to Ev3-series. This change applies to head nodes and worker nodes. To avoid this change impacting your tested workflows, specify the VM sizes that you want to use in the ARM template.
Network interface resource not visible for clusters running on Azure virtual machine scale sets
HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.
Upcoming changes
The following changes will happen in upcoming releases.
OS version upgrade
HDInsight will be upgrading OS version from Ubuntu 16.04 to 18.04. The upgrade will complete before April 2021.
HDInsight 3.6 offers Basic support on July 2021
Starting July 2021, Microsoft will offer Basic support for certain HDInsight 3.6 cluster types. The Basic support plan will be available until 3 April 2022. You'll automatically be enrolled in Basic support starting July 2021. No action is required by you to opt in. See our documentation for which cluster types are included under Basic support.
We don't recommend building any new solutions on HDInsight 3.6, freeze changes on existing 3.6 environments.
We recommend that you migrate your clusters to HDInsight 4.0. Learn more about what's new in HDInsight 4.0.
Bug fixes
HDInsight continues to make cluster reliability and performance improvements.
Component version change
Added support for Spark 3.0.0 and Kafka 2.4.1 as Preview.
You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Release 2021-02-05
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
Dav4-series support
HDInsight added Dav4-series support in this release. Learn more about Dav4-series here.
Kafka REST Proxy GA
Kafka REST Proxy enables you to interact with your Kafka cluster via a REST API over HTTPS. Kafka Rest Proxy is general available starting from this release. Learn more about Kafka REST Proxy here.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. The service is gradually migrating to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Deprecation
Disabled VM sizes
Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Behavior changes
Default cluster VM size changes to Ev3-series
Default cluster VM sizes will be changed from D-series to Ev3-series. This change applies to head nodes and worker nodes. To avoid this change impacting your tested workflows, specify the VM sizes that you want to use in the ARM template.
Network interface resource not visible for clusters running on Azure virtual machine scale sets
HDInsight is gradually migrating to Azure virtual machine scale sets. Network interfaces for virtual machines are no longer visible to customers for clusters that use Azure virtual machine scale sets.
Breaking change for .NET for Apache Spark 1.0.0
HDInsight introduces the first major official release of .NET for Apache Spark in the next release. It provides DataFrame API completeness for Spark 2.4.x and Spark 3.0.x along with other features. There will be breaking changes for this major version, refer to this migration guide to understand steps needed to update your code and pipelines. Learn more here.
Upcoming changes
The following changes will happen in upcoming releases.
Default cluster version will be changed to 4.0
Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0.
OS version upgrade
HDInsight is upgrading OS version from Ubuntu 16.04 to 18.04. The upgrade will complete before April 2021.
HDInsight 3.6 end of support on June 30 2021
HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Bug fixes
HDInsight continues to make cluster reliability and performance improvements.
Component version change
No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Release 2020-11-18
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
Auto key rotation for customer managed key encryption at rest
Starting from this release, customers can use Azure KeyValut version-less encryption key URLs for customer managed key encryption at rest. HDInsight will automatically rotate the keys as they expire or replaced with new versions. Learn more details here.
Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services
HDInsight previously didn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. From this release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Deprecation
Deprecation of HDInsight 3.6 ML Services cluster
HDInsight 3.6 ML Services cluster type will be end of support by December 31 2020. Customers won't create new 3.6 ML Services clusters after December 31 2020. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.
Disabled VM sizes
Starting from November 16 2020, HDInsight will block new customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing customers who have used these VM sizes in the past three months won't be affected. Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Behavior changes
Add NSG rule checking before scaling operation
HDInsight added network security groups (NSGs) and user-defined routes (UDRs) checking with scaling operation. The same validation is done for cluster scaling besides of cluster creation. This validation helps prevent unpredictable errors. If validation doesn't pass, scaling fails. Learn more about how to configure NSGs and UDRs correctly, refer to HDInsight management IP addresses.
Upcoming changes
The following changes will happen in upcoming releases.
Default cluster version will be changed to 4.0
Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0
HDInsight 3.6 end of support on June 30 2021
HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Bug fixes
HDInsight continues to make cluster reliability and performance improvements.
Component version change
No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Release 2020-11-09
Release date: 11/09/2020
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
HDInsight Identity Broker (HIB) is now GA
HDInsight Identity Broker (HIB) that enables OAuth authentication for ESP clusters is now generally available with this release. HIB Clusters created after this release will have the latest HIB features:
- High Availability (HA)
- Support for Multi-Factor Authentication (MFA)
- Federated users sign in with no password hash synchronization to AAD-DS
For more information, see HIB documentation.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Deprecation
Deprecation of HDInsight 3.6 ML Services cluster
HDInsight 3.6 ML Services cluster type will be end of support by December 31 2020. Customers won't create new 3.6 ML Services clusters after December 31 2020. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.
Disabled VM sizes
Starting from November 16 2020, HDInsight will block new customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing customers who have used these VM sizes in the past three months won't be affected. Starting form January 9 2021, HDInsight will block all customers creating clusters using standand_A8, standand_A9, standand_A10 and standand_A11 VM sizes. Existing clusters will run as is. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Upcoming changes
The following changes will happen in upcoming releases.
Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services
HDInsight today doesn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. In the upcoming release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.
Default cluster version will be changed to 4.0
Starting February 2021, the default version of HDInsight cluster will be changed from 3.6 to 4.0. For more information about available versions, see available versions. Learn more about what is new in HDInsight 4.0
HDInsight 3.6 end of support on June 30 2021
HDInsight 3.6 will be end of support. Starting form June 30 2021, customers can't create new HDInsight 3.6 clusters. Existing clusters will run as is without the support from Microsoft. Consider moving to HDInsight 4.0 to avoid potential system/support interruption.
Bug fixes
Fix issue for restarting VMs in cluster
The issue for restarting VMs in the cluster has been fixed, you can use PowerShell or REST API to reboot nodes in cluster again.
Release 2020-10-08
Release date: 10/08/2020
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
HDInsight private clusters with no public IP and Private link (Preview)
HDInsight now supports creating clusters with no public IP and private link access to the clusters in preview. Customers can use the new advanced networking settings to create a fully isolated cluster with no public IP and use their own private endpoints to access the cluster.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Deprecation
Deprecation of HDInsight 3.6 ML Services cluster
HDInsight 3.6 ML Services cluster type will be end of support by Dec 31 2020. Customers won't create new 3.6 ML Services clusters after that. Existing clusters will run as is without the support from Microsoft. Check the support expiration for HDInsight versions and cluster types here.
Upcoming changes
The following changes will happen in upcoming releases.
Ability to select different Zookeeper virtual machine sizes for Spark, Hadoop, and ML Services
HDInsight today doesn't support customizing Zookeeper node size for Spark, Hadoop, and ML Services cluster types. It defaults to A2_v2/A2 virtual machine sizes, which are provided free of charge. In the upcoming release, you can select a Zookeeper virtual machine size that is most appropriate for your scenario. Zookeeper nodes with virtual machine size other than A2_v2/A2 will be charged. A2_v2 and A2 virtual machines are still provided free of charge.
Component version change
No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.
Release 2020-09-28
Release date: 09/28/2020
This release applies for both HDInsight 3.6 and HDInsight 4.0. HDInsight release is made available to all regions over several days. The release date here indicates the first region release date. If you don't see below changes, wait for the release being live in your region in several days.
New features
Autoscale for Interactive Query with HDInsight 4.0 is now generally available
Auto scale for Interactive Query cluster type is now General Available (GA) for HDInsight 4.0. All Interactive Query 4.0 clusters created after 27 August 2020 will have GA support for auto scale.
HBase cluster supports Premium ADLS Gen2
HDInsight now supports Premium ADLS Gen2 as primary storage account for HDInsight HBase 3.6 and 4.0 clusters. Together with Accelerated Writes, you can get better performance for your HBase clusters.
Kafka partition distribution on Azure fault domains
A fault domain is a logical grouping of underlying hardware in an Azure data center. Each fault domain shares a common power source and network switch. Before HDInsight Kafka may store all partition replicas in the same fault domain. Starting from this release, HDInsight now supports automatically distribution of Kafka partitions based on Azure fault domains.
Encryption in transit
Customers can enable encryption in transit between cluster nodes using IPSec encryption with platform-managed keys. This option can be enabled at the cluster creation time. See more details about how to enable encryption in transit.
Encryption at host
When you enable encryption at host, data stored on the VM host is encrypted at rest and flows encrypted to the storage service. From this release, you can Enable encryption at host on temp data disk when creating the cluster. Encryption at host is only supported on certain VM SKUs in limited regions. HDInsight supports the following node configuration and SKUs. See more details about how to enable encryption at host.
Moving to Azure virtual machine scale sets
HDInsight now uses Azure virtual machines to provision the cluster. Starting from this release, the service will gradually migrate to Azure virtual machine scale sets. The entire process may take months. After your regions and subscriptions are migrated, newly created HDInsight clusters will run on virtual machine scale sets without customer actions. No breaking change is expected.
Upcoming changes
The following changes will happen in upcoming releases.
Ability to select different Zookeeper SKU for Spark, Hadoop, and ML Services
HDInsight today doesn't support changing Zookeeper SKU for Spark, Hadoop, and ML Services cluster types. It uses A2_v2/A2 SKU for Zookeeper nodes and customers aren't charged for them. In the upcoming release, customers can change Zookeeper SKU for Spark, Hadoop, and ML Services as needed. Zookeeper nodes with SKU other than A2_v2/A2 will be charged. The default SKU will still be A2_V2/A2 and free of charge.
Component version change
No component version change for this release. You can find the current component versions for HDInsight 4.0 and HDInsight 3.6 in this doc.