AWS Glue Enhances Spark Engines and Supports the Ray Framework

AWS Glue, a serverless data integration service provided by Amazon Web Services, showcases Python and Apache Spark capabilities in a version 4.0 release introduced this week.

The update includes engines for Apache Spark 3.3.0 and Python 3.10. Both engines have performance improvements and bug fixes, with Spark providing features like improved error messages and row-level runtime filtering.

The Cloud Shuffle Service for Spark, the Ray compute framework, and Adaptive Query Execution are supported by new engine plugins in Glue 4.0. Also included is support for the Python-based Pandas data analysis and manipulation tool. Delta Lake, Apache Iceberg, and Apache Hudi all now have new data format support. The Parquet vectorized reader, with support for additional encodings and data types, is also part of Glue 4.0.

Read More: AWS Glue upgrades Spark engines, backs Ray framework

Check Out The New TalkDev Podcast. For more such updates follow us on Google News TalkDev News.