Navigation
Search
|
Snowflake brings analytics workloads into its cloud with Snowpark Connect for Apache Spark
Tuesday July 29, 2025. 05:53 PM , from InfoWorld
Snowflake is preparing to run Apache Spark analytics workloads directly on its infrastructure, saving enterprises the trouble of hosting an Apache Spark instance elsewhere, and eliminating data transfer delays between it and the Snowflake Data Cloud.
The new offering will rely on a feature of Apache Spark introduced in version 3.4, Spark Connect, which enables users’ code to run separately from the Apache Spark cluster doing the hard work. With Spark Connect, “Your application, whether it’s a Python script or a data notebook, simply sends the unresolved logical plan to a remote Spark cluster. The cluster does all the heavy lifting and sends back the results,” Snowflake explained in a blog post. Snowpark Connect for Apache Spark is Snowflake’s take on that technology, enabling enterprises to run their Spark code on Snowflake’s vectorized engine in Data Cloud. For Snowflake customers, the new capability will make it easy for developers to move their code to Snowpark, essentially offering a combination of Spark’s familiarity and Snowflake’s simplicity, according to Sanjeev Mohan, chief analyst at SanjMo. Snowpark Connect for Spark will also help enterprises lower their total cost of ownership as developers can take advantage of Snowflake’s serverless engine and no longer have to tune Spark, Mohan said. Other benefits include faster processing due to Snowflake’s vectorized engine and bypassing challenges such as finding staff with Spark expertise, he said. Everest Group senior analyst Shubham Yadav sees the launch as timely as enterprises are seeking to simplify infrastructure and lower costs amid growing AI and ML adoption. Risk of confusion Enterprises should take care not to confuse Snowpark Connect for Apache Spark with Snowflake’s existing Snowflake Connector for Spark. If the Connector is like a bridge connecting two cities, allowing data to travel back and forth, Snowpark Connect is more like relocating the entire Spark city into Snowflake. That leads to limitations for Snowflake Connector, with the movement of data across systems adding to latency and cost due to Spark jobs running outside Snowflake. Migrating from the existing Connector to the newer Snowpark Connect for Apache Spark can be done without needing to convert any code, Snowflake said. The new capability is in public preview, and works with Spark 3.5. Rival Databricks offers similar capabilities via its Databricks Connect offering.
https://www.infoworld.com/article/4030595/snowflake-brings-analytics-workloads-into-its-cloud-with-s...
Related News |
25 sources
Current Date
Jul, Wed 30 - 07:47 CEST
|