Home » Picture Gallery » MapR Releases New Ecosystem Pack with Optimized Security and Performance for Apache Spark
Ihren XING-Kontakten zeigen

MapR Releases New Ecosystem Pack with Optimized Security and Performance for Apache Spark

SAN JOSE, CA — (Marketwired) — 04/10/17 — , the provider of the Converged Data Platform that converges the essential data management and application processing technologies on a single, horizontally scalable platform, today announced its next major release of the (MEP) program. MEP is a broad set of open source ecosystem projects that enable big data applications running on the with inter-project compatibility. Version 3.0 of MEP provides enhanced security for Spark, new Spark connectors for MapR-DB and HBase, significant updates and integrations with Drill, and a faster version of Hive.

“The adoption of Spark and Drill continues to advance at a fast pace with enterprises worldwide,” said Will Ochandarena, senior director, product management, MapR Technologies. “With a regular cadence of ecosystem updates that make it easier to adopt for production use, our customers immediately benefit from rapid open source innovation with the reliability, scale and performance of the Converged Data Platform.”

The MapR Ecosystem Pack removes the complexity of coordinating many different community projects and versions. MapR develops, tests, and integrates open source ecosystem projects such as Apache Drill, Spark, Hive, and Myriad, among others. The new MapR Ecosystem Pack version 3.0 includes:

The Spark 2.1 release focuses on improvements in enterprise-ready stability and security including:

Scalable partition handling

Data Type APIs graduate to “stable”

More than 1200 fixes on the Spark 2.X line

Provides for secure connections using MapR-SASL in addition to Kerberos for inbound client connections to the Spark Thrift server and Spark connections to Hive Metastore

Support for impersonation on SELECT statements

The Native Spark Connector for MapR-DB JSON makes it easier to build real-time or batch pipelines between data and MapR-DB while leveraging Spark or Spark Streaming within the pipeline. Designed to be highly efficient and simplify code development, the Native Spark Connector includes:

Two new APIs that allow you to load data from a MapR-DB JSON table to a Spark RDD or save a Spark RDD to a MapR-DB JSON table

A custom data partitioner for better performance

Data locality of MapR-DB to launch Spark executors when it reads data

The new Spark-HBase connector provides the ability to write applications that consume HBase binary tables and use them in Spark. Added features include:

Bulk insertion into HBase

Spark SQL for HBase

Significant updates have been added in this release around BI tool integration, end to end security, performance, and usability. Highlights of Drill 1.10 include:

Tableau native connectivity

Support for the CREATE TEMPORARY TABLE AS (CTTAS) command

Support for Kerberos & MapR-SASL authentication between the client and drillbit

Ability to query data with Hue 3.12 (experimental only)

Improved compatibility with Hive/Spark generated Parquet files

Improved query diagnostics

~110 bug fixes & other improvements

The MEP 3.0 release includes a faster version of Hive to greatly improve the speed for data processing tasks, provide smaller latency for interactive queries, and higher throughput for batch queries. Other key improvements include:

2X Faster ETL through a smarter Cost-Based Optimizer (CBO), faster type conversions and dynamic partition pruning

New HiveServer UI with new diagnostics and monitoring tools

Dynamically partitioned hash joins provide unsorted inputs in order to eliminate the sorting

With MapR core Release 5.2.1, you can develop C applications for MapR Streams. The MapR Streams C Client is a distribution of librdkafka that integrates with MapR Streams.

With MapR core Release 5.2.1, you can create python applications for MapR Streams using the MapR Streams Python client. The Streams Python client is a binding for librdkafka and contains support for high-level consumers.

The MapR Ecosystem Pack version 3.0 is . For more information and on new product information, visit .

To learn more about the new features in the MapR Ecosystem Pack, please join MapR on April 27 at 10:00 am PT (1:00 pm ET) for a free, live webinar. To register, please visit .

Tweet this: releases new ecosystem pack with optimized security and performance for Spark

Headquartered in San Jose, Calif., MapR provides the industry–s only Converged Data Platform that enables customers to harness the power of big data by combining analytics in real-time to operational applications to improve business outcomes. With MapR, enterprises have a data management platform for undertaking digital transformation initiatives to achieve competitive edge. Amazon, Cisco, Google, Microsoft, SAP, and other leading businesses are part of the global MapR partner ecosystem. For more information, visit .

Beth Winkowski
MapR Technologies, Inc.
(978) 649-7189

Kim Pegnato
MapR Technologies, Inc.
(781) 620-0016

You must be logged in to post a comment Login


Blogverzeichnis - Blog Verzeichnis bloggerei.de Blog Top Liste - by TopBlogs.de Bloggeramt.de blogoscoop