Map Reduce Service
Flexible Engine / Data Analysis
Launch your Hadoop and Kafka clusters in just a few minutes
MapReduce service is a Big Data solution that provides storage resources and analysis capabilities to build a massive data processing platform in a reliable, secure and easy to use way.
Users can benefit from solutions such as Hadoop, Spark, HBase, or Hive to quickly create clusters that provide computing and storage resources for massive data analysis or real-time processing.
The resources used for calculation and storage can be created or deleted during the process, depending on the processing required to optimize the costs
Components of the solution
Massive data storage
The Hadoop distributed file system (HDFS) is a distributed file system that provides high-performance access to data distributed in Hadoop clusters. Like other Hadoop-related technologies, HDFS has become a key tool for managing Big Data pools and supporting analytical applications. After processing and analysis, the data is encrypted via SSL and stored in object storage (OBS) or HDFS.
Based on Hadoop to deploy a distributed infrastructure, MRS uses MapReduce to perform parallel processing on large volumes of data (To and beyond).
Spark is a framework for batch processing. It supports development in different programming languages such as Scala, Java and Python. In addition, it provides Spark SQL to request and analyze data via the standard SQL language.
Hadoop Database (HBase) is a distributed non-relational database management system, written in Java, with structured storage for large tables. It thus provides a reliable, high-performance and scalable solution to complete relational databases in massive data processing.
Hive Apache Hive is a data warehouse infrastructure integrated into Hadoop that allows analysis, query via a syntactically close language to SQL as well as data synthesis.
MRS uses KrbServer to provide the Kerberos authentication function for all components, thereby implementing reliable authentication mechanisms. User can choose whether support Kerberos when creating cluster. When user enabled the Kerberos authentication, all MRS components need to be authenticated. For more details about Kerberos, visit https://web.mit.edu/kerberos//.
Hue provides a graphical web user interface (WebUI) for MRS applications. Hue supports components including Hadoop distributed file system (HDFS), Hive, YARN/MapReduce and Spark. On the WebUI provided by Hue, you can perform the following operations on the components:
Attaches to spark as a data source and provides the functionality of storage and retrieval data for fast query and analysis. CarbonData leverages the distributed processing power of spark to speed up the queries by an order of magnitude faster over PetaBytes of data.
Kafka is a distributed, partitioned, replicated message publishing and subscription system. It provides features similar to the Java Message Service (JMS), but the design is different. Kafka provides features, such as message persistence, high throughput, multi-client support, and real-time processing, and applies to online and offline message consumption. It is ideal for Internet service data collection scenarios, such as conventional data collection, website active tracing, data monitoring, and log collection.
Storm is a distributed, reliable, and fault-tolerant real-time computing system. It is used to process data streams of a massive scale on a real-time basis. Storm applies to real-time analysis, continuous computation, and distributed Extract, Transform, and Load (ETL).
A highly available commercial Hadoop big data platform can be built by performing a few steps within minutes.
MRS provides a user-friendly web-based console, enabling you to perform management operations with ease.
Stability and reliability
99.9% service availability: : Critical services of MRS, such as NameNode and HMaster, are working in active-standby mode. In the event of an active server failure, services are automatically switched over to the standby server within minutes.
With Kerberos authentication and Orange’ security expertise, MRS provides role-based access control and sound audit functions to ensure 360-degree protection.
Application scenario: vehicle internet
An automobile company stores data on HBase, which supports PB-level storage and CDR queries in milliseconds.