ABSTRACT
Big Data analysis refers to advanced and efficient data mining and machine learning techniques applied to large amount of data. Research work and results in the area of Big Data analysis are continuously rising, and more and more new and efficient architectures, programming models, systems, and data mining algorithms are proposed. Taking into account the most popular programming models for Big Data analysis (MapReduce, Directed Acyclic Graph, Message Passing, Bulk Synchronous Parallel, Workflow and SQL-like), we analysed the features of the main systems implementing them. Such systems are compared using four classification criteria (i.e. level of abstraction, type of parallelism, infrastructure scale and classes of applications) for helping developers and users to identify and select the best solution according to their skills, hardware availability, productivity and application needs.
GRAPHICAL ABSTRACT
This figure is a word cloud highlighting the most popular words related to Big Data analysis.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
13. https://netty.io