Recently, we were pentesting a Data mining and Analytics company. The amount of data that they talked about is phenomenal and they are planning to move to Big Data. They invited me to write a blog on state of the art, Big Data security concerns and challenges and I happily accepted.
Big data is fundamentally different from traditional relational databases in terms of requirements and architecture. Big data is often characterized by 3Vs, Volume, Velocity and Variety of data. Some of the fundamental differences in Big Data architecture are Distributed Architecture, Real Time, Stream and Continuous Computations, Ad-hoc Queries, Parallel and powerful Programming Language, Move the Code, Non Relational Data, Auto-tiering and Vareity of Input Data Sources.
The top 5 vulnerabilities classes are Insecure Computation, End-point input validation/filtering, granular access control, Insecure Data Storage and communication, Privacy Preserving Data Mining and Analytics.