Think of this. Thousands of data sources of varying locations, connection protocols and file formats. Hundreds of Terabytes to Petabytes of data stored in Hadoop. Multiple data integration and analytic tools. Thousands of data consumers. So many data compliance to adhere to due to sensitive data. How does one manage such a complex environment let alone security across the stack? There’s no single tool that will address all of the security needs across this type of infrastructure. The Apache community delivered multiple open source tools that will provide data protection against unauthorized access and that meets corporate requirements – Ranger, Kerberos, HDFS Encryption, Knox, Atlas, and LDAP. In this talk, we will go through the process of securing a big data environment from beginning to end and where each component fits along with do’s and don’ts and best practices.