Abstract
A large amount of structured and unstructured data is collectively termed big data. The recent technological development streamlined several companies to handle massive data and interpret future trends and requirements. The Hadoop distributed file system (HDFS) is an application introduced for efficient big data processing. However, HDFS does not have built-in data encryption methodologies, which leads to serious security threats. Encryption algorithms are introduced to enhance data security; however, conventional algorithms lag in performance while handling larger files. This research aims to secure big data using a novel hybrid encryption algorithm combining cipher-text policy attribute-based encryption (CP-ABE) and advanced encryption standard (AES) algorithms. The performance of the proposed model is compared with traditional encryption algorithms such as DES, 3DES, and Blowfish to validate superior performance in terms of throughput, encryption time, decryption time, and efficiency. Maximum efficiency of 96.5% with 7.12 min encryption time and 6.51 min decryption time of the proposed model outperforms conventional encryption algorithms.
Keywords:
big data security, Hadoop, data encryption and decryption, Hadoop distributed file system (HDFS)References
2. A. Banik, Z. Shamsi, D.S. Laiphrakpam, An encryption scheme for securing multiple medical images, Journal of Information Security and Applications, 49: 1–8, 2019, https://doi.org/10.1016/j.jisa.2019.102398
3. T. Wang, Z. Zheng, M.H. Rehmani, S. Yao, Z. Huo, Privacy preservation in big data from the communication perspective – A survey, IEEE Communications Surveys & Tutorials, 21(1): 753–778, 2019, https://doi.org/10.1109/COMST.2018.2865107
4. X. Wang, M. Veeraraghavan, H. Shen, Evaluation study of a proposed Hadoop for data center networks incorporating optical circuit switches, IEEE/OSA Journal of Optical Communications and Networking, 10(8): C50–C63, 2018, https://doi.org/10.1364/JOCN.10.000C50
5. J. George, C.-A. Chen, R. Stoleru, G. Xie, Hadoop MapReduce for mobile clouds, IEEE Transactions on Cloud Computing, 7(1): 224–236, 2019, https://doi.org/10.1109/TCC 2016.2603474.
6. G.S. Bhathal, A. Singh, Big Data: Hadoop framework vulnerabilities, security issues and attacks, Array, 1–2: 1–8, 2019, https://doi.org/10.1016/j.array.2019.100002
7. R.R. Parmar, S. Roy, D. Bhattacharyya, S.K. Bandyopadhyay, T.-H. Ki, Large-scale encryption in the Hadoop environment: challenges and solutions, IEEE Access, 5: 7156–7163, 2017, https://doi.org/10.1109/ACCESS.2017.2700228
8. J. Samuel Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits, Journal of Innovative Image Processing (JIIP), 3(01): 36–51, 2021, https://doi.org/10.36548/jiip.2021.1.004
9. H.-Y. Tran, J. Hu, Privacy-preserving big data analytics a comprehensive survey, Journal of Parallel and Distributed Computing, 134: 207–218, 2019, https://doi.org/10.1016/j.jpdc.2019.08.007
10. N. Eltayieb, R. Elhabob, F. Li, An efficient attribute-based online/offline searchable encryption and its application in cloud-based reliable smart grid, Journal of Systems Architecture, 98: 165–172, 2019, https://doi.org/10.1016/j.sysarc.2019.07.005
11. P.K. Mallepalli, S.R. Tumma, A lightweight hybrid scheme for security of big data, Materials Today: Proceedings, pp. 1–14, 2021, https://doi.org/10.1016/j.matpr.2021.03.151
12. M. Parihar, Big Data security and privacy, International Journal of Engineering Research & Technology, 10(07): 323–327, 2021.
13. R. Chatterjee, R. Chakraborty, J.K. Mondal, Design of lightweight cryptographic model for end-to-end encryption in IoT domain, IRO Journal on Sustainable Wireless Systems, 1(4): 215–224, 2019, https://doi.org/10.36548/jsws.2019.4.002
14. W. Gao, W. Yu, F. Liang, W.G. Hatcher, C. Lu, Privacy-preserving auction for big data trading using homomorphic encryption, IEEE Transactions on Network Science and Engineering, 7(2): 776–791, 2020, https://doi.org/10.1109/TNSE.2018.2846736
15. A. Alabdulatif, I. Khalil, X. Yi, Towards secure big data analytic for cloud-enabled applications with fully homomorphic encryption, Journal of Parallel and Distributed Computing, 137: 192–204, 2020, https://doi.org/10.1016/j.jpdc.2019.10.008
16. C. Xiao, P. Li, L. Zhang, W. Liu, N. Bergmann, ACA-SDS: Adaptive crypto acceleration for secure data storage in big data, IEEE Access, 6: 44494–44505, 2018, https://doi.org/10.1109/ACCESS.2018.2862425
17. K. Sharma, A. Agrawal, D. Pandey, R.A. Khan, S.K. Dinkar, RSA based encryption approach for preserving confidentiality of big data, Journal of King Saud University – Computer and Information Sciences, pp. 1–16, 2019, https://doi.org/10.1016/j.jksuci.2019.10.006
18. S. Tahir, L. Steponkus, S. Ruj, M. Rajarajan, A. Sajjad, A parallelized disjunctive query based searchable encryption scheme for big data, Future Generation Computer Systems, 109: 583–592, 2020, https://doi.org/10.1016/j.future.2018.05.048
19. D. Puthal, X. Wu, N. Surya, R. Ranjan, J. Chen, SEEN: A selective encryption method to ensure confidentiality for big sensing data streams, IEEE Transactions on Big Data, 5(3): 379–392, 2019, https://doi.org/10.1109/TBDATA.2017.2702172
20. P. Perazzo, F. Righetti, M. La Manna, C. Vallati, Performance evaluation of attributebased encryption on constrained IoT devices, Computer Communications, 170: 151–163, 2021, https://doi.org/10.1016/j.comcom.2021.02.012
21. H. Deng, Z. Qin, Q. Wu, Z. Guan, Y. Zhou, Flexible attribute-based proxy re-encryption for efficient data sharing, Information Sciences, 511: 94–113, 2020, https://doi.org/10.1016/j.ins.2019.09.052
22. P.S. Challagidad, M.N. Birje, Efficient multi-authority access control using attributebased encryption in cloud storage, Procedia Computer Science, 167: 840–849, 2020, https://doi.org/10.1016/j.procs.2020.03.423
23. S. Aditham, N. Ranganathan, A system architecture for the detection of insider attacks in big data systems, IEEE Transactions on Dependable and Secure Computing, 15(6): 974–987, 2018, https://doi.org/10.1109/TDSC.2017.2768533
24. J.S. Raj, A novel encryption and decryption of data using mobile cloud computing platform, IRO Journal on Sustainable Wireless Systems, 2(3): 118–122, 2021, https://doi.org/10.36548/jsws.2020.3.002
25. S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications, Journal of ISMAC, 3(3): 235–249, 2021, https://doi.org/10.36548/jismac.2021.3.005
