Optimization Of Big Data Processing Using Distributed Computing In Cloud Environments
DOI:
https://doi.org/10.62951/ijcts.v1i2.58Keywords:
Big data, Distributed computing, Cloud computing, Apache Hadoop, Apache Spark, Data processing optimizationAbstract
The growth of big data has driven the need for efficient data processing methods, especially in cloud computing environments. This study evaluates distributed computing frameworks like Apache Hadoop and Apache Spark for optimizing big data processing. By analyzing different configurations, we demonstrate how distributed systems can significantly reduce processing time and improve resource utilization, making them ideal for handling complex datasets in cloud environments.
References
Airbnb. (2018). Data science at Airbnb: A case study. Retrieved from https://medium.com/airbnb-engineering/data-science-at-airbnb-a-case-study-3e5f6c1f8e6a
Capital One. (2020). How Capital One uses machine learning to combat fraud. Retrieved from https://www.capitalone.com/tech/machine-learning-fraud/
Chen, M., Mao, S., & Liu, Y. (2019). Big data: A survey on applications and security issues. IEEE Access, 7, 2320-2340. https://doi.org/10.1109/ACCESS.2019.2891586
Ghoting, A., et al. (2016). A comparison of Hadoop and Spark for big data applications. In Proceedings of the IEEE International Conference on Cloud Computing Technology and Science.
Gomez-Uribe, C. A., & Hunt, N. (2015). The Netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems, 6(4), 1-19. https://doi.org/10.1145/2843948
International Data Corporation. (2020). Data age 2025: The evolution of data to life-critical. Retrieved from https://www.idc.com/getdoc.jsp?containerId=prUS45751220
Kumar, S., et al. (2019). Big data in healthcare: A review of the applications and challenges. Journal of Healthcare Engineering, 2019. https://doi.org/10.1155/2019/8787602
Marz, N., & Warren, J. (2015). Big data: Principles and best practices of scalable real-time data systems. Manning Publications.
Shi, W., et al. (2016). Edge computing: A new frontier for computing. IEEE Internet of Things Journal, 3(5), 637-646. https://doi.org/10.1109/JIOT.2016.2564339
Zaharia, M., et al. (2016). Spark: The definitive guide: Big data processing made simple. O'Reilly Media.
Zhang, Y., et al. (2020). Performance evaluation of Hadoop and Spark for big data processing. Journal of Cloud Computing: Advances, Systems and Applications, 9(1), 1-15. https://doi.org/10.1186/s13677-020-00183-6
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Computer Technology and Science

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.