BIBLIOGRAPHY

Big Data Management & Processing

Data Storages: Distributed file systems and No/New SQL

  • Cattell, “Scalable SQL and NoSQL data stores,” SIGMOD Rec., vol. 39, no. 4, May 2011.
  • J. Sadalage and M. Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot. 2012.
  • Davoudian, A., Chen, L. and Liu, M., 2018. A survey on NoSQL storesACM Computing Surveys (CSUR)51(2), pp.1-43.
  • Stonebraker, M., 2010. SQL databases v. NoSQL databases. Communications of the ACM, 53(4), pp.10-11.
  • Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in Proc. of the 19th ACM SOSP Symposium on Operating Systems Principles (SOSP’03), 2003, vol. 37, no. 5.
  • Cafarella, A. Halevy, W. Hsieh, S. Muthukrishnan, R. Bayardo, O. Benjelloun, V. Ganapathy, Y. Matias, R. Pike, and R. Srikant, “Data Management Projects at Google,” SIGMOD Rec., vol. 37, no. 1, 2008.
  • Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, Jan. 2008.
  • Borthakur, “HDFS Architecture Guide,” Apache Rep., pp. 1–13, 2010.
  • Adiba, M., J. C. Castrejón, J. A. Espinosa-Oviedo, G. Vargas-Solar, and J. L. Zechinelli-Martini. “Big data management: challenges, approaches, tools and their limitations.” In Networking for big data, pp. 43-55. CRC Press, 2015.s

Parallel Data Processing Models & Environments

  • Dittrich and J.-A. Quiané-Ruiz, “Efficient big data processing in Hadoop MapReduce,” Proc. VLDB Endow., vol. 5, no. 12, Aug. 2012.
  • Li, B. C. Ooi, M. T. Özsu, and S. Wu, “Distributed data management using MapReduce,” ACM Comput. Surv., vol. 46, no. 3, Feb. 2014.
  • Okcan and M. Riedewald, “Processing theta-joins using MapReduce,” in of the 2011 ACM SIGMOD Int. Conference on Management of Data (SIGMOD ’11), 2011.
  • D. Ullman, “Designing good MapReduce algorithms,” XRDS Crossroads, ACM Mag. Students, vol. 19, no. 1, Sep. 2012.
  • Chandar, “Join Algorithms using Map / Reduce,” Slides, 2010.
  • Thusoo, J. Sen Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy, “Hive – a petabyte scale data warehouse using Hadoop,” in of the 26th ICDE Int. Conference on Data Engineering (ICDE’10), 2010.
  • Abadi and D. J. Dewitt, “mapReduce and Parallel DBmss : friends or foes ?”
  • Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoop’s adolescence: an analysis of Hadoop usage in scientific workloads,” Proc. VLDB Endow., vol. 6,

Devops and Virtualisation

  • Leite L, Rocha C, Kon F, Milojicic D, Meirelles P. A survey of DevOps concepts and challenges. ACM Computing Surveys (CSUR). 2019 Nov 14;52(6):1-35. [PDF ]
  • Kreuzberger D, Kühl N, Hirschl S. Machine learning operations (mlops): Overview, definition, and architecture. IEEE access. 2023 Mar 27 [PDF]