Big Data Management & Processing
- Chaudhuri, “What next?: a half-dozen data management research goals for big data and the cloud,” in Proc. of the 31st PODS Symposium on Principles of Database Systems
- Michael and K. Miller, “Big Data: New opportunities and new challenges,” Computer (Long. Beach. Calif)., vol. 46, no. 6, 2013.
- R. Borkar, M. J. Carey, and C. Li, “Big data platforms: What’s next?,” XRDS Crossroads, ACM Mag. Students, vol. 19, no. 1, Sep. 2012.
- Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland, “The end of an architectural era: (it’s time for a complete rewrite),” in Proc. of the 33rd VLDB Int. Conference on Very Large Data Bases (VLDB ’07), 2007.
- Vargas-Solar, G., Zechinelli-Martini, J.L. & Espinosa-Oviedo, J.A. Big Data Management: What to Keep from the Past to Face Future Challenges?. Data Sci. Eng.2, 328–345 (2017). https://doi.org/10.1007/s41019-017-0043-3
- Langford, “Parallel machine learning on big data,” XRDS Crossroads, ACM Mag. Students, vol. 19, no. 1, Sep. 2012.
- Hoffmann, “Looking back at big data,” Commun. ACM, vol. 56, no. 4, Apr. 2013.
- S. Vitter, “External memory algorithms and structures: dealing with Massive Data,” ACM Comput. Surv., vol. 33, no. 2, 2001.
- Apps and R. Scale, Big Data Sourcebook. 2014.
Data Storages: Distributed file systems and No/New SQL
- Cattell, “Scalable SQL and NoSQL data stores,” SIGMOD Rec., vol. 39, no. 4, May 2011.
- J. Sadalage and M. Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot. 2012.
- Davoudian, A., Chen, L. and Liu, M., 2018. A survey on NoSQL stores. ACM Computing Surveys (CSUR), 51(2), pp.1-43.
- Stonebraker, M., 2010. SQL databases v. NoSQL databases. Communications of the ACM, 53(4), pp.10-11.
- Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in Proc. of the 19th ACM SOSP Symposium on Operating Systems Principles (SOSP’03), 2003, vol. 37, no. 5.
- Cafarella, A. Halevy, W. Hsieh, S. Muthukrishnan, R. Bayardo, O. Benjelloun, V. Ganapathy, Y. Matias, R. Pike, and R. Srikant, “Data Management Projects at Google,” SIGMOD Rec., vol. 37, no. 1, 2008.
- Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, Jan. 2008.
- Borthakur, “HDFS Architecture Guide,” Apache Rep., pp. 1–13, 2010.
- Adiba, M., J. C. Castrejón, J. A. Espinosa-Oviedo, G. Vargas-Solar, and J. L. Zechinelli-Martini. “Big data management: challenges, approaches, tools and their limitations.” In Networking for big data, pp. 43-55. CRC Press, 2015.s
Parallel Data Processing Models & Environments
- Dittrich and J.-A. Quiané-Ruiz, “Efficient big data processing in Hadoop MapReduce,” Proc. VLDB Endow., vol. 5, no. 12, Aug. 2012.
- Li, B. C. Ooi, M. T. Özsu, and S. Wu, “Distributed data management using MapReduce,” ACM Comput. Surv., vol. 46, no. 3, Feb. 2014.
- Okcan and M. Riedewald, “Processing theta-joins using MapReduce,” in of the 2011 ACM SIGMOD Int. Conference on Management of Data (SIGMOD ’11), 2011.
- D. Ullman, “Designing good MapReduce algorithms,” XRDS Crossroads, ACM Mag. Students, vol. 19, no. 1, Sep. 2012.
- Chandar, “Join Algorithms using Map / Reduce,” Slides, 2010.
- Thusoo, J. Sen Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy, “Hive – a petabyte scale data warehouse using Hadoop,” in of the 26th ICDE Int. Conference on Data Engineering (ICDE’10), 2010.
- Abadi and D. J. Dewitt, “mapReduce and Parallel DBmss : friends or foes ?”
- Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoop’s adolescence: an analysis of Hadoop usage in scientific workloads,” Proc. VLDB Endow., vol. 6,