2nd Session- Big Data
This session we deal with the following questions: I. What is "big data"? II. What are the opportunities? III. What are the limitations? |
This session's tasks consist of 2 activities and one voluntary one:
1- Lecture: Watch the following lectures and answer the interactive questions (which are graded). We recommend to use the following slide print-outs to take notes.
1slidePerPage_UCCSS_Blumenstock.pdf Download 1slidePerPage_UCCSS_Blumenstock.pdf
3slidesPerPage_UCCSS_Blumenstock.pdf Download 3slidesPerPage_UCCSS_Blumenstock.pdf
UCCSS_Blumenstock_1: Fighting Poverty with Data (7min) UCCSS_Blumenstock_2: Extracting features (9min) |
1slidePerPage_UCCSS_2ndBigData_Hilbert.pdf Download 1slidePerPage_UCCSS_2ndBigData_Hilbert.pdf
3slidesPerPage_UCCSS_2ndBigData_Hilbert.pdf Download 3slidesPerPage_UCCSS_2ndBigData_Hilbert.pdf
I. What is "big data"?
UCCSS 2-01: Big Data lecture overview (2min)
UCCSS 2-02: What is "big data"? (14min)
II. What are the opportunities?
UCCSS 2-03: Digital Footprint (5min)
UCCSS 2-04: Political Data-fusion & No-sampling (19min)
UCCSS 2-06: Machine Learning (5min)
UCCSS 2-07: ML Recommender Systems (11min)
III. What are the limitations?
UCCSS 2-08: Footprint ≠ Representativeness (10min)
UCCSS 2-09: Data ≠ Reality (6min)
UCCSS 2-10: Meaning ≠ Meaningful (5min)
UCCSS 2-11: Discrimination ≠ Personalization (9min)
UCCSS 2-12: Correlation ≠ Causation (7min)
UCCSS 2-13: Past ≠ Future (11min)
(total 2h 25min)
2- Lab:
You will web scrape two different YouTube channels with: http://webscraper.io
Links to an external site.
First, get familiar with the task with help of this tutorial video: UCCSS_LAB_webscraping (29min) ; and this PDF tutorial: UCCSS_Lab_Webscraping.pdf Download UCCSS_Lab_Webscraping.pdf
You find your individually assigned task here: 2nd Session- Web scraping task
If you run into problems, please feel free to ask questions in Piazza and/or coordinate with others through Study Groups Coordination .
Optional / Voluntary / Complementary:
- If you want to build a more sophisticated scraper that scrapes both YouTube video >Titles< + >number of views<, feel free to check out this tutorial here: UCCSS_Lab_Webscraping_EXTRA_titlesANDviews.pdf Download UCCSS_Lab_Webscraping_EXTRA_titlesANDviews.pdf
- Blumenstock, J., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science, 350(6264), 1073–1076. https://doi.org/10.1126/science.aac4420 Links to an external site.
- Blumenstock, J., & Eagle, N. (2010). Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. In Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development (pp. 6:1–6:10). New York, NY, USA: ACM. https://doi.org/10.1145/2369220.2369225 Links to an external site.
- Bloomberg Radio (2017). "Big Data Goes Where Economies Fear to Tread Links to an external site.", Podcast with Prof. Joshua Bloomenstock.
- DATA X (2017). Be prepared to be creeped out: Data Selfie of your own Facebook data https://vimeo.com/201178499 Links to an external site.
- Vigen (2015). Make your own "spurious correlations": http://www.tylervigen.com/spurious-correlations Links to an external site.