Instructor: Periklis A. Papakonstantinou [ppapakons.tsinghua.old@gmail.com] 
Teaching assistant: Yuanxi Dai 
Lectures: Monday 1.304.05pm (3x45min) at TX112 
Office hours: Monday 11am12pm (or by appointment), FIT 46087 
Dec 26 
The third problemset is out Start summarizing your termproject research in a single deliverable document. The deadline is in two weeks! 
Topics: 
testing properties of very big objects, computation over datastreams, finite metric embeddings and highdimensional geometry, inference and statistics. 
Emphasis in the interface between mathematical analysis and engineering realization.  
For more details see course outline. 
COURSE PROJECTS
In your online course registration system you have received a login and a password to read through the class project descriptions.
The topics of the projects are formulated by Alexandr Andoni, Yuval Filmus, Guang Yang, David Woodruff, and the instructor of this course. Some of these questions were well known before in the literature whereas many others are new research directions, some are small & new research projects (appropriate for a class project), and others are based on new ideas building upon existing ones. All of them are adjusted and made appropriate for a term project for an advanced undergraduate class. The questions that have not appeared before themselves constitute topics that can only be discussed confidentially with the students taking this class (and only them).
* click on the lecture date for reading material
Date  Topic 

Finish up the proof of Bourgain's theorem Streaming, sketching, and finite metric embeddings 

Basic technqiues: Dimensionality reduction (JohnsonLindenstrauss), Bourgain's theorem. Full proof details and algorithmic implications.  
Introduction to finite metric embeddings and applications  
Sketching a stream, the AMS sketch for frequency moment estimation  
Hashing and its likes for efficient randomized computation over datastreams  
More details about testing graph properties Introduction to streaming models and algorithms 

Finish up the proof details of KKL Back to Property Testing: Szemeredi's regularity lemma and applications Discussion & CaseStudies: Examples from realworld applications how does theory bind to practice & shortcomings of the current theory 

Introduction to hypercontractivity. The KahnKalaiLinial inequality: influences, noise operators, and their relations, start with the proof of KKL 

Full details on MasrourKushilevitz (GoldreichLevin) 

Finish up the BLR test analysis and discussion Estimating fourier coefficients; learning boolean functions; concentrated specta; a discussion on the MasrourKushilevitz algorithm: its application domain and limitations 

Elements of Fourier Analysis on the Boolean Cube The BLR test and its analysis 

Measure concentration: working examples on ChernoffHoeffding,
concentration for pairwise independence and beyond Introduction to Property Testing 


What do we mean by "Big Data"? What is this class about: highlevel overview of property testing, streaming, fourier analysis, highdimensional geometry. First topic: Basic inequalities in discrete probability 
Readings for the 13th lecture:
next lecture: wrapup, what was this course about and what is Big Data Science
 Read chapters 15 from Jiri Matousek's textbook
next lecture: wrapup, what was this course about and what is Big Data Science
Readings for the 12th lecture:
next lecture: finite metric embeddings and streaming algorihms
 Read chapters 15 from Jiri Matousek's textbook
next lecture: finite metric embeddings and streaming algorihms
Readings for the 11th lecture:
next lecture: embedding every metric space into ell_p and dimensionality reduction
 Read the excellent survey by Piotr Indyk [link] and the even more excellent overview by Nati Linial [link]
next lecture: embedding every metric space into ell_p and dimensionality reduction
Readings for the 10th lecture:
next lecture: introduction to finite metric embeddings
 Read chapter 56 from Amit Chakrabarti's lecture notes on streaming algorithms
next lecture: introduction to finite metric embeddings
Readings for the 9th lecture:
next lecture: sketches and algorithms over datastreams
 Read chapters 35 from Amit Chakrabarti's lecture notes on streaming algorithms
next lecture: sketches and algorithms over datastreams
Readings for the 8th lecture:
next lecture: continue with basic Streaming Algorithms
 Reading material is based on inclass lecture (ask for lecture notes)

An elementrary proof of Szemeredi's regularity lemma by Adam Marcus can be found here
A series of very nice blogposts (both on Szemeredi's lemma and Szemeredi's theorem) can be found in the Terrence Tao's blogposts, and a more highlevel exposition for property testing in Luca Trevisan's blogpost  Read the first 2 chapters from Amit Chakrabarti's lecture notes on streaming algorithms
next lecture: continue with basic Streaming Algorithms
Readings for the 7th lecture:
next lecture: begin with Streaming Algorithms
 Same as for the previous lecture. I'll give references for Szemeredi's Regularity Lemma
next lecture: begin with Streaming Algorithms
Readings for the 6th lecture:
next lecture: finish up the proof of KKL  Szemeredi's regularity lemma and applications to property testing
 From O'Donnel's text: the entire Chapter 9 (check the previous chapters for prerequisites as necessary)
next lecture: finish up the proof of KKL  Szemeredi's regularity lemma and applications to property testing
Readings for the 5th lecture:
 By now you should know in detail (most important parts covered in class  but you should study all): Chapter 13 from Ryan O'Donnel's text, and Chapter 1 from the measure concentration text
Readings for the 4th lecture:
next lecture: the GoldreichLevin algorithm
 The 3rd chapter from Ryan O'Donnel's textbook (see handouts)
next lecture: the GoldreichLevin algorithm
Readings for the 3rd lecture:
next lecture: learning boolean functions
 The 1st chapter from Ryan O'Donnel's textbook (see handouts)
next lecture: learning boolean functions
Readings for the 2nd lecture:
next lecture: intro to Fourier Analysis on the Boolean Cube
 The 1st chapter from Concentration of Measure for the Analysis of Randomised Algorithms (the best introductory text on the topic I'm aware of)

Measure concentration and pairwise independence: p.1011 from Foundations of Cryptography, vol.1, by Odded Goldreich
 Paragraph 1.6 from
Analysis of Boolean Functions
(go backwards and read the relevant definitions)
next lecture: intro to Fourier Analysis on the Boolean Cube