## Coursera Week 1

###
How do biologists visualize gene expression matrices?

###
Why do we use the logarithms of expression values rather than the expression values themselves?

###
Why are we interested in analyzing genes whose expression significantly decreases during the course of an experiment?

###
How do we solve the k-center clustering problem for k = 1?

###
Can I modify FarthestFirstTraversal to solve the k-Means Clustering Problem?

###
Are there measures in addition to the squared error distortion for evaluating clustering quality?

###
If outliers present so many challenges for clustering, why don’t we simply remove outliers before running clustering algorithms?

###
How do biologists select the value of k in k-means clustering?

###
What is the running time of the Lloyd algorithm?

###
Can the Lloyd algorithm for k-means clustering start from k centers and end up with fewer than k centers?

###
Is it possible that two different clusters during the course of the Lloyd algorithm will have the same center of gravity?

###
Isn’t k-means++Initializer rather slow? And why is it better at initializing data points than FarthestFirstTraversal?

###
How many partitions of a set of points into k clusters are there?

Exercise Break: Find a formula for {n, 2} in terms of n.

## Coursera Week 2

###
If HiddenVector consists of all zeroes, the formula for computing θA in the section “From Coin Flipping to k-Means Clustering” does not work because we have to divide by 0. What should we do?

###
We saw that the Lloyd algorithm does not necessarily converge to an optimal solution to the k-Means Clustering Problem. Does the soft k-means clustering algorithm converge to an optimal solution?

###
Why is the soft clustering algorithm called "Expectation Maximization"?

###
Can we use k-means++Initializer for soft k-means clustering?

###
How do we determine an appropriate stiffness parameter?

###
What is the stopping rule for the EM algorithm?

###
How do we decide which horizontal line passing through the hierarchical clustering tree results in the best clustering?

###
In contrast to hierarchical clustering, the Lloyd algorithm is run for a fixed number of clusters k, and it is not clear how to select k in advance. Why would we ever select the Lloyd algorithm over hierarchical clustering?

###
How does scaling the dataset affect the result of clustering?

###
A simple generalizable scaling method is based on the formula