A clustering is a partition of a set of objects into groups (clusters), such that objects in the same cluster are similar to one another, while objects in different clusters differ substantially. As a first step, we have to define a notion of similarity.
As an example, consider a map. If we interpret locations marked on the map as a set of objects, then we could say the two objects are similar if the distance between the two is small. A clustering of the locations would then correspond to a division of the map into conurbations.
Algorithms solving such a clustering problem try to compute a good clustering. That is, a clustering with small cost. So, in addition to defining similarity, we also need to define an objective function, which signifies the quality of the computed clustering.