Cosine similarity is a common metric used to measure how similar two vectors are, based on the angle between them rather than their absolute magnitude. It is widely used in information retrieval, recommendation systems, document comparison, and machine learning. In Java, cosine similarity can be implemented in several ways: using basic loops, using modern Stream APIs, or by relying on numerical computing libraries such as ND4J. This article demonstrates each of the three approaches with practical examples.
1. What Is Cosine Similarity?
Cosine similarity computes the cosine of the angle between two vectors. Given two vectors A and B, the formula is:Where:
- is the dot product of the vectors
- and are the magnitudes (Euclidean norms) of the vectors
The result ranges from −1 to 1, where a value of 1 indicates the vectors point in the same direction, 0 means they are orthogonal with no similarity, and −1 indicates they point in opposite directions.
2. Core Math Implementation Using Traditional Loops
In this section, we implement cosine similarity using basic Java constructs and for loops. This approach is explicit and does not rely on any external libraries.
public class CosineSimilarityLoop {
public static double cosineSimilarity(double[] a, double[] b) {
if (a.length != b.length) {
throw new IllegalArgumentException("Vectors must have the same length");
}
double dotProduct = 0.0;
double normA = 0.0;
double normB = 0.0;
for (int i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
if (normA == 0 || normB == 0) {
throw new IllegalArgumentException("Vector magnitude cannot be zero");
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
public static void main(String[] args) {
double[] v1 = {1, 2, 3};
double[] v2 = {2, 3, 4};
double similarity = cosineSimilarity(v1, v2);
System.out.println("Cosine Similarity : " + similarity);
}
}
This implementation computes the dot product and both vector magnitudes in a single pass through the arrays. Accumulating the squared values allows us to compute the Euclidean norm using Math.sqrt at the end. We also validate that the vectors have equal lengths and that neither vector has zero magnitude, which would make the cosine similarity undefined due to division by zero.
3. Implementation Using Java Streams
In this example, we use Java Streams, which provide a declarative, functional programming style. Although streams may introduce slight overhead compared to traditional loops, they improve readability and make the computation’s intent clearer.
public class CosineSimilarityStream {
public static double cosineSimilarity(double[] a, double[] b) {
if (a.length != b.length) {
throw new IllegalArgumentException("Vectors must have the same length");
}
double dotProduct = IntStream.range(0, a.length)
.mapToDouble(i -> a[i] * b[i])
.sum();
double normA = Math.sqrt(
IntStream.range(0, a.length)
.mapToDouble(i -> a[i] * a[i])
.sum()
);
double normB = Math.sqrt(
IntStream.range(0, b.length)
.mapToDouble(i -> b[i] * b[i])
.sum()
);
if (normA == 0 || normB == 0) {
throw new IllegalArgumentException("Vector magnitude cannot be zero");
}
return dotProduct / (normA * normB);
}
public static void main(String[] args) {
double[] v1 = {1, 2, 3};
double[] v2 = {2, 3, 4};
double similarity = cosineSimilarity(v1, v2);
System.out.println("Cosine Similarity : " + similarity);
}
}
Here, IntStream.range is used to iterate over indices, allowing access to both arrays simultaneously. Separate stream pipelines compute the dot product and each vector’s squared sum. While this version is more expressive, it iterates over the arrays multiple times, which may be less efficient for very large vectors compared to the single-loop approach.
4. Using the ND4J Library
ND4J (N-Dimensional Arrays for Java) is a high-performance numerical computing library designed for machine learning and scientific workloads. Using ND4J simplifies mathematical operations and improves performance for large-scale data processing.
Before using ND4J, you need to add the dependency to your project (pom.xml).
<dependencies> <dependency> <groupId>org.nd4j</groupId> <artifactId>nd4j-native-platform</artifactId> <version>1.0.0-M2.1</version> </dependency> </dependencies>
ND4J Implementation
public class CosineSimilarityND4J {
public static double cosineSimilarity(double[] a, double[] b) {
INDArray vecA = Nd4j.create(a);
INDArray vecB = Nd4j.create(b);
// Compute dot product
double dotProduct = vecA.mul(vecB).sumNumber().doubleValue();
// Compute norms using Nd4j executioner
double normA = Nd4j.getExecutioner().exec(new Norm2(vecA)).getDouble(0);
double normB = Nd4j.getExecutioner().exec(new Norm2(vecB)).getDouble(0);
if (normA == 0 || normB == 0) {
throw new IllegalArgumentException("Vector magnitude cannot be zero");
}
return dotProduct / (normA * normB);
}
public static void main(String[] args) {
double[] v1 = {1, 2, 3};
double[] v2 = {2, 3, 4};
double similarity = cosineSimilarity(v1, v2);
System.out.println("Cosine Similarity : " + similarity);
}
}
First, we create INDArray objects to represent the input vectors. The dot product is then computed using vecA.mul(vecB).sumNumber(), which performs element-wise multiplication followed by summation. Finally, the Euclidean norm of each vector is calculated by executing the Norm2 operation through Nd4j.getExecutioner().exec(new Norm2(vec)).
Output from the Examples
Cosine Similarity : 0.9925833339709303
All three implementations produced the same cosine similarity value of approximately 0.993 for the vectors [1, 2, 3] and [2, 3, 4]. This value is very close to 1, indicating that the two vectors point in nearly the same direction in the vector space, meaning they are highly similar. Minor differences you might see in other datasets could arise from floating-point precision, but in general, all three approaches, traditional loops, streams, and ND4J, correctly compute the similarity according to the standard formula.
5. Conclusion
In this article, we demonstrated how to calculate cosine similarity between two vectors in Java using three different approaches: a traditional loop-based implementation, a Stream-based functional approach, and a library-based solution using ND4J. Each method applies the same mathematical formula but offers different trade-offs in terms of readability, performance, and dependency management.
6. Download the Source Code
This article covered multiple approaches for computing cosine similarity between two vectors in Java.
You can download the full source code of this example here: code for computing cosine similarity between two vectors in Java
Thank you!
We will contact you soon.
Omozegie AziegbeJanuary 28th, 2026Last Updated: January 28th, 2026

This site uses Akismet to reduce spam. Learn how your comment data is processed.