I am executing the below two code snippets to calculate the cosine similarity of two vectors where the vectors are the same for both executions and the code for the second one is mainly the code SciPy is running (see scipy cosine implementation).
The thing is that when calling SciPy it is running slightly faster (~0.55ms vs ~0.69ms) and I don’t understand why, as my implementation is like the one from SciPy removing some checks, which if something I would expect to make it faster.
Why is SciPy’s function faster?
<code>
from scipy.spatial import distance
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = distance.cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
</code>
<code>
from scipy.spatial import distance
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = distance.cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
</code>
from scipy.spatial import distance
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = distance.cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
<code>import math
import numpy as np
def cosine(u, v):
uv = np.dot(u, v)
uu = np.dot(u, u)
vv = np.dot(v, v)
dist = 1.0 - uv / math.sqrt(uu * vv)
# Clip the result to avoid rounding error
return np.clip(dist, 0.0, 2.0)
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
</code>
<code>import math
import numpy as np
def cosine(u, v):
uv = np.dot(u, v)
uu = np.dot(u, u)
vv = np.dot(v, v)
dist = 1.0 - uv / math.sqrt(uu * vv)
# Clip the result to avoid rounding error
return np.clip(dist, 0.0, 2.0)
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
</code>
import math
import numpy as np
def cosine(u, v):
uv = np.dot(u, v)
uu = np.dot(u, u)
vv = np.dot(v, v)
dist = 1.0 - uv / math.sqrt(uu * vv)
# Clip the result to avoid rounding error
return np.clip(dist, 0.0, 2.0)
EXECUTIONS = 10000
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
1