I already have fuzzy searching, this allows my users to do "close enough" searching on my music website onlymusik.com. Fuzzy match or Fuzzy search allows people to search for songs that contain the search term that is close enough.
There are real world benefits to this type of searching, these are:
This sort of approach can often lead to a more smoother and quicker search experience.
However since the emergence of AI, Vector Searching is now all the rage and it is a very useful feature and boasts some of the following real world features:
/// <summary>
/// Performs a vector search on the specified field of the documents in the collection.
/// </summary>
/// <remarks>The method constructs an aggregation pipeline that includes a vector search stage and
/// an optional filter stage. The vector search is performed using the specified query vector and field, and the
/// results are limited to the specified number of documents.</remarks>
/// <typeparam name="TDoc">The type of the documents in the collection, which must inherit from <see cref="MongoItemBase"/>.</typeparam>
/// <param name="queryVector">The vector used as the query for the search. Each element represents a dimension of the vector space.</param>
/// <param name="vectorField">The name of the field in the documents that contains the vector data to be searched against.</param>
/// <param name="limit">The maximum number of documents to return. Defaults to 10.</param>
/// <param name="filter">An optional filter to apply to the documents before performing the vector search. Can be <see
/// langword="null"/>.</param>
/// <returns>A task that represents the asynchronous operation. The task result contains an enumerable of documents of
/// type <typeparamref name="TDoc"/> that match the vector search criteria.</returns>
public async Task<IEnumerable<T>> VectorSearchAsync(string index, float[] queryVector, string vectorField, int limit = 10, FilterDefinition<T> filter = null, double minimumScore = 0.75)
{
// Verify the vector has the correct dimensions
if (queryVector.Length != 1536)
{
throw new ArgumentException($"Query vector must have 1536 dimensions, but has {queryVector.Length} dimensions. This must match the 'numDimensions' value in your vector search index.");
}
// Build the vector search aggregation pipeline
var pipeline = new List<BsonDocument>();
// Add $vectorSearch stage with improved parameters
var vectorSearchStage = new BsonDocument
{
{ "$vectorSearch", new BsonDocument
{
{ "index", index },
{ "path", vectorField },
{ "queryVector", new BsonArray(queryVector.Select(v => new BsonDouble(v))) },
{ "numCandidates", limit * 4 }, // Dynamic based on limit but still constrained
{ "limit", limit * 2 }, // Get more candidates than needed for post-filtering
{ "similarity", new BsonDocument { { "minimumScore", minimumScore } } } // Only return results above this threshold
}
}
};
// Add the vector search stage first
pipeline.Add(vectorSearchStage);
// Add filtering if provided
if (filter != null)
{
var filterBson = filter.ToBsonDocument();
pipeline.Add(new BsonDocument("$match", filterBson));
}
// Add score field to results
pipeline.Add(new BsonDocument
{
{ "$addFields", new BsonDocument
{
{
"VectorScore", new BsonDocument { { "$meta", "vectorSearchScore" } }
}
}
}
});
// Sort by search score (highest first)
pipeline.Add(new BsonDocument
{
{ "$sort", new BsonDocument { { "searchScore", -1 } } }
});
// Limit to final requested number
pipeline.Add(new BsonDocument
{
{ "$limit", limit }
});
try
{
// Execute the aggregation pipeline
var results = await _Collection.AggregateAsync<T>(pipeline);
return await results.ToListAsync();
}
catch (MongoException ex)
{
// Provide more helpful error information
if (ex.Message.Contains("array size mismatch"))
{
throw new InvalidOperationException(
"Vector dimension mismatch. The query vector dimensions don't match the dimensions in the MongoDB index. " +
"Make sure the index was created with the same number of dimensions as your query vector. " +
$"Current query vector has {queryVector.Length} dimensions.", ex);
}
throw;
}
}