We use AWS DynamoDb for a number of microservices for the current project at work. We have a DynamoDb table used for caching a certain type of entity similar to this.
[DynamoDBTable("book-cache")]
public class Book
{
[DynamoDBHashKey]
public Guid Id { get; set; }
[DynamoDBGlobalSecondaryIndexHashKey]
public string Reference { get; set; }
public bool IsActive { get; set; }
// More properties here
}
There can be multiple books in the table with the same reference, but there can only be ONE active book per reference at a time. When the user searches for the same book reference, a new cache will be created and this same cache will be used in all other searches. The book cache becomes inactive once it is used in an order transaction. Because of that, when we query the table to get the active book by reference, we use GetNextSetAsync method from IDynamoDBContext.FromQueryAsync instead of GetRemainingAsync method as we thought we should get back the active book in the first result set.
public async Task<Book> GetByReference(string reference)
{
var queryFilter = new QueryFilter();
queryFilter.AddCondition("Reference", QueryOperator.Equal, reference);
queryFilter.AddCondition("IsActive", QueryOperator.Equal, 1);
var results = await _dbContext.FromQueryAsync<Book>(new QueryOperationConfig
{
IndexName = "Reference_GSI",
Filter = queryFilter
})
.GetNextSetAsync();
return results.FirstOrDefault();
}
Now that we have large enough amount of data in the database table, I notice there are many active records for the same book reference. After a bit of debugging, I find that the GetByReference always returns null. This is strange as I can see there are many active records for the same book reference. So I thought I changed the code so that it keeps getting next result set until it is finished.
public async Task<Book> GetByReference(string reference)
{
var queryFilter = new QueryFilter();
queryFilter.AddCondition("Reference", QueryOperator.Equal, reference);
queryFilter.AddCondition("IsActive", QueryOperator.Equal, 1);
var query = await _dbContext.FromQueryAsync<Book>(new QueryOperationConfig
{
IndexName = "Reference_GSI",
Filter = queryFilter
});
var results = await query.GetNextSetAsync();
while (!query.IsDone)
{
results.AddRange(await query.GetNextSetAsync());
}
return results.FirstOrDefault();
}
What I find is that the first iteration returns empty result, then 13 for the next, 24 for the one after and so on. This explains why there are many active records for the same book reference as the first result set returns empty so a new cache is created for every search. Given this behaviour, it is probably safer to use GetRemainingAsync method instead. This method now looks like below
public async Task<Book> GetByReference(string reference)
{
var queryFilter = new QueryFilter();
queryFilter.AddCondition("Reference", QueryOperator.Equal, reference);
queryFilter.AddCondition("IsActive", QueryOperator.Equal, 1);
var results = await _dbContext.FromQueryAsync<Book>(new QueryOperationConfig
{
IndexName = "Reference_GSI",
Filter = queryFilter
})
.GetRemainingAsync();
return results.FirstOrDefault();
}