I am building an Azure Orchestration Function App that has four different functions. The first function is a start task function that removes containers inside a Cosmos DB database. The second function, which I am having a problem with, scans a Cosmos DB container that consists of multiple links and extracts some of the links. Then, it saves the links inside a new Cosmos DB container.
Inside the Orchestration Function, I am using a for-loop that runs this function 4 times with different parameters, in reality, I will have around 9 or more: I want the app takes A, B, C, D and finishes all. Now it is going back and forth.
Thank you for your help
This is my orchestration function:
[Function(nameof(CosmosDBFunction))]
public async Task<string> RunOrchestrator(
[OrchestrationTrigger] TaskOrchestrationContext context)
{
ILogger logger = context.CreateReplaySafeLogger(nameof(CosmosDBFunction));
logger.LogInformation("----Function Started-----");
try
{
await context.CallActivityAsync("StartupTask");
}
catch (Exception ex)
{
logger.LogError($"Error happened StartupTask: {ex.Message}");
}
var parallelTasks = new List<Task<string>>();
var sites = new List<string>() {
"A", "B", "C", "D"
};
for (int i = 0; i < sites.Count; i++)
{
try
{
var task = context.CallActivityAsync<string>("ScanSafetoAContainer", sites[i]);
parallelTasks.Add(task);
}
catch (TaskFailedException ex) when (ex.InnerException is CosmosException cosmosException)
{
_logger.LogError($"Error calling for site {sites[i]}: {ex.Message}");
}
catch (TaskFailedException ex)
{
_logger.LogError($"Error calling {sites[i]}: {ex.Message}");
}
}
await Task.WhenAll(parallelTasks);
context.CallActivityAsync("SaveToAContainer");
try
{
logger.LogInformation("----Function B Started-----");
await context.CallActivityAsync("SaveToXContainer");
}
catch (TaskFailedException ex)
{
logger.LogError($"Error happened : {ex.Message}");
}
try
{
logger.LogInformation("----Function clinksToBlinks Started-----");
await context.CallActivityAsync("SaveToYContainer");
}
catch (TaskFailedException ex)
{
logger.LogError($"Error happened: {ex.Message}");
}
logger.LogInformation("----Function Finished-----");
try
{
logger.LogInformation($"Operation is completed");
return "Operation is completed";
}
catch (Exception ex)
{
return $"Operation is completed: exception at the end line {ex.Message}";
}
}
This is the function that has been called inside the loop:
[Function(nameof(Something))]
[CosmosDBOutput(databaseName: "DB", containerName: "AB",
Connection = "CosmosConnection"
, CreateIfNotExists = true, PartitionKey = "/something")]
public async Task<string> SaveAToContainer
([ActivityTrigger] string site,
FunctionContext executionContext)
{
ConcurrentBag<Object> links = new ConcurrentBag<Object>();
ILogger _logger = executionContext
.GetLogger(nameof(Something));
var container = await CreateContainer();
var client = _httpClientFactory;
client.Timeout = TimeSpan.FromSeconds(60);
using FeedIterator<object> object = container.GetItemLinqQueryable<Page>()
.Where(s => s.prop == prop)
.ToFeedIterator<Object>();
while (pages.HasMoreResults)
{
long counter = 0;
foreach (var page in await pages.ReadNextAsync())
{
string href = string.Empty;
try
{
HttpResponseMessage responseFromUrl = await client.GetAsync(page.Url);
var htmlContent = await responseFromUrl.Content.ReadAsStringAsync();
var doc = new HtmlDocument();
doc.LoadHtml(htmlContent);
HtmlNode divNode = doc.DocumentNode.SelectSingleNode("//div[@id='ga-maincontent']");
HtmlNodeCollection anchorTags = doc.DocumentNode
.SelectNodes("//div[@id='ga-maincontent']//a[@href]");
if (anchorTags != null)
{
foreach (HtmlNode anchorTag in anchorTags)
{
href = anchorTag.GetAttributeValue("href", string.Empty);
if (href.StartsWith("http"))
{
long uniqueId = Interlocked.Increment(ref counter);
links.Add(new Link
{
Id = $"{Guid.NewGuid().ToString()}{Interlocked.Increment(ref counter)}",
PropA= something
});
}
}
}
}
catch (Exception ex)
{
links.Add(new Link
{
Id = $"{Guid.NewGuid().ToString()}{Interlocked.Increment(ref counter)}",
PropA= something
});
_logger.LogInformation($"Http request exception :" +
$" Message :{ex.Message} | Status Code {ex.Source} | {ex.Source} ");
}
//if (i >= pages.) break;
}
}
string scannedLinks = string.Empty;
try
{
ss = JsonConvert.SerializeObject(links);
}
catch (Exception ex)
{
_logger.LogInformation($"Parallel Func Insert Links : {ex.Message} ");
}
return ss;
}
I used to use the list instead of ConcurrentBag
, but then I thought it is not thread safe why the apps is not executing as I am expecting. However, I feel that I am using async and await keywords nicely to make sure that the app run and waits for the tasks to complete.
What I really want to do is: when I loop through the task: A, B, C, and D will be completed and all be ready saved to cosmos db, and then we go to the third function.
Any suggestion of what I am doing wrong is much appreciated.