Pages

How to Unzip Automatically your Files with Azure Function v2

I published a video that explains how to UnZip files without any code by using Logic Apps. However, that solution didn't work for bigger files or different archive type. This post is to explain how you can use the Azure Function to cover those situations. This first iteration supports "Zip" files; all the code is available in my GitHub.

Prerequisites


To create the Azure Function, I will use the excellent Azure Function extension of Visual Studio Code. You don't "need" it. However, it makes thing very easy.


You can easily install the extension from Inside Visual Studio Code by clicking on the extension button in the left menu. You will also need to install the Azure Function Core Tools

Creating the Function


Once the extension installed, you will find a new button in the left menu. That opens a new section with four new option: Create New Project, Create Function, Deploy to Function App, and Refresh.


Click on the first option Create New Project. Select a local folder and a language; for this demo, I will use C#. This will create a few files and folder. Now let's create our Function. From the extension menu, select the second option Create Function. Create a Blob Trigger named UnzipThis into the folder we just created, and select (or create) Resource Group, Storage Account, and location in your subscription. After a few seconds, another question will pop asking the name of the container that our blob trigger monitors. For this demo, input-files is used.

Once the function is created you will see this warning message.


What that means is that to be able to debug locally we will need to set the setting AzureWebJobsStorage to UseDevelopmentStorage=true in the local.settings.json file. It will look like this.

{
    "IsEncrypted": false,
    "Values": {
        "AzureWebJobsStorage": "UseDevelopmentStorage=true",
        "FUNCTIONS_WORKER_RUNTIME": "dotnet",
        "unziptools_STORAGE": "DefaultEndpointsProtocol=https;AccountName=unziptools;AccountKey=XXXXXXXXX;EndpointSuffix=core.windows.net",
    }
}

Open the file UnzipThis.cs; this is our function. On the first line of the function, you can see that the Blob trigger is defined.

[BlobTrigger("input-files/{name}", Connection = "cloud5mins_storage")]Stream myBlob

The binding is attached to the container named input-files, from the storage account reachable by the connection "cloud5mins_storage". The real connectionString is in the local.settings.json file.

Now, let's put the code we need for our demo:

[FunctionName("Unzipthis")]
public static async Task Run([BlobTrigger("input-files/{name}", Connection = "cloud5mins_storage")]CloudBlockBlob myBlob, string name, ILogger log)
{
    log.LogInformation($"C# Blob trigger function Processed blob\n Name:{name}");

    string destinationStorage = Environment.GetEnvironmentVariable("destinationStorage");
    string destinationContainer = Environment.GetEnvironmentVariable("destinationContainer");

    try{
        if(name.Split('.').Last().ToLower() == "zip"){

            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(destinationStorage);
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            CloudBlobContainer container = blobClient.GetContainerReference(destinationContainer);
            
            using(MemoryStream blobMemStream = new MemoryStream()){

                await myBlob.DownloadToStreamAsync(blobMemStream);

                using(ZipArchive archive = new ZipArchive(blobMemStream))
                {
                    foreach (ZipArchiveEntry entry in archive.Entries)
                    {
                        log.LogInformation($"Now processing {entry.FullName}");

                        //Replace all NO digits, letters, or "-" by a "-" Azure storage is specific on valid characters
                        string valideName = Regex.Replace(entry.Name,@"[^a-zA-Z0-9\-]","-").ToLower();

                        CloudBlockBlob blockBlob = container.GetBlockBlobReference(valideName);
                        using (var fileStream = entry.Open())
                        {
                            await blockBlob.UploadFromStreamAsync(fileStream);
                        }
                    }
                }
            }
        }
    }
    catch(Exception ex){
        log.LogInformation($"Error! Something went wrong: {ex.Message}");

    }            
}

UPDATED: Thanks to Stefano Tedeschi who found a bug and suggested a fix.

The source of our compressed file is defined in the trigger. To define the destination destinationStorage and destinationContainer are used. Their value are saved into local.settings.json. Then because this function only supports .zip file a little validation was required.

Next, we create an archive instance using the new System.IO.Compression library. We then create references to the storage account, blob, and container. It not possible to used second binding here because for one archive file you have a variable number of potential extracted files. The bindings are static; therefore we need to use the regular storage API.

Then for every file (aka entry) in the archive the code upload it to the destination storage.

Deploying


To deploy the function, from the Azure Function extension click on the third option: Deploy to Function App. Select your subscription and Function App name.

Now we need to configure our settings in Azure. By default, the local.setting are NOT used. Once again the extension is handy.


Under the subscription expand the freshly deployed Function App AzUnzipEverything, and right-click on Application Settings. Use Add New Setting to create cloud5mins_storage, destinationStorage and destinationContainer.

The function is now deployed and the settings are set, now we only need to create the blob storage containers, and we will be able to test the function. You can easily do that directly from the Azure portal (portal.azure.com).

You are now ready to upload a file into the input-files container.

Let's Code Together


This first iteration only supports "Zip" files. All the code is available on GitHub. Feel free to use it. If you would like to see or add support for other archive types join me on GitHub!.

In a video, please!


I also have a video of this post if you prefer.



I also have an extended version where I introduce more the Visual Studio Extension to work with Azure Function. And explain more details about the Azure Function V2.