From SFTP/FTPS to Data Lake with Azure Functions in C#

From SFTP/FTPS to Data Lake with Azure Functions in C#

This article describes how to extract reports from a SFTP or FTPS with an Azure App Service, and saves them in a data lake.

The infrastructure

Recently we built a robot (RPA) with Softomotive 1 , to extract some excel reports from our facility management system; Dalux FM.

The RPA uploads the files to a FTP server. I choose to use SFTP instead of FTPS, because SFTP has a better usability with firewalls. I show how to hook up with both protocols.

In Azure, I have a service app, as a Timer trigger, that looks every hour at the FTP for any new files.

The data gets cleaned up and converted from xlsx to csv.

Finally the service app, uploads the files, directly to a data lake in Azure. I could have send them to an event hub, and made a more separated code, but it will do for now.

The system flow::img-120

Azure Functions timer trigger

I created this Azure Function version V3 (full version on GitHub):

1[FunctionName("DaluxFMFromFTP_TimerTrigger")]
2public static void Run([TimerTrigger("0 0 * * * *")] TimerInfo myTimer, ILogger log)
3{}

The TimerTrigger determines how often the trigger should be called: {second} {minute} {hour} {day} {month} {day-of-week}. So this will be called every hour.

In local.settings.json, I have all my variables and secrets for testing on my computer. To reach it, I write:

1var config = new ConfigurationBuilder().SetBasePath(Directory.GetCurrentDirectory())
2.AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)
3.AddEnvironmentVariables().Build();

The AddEnvironmentVariable, assures that variables defined within Azure, gets transferred.

In local.settings.json, I have:

1{
2  "Values": {
3    "FTPConnectionStringDaluxFM": "host=url.com;user=username;pass=password;path=/daluxfm;"
4  }
5}

Instead of having host, user, password and path as individual variables, I have gathered them into one connection-string, and then extended ssh.net with a helper. Build it like this (full version on GitHub):

 1using Renci.SshNet;
 2
 3namespace Warehouse.Fileup.Functions.Helpers
 4{
 5    public class FTPClientHelper : SftpClient
 6    {
 7        private static string _path;
 8
 9        public string Path { get { return _path; } }
10
11        private static string _host;
12        public string Host { get { return _host; } }
13
14
15        public FTPClientHelper(string connectionString) : base(CreateConnectionInfo(connectionString))
16        {
17
18        }
19
20        private static ConnectionInfo CreateConnectionInfo(string connectionString)
21        {
22            var user = "";
23            var pass = "";
24            int port = 0;
25            foreach (var item in connectionString.Split(';'))
26            {
27                var pair = item.Split('=');
28                if (pair.Length == 2)
29                {
30                    switch (pair[0].ToLower())
31                    {
32                        case "host": _host = pair[1]; break;
33                        case "user": user = pair[1]; break;
34                        case "pass": pass = pair[1]; break;
35                        case "port": int.TryParse(pair[1], out port); break;
36                        case "path": _path = pair[1]; break;
37                    }
38                }
39            }
40
41            if (user != null && pass != null)
42                return new ConnectionInfo(_host, user, new PasswordAuthenticationMethod(user, pass));
43
44            return default;
45        }
46    }
47}

SFTP setup

SFTP has a better usability with firewalls, than FTPS.

I call the SFTP, by using the NuGet package SSH.NET with this implementation:

 1using var client = new FTPClientHelper(config["FTPConnectionStringDaluxFM"]);
 2try
 3{
 4    client.Connect();
 5}
 6catch (Exception)
 7{
 8    log.LogError("Could not connect to FTP.");
 9    throw;
10}
11
12log.LogInformation($"Connected to ftp: {client.Host}, at path: {client.Path}.");
13
14foreach (var item in client.ListDirectory(client.Path).Where(o => !o.IsDirectory))
15{
16    log.LogInformation(
17      $"- File: {item.FullName}. Created: {item.LastWriteTimeUtc}. Bytes: {item.Length}");
18    var sourceFilePath = client.Path + "/" + item.Name;
19    using var stream = new MemoryStream();
20    client.DownloadFile(sourceFilePath, stream);
21    //Use the stream
22    client.DeleteFile(sourceFilePath);
23}

using no longer needs to be packed in to curly brackets like using(...){code}, after introduction of C# 8.0.

In the foreach, I only loop through files, by writing .Where(o => !o.IsDirectory).

FTPS setup

If you are using FTPS, then I recommend FluentFTP:

 1using var client = new FTPClientHelper(config["FTPConnectionStringDaluxFM"]);
 2client.AutoConnect();
 3log.LogInformation($"Connected to ftp: {client.Host}, at path: {client.Path}.");
 4
 5foreach (FtpListItem item in client.GetListing(client.Path))
 6{
 7    log.LogInformation($"- File: {item.FullName}. Created: {item.Modified}. Bytes: {item.Size}");
 8
 9    if (item.Type == FtpFileSystemObjectType.File)
10    {
11        using var stream = new MemoryStream();
12        client.Download(stream, item.FullName);
13        //Use the stream
14    }
15}

Where the FTPClientHelper is:

 1using FluentFTP;
 2
 3namespace Warehouse.Fileup.Functions.Helpers
 4{
 5    public class FTPClientHelper : FtpClient
 6    {
 7        public readonly string Path;
 8
 9        public FTPClientHelper(string connectionString)
10        {
11            var user = "";
12            var pass = "";
13            foreach (var item in connectionString.Split(';'))
14            {
15                var pair = item.Split('=');
16                if (pair.Length == 2)
17                {
18                    switch (pair[0].ToLower())
19                    {
20                        case "host": Host = pair[1]; break;
21                        case "user": user = pair[1]; break;
22                        case "pass": pass = pair[1]; break;
23                        case "port": if (int.TryParse(pair[1], out int port)) Port = port; break;
24                        case "path": Path = pair[1]; break;
25                    }
26                }
27            }
28
29            if (user != null && pass != null)
30                Credentials = new System.Net.NetworkCredential(user, pass);
31
32            this.EncryptionMode = FtpEncryptionMode.Explicit;
33            this.ValidateAnyCertificate = true;
34        }
35    }
36}

  1. I should mention that Microsoft recently acquired Softomotive, which is becoming WinAutomation as a part of Microsoft Power Platform, and that would be awesome, if we could have our RPA's hosted in a Microsoft environment - however we have evaluated that it is not time yet, to switch over. ↩︎