Scheduled json post in linux using cron and du – to Logic Apps and PowerBi

This post digs into getting regular disk space data out of linux, sending to an Azure Logic App for submission into PowerBi for visualisation.

So in my last post, i set up an rsync server to take a load of files from my “on-premises” storage. I have a reasonable internet connection, but am impatient and was always checking progress by logging into my rsync server and checking directory size using the du command.

-h gives you a human readable disk size, and max-depth only reports on the top level directories

du output to json

So the first step is take the output that i like, and get it into a better format.
Here’s what i came up with, it’s a little hacky as to create valid json i have an empty object at the end of the array to avoid messing around with last-comma removal.

This outputs valid json, which i can then curl out to a waiting Logic App, Power Bi dataset, or just a web service.

Logic App

So i chose to post the data to an Azure Logic App. This way i can take advantage of it’s pre-built connectors for email/power-bi whenever i change my mind about what to do with the data.

I new up a Logic App in Azure, choosing a Http Request/Response trigger template, pasting in the json – it creates the schema for me and i’m ready to go.

logic app new

Curling the json

So now that i’ve got a URL from Logic Apps to post to i can create the curl command on my linux box.

Lets see if that worked by checking the Logic App run history

Scheduling the curl

Ok, so all works as planned. Lets get this reporting the data every hour.

First lets put the code in a shell script file

Then lets schedule it for every 30 minutes.

Logic App run history, oh good it’s working every 30mins 🙂

Developing the Logic App

So up until this point the Logic App contains 1 action, a trigger to receive the data which it does nothing with.
Lets send the data over to PowerBi so i can quickly check it from my mobile app when I’m interested.

First up, head to PowerBi, click on a workspace and then add a dataset. I’m going to use a streaming dataset, with the API model.
You provide the field names and that’s it.

Next, we add add a few more actions to the Logic App.
– Filter Array (so we’re just working with the total size item)
– Add rows to a Powerbi dataset
– Use a simple calculation to work out the GB size from the KB size provided in the request

PowerBi Reports

So, once we have data going into a PowerBi hybrid streaming dataset we can craft some reports.

Streaming dashboard

Data over time

Troubleshooting rsync with Readynas, Azure and Centos

Years ago i bought a couple of Netgear Readynas devices.  A duo, and then subsequently a duo v2.  They’re pretty basic, but offered good squeezebox support and a cheap way of me storing TB of data in a RAID config.

Both of the Readynas devices support backup operations to send their data on a scheduled basis.  I’d normally opt for the simplicity of CIFS/Samba, but my internet provider has decided to block those ports and the Readynas devices don’t allow you to use a non-standard port.  Thusly the only other way to get the job done is to use rsync.

My desired location for the data backup is in Azure (naturally!).  Ideally in Azure files as my data will be the most accessible to me over an smb share in the same way that i’ve always accessed my ReadyNas devices.

Here’s a run down of a bunch of the errors i received when doing this and how to get around them.

rsync: getaddrinfo: 873: Name or service not known

It turned out that my rsync daemon wasn’t listening correctly.

Check with

The quick command to get it running is

rsync: failed to set times on “.” (in frontbut): Operation not permitted (1)

At first i thought this problem was because of the way i was mounting Azure files and that it’s filesystem didn’t support it. Most of the solutions on the web tell you to use the -O flag to omit updating directory times.

However the solution was that the username my Readynas was using was not the owner of the directory.

This statement changes the ownership (recursively) of the directory to user1. This should match the username you are using in the Readynas and the rsyncd.conf file

ERROR: The remote path must start with a module name not a /

Pretty easy one here. The path must only represent the module defined in the rsyncd.conf file – not the directory path.
Rnas backup destination

@ERROR: Unknown module ‘mnt’

I was having an issue whereby the config file i was using wasn’t being picked up by rsync (typo).
I was editing /etc/rsync.conf when it should have been /etc/rsyncd.conf.
Inside this configuration files are various module definitions (specifying the path etc), the module must be used.
rsync module

@ERROR: chroot failed

In your rsyncd.conf file make sure that chroot = false

@ERROR: chdir failed

Ensure that the directory has the correct permissions allocated.

My final Rsyncd.conf file

Create Kubernetes Cluster in Azure


Azure has 2 container service offerings, ACS and AKS.
ACS was the first to be released, gives a choice of orchestrators but is little more than an ARM template with no management capability. These are some of the issues that AKS address. I’m confident that when AKS is Generally Available, ACS will become deprecated. Until that point however, i like to stay with the GA container service.

I have a shell script that creates my cluster with my optimal “cheapo” settings. Probably worth noting that this config is pretty slow, and not great at taking load tests – but hey, you get what you pay for.

I usually kick this off in the Azure Cloud Shell, and i pass in simply one parameter which is the name of the Resource Group.
The reason for the script is as follows.
1) I want to consistently add tags to my resource group for automation
2) I use a service principal to access Azure which has a much lower set of permissions. At point of creation i want it to automatically have Contributor access.
3) I want the cluster to be small, and sized to be cheap.
4) I want the ssh credentials zipped and ready for me to download to other clients to access the cluster. I do this partly so i can easily get away from the cloudshell and its aggressive timeouts. It’s probably worth saying that this is a sledgehammer approach, i could just go into the /.kube/ directory and copy out the specific kube config file.

Hope this proves useful

Azure Arm Template – VM Domain Join Automation


In this post we’re going to look at the ARM template and steps needed to simply Domain Join a Windows VM using Powershell DSC and Azure Automation.

The Arm template will do the following;

  • Create an Automation Account
  • Add a Automation Credential
  • Add a Automation Variable
  • Import a Powershell module
  • Add a DSC Configuration

We’ll then use a simple powershell script to

  • Specify the ConfigData
  • Compile the DSC Node Configuration

Then we’ll take to the portal and tell a VM to use this config.

Arm template

Take particular note of the parameters. They set up the Automation variable that holds the domain name, and the credentials for the user account that has permission in the domain to perform domain join.

The script can also be found here on github:

Config. Compliation script

This Powershell script will connect to Azure and submit a compilation of the DSC Configuration defined in the Arm template.

Assign configuration to existing VM.

Now that the Configuration has compiled, it’s ready to be used.
To assign it to a VM that already exists, the quickest way is to use the Portal.
Follow the wizard instructions to complete the enrolment, wait whilst the config is applied.
DSC Nodes

Applying VM Config during VM Arm template build

If you wanted to domain join a VM or VM Scaleset which is created from an Arm template then you can leverage this script. The 3 variables that need to be set;

  1. automationRegistrationUrl – Registration url from your azure automation account
  2. automationKey – access key from your azure automation account
  3. automationConfiguration – name of the configuration to apply.

Arm templates – Importing modules from the Azure Automation gallery


In this post we’re going to look at referencing Powershell modules from the gallery, in your own Arm templates. This will save having manual steps to complete after your Azure Automation Account has been deployed.
If you are looking to package up your own modules, then this guide is not for you.

Using ARM templates to deploy any kinds of infrastrure is foundational when it comes to cloud skills.
As part of an ARM template i’ve been creating; I’m wanting to deploy an Azure Automation Account, fully configured to be able to domain join VM’s. A good first step in creating an ARM template is to manually create the infrastructure in the Azure Portal and copy out the Automation Script before neatening up. Unfortunately this doesn’t work so well with several of the assets in an Automation Account, modules being the first issue.

I tend to heavily use Modules from the gallery, but it’s a little akward in inserting them into an ARM template. The Automation Script generated in Azure offers you something like this;

Which if you look at it, seems reasonable. Its type has been set to Microsoft.Automation/automationAccounts/modules and the version looks good. However this won’t pull down the module from the gallery.
Here’s what you need to do…

Find the Module

For me, my module is v1.1 of xDSCDomainJoin

Root Template

So on this page; a link to “Deploy to Azure Automation”
Clicking the link, takes you to the Azure Portal with the template pre-loaded to deploy.

If you extract the TemplateUrl out, and UrlDecode it, you’re left with;
When you hit this url you get prompted to download

Once it’s downlaoded, rename the files extension back to .json from .zip
Here’s what’s inside;

This is a pretty simple template that calls another template on the url held in this variable;
"templatelink": "[concat('', parameters('New or existing Automation account'), 'AccountTemplate.json')]",

Because we’re going to be using this in a New Automation Account, lets compose the url properly;

As parameters for this template, it passes in the Name and Uri for the xDSCDomainjoin module. We’ll come back for these variables later.

New account template

Follow the url from above;
You’ll be prompted to download Again, rename the extension back to .json.
Open the file and you’ll see;

Snipping the bits that matter

It’s simply the module resource that we want to steal to lift into our Automation Account template. Grab that definition, and the two variables from the last file and you have enough to plug into your own Arm template.

"xDSCDomainjoin:1.1.0:Name": "xDSCDomainjoin",
"xDSCDomainjoin:1.1.0:Uri": ""

"name": "[parameters('xDSCDomainjoin:1.1.0:Name')]",
"type": "modules",
"apiVersion": "2015-10-31",
"location": "[parameters('accountLocation')]",
"dependsOn": [
"[concat('Microsoft.Automation/automationAccounts/', parameters('accountName'))]"
"tags": {},
"properties": {
"contentLink": {
"uri": "[parameters('xDSCDomainjoin:1.1.0:Uri')]"