Webinar: What are the best Data Engineering tools of 2023?

What are the best Data Engineering tools of 2023? Every year, BARC publishes a report featuring various Data Engineering technologies and evaluates these tools on various aspects. In half an hour, we'll be breaking down the most interesting aspects of this report and bring you up to speed on the latest trends.

Within Data Engineering, there are various technologies that you can use within your organization. But which technologies keep up with developments, and what are the latest trends in the Data Engineering space?

Every year, BARC writes a data management summary with various technology and reviews them on different aspects. We reviewed this report for you and will tell you all the interesting details. Fill out the form below and know the most interesting trends and technologies in the field of Data Engineering.

What will you learn in this webinar?

  • The key takeaways from the BARC report
  • Most interesting new insights

Watch the webinar

Watch the webinar

Webinar: What are the best Low Code apps in 2023?

Looking for a low-code way to do application development? The Gartner Magic Quadrant on Low Code platforms is a report of, a whopping 125 pages that lists all the platforms. Are you having a hard time dissecting every detail in the report? Don't worry, our experts delved into every little detail!

In the report “Magic Quadrant for Enterprise Low-Code Application Platform,” Gartner explains the differences between various Low Code platforms. But how do you get the important information from this 125-page report? Luckily, you don’t have to do this yourself. In half an hour, our Low Code specialist will update you on the most important developments and trends for the upcoming year.

What will you learn in this webinar?

  • The key takeaways from Gartner’s report: Magic Quadrant for Enterprise Low-Code Application Platform
  • Most interesting new insights

Watch the webinar

You can now watch the webinar!

Questions regarding the Gartner Magic Quadrant?

Databricks for Data Science

Picture this: your business is booming, and you need to make informed decisions quickly to stay ahead of the competition. But forecasting can be a tedious and time-consuming process, leaving you with less time to focus on what’s important. That's where Azure Databricks comes in! We used this powerful technology to automate our internal forecasting process and save precious time. In this blog post, we'll show you the steps we took to streamline our workflow and make better decisions with confidence. So make yourself a coffee and enjoy the read!

Initial Setup

Before we dive into the nitty-gritty of coding with Databricks, there are a few important setup steps to take.

First, we created a repository on Azure DevOps, where we could easily track and assign tasks to team members, make comments on specific items, and link them to our Git commits. This helped us stay organized and focused on our project goals.

Next, we set up a new resource group on Azure with three resources: Azure Databricks Services, a blob storage account, and a key vault. Although we already have clean data in our Rockfeather Database (thanks to our meticulous data engineers), we wanted to keep our intermediate files separately in this resource group to ensure version control and maintain a clean workflow. Within our blob storage, we created containers to store our formatted historical actuals, exogenous features, and predictions.

Finally, we sketched out a high-level project architecture to get a bird’s-eye view of the project. This helped us align on our deliverables and encouraged discussion within the team about what a realistic outcome would look like. By taking these initial setup steps, we were able to hit the ground running with Databricks and tackle our forecasting process with confidence.

Project Architecture

Moving on to Databricks

Setting up Azure Databricks

Now we’re ready to dive into Databricks and its sleek interface. But before we start coding, there are a few more setup steps to take.

First, we want to link the Azure DevOps repo we set up earlier to Databricks. Kudos to Azure Databricks, the integration between these two tools is seamless! To link the repo, we simply go to User Settings > Git Integration and drop the repo link there. For more information, check out this link.

To keep our database and blob storage keys and passwords secure, we use the key vault to store our secrets, which we link to Databricks. You can read more about that here!

Lastly, we need to create a compute resource that will run our code. Unlike Azure Machine Learning, Databricks doesn’t have compute instances, only clusters. While this means it takes a minute to spin up the cluster, we don’t have to worry about forgetting to terminate it since it automatically does so after a pre-defined time period of inactivity! It’s also super easy to install libraries on our compute: just head over to the Libraries tab and click on “⛓️ Install new”!

From Database Query to Forecast

Now that our setup is complete, we move on to using Databricks notebooks for data loading, data engineering, and forecasting. We break this down into three sub-sections:

  1. Query Data: We create a notebook which reads the formatted data from the blob storage containers we created earlier. We then transform this data and create data frames for further data engineering.
  2. Preprocess Data: Here, we perform data cleaning, feature engineering, and data aggregation on our data frames to get them into a shape suitable for forecasting. This is also the place to apply transformations such as normalisation, scaling, and one-hot encoding to prepare our data for modelling.
  3. Forecast: We use machine learning models such as Random Forest, Gradient Boosting, and Time Series models to forecast future values based on our prepared data. For our baseline model, we use the classic Exponential Smoothing model. We then store these predictions back in our blob storage for further analysis and visualisation.

By breaking down the process into these three sub-sections, we can work more efficiently and focus on specific tasks without getting overwhelmed. As shown in the Project Architecture above, each step is one notebook. Let’s have a closer look at each notebook.

Notebook 1: Query Data

The code we’ll be writing in Databricks is pretty much in standard notebook format, which is familiar territory to all data scientists. Our data engineers in the audience will also appreciate that we can write, for example, SQL code in the notebook. All we have to do is include the %sql magic command (link: https://docs.databricks.com/notebooks/notebooks-code.html) at the beginning of the cell, as shown below. We use this approach for our first notebook, where we query the data we need. In our case here, it’s past billable hours and the available hours of our lovely consultants. Once we have the data we need, aggregated to the right level, we save it in our blob storage for the next step.

Notebook 2: Preprocess data

In Notebook 2, we use pandas and numpy to preprocess our data and do our feature engineering. We need to make sure we have all the data on the features we’ve identified for the whole forecast period before we start training any models. For example, if we’re forecasting billable hours and using total available hours of our consultants as an exogenous feature, we need to have that data available for the entire forecast period. While this may seem obvious in this case, it’s an important step to take before we jump into any modelling! Once we’ve got our data formatted the way we want it, we save it back to blob storage and move on to the next notebook.

Notebook 3: Forecast

Here’s where the magic happens – we finally get to do some forecasting! We use the darts package to train multiple models and compete against our baseline Exponential Smoothing Model. We love this package because it’s super easy to use and makes backtesting a breeze. To evaluate model accuracy, we use the mean absolute percentage error (MAPE) – it’s a simple metric that’s easy to understand.

We try out different models like linear regression, random forest, and XGBoost and compare their performance against our baseline. Our baseline model had an MAPE of 44%, which isn’t great, but we’re not deterred. By adding in our single exogenous feature and leveraging our three ML models, we were able to decrease our MAPE to 12% – a huge improvement! And of course, we save our results back to blob storage for future reference.

With our forecasting pipeline up and running in Databricks, we can sit back and watch the predictions roll in. It’s amazing what you can do with a little data and a lot of creativity!


💡 Forecast backtesting is a method used to check how accurate a forecast is by comparing its predictions with what actually happened. This helps identify any errors or biases in the forecasting model, which can be used to improve future predictions. It’s a useful tool in many industries, such as finance or weather forecasting, and helps decision-makers make better-informed decisions.


Our thoughts on Databricks

We have got to give props to Databricks, it’s a tool that makes our lives easier. It’s like having a Swiss army knife in your pocket – it’s slim, versatile, and gets the job done. We love the collaboration feature – we can code with our team, and it’s like a real-time jam session. Plus, setting up and scheduling pipelines is as smooth as butter. The best part is the seamless integration with mlflow and PySpark – it’s like having your favorite sauce on your favorite dish. Let’s just say that Databricks has been a game-changer for us, and we’re excited to see what new features they’ll cook up in the future!

Next Steps

Although we’ve got a pipeline set up, our forecasting journey still has an exciting ride ahead. That’s always how it is with data science projects. The next step is generating maximum value from our forecasts.

We are discussing with our Data Viz team how best to integrate this forecast into our dashboards. Also, we’re scheduling meetings with our CFO to see how exactly we can make his job easier by, for example, introducing more exogenous features or reporting historical accuracy.

As a data-thinking organization, we’re committed to becoming more anti-fragile, and we treat our customers the same way. This means building resilience into our forecasting models to ensure they can withstand unexpected events and continue providing reliable forecasts. If you found this post inspiring and would like to know more, don’t hesitate to reach out!

Webinar: What are the best Data Science tools in 2023?

Where should you start when looking for a Data Science & Machine Learning solution? What information is important? And especially which sources are the most reliable? Our expert goes over these questions in just half an hour. Sign up now!

Choosing a Data Science solution for your business is a difficult task. Our Data Science experts have looked into all the options and will be talking about the best tools and the newest trends in the Data Science & Machine Learning market. All in a short and sweet 30 minute webinar. They’ll also tell you everything you need to think about when picking a tool and what the pros and cons of the most commonly used tools are.

What will you learn in this webinar?

  • The best Data Science and Machine Learning tools on the market
  • The most interesting new trends in the market

Fill out this form and view the webinar right away!


You can now view the webinar

Questions regarding the Forrester Wave?

What if your dashboards turn out to be misleading?

An effective dashboard is a tool for getting clear insights into your data. But what if a dashboard is less effective than imagined? Or even worse; what if your dashboards mislead users? In this blog, we discuss two new features of Power BI. One can mislead your users, while the other makes your dashboards more effective.

A highly requested solution with undesirable side effects.

The first update we want to highlight is an update that allows small multiples to be turned off on the shared axis. A common complaint among Power BI users was “due to the shared axis, I can’t see smaller values properly anymore”. This update does make that possible, however how useful is this turning off the shared axis really?

Let’s see what happens when we turn off the shared axis.

As you can see, it is not very easy to tell which product group is the largest if the y-axes are not equal. In fact, at first glance, it all looks the same. Only when you look longer do you see that the axes are not equal.

We believe that you should always keep the axes equal, and so even if they are small values. After all, you are comparing different segments. But then how can you visualize small multiples, while also making it easy to read? It’s simpler than it sounds, take a look:

In this case, we recommend using Zebra BI, a plug-in for BI tools such as Power BI. By using this plug-in, the small multiples are placed in boxes of different sizes depending on the values. This allows users to properly compare the segments and thus draw the correct conclusion, as they are not misled by different axes. Sounds useful right?

An update that simplifies reading a report

A feature from the same update that does make Power BI dashboards more effective, is the addition of Dynamic Slicers. With Dynamic Slicers, you can use field parameters to dynamically change the measurements or dimensions analyzed within a report. But why is this so useful?

With Dynamic Slicers, you can help readers of a BI report explore and customize the report, so they can use the information that is useful for their analysis. As shown in the GIF above, a user can filter by Customers, Product Family, Product Group, or other slicers that you’ve set up in the report.

In addition, you can parameterize slicers further to support dynamic filtering scenarios. In the GIF below, you can see how it works. You can see that the value comes from the dynamic slicer and changes dynamically. This gives your users even more opportunities to interact with the dashboard.

How do we make sure Dashboards are not misleading?

Of course, Microsoft continuously tries to update and improve Power BI, but some updates can have unwanted effects. We have been using IBCS standards for years as our Power BI report guidelines, and we also believe that if these standards are applied properly, you’ll prevent unintended consequences. To test if our reports comply with these standards, we use ZebraBI‘s IBCS-proof plug-in while building dashboards in PowerBI. Want to learn more about how IBCS standards can help your organization?

Learn more about IBCS reporting

Fill out this form below and find out the basics of IBCS reporting in a quick 20-minute webinar!

Please be patient... The form will appear in a couple of seconds.

You can now watch the webinar!

The Power of Microsofts Low Code platform

As the low-code platform that integrates with Microsoft 365, Azure, and Dynamics 365, the Microsoft Power Platform is perfect for professionals who want to develop apps but lack the necessary programming knowledge. In addition, because of its integration with other Microsoft platforms, the Power Platform is perfect for companies that use Microsoft tools. But what are the different apps? And why is using the Power Platform so useful? What's so convenient about Low Code? In this blog, we try to answer all these questions.

What are power apps?

To put it in Microsoft’s words: “Power Apps is a suite of apps, services, and connectors, as well as a data platform, that provides a rapid development environment to build custom apps for your business needs.”

Power Apps enables you (an accountant, engineer, chef, CEO) to build applications that solve your problems regardless of their size.

What is low-code?

Low-code is a way to build applications quickly in a “drag and drop” environment. Low-code enables anyone with a desire to build apps quickly and create something without having to understand coding languages like Java or C++.

On top of that, Low-code also enables you to connect to a variety of back-ends or third party services without having to manually connect to an API or other complex connection tools.Overview of the Microsoft Power Platform and its connection to other apps in the Microsoft Portfolio.

What can low-code offer me?

  • Automation
  • Re-usability
  • Scalability
  • Rapid development
  • Cross-platform applications

What are the low-code apps within the Microsoft Stack?

Power Apps

The bread and butter of the low-code Power Platform. It enables you to create applications that display data and allow you to interact with your data directly. Power Apps enables you to build applications for both Mobile and web by using a drag and drop interface.

Power Automate

If you want to create automated processes running in the background, this is how you do it. If you want to send emails after new records are created, while at the same time notifying users, Power Automate does this with just a few clicks.

Power Virtual Agents

One of Microsoft’s newest offerings allows you to create chat bots using a low-code approach. Chat bots can be deployed internally on Teams or on your own website. They enable users or customers to find answers to problems without the need for a human-ran support centre.

Power BI

With Power BI, you get even more out of Business Intelligence. Create stunning visuals that provide deep insights into your data and financials. Use forecasting to better predict changes in upcoming time periods and adjust your business planning accordingly.

Power Platform use cases

Imagine you’re an accountant that has to modify data in excel based on the parameters that someone emails to you on a daily basis. Based on these emails, you want to have an overview of all of your data to be able to show to management during your monthly meetings and, on top of this, you want to be able to modify something within the data on a last minute basis if needed. The Power Platform and it’s low code tools can help you automate a large part of your workflow. Here is how:

  • We’ll use Power Automate to scrape the values from the email and modify the excel sheet.
  • Once the sheet is modified, we can create a dashboard to have a full overview using Power BI to present at your monthly meetings.
  • And lastly, we use Power Apps to create a small app that you can modify your data from, if case last-minute modifications are needed.
  • Lastly, we could create a Virtual Agent to enable colleagues or clients to request copies of documents from a certain case file using a specified password.

Benefits of Low-code

  • Save time and money
  • No need to hire a full stack developer
  • Custom solutions for custom problems
  • Easy to implement

Rockfeather & Low-code

Here at Rockfeather, we know that not everyone has the budget for their business ideas to become reality. That’s why we offer both trainings in low-code as well as building services. We empower everyone to become a citizen developer – this enables you to build things in your own time, with your own tools. If you get stuck or need help, you can be sure that Rockfeather will be there to lend a hand.

For the more advanced projects, we use all of the tools above and more to create fully fledged desktop and mobile applications that can be exported and deployed to your environment seamlessly.

More blogs

Webinar: Advanced forecasting with data science

Forecasting with Data Science can help your organization take the next step in data maturity. In this webinar, we’ll show you how to get even more out of your forecasts with AI.

Read more

Power Automate or Logic Apps

Power Automate or Logic apps? Two similar automation tools, both on the Microsoft platform with a common look. With many of the same features the differentiation lies in the details.

Read more

Looking back: Data & Analytics Line Up 2022

Want to know what’s on sale for dashboarding or data integration solutions? Want to compare data science solutions? Or would you like to see Low Coding platforms in action? This and more was discussed at the Data & Analytics Line Up 2022!

Read more
All posts

Power Automate or Logic Apps

Power Automate or Logic apps? Two similar automation tools, both on the Microsoft platform with a common look. With many of the same features the differentiation lies in the details.

Comparing features

Most notably, Logic Apps is aimed at more technically proficient users. Power automate is really aimed at citizen developers, however, with less in-depth features and more user-friendly options.

For most companies, Power Automate is included in the Microsoft Office license with the standard connectors. Though, customers interested in premium connectors will need a premium license. Logic Apps is a service that’s pay as you go, meaning that you pay while the app is running.

Below are 3 key differences:

  • Power Automate integrates well with the Power Platform, whereas Logic Apps integrates with azure resources
  • Logic Apps supports version control
  • Power automate supports robotic process automation

To further expand on the differences in licensing between Power Automate and Logic Apps, we summarized some key takeaways below:

  • Logic apps: Pay as you go. You only pay when your application is actively running
  • Power Automate: Pay per Month. You pay a fixed fee per user per month, and it’s often already included in your Microsoft Office license (E1, E3 and E5)

Important details

Power automate is part of the Microsoft 365 environment and the power platform, and its main aim is to automate tasks and work within the Power Platform. By comparison, Logic Apps is one of the solutions within the Microsoft Azure Integration Services and is thus more commonly used for ETL processes and data integration. Therefore, Logic Apps has good integration within Azure, but lacks this integration with the Power Platform. What’s best for your business simply depends on what type of capability you need and what you want.

For instance, consider the following examples:

  • A manager wants to get an email when a certain KPI value is reached, this KPI is calculated and presented in Power BI, depending on circumstances this could both be done by Power Automate or Logic apps
  • A finance employee needs to click a few buttons in a legacy application on his desktop once a day. With Power Automate RPA, this can be automated.
  • Data needs to be extracted from a system and loaded to a data warehouse, for this, process Logic App provides the best pricing and the most flexibility.
  • A monthly survey needs to be sent out, this is best done via Power Automate
  • A process must be approved through a button on a dashboard, because of the Power Platform functionality here Power Automate is best

Compare for yourself

Interested about the strengths and weaknesses of both platforms? We have articles that dive deeper into both Logic Apps and Power Automate and all their pros and cons.

Harnessing Data Power with DBT; Elevating Business Practices

In today’s data-centric business environment, mastering the art of transforming data into actionable insights can dramatically differentiate leaders from followers. Have you ever been in a situation where you doubted the reliability of your data, leading to decision paralysis? Then read this blog!

Read more

Webinar: External Tools to Optimize your Power BI Environment

Whether you’re a seasoned Power BI user or a BI manager looking to elevate your team’s capabilities, this session will provide you with the insights and tools necessary to achieve a clean and effective Power BI environment.

Read more

Webinar: Kick-start your Data Science project

You have formulated a solid business case for your Data Science project. Congratulations! But what’s next? In this webinar, we will give you an overview of the steps to take in your Data Science project. We will also show you which technologies you can best use for your project.

Read more
All posts

Microsoft’s Cloud based machine learning service

• Notebooks
• Auto ML
• A drag & drop tool called Designer

Machine learning in a nutshell

Machine learning is a form of artificial intelligence where, through software and algorithms a computer uses from historical data to make predictions about the future. These complex computer codes, also known as algorithms, are used to create complex models. These machine learning models use algorithms to find patterns or relationships in the data, which can then be used by a model to predict values based on new data.

What is Azure Machine Learning?

Azure Machine Learning is Microsoft’s answer to the increasing demand for Machine Learning as a Service solutions. With Azure Machine Learning, Microsoft provides a cloud based machine learning environment within the Azure platform through which data science projects can be run and managed. The machine learning services seamlessly integrate with the other cloud computing services Microsoft offers through the Azure platform. Azure Machine Learning gives you the ability to develop models based on open source machine learning tools such as Pytoch, TensorFlow, scikit-learn and many other resources.

Azure Machine learning infrastructure

What does Azure Machine Learning Workspace offer?

  • Auto ML
  • A drag and drop interface for citizen data scientists called Designer
  • Scalability of computing power and data storage
  • Integration with the Azure platform
  • Wide variety of algorithms for different machine learning purposes
  • You can easily convert your machine learning model to a web service
  • Version management  per model

The different features of Azure Machine Learning

  1. Notebooks
    In the machine learning workspace, you can create a notebook that works similar to a Jupyter Notebook. Jupyter Notebooks have been used for years to develop code for data science projects. In a notebook, pieces of code can be written in cells to manipulate data and train a model.
  2. Automated ML
    If you have structured and clean data, then you can use Auto ML. With Auto ML, you enter the data step by step while going through a menu where you can determine various settings. For example, you indicate what value you want to predict with your model and what error metric should be used to compare the different models. When the set-up is done, Auto ML starts training and testing models. After this process, you will be presented with the best model. Running Auto ML for simple use cases works well through the menu interface, but when things get more complicated you will need to run Auto ML from a notebook.
  3. Designer
    Designer is the drag and drop interface of Azure Machine Learning. This component can be compared to the way a machine learning project can be built in an intuitive way in Alteryx. In the designer canvas, so called pipelines can be created where you can visually lay out the data from the raw input to the trained model.

Use-case for Auto ML: Training multiple models

A major development in the field of data science in recent years has been the emergence of Auto ML. This development automates parts of the data science development process that were previously often repetitive and time-consuming. In fact, the most modern Auto ML solutions can perform almost the entire development process with well-structured data. However, in most cases, data is not so structured and prior work is still needed. Big data needs to be trained before it can be used in a model. Where Auto ML really adds great value is in training and testing different models. Previously, after cleaning and structuring big data, the data scientist had to determine which models to train and test. The process of training and testing different models is a time-consuming task and often requires a lot of computing power. This is one of the reasons why data scientists often choose to train and test only a limited selection of algorithms. Auto ML allows the data scientist to train and test a much larger selection of algorithms on scalable cloud computers. In this way, the data scientist does not have to train and test every algorithm himself. With Auto ML, the data scientist indicates which algorithms should be tested and trained and which ones should be compared according to which error metric and/or statistics results. As an analogy, think of it like cycling. Previously, a data scientist cycled around on a city bike without gears, but with the development of Auto ML, a data scientist gets an electric bike with pedal assistance. You still have to ride the bike, but there is support from technology that gets data scientists to their destination faster.

Benefits of Microsoft Azure Machine Learning

  1. Scalability
  2. Clear documentation.
  3. Easy to implement
  4. Integration with Microsoft’s Azure platform
  5. ML as a service

Rockfeather & Azure Machine Learning

At Rockfeather, we are also aware of the advantages that Azure Machine Learning offers over conventional data science tools. Using Azure Machine Learning allows us to deliver better and faster results to our customers in the field of data science. One of the projects where we at Rockfeather have leveraged Azure Machine Learning effectively is predicting sales numbers for one of our clients. These predictions, or forecasts as it is officially called, is a branch of data science that uses historical data to make predictions about the future. Good forecasting is important to this client because it saves costs and waste. The challenge with this project was that there was a large assortment of different products and therefore sales numbers had to be predicted for all these products. Manually training and testing models for a large assortment takes a lot of time, which is why we applied Auto ML in this situation. Because of the scalability of Azure’s cloud based Machine Learning, we can train and test a selection of different algorithms per product to find the most effective algorithm for each product. These models are all stored in Azure Machine Learning and once trained can be used to actually make predictions about future sales numbers.

Harnessing Data Power with DBT; Elevating Business Practices

In today’s data-centric business environment, mastering the art of transforming data into actionable insights can dramatically differentiate leaders from followers. Have you ever been in a situation where you doubted the reliability of your data, leading to decision paralysis? Then read this blog!

Read more

Webinar: External Tools to Optimize your Power BI Environment

Whether you’re a seasoned Power BI user or a BI manager looking to elevate your team’s capabilities, this session will provide you with the insights and tools necessary to achieve a clean and effective Power BI environment.

Read more

Webinar: Kick-start your Data Science project

You have formulated a solid business case for your Data Science project. Congratulations! But what’s next? In this webinar, we will give you an overview of the steps to take in your Data Science project. We will also show you which technologies you can best use for your project.

Read more
All posts