Microsoft SSIS vs Azure Data Factory (ADF)
SQL Server Integration Services (SSIS) and Azure Data Factory (ADF) both serve as ETL tools to extract data, transform data, and load that data into a target system. However, there are some key differences between SSIS vs ADF, which we’ll go over.
Let’s start with SSIS, which was first established in 2005. SSIS has been evolving over the last 15-20 years in the data space. As a result, this ETL application has gone through several iterations. There’s no doubt that SSIS has greater capabilities as a tool than ADF today.
However, we shouldn’t look at these ETL tools as a matchup between SSIS vs ADF. Instead, we can look at them as complements to each other. ADF didn’t have any UI (user-interface) and was just functions and commands only a few years ago. Today, we can add SSIS packages into ADF so that we can automate our data flows in the Azure cloud.
Your skills aren’t lost to ADF if you’re already familiar with SSIS. Instead, we can utilize those skills to complement the Azure stack and supplement the ADF data flow. SSIS now has functionality to connect to an Azure database and directly deploy its packages for use in Data Factory.
Azure Data Factory
ADF is an ETL cloud tool in the Azure ecosystem. In order to use ADF, we must have an Azure account set up and permissions to create a data factory. ADF continues to grow as Microsoft Azure continues to grow and expand its rich feature-set.
Although we can supplement our ADF workflows with SSIS packages, it’s not necessary. Sometimes ADF is unable to accomplish what we need by itself, which makes SSIS a nice complement. Data Factory is still ever expanding and growing. As a result, the cloud tool is plagued with strange bugs and issues through its web UI. Trying to use multiple tabs for development can prove costly in the end.
As a result of this, SSIS is still heavily used to develop ETL data flows. We can very easily develop our data flows locally and deploy our packages to the cloud. ADF is leveraged for it’s cloud capabilities and automation within the Azure framework.
SQL Server Integration Services
Although ADF is the newer “go-to” for Microsoft cloud ETL, it’s not necessary. There are work-arounds to automating SSIS in the cloud by using Azure Virtual Machines. With an Azure VM, we can develop and execute our SSIS packages within a cloud server. It’s a bit of a cheat, but it does get around using ADF as your primary ETL cloud service.
SSIS is not going away and will continue to be used. Many companies have heavily developed packages and spent countless hours on development. They aren’t ready to just throw it all away for a new cloud solution. In these cases, deploying packages to the cloud make a lot of sense.
Final Thoughts on SSIS vs ADF
Where ADF excels is in it’s integration to other cloud services. As an example, we can utilize other cloud services to send failure or success emails if our packages fail. We can even run API calls within our ADF packages to extract data from some other service.
SSIS has been built out for a decade longer than ADF and excels is in its rich ETL feature-set . It also excels in its familiarity. The talent pool for SSIS developers is much larger and many of them can likely pick up ADF as a new skill-set. Going the other way around would arguably be more difficult.
Whichever tool you decide to use depends on the situation. Both can be used independently or complementary to each other. As a result, this makes it easy to start with one tool and add the other if lacking the features needed.