Dec 23, 2020

Should you build or buy your marketing data pipelines?

9-MINUTE READ | By Evan Kaeding and Pinja Virtanen

Data Management

[ Updated Apr 22, 2024 ]

If you’re looking to move your cross-channel marketing and sales data into a data warehouse (like Google BigQuery, Snowflake, or Azure Synapse Analytics), a cloud storage solution or data lake (like Google Cloud Storage Bucket, Amazon S3, or Azure Storage Container), or a business intelligence tool (like Tableau, Power BI, Looker, or Qlik), you basically have three options:

  • Build and maintain your own API connections
  • Contract an external partner to build and maintain the pipelines for you
  • Invest in a managed data pipeline (like Supermetrics)

In this article, we’ll discuss the merits and drawbacks related to each of these options across three associated costs: set up cost, maintenance cost, and opportunity cost.

1. The upfront cost of setting up marketing data pipelines…

Chronologically speaking, the first cost you’ll need to consider is the upfront cost of getting your data flowing from your marketing data sources into your destination of choice.

The upfront cost of setting up a marketing data pipeline

… by building API connections in-house

If you’re looking to transfer data from a single marketing source (like Google Analytics, Google Ads, or Facebook), building your own API connection may be a viable option. 

However, since each new data source requires its own data pipeline, it’s fair to state that each new data source multiplies the cost of the project.

On average, building each new API connection takes 3-6 weeks of full-time software development resources (depending on the complexity of the API and the required fields). 

The person (or team) responsible for the project should:

  • Be competent in at least one data engineering language (typically either Python or Java)
  • Have ETL experience (usually with SQL)
  • Have strong data modeling skills
  • Have experience in monitoring data-intensive applications for errors

Considering that in late 2020, the average annual salary of a senior data engineer in the US was $118,810, building each new data integration would cost somewhere between $6,854 and $13,709.

With five API connections, you could easily be looking at an upfront cost of over $50,000 — and a project timeline of several months — to get your data flowing into your destination of choice.

And when you factor in the cost of data modeling, quality assurance, and infrastructure, you could easily end up with an upfront cost in six figures.

… by contracting the work to an external partner

If you don’t have in-house data engineers to spare, the other logical option would be outsourcing the work to an external partner.

For the aforementioned reasons, the upfront price tag of a single pipeline in the US would likely be around $20K-50K (assuming a fair rate of $150/hour).

Offshoring the work to a country where development resources are cheaper might help bring down the price.

As a rule of thumb, offshoring pipeline development projects will typically cost you anywhere between 25-50% of domestic labor. For example, Ukrainian developers typically charge about 50% of what US developers would. And if you outsource the project to an Indian company, you might be able to get the price down to 25%.

However, it’s good to remember that price usually follows quality. The lower the price, the more likely it is that your team will end up spending a lot of time on quality assurance and project management. 

Before you commit to outsourcing pipeline building, you’ll also want to answer the following questions:

  • Can you find a partner that has prior experience in completing similar projects?
  • How will you make sure that the solution will scale as your organization’s needs evolve?
  • Who will take care of maintaining the API connections when the initial project is over? How much will that cost?

If you’re happy with the answers to all of these questions, outsourcing might be a viable option for you.

… with a managed data pipeline like Supermetrics

The third and final option for getting your data flowing into a data warehouse, cloud storage, or BI tool is investing in a managed data pipeline like Supermetrics.

The immediate benefit is that you can schedule daily data transfers from multiple sources in just a few clicks. In fact, setting up all your transfers should take you no more than a few hours max, while a custom data pipeline project can take anywhere from a few months to a year.

And instead of spending $100,000+, in upfront costs alone, you’ll only be spending $10-30K a year in software — depending on where you’re moving the data, the number of data sources, and whether or not you need custom schemas.

Even though setting up basic transfers only takes a few minutes with Supermetrics, it’s also a good idea to reserve some time from an analyst to model the data. 

So let’s say you’ll need one day a week of an experienced data analyst’s time. With an annual average salary of $71,170, 20% of a data analyst’s time would cost $14,234 per year.

Combined with the pipeline software cost, the total cost for the entire first year would be anywhere between $24,000 and $45,000.

And psst! If you don’t yet have a data analyst in-house, Supermetrics has a stellar professional services team that can help you get up and running until you’ve made that key hire. Read more about our professional services here.

2. The cost of pipeline maintenance…

The second part of the cost puzzle is, of course, the cost of maintaining your data pipelines and making sure that if something changes in the source API, those same changes are quickly applied to your pipeline.

Maintenance cost of marketing data pipelines

… by building API connections in-house

The problem with maintaining your home-grown data pipelines is that API changes in marketing and sales platforms like Facebook, Google, and Salesforce are frequent yet fairly unpredictable.

Needless to say, unpredictability makes resourcing difficult.

On average, however, it’s good practice to allocate 10-20 hours a month of an experienced data engineer’s time for troubleshooting and fixing API issues per API connection.

That would mean an annual maintenance cost of anywhere between $7,425 and $14,851 (calculated with an hourly rate of $61.88).

Another cost that is typical for home-grown pipelines but isn’t easily quantifiable in monetary terms is the cost of downtime. This can materialize in two ways:

  • When an API connection breaks and a stakeholder is expecting an important report, your analytics and/or marketing team will have to manually scramble the required data together until the problem is fixed in the API connection.
  • When you’d like to make a change to one or more of your reports but have to wait for weeks — or even months — for internal resources to be freed up.

Especially if you’re offering reporting as a part of a marketing agency’s services, it’s a good idea also to include the cost of pipeline downtime in your calculations. At best, you’ll end up wasting your employees’ valuable time on manual report building, and at worst, you’ll lose customers as a result of broken reports and promises.

… by contracting the work to an external partner

If you’re considering contracting out the pipeline building part, we would strongly recommend that you also include pipeline maintenance in your contract from the get-go. 

In fact, any decent software development shop will make you pay to keep them on retainer. This means that you’ll end up paying them even if everything is running smoothly. The problem is, without a retainer, you won’t be able to simply pick up the phone and get 10 developer hours.

Because nothing is more expensive for an in-house engineering team than troubleshooting someone else’s code — especially if the code is built with unfamiliar technologies.

While the price of continuous maintenance contracts will vary dramatically based on the provider (and their location), it’s a good idea to reserve 40-60% of the initial setup cost for annual maintenance.

If you’re also expecting the partner to add new data pipelines in the maintenance phase, you’ll want to reserve closer to 100% of the original setup costs to this phase.

You’ll also want to make sure that you have a service level agreement in place, ensuring that any possible changes in the source API will be handled within an agreed timeframe.

… with a managed data pipeline like Supermetrics

With a managed pipeline like Supermetrics, pipeline maintenance is a non-issue. Because our full-time job is to make sure that our customers’ marketing data transfers work smoothly, our team of developers is here to fix any possible issues.

Additionally, the cost of pipeline maintenance is automatically included in the price of your Supermetrics plan.

3. The opportunity cost of building your own marketing data pipelines

Finally, let’s quickly consider the opportunity cost of building your marketing data pipelines in-house.

Depending on how your engineering team is set up, there is also a real chance that reserving a large chunk of your engineers’ time on pipeline building is not the best use of their time — or skills.

So before you commit to a data pipeline project, ask yourself: Is it more efficient for the developers to build automation for other departments than marketing? 

And how will you avoid battling for internal resources if and when one of the API connections takes more time to build and/or maintain than you had anticipated?

After all, the rule of thumb is that you should only spend your team’s time on a) things that can’t be automated and b) things that only your engineering team can do — as opposed to doing the same things other companies can do for you.

TL;DR: to build or to buy?

No matter how you slice it, building and maintaining your own marketing data pipelines is a costly exercise with little to no advantages compared to outsourcing the work or buying a managed pipeline.

Managed vs. outsourced vs. home-grown data pipelines

While hiring an external (potentially offshore) vendor is always an option, the truth is that it’s hard to find a partner with demonstrated experience. This unfortunately means that the initial scope of work and project timeline are unlikely to hold, often rendering this approach both slow and expensive.

And while you would expect a data pipeline company like Supermetrics to advocate for managed pipelines, we trust that the cost/benefit calculations above were enough to convince you.

So if you’re interested in seeing Supermetrics’ data pipelines in action, get in touch with our experts to book a demo or start a trial.

Turn your marketing data into opportunity

We streamline your marketing data so you can focus on the insights.

Book Demo