In a previous experience, I was asked to build a data team. The task was entirely new for me, so I did what anyone in my situation would do: I Googled it. I found many resources on the topic, but most of the blogs and articles mainly focused on the more technical aspects. Although some of the information was helpful, I still felt lost about where to begin.
This frustration is what motivated me to write this article. The aim here is to give pointers on what to focus on, how to prioritize, and other considerations that will benefit modern leaders of organizations taking their first steps towards data maturity.
The objectives of the data team should be set ahead of its implementation. They should involve the key stakeholders from different business divisions (engineering, product, marketing, finance, etc.), as many will ultimately be the consumers of the data products.
It is also crucial that the objectives of the data team reflect the company’s level of data literacy. Considering that you are building the data team from scratch, chances are your company is at the early stages of data maturity. More on this later.
The objectives of a data team can typically be categorized into the following :
Now that you’ve defined the business objectives, you should decide where the data team sits from a company organizational perspective. This step is crucial as it will put the proper foundation to avoid silos and unclear ownership. A few popular setups:
With defined objectives and organizational reporting for your data team, you now need to consider several aspects: the company’s stage of data maturity, the data stack, and the data platform.
A Modern Data Stack (MDS) is a collection of tools and technologies that help businesses collect, transform, store and utilize data for analytics and ML use cases. The Modern Data Stack is cloud-based, modular and typically includes the following layers:
In general, data platforms can be:
Let’s now focus on the Human Resources aspect. There are three core technical capabilities in a Data team: Data Engineering, Data Analytics, and Data Science. Other variations or combinations of these led to the emergence of roles like Analytics Engineer, ML Engineer, MLOps, BI Developer, etc. In the case of more data mature organizations, positions like DataOps, MLOps, DataSecOps, etc., are often sought after.
Let’s go through the three prominent roles in detail.
If you are just getting started, I advise you to follow the “less is more” rule. Start small by favoring “Full Data Stack” capabilities and keeping your data team’s objectives in mind; you can grow the team one member at a time as your necessities evolve.
At an early stage, the data team tends to focus on experimentation and initial POCs instead of bringing one big project into production. In this case, a Data Analyst or a Data Engineer with Analytics skills (Python, SQL, etc.) will be more valuable as a first hire. This person could work alongside Software Engineers on a first POC, which would help identify the first pipeline needs. This paves the way for the second hire, that should be someone with more Data & Architecture Engineering skills, to proceed with building the platform and making appropriate infrastructure choices. After this, further recruitment should be done according to ongoing projects.
Soft skills are essential when evaluating Data professionals. The Data practice is by default cross-functional; the Data team’s core mission is to help the business extract the maximum amount of value from data and become data-driven. Therefore, proximity to the business is indispensable. On the other hand, and especially in the early stages of Data Maturity, the data team also works closely with IT and Software engineering to ensure the robustness and sustainability of the Data Infrastructure. A good Data hire will have the following skills:
Building a data practice is not only about making technological choices; and you will likely have to start with a first iteration and expect it to evolve as your business grows. Although there is no one size fits all approach, there are some best practices that I have gathered from my experience and the many conversations I have had with data leaders around the topic. Starting with the “why,” it is essential to set the right objectives for your modern data team by assessing your organization’s data maturity. Companies with different data maturity stages have different needs, which should drive the choices you make when creating your team. In addition, you need to define a clear role for the data team within the organization to avoid silos and unclear ownership. On top of this, you need to pick the best data stack and data platform for your organization. And while it might not be able to fit all considerations, this blog aims to provide you with an overview of best practices and a non-exhaustive list of recommendations to overcome some non-technical challenges that may arise when building a modern data team. While considering the non-technical challenges organizations face during the team creation phase, the technical aspects also deserve detailed attention. I will be discussing these at length in another blog.