How to Build an Actionable Data Strategy Framework

Step 1: Identify Key Stakeholders

The first step of formulating a proper data strategy is to identify the key players. The key stakeholders ideally should have a vested interest in the data platform, a healthy dose of excitement, and a genuine passion to make more data-driven decisions across your organization. 

Who Leads the Data Strategy Engagements?

The best organizations create a cross-functional team that is typically led by someone in a Data/Analytics role or a leader within the IT organization. In some cases, this team is led by a leader within a business unit or a central business team. This individual serves as the “point person” who is responsible for driving success.

It’s important to ensure the point person has a clear line of sight and understanding of your current data platform architecture. They should be comfortable making technology decisions for the organization and ideally not develop the data strategy in a vacuum.

Who are the Key Data Strategy Stakeholders?

This team is very collaborative, working cross-functionally to gather input and support from numerous stakeholders across your organization. Here are a few examples of primary stakeholders: 

Internal IT Teams – These teams are typically made up of architects and software or data engineering leads who help run the IT technology to support the business.

Business Units – Professionals in this role help contribute and align corporate strategy with data strategy. Additionally, they help with use case identification, capability, and feature prioritization.

Data Consumers – Data consumers help provide insights into how teams use data within the business.  

Project Management – Individuals in this position help coordinate the cross-functional team to ensure deliverables and timelines are met.

Executive sponsorship – This pivotal role is often taken up by the Chief Data Officer to help oversee the entire data strategy operation. 

Finance– Finance plays an important aspect in developing a data strategy by providing clear understanding of financial value created as well as consumers of data applications.

What to Expect as a Stakeholder?

As a valued stakeholder, you will be asked to participate in many activities throughout the data strategy project. It’s important to note that not every stakeholder takes action in each aspect of the project, rather they add value within their spans of control and help influence the overall success of the project. 

Step 2: Discovery

The initial discovery sessions are intended to catalog the current state of data assets, data platform technologies, and any current data use cases. Once all of this information is captured, the next step is to identify any gaps or challenges and then create a prioritized list of potential future use cases. 

Discovery is performed through a series of interviews and documentation reviews. Interviews are performed with each stakeholder group where detailed notes are taken to document relevant findings. Additional interviews may be performed when new information is uncovered in a related discovery session. Interviews are complimented with documentation that covers things such as:

  • Architecture
  • Roadmaps
  • Process flows
  • Business Requirements
  • Business Plans
  • Org charts
Common Example Questions to Ask During Interviews
Non-IT Questions
  • How do you get value out of the data you use?
  • What tools do you use to answer difficult questions that come from your BU leadership?
IT Questions
  • Describe the architecture of the current data platform.
  • What source systems are inscope for the platform? Describe the data domains available, size and any transformation that happen within that source system.

Near the end of Discovery, it is important to catalog and summarize use cases, gaps, and priorities. This summarization will allow for the identification of a Primary Use Case that can drive the development of the platform.   

Identify a Primary Use Case That Drives Action and Decision-Making

Your primary use case is the focal point of your data strategy, it’s what’s going to drive the value home. The ideal primary use case should align with your business’s top priority and goals while also having the potential to be completely supercharged by data. There are two main objectives that the primary use case should facilitate achieving:

  1. Exercise and require implementation with a sufficient number of capabilities of the platform for subsequent use cases to be accelerated.  
  2. Be impactful enough to share with executive leadership to see the value of the platform and provide more support for subsequent approval of use cases.
Questions for Consideration as You Identify Your Primary Use Case:
  1. How long would it take to solve?
  2. How likely are you able to deliver?
  3. Can you tackle it within your existing tech stack?
  4. Is this solvable with your existing team or do you need to hire more people?
  5. Are there any regulatory risks involved?
Analytical vs. Operational Use Cases

There is a spectrum that splits use cases, at one end you have Operational Data Products and on the other, you have Analytical Data Products. While these use cases rely on the same underlying data, they have very different requirements. The primary dimension that differentiates these are the impacts they have on generating revenue, producing products, or interacting with clients.  

What Are Analytical Data Products? 

Analytical data products are most commonly used to inform decision-making and analyze certain business functions. When they are not functioning, there is little impact on customers, revenue, or production.  These are the use cases that we typically recommend to prioritize.  They typically have a significant impact on the overall business but do not require large upfront investment or support to manage.

What Are Operational Data Products? 

Operational data products are used to run day-to-day business operations. Typically, when they go down, there is a large impact on customers, revenue, or production.

Example Use Cases
Analytical – BI 
  • Basket Analysis – Ad hoc analysis that determines which products customers typically purchase together.
  • Product Drill Downs – Help determine product or feature sales by region/customer/distributor.
Operational
  • Inventory Forecasting – Estimates inventory levels required in a specified period.
  • Real-Time Equipment & Process Monitoring – Monitors the health of manufacturing equipment in real or near real-time to increase efficiency.

Step 3: Data Platform Architecture

Drafting an initial architecture of the solution based on information gathered in the discovery phase gives life to the platform. Even if it is incomplete, it will still bring visual representation to subsequent discussions. It also helps frame people’s thinking and tells the story of the platform. Without a visual representation, conversations often end up repeating, causing confusion and ultimately slowing down progress.

The architecture is organized by capabilities. Capabilities represent the logical components of the platform necessary to deliver on a requirement. For example, most data platforms require data warehousing as a capability. The data warehouse allows for efficient storage and querying of data for business intelligence, advanced analytics, and machine learning. 

Architecture Diagram #1

The first architecture diagram focuses on the capabilities of the platform. The capabilities are laid out in the order in which data will be processed. Like a good story, this architecture diagram tells the audience how data gets into the data platform, processed, and consumed. The architecture highlights specific capabilities and data requirements. The goal of this diagram is to get buy-in on a capability view of the data platform. 

Capabilities are composed of technologies that have features that can align to specific business requirements that encapsulated in the capability. For example, the ingestion capability might have different technologies that manage real-time data ingestion vs. batch. 

Architecture Diagram #2

The second architecture diagram incorporates another level of detail, specifying technologies to support a capability. The technology assessment should provide reasoning and justification to technology choices (see Technology Assessment). Justification is derived from discovery phase interviews. 

Technologies can be deployed and configured in a number of different ways. Everyone must understand how the technology will be operated and leveraged. This often requires a deeper technical representation of the technology. For example, Airflow can be used in a standalone server, it can be deployed as a service on kubernetes or it can be used “As a Service” in AWS Managed Workflow. Having a detailed representation will make it clear how the technology will actually be deployed in the data platform.

Architecture Diagram #3

The final architecture diagram will be a deep technical document that will be a reference for how the actual platform is used. This version will be most useful to technology domain owners and it should offer clear guidance on the scope and magnitude of deployment. It will also allow for more accurate cost estimation of the platform for initial setup and on-going maintenance.

The architecture is meant to be a living document. Each architecture diagram should be kept up to date as the discussion with business and technology stakeholders progresses. The refined architecture iterates until there is general acceptance. High-level diagrams give the illusion of it being simple to set up whereas having the details lets the reader understand how much work it takes to get all the configurations just right.

Step 4: Technology Assessment

As the capability architecture diagram comes into focus, technology assessments will be conducted to determine the technology stack of the platform. This will involve the creation of a document detailing the technology options, selection criteria, and applicable business factors that determine the selected technology. 

For example, a company might be deciding whether to continue leveraging Hadoop for its data warehouse or moving over to Snowflake. 

The assessment may reveal the need to do a proper Proof of Concept (POC) of technology to build a better understanding of how the proposed technologies stack up to the selection criteria. This is usually noted in the proposal and would represent a phase 0 implementation scope to solidify technology selection. 

At AKIRA, we’ve been fortunate enough to have a strong background in implementing data platforms using a variety of technologies. The selection criteria list below is what we use to help our customers maximize their technology investments.

Selection Criteria:

  • Create a pros and cons list
  • Create a cost profile
  • Create a migration calculator
  • Create a feature comparison
  • Create alternatives
  • What about internal comfort or experience with the tool?

Technology decisions are critical to a successful data platform. The decision drives everything from costs to recruiting for roles on the platform team. 

Step 5: Data Governance

As data platforms mature, the impact of not having specific components of a Data Governance program becomes more and more important. On the other hand, over-engineering a Data Governance program can slow down progress and limit business value. The key is balance.

Pay special attention to certain business units that have non-negotiable Data Governance requirements. Compliance reasons in certain industries require specific governance of the platform. Identify aspects of Data Governance that are non-negotiable and those that can be developed later in the data platform life cycle. Core Data Governance capabilities are:

  • Authentication & Authorization: Ensuring the right users have the right access to the right data.
  • Information Architecture: Focuses on organizing, structuring, and labeling data effectively and sustainably. 
  • Provisioning and Rights Management (Data Stewards & Data Asset Definitions): Supplying data to users to ensure high data quality and a clear understanding of data assets. 
  • Data Catalog and Classification: Providing data summaries and metadata for easy access and understanding.
  • Data Lineage: Understanding the impacts of changing source systems and the downstream effects.
  • Data Mastering: Set of defined activities to specify a single source of truth across the enterprise for all data required to run the business.

Step 6: Organizational Structure

Organizations evolve through growth, contraction, acquisitions, mergers, and a whole host of other factors. The teams that support the platform will need to ebb and flow with the organization and platform. 

Typically, organizations have an executive leader who is responsible for data and analytics broadly. As organizations grow, sub-teams can form to support specific business units or functions within the organizations and a central IT team can manage fundamental aspects of the platform infrastructure. Below is a simple view of an organizational structure vs. a more complex one. 

Simple Vertical

Complex Vertical

Step 7: Implementation Plan

All of this work culminates in a clear plan for how to get from where you are to where you want to be. Typically platforms go through a similar implementation process of platform build, migration, use cases development, testing/validation, deployment to production, and management. Each of these phases needs to have clear timelines and costs. Balancing quality, speed, costs will vary by organization.

Phased Implementation Plan

Listed below is an example of a phased implementation plan that AKIRA uses for most customers:

Migrations

If your business is ready for the migration step, we have a vast library of common migration approaches including On-Prem to Cloud, Simple DW to Hadoop, Hadoop to AWS, and much more!

Step 8: Recommendations

The last step brings everything together. The recommendation process includes the target future state and justification for what data asset will be developed and how the business will create a strategic advantage through the use of data. Here are the five key components that should be included in the recommendations:

  • Summary of Discovery
    • Data platform current state 
    • Platform delivery model & team structure
    • Gaps/challenges
    • Identification of use cases
  • Recommended Data Platform Architecture
    • Key technologies
    • Technology justification
    • Initial platform roadmap
  • Recommended Delivery Organization
    • Roles and responsibilities
    • Team size
    • Required technology skills 
    • BU partnership model
    • Execution method (Agile vs. Waterfall vs. Hybrid)
  • Recommended Data Governance Model
    • Initial data governance requirements
    • Recommendation for how data governance progress
  • Implementation Plan
    • Phases
    • Timelines
    • Costs

The intention of the recommendation is to convey and convince stakeholders of the direction they should be heading. It will be the basis for any needed investment in technology, people, process changes, or potential changes in organizational structure to support this new direction.  

When developing the recommendation, it is important to bring stakeholders along in the process. They should not be caught off guard by the recommendation. Not all stakeholders will agree with the recommendation but getting their input, understanding their concerns, and addressing them is key to managing change within your organization. 

Even though not all stakeholders will need to agree with the recommendations, key decision-makers will. Identifying these decision-makers and include them early in the recommendation process will increase your chances of success.  

Conclusion

The general principles and best practices in this guide truly apply to organizations of all sizes, industries, and data/analytic maturities. 

At AKIRA, we know this work is foundational to your success and are here to help. Drawing from years of experience, learning, iterating, and doing this for customers at every stage of their life cycle, we’ve built a set of tools, processes, reference architectures, and a team that is ready to help you get on the right path towards better-utilizing data and analytics within your organization(s).