How to Plan a Data Warehouse Migration to the Cloud

An on-premises data warehouse – frequently termed an on-prem – is a repository used by businesses and organizations to store data for use in analytics, business intelligence, and reporting. While a database is mainly employed to process daily transactions, a data warehouse is used to report, analyze, store changing and historical data.

Recent years have seen many organizations move their data warehouses to the cloud. This is for a variety of reasons, including cost, scalability, and improved integration. Following in their footsteps and migrating your data warehouse to the cloud is a significant undertaking that requires careful planning for before, during, and after the move. 

Organizations stand to benefit from breaking their cloud migration procedure into groups of related workloads that can be managed or delegated individually.

Define The Goals

One of the first things that an organization needs to do is answer whether data migration is even necessary. For this, they should answer these questions.

  • What is the purpose of this data migration? Is it forced by governance or security? Is it business-critical? Is it about making long-needed changes in data infrastructure?
  • Define a Business Intelligence (BI) strategy and fully answer if and how this move will enhance performance and competitiveness.
  • How will this move enhance performance and competitiveness and does it comply with the company’s Business intelligence strategy?
  • Understand which part or parts of your organization stands to benefit from cloud migration. Does cost-saving purely drive this, or is it required for growth? 
  • Define which parts of your organization will be responsible for each stage.
  • Data governance and stakeholder considerations must also be defined
  • Existing reporting vs. must have future reporting, should be determined to make sure it’s all achievable on the new cloud DW

Data migration is a time-intensive and costly undertaking that requires skilled specialists and carries some risk of information loss. Organizations should take the time to consider exactly what the benefits will be. Solutions exist to many of the issues facing data warehouse storage, so due diligence is recommended before any action is taken.

Assess Your Current Solution

A full assessment of your current analytics solution should be the first port of call. Clearly define which parts of your organization stand to gain from migration. Many migrations benefit from being done incrementally to minimize business disruption. 

At this stage, assessing exactly how much cloud space you need can be evaluated by figuring out your current application resource requirements and then factoring in any additional processing improvements or enhancements, you have planned.

A full understanding must be made about what services and uses your data warehouse currently performs. Assess what operations can’t be presently performed with your current solution that cloud storage stands to solve. 

Which components are already on the cloud?

Identify which components of your analytic solution are already on the cloud. Then make a list of which pieces parts of your data warehouse you plan to move to the cloud, for example, BI tools, ETL, or database.

Which elements are stored on-premise?

Once you’ve figured out which elements are already on the cloud and which parts you would like to move, a comprehensive list of which elements are stored on-premise is required. Figuring out dependencies is all-important to avoid any downtime. Data and applications tend to become entangled over time — a term frequently referred to as “data gravity” — so identifying which applications are reliant on each other, and re-establishing these connections on the cloud, requires forethought and planning.

What applications are you running in the cloud, and which need advanced analytics?

Figuring out your data sources currently on the cloud — and how your organization uses them — should be determined at this stage. These should align to the goals stated before in the initial section of this document.

Another important consideration is exactly what advanced analytics you currently use or plan to use. This time, early in the process, is crucial to gain a planning oversight. This migration should not replicate old inefficiencies. 

When all of this information is accounted for, then you can begin the next stage of planning.

Plan a Budget

Though moving to the cloud promises savings on the maintenance staff, data warehouse space, and equipment, a successful transfer incurs its own costs. An initial budget should be drawn up and compared to current solutions before proceeding.

Careful attention must be paid to the spending culture within your organization. When allocating costs, are operational expenses preferred over capital expenses? Which stand a better chance of board, stakeholder, or management approval?

How to Design Your Cloud

Once you know why you are migrating, what your budget is and what your current solution does — and how you can improve it — you are ready to move on to the design of your cloud.

Data migration involves three steps: Extracting data, transforming data, then loading data. The design of your cloud is an opportunity to examine current formats and databases and reimagine them to suit your business analysis needs.

There are a lot of specifics to be decided upon. With no one-size-fits-all approach, how you plan to use the cloud will dictate the design. Some important points to consider are as follows.

To really reap the benefits of the cloud, organizations stand to benefit from cloud optimized databases. Snowflake should be considered first. Other recommended options are Amazon Redshift, Microsoft SQL Database, or Synapse. Consultation is essential at this stage to make sure you’ve got the right solution for your specific uses. The right platform for your migration is crucial. Again, consultation is recommended at this stage to avoid any costly wrong choices. AWS, Azure, or Google are market leaders for a good reason and are highly recommended.

A geographical map of users’ locations is worthy of consideration in the planning of your cloud. Cloud servers have a physical location, so choosing a place close to your market or markets may be useful when maximizing speed.

Another factor to consider is the definition of your cloud approach. Will it be hosted, managed, or SaaS? Identifying the trade-offs of each method will serve an organization well. Again, these decisions will be business-specific, SaaS will be sufficient for some, but cloud storage provides a more customizable option.

Finally, organizations should select between a public cloud, hybrid cloud, private cloud, or multi-cloud options. Again, this will be business or industry-specific. Some organizations choose to keep some data on-prem, so a hybrid cloud solution is then the preferred option. 

Identify Who Will be Interrupted

It’s critical to define exactly who will be interrupted by this migration and plan and schedule around this. After this, figure out which departments will be using the cloud. What applications or programs will they need access to, and will they — or your IT department — be responsible for the long-term management of the cloud.

Evaluate what training and skills will be involved in the day to day running of the cloud and consider appropriate training for those responsible. With a move from a data warehouse to the cloud, specific jobs or staff members may become obsolete. This is an important consideration to make. Any data governance and stakeholders considerations must also be considered at this juncture.

Additionally, consider the fact that many migrations benefit from being done incrementally to minimize business disruption. 

Review of Security and Privacy Policy

A thorough review of security policy must be made before transferring data to the cloud. Specifically, who has access or authorization to use this migrated data? With user privacy regulations being enforced by governments and regulatory bodies, data security must be factored into this move.

Robust backup strategies and recovery processes should be implemented to save any disastrous data losses or compatibility issues.

Perform a Data Audit

An audit of the data currently stored on-prem and on the cloud is necessary, specially that data that is loaded into the Data Warehouse. Organizations must define how much information they have and how it is currently accessed. Is it structured or unstructured? What are the sources of your data? What formats are this data found in? 

Some data for ERP solutions— for example, Salesforce — will already be on the cloud. Others will not. If your organization is planning to keep some data on-prem, defining which data is kept — and for what purpose — is essential. This will largely depend on the business intelligence uses of this data. Consultation with an expert is recommended so that the migration produces the best results.

Based on your operation’s size, as part of your audit, you should identify server groups that can be appropriately moved together to minimize business interruption when migration occurs. This is a key part of the planning process, as business should not be disturbed because of this migration of the data warehouse to the cloud.

Establish The Migration Pathway

Organizations must assign a skilled leader to be responsible for managing the migration. This is complex, specialist work that requires an experienced hand, so outsourcing this job to consultants or third-parties is a common approach. When a ‘lift and shift’ of historical data is needed, this again requires careful planning. This migration can be done in small chunks or one large go. Interdependencies among components is another consideration that needs to be made during migration. Application dependency mapping is crucial. 

As you can see, planning a data migration is a complex, multi-step task that benefits greatly from advanced planning and consultation. You may need an additional data store to take advantage of the enterprise data warehouse in the cloud new capabilities, by having a data lake to store raw data and historical data in detail instead of keeping a volume of data in the EDW that is barely used.

Contact us today to speak with one of our on-demand teams to help solve your organization’s technological challenges so you can improve your productivity and maximize your growth.

+

Related Articles