Moving to the Cloud

Moving to the Cloud

Moving to the Cloud

7 Expert Tips for Selecting the Right Strategy and Tools


Moving your data to the cloud can help manage costs and increase agility. In the cloud, you can scale up or down as needed to handle spikes in demand and control costs. The cloud also offers more choices for SaaS applications: Cloud provider marketplaces are filled with tools that can help you move your data, transform it, analyze it, and meet just about any other need you have.

Peter Choe, Data Practice Lead, Ippon Technologies USA and Shawn Johnson, Solution Architect, Matillion, chatted on a webinar about the right strategies and tools for moving to the cloud. Here are some of their top tips for a successful cloud implementation and migration.


1. Build your cloud strategy around people, processes, and technology.

When building a data strategy, it’s important to look closely at your people, processes, and technology. When you understand the existing state of your organization, you can come up with a strategic plan for a future state.

People

When looking at your people, you need to understand the skill sets your employees already have. You don’t necessarily want a strategy that forces everyone to re-skill. Building off your employees’ existing skills will allow you to pivot to the cloud more quickly and better assess which gaps to fill right away and which skills to develop over time.

One great way to identify all your existing skills is to build a RACI chart. Also known as a responsibility assignment matrix, a RACI chart shows the skills, roles and responsibilities of your employees: Whether they need to be Responsible, Accountable, Consulted, or Informed on particular initiatives.

Processes

It’s important to understand how your existing processes intersect so that when you move one thing to the cloud, you understand what else might be impacted. With a thorough understanding of your existing processes, you can create a strategic migration plan.

Technologies

There are numerous technologies that can easily be migrated from on-premises to the cloud. For example, Unix and Linux scripts can easily run on cloud platforms. Also, SQL is used extensively in on-premises solutions as well as in the cloud.

Again, assessing existing technologies can take advantage of skills your employees already have. Do research to determine if you can simply move your existing technologies over or if there’s a better technology option available in the cloud.


2. Get buy-in from your entire organization.

If you want to move your business to the cloud, you need to have buy-in from everyone from low-level developers to C-level executives. Everyone needs to have a clear vision of what the goals of the project are.

When talking to developers, you may want to talk about the day to day: efficiency and ease of management. Data teams are overloaded and increased productivity is an immediate benefit. When talking to C-level executives, you’ll want to focus more on the big picture and business value: future development, speed to analytics, ROI, delivering value to end consumers and the actual dollar value of the project.


3. Start small to achieve quick wins.

When migrating to the cloud, avoid the “Big Bang” approach. Start small. Doing so can help quickly demonstrate the ROI of moving to the cloud. Starting small also helps developers build up their confidence in working with new tools and technologies.

If you start with smaller, more tangible projects that yield immediate business value, you’re more likely to reinforce the importance of a bigger initiative. Also, in your first endeavor, you’ll no doubt run into bumps in the road that require you to correct course. These shifts are easier to make on a smaller project.


4. Determine the right method for moving to the cloud.

There are three common methods for moving to the cloud that the majority of organizations employ for their first cloud migration:

Lift and shift

Lift and shift, or load and transfer, is exactly what it sounds like: You basically move an application and its associated data to the cloud as-is, with no redesign. When you are just beginning to learn about the cloud, lift and shift might be the easiest, fastest, and most cost-effective way (in the short term) to get an existing on-premises application or process moved to the cloud. It’s also a great way to become more familiar with the cloud.

Load, transfer & sync

This is similar to lift and shift, but once you’ve loaded and transferred to the cloud, you then try the different cloud services that are available and swap them out for increased efficiency. For example, you might move your application to the cloud, but swap out the database for a cloud-native database. With this approach, you can benefit from the automated backup and operations that cloud services provide.

Re-architect and re-platform

This approach requires the most time and effort. It involves re-imagining how your application will run on a cloud platform, then re-designing it to take full advantage of cloud-native capabilities.

This method may be useful if your current architecture is unable to scale to meet future business needs. It can also help you achieve cost savings over lift and shift in the long run. However, this method is the most time-consuming and difficult up front.


5. Choose the right cloud provider.

All the major cloud providers offer a variety of managed services and components that you can use to expedite your move to the cloud. Review the marketplaces for each provider to determine if they offer the applications and microservices you need. Some providers also offer a cloud adoption framework to help support your cloud migration plan.

Even as you choose a provider, keep in mind that you may need to make a change at some point in the future or need a multi-cloud strategy. Consider ways to make your applications cloud-agnostic.


6. Use cloud-native data loading and ETL

Whether you are performing data transformation or simply loading data into the cloud, cloud-native ETL products can help you increase productivity and accelerate time to value. Trying to adapt on-premises ETL tools and processes to the cloud won’t take advantage of the platform’s speed and scalability like a cloud-native solution will.

Matillion supports all the major cloud data warehouses, and it provides a graphical, low-code/no-code user interface that can generate SQL for you. Pre-built connectors help you get your data into the cloud from common data sources, and the ability to ‘Create Your Own Connector’ using REST API ensures that you can bring data into the cloud from virtually any source. Data teams are overloaded and Matillion helps them move faster and more efficiently, increasing the speed to analytics.


7. Always be evolving.

The cloud is not a static technology. It’s always changing. Your data journey will continue to evolve as well. Also, your move to the cloud may be gradual, and you may be maintaining some on-premises applications for years. Be prepared to continually evaluate your services in and out of the cloud to improve efficiency and take advantage of new and emerging technologies wherever your data resides.


Learn more about planning a move to the cloud.

To learn more about selecting the right strategy and tools to support your cloud transformation, click here or contact us. We’re happy to provide more links to articles, research, or answer your questions on moving to the cloud.

How is CI/CD (Continuous Integration / Continuous Delivery) Used to Modernise a Data Warehouse? 

How is CI/CD (Continuous Integration / Continuous Delivery) Used to Modernise a Data Warehouse? 



.

How is CI/CD (Continuous Integration / Continuous Delivery) Used to Modernise a Data Warehouse? 


A specially designed type of data management, a data warehouse is a system that has been built to enable and support business intelligence and analytics. For example, a data warehouse will read large amounts of data to understand relationships and trends within an organisation.  

Data warehouses contain large amounts of historical data and are intended to perform queries and carry out analysis. The data within a data warehouse is often derived from various sources, such as application log files and transaction details. You’re centralising and consolidating large amounts of data into one location by utilising a data warehouse. Over time, this data can be invaluable to data scientists and business analysts, allowing informed business decisions to be made using valuable business data insights. As this data record builds, a data warehouse is often considered an organisation’s single source of truth.  

Analytics and data have become indispensable to allow businesses to stay competitive, and companies rely on reports, dashboards, and analytic tools to extract data, monitor business performance, and support future business decisions. The power behind these processes are data warehouses, which store data efficiently to minimise the input and output of data and deliver query results quickly.  


What’s the Architecture of a Data Warehouse?  

A data warehouse architecture is created in tiers. The top tier is the front-end, where results are presented through reporting, analysis, and data mining. The middle tier is where the analytics engine used to access and analyse data sits. The bottom tier of the architecture is the database server, where data is loaded and stored.  

Data is stored in two types of ways:  

  1. Data accessed regularly is stored in fast storage, like an SSD drive.  
  2. Whilst data that is less frequently accessed is stored in an object store.  

The data warehouse is competent at differentiating between the two types of data. It will automatically ensure that frequently accessed data is available in the fast storage option, thereby optimising query speed.  

So, how does a Data Warehouse work?

A data warehouse might contain multiple databases, and within each database, the data will be organised into tables and columns. Within each column, you can then define a description of the data, for example, integer, string, or data field. Tables can be organised within schemes, using a similar structure to files and folders. So, data is added to a data warehouse, it’s then collected and stored into various tables defined by the schema, and query tools utilise the schema to identify which data tables to access and analyse.  

What are the benefits of using a Data Warehouse?

As we mentioned above, there are multiple benefits to using a data warehouse, for example, making more informed business decisions based on data and insight. Here are some of the additional benefits of using a data warehouse:  

  • Consolidated data from many sources 
  • Historical data analysis 
  • Data quality, consistency, and accuracy 
  • Separation of analytics processing from transactional databases and improves the performance of both systems 

What’s a Modern Data Warehouse?

Across your organisation, multiple teams and users will have different needs for a data warehouse. Traditional data warehousing can’t always keep up with the demands of rapidly growing volumes of data, processing workloads and analysing data. In contrast, a modern data warehouse architecture addresses the different needs of your organisation by providing a way to manage everything you need with various integrated components that all work together. 

A modern data warehouse would typically include:  

  • A streamlined database that simplifies the management of all data types and provides different ways to use data across your organisation 
  • Self-service data ingestion and transformation services  
  • Support for SQL, machine learning, graph, and spatial processing 
  • Multiple analytics options that make it easy to use data without moving it 
  • Automated management for simple provisioning, scaling, and administration 

Ultimately, a modern data warehouse can efficiently streamline data workflows in a way that traditional data warehouses can’t. This allows everyone to perform their jobs more effectively and efficiently, from analysts and data engineers to IT teams and data scientists.  

If you have a lot of data, and multiple teams that all need to access it, then modernising your data warehouse should be a key component of your data strategy. But how can you update your data warehouse? This is where CI/CD, which stands for Continuous Integration (CI) and Continuous Delivery (CD), comes into play. CI/CD creates a faster, more precise, and overall efficient way of combining the work of multiple teams into one streamlined product. For example, in app development and operations (DevOps), CI/CD would streamline coding, testing, and deployment by creating a single space for storing work and tools, thereby consistently combining and testing code to ensure it works.  


What is the CI/CD pipeline, and how can it support modernising a Data Warehouse?  

By utilising a CI/CD pipeline, any software development or engineering processes that combine automated code building with testing and deployment, you can deploy new and updated software safely and precisely.  

In other words, a CI/CD pipeline is the behind-the-scenes plumbing of your data and analytics, making your life easier and your work more consistent. Here at Engaging Data, we have developed a solution that integrates WhereScape 3D & RED with CI/CD and DevOps pipelines in one ecosystem to modernise the world data warehousing. If you’d like to know more about our CI/CD pipeline, take a look here.  

Because a CI/CD pipeline isn’t just a linear process, it allows DevOps teams to write code, integrate it, test, deliver updates & releases, and make changes to the software in real time. In addition, the ability to automate critical parts of the CI/CD pipeline allows development teams to work more efficiently and more effectively and improve other DevOps metrics.  

What are the benefits of the CI/CD pipeline in modernising a Data Warehouse? 

The most significant benefit of a CI/CD pipeline is the automation of releases from the initial testing to deployment. Additional benefits of the CI/CD pipeline for DevOps include:  

  • Automated testing makes the development time more efficient; CD and automation mean that a developer’s changes to a cloud application could go live within minutes.  
  • Thanks to faster, more efficient testing and development, less time is needed to be spent in the development phase, therefore reducing cost.  
  • The CI/CD pipeline is a continuous code, test, and deploy cycle. Every time code is tested, developers can react to feedback and improve the code.  
  • A CI/CD pipeline allows a more collaborative and integrated process with everyone across the organisation who needs to access the data warehouse.   

If you have any questions about a CI/CD pipeline or the deployment process, we have diagrams of the flow here.  


Conclusion

Overall, a modern data warehouse utilising CI/CD should be an essential part of your data strategy if your organisation has multiple touchpoints requiring access to data, insight, and analytics.  

Do you have any questions about modernising your data warehouse? Or about creating and implementing a CI/CD pipeline? Our expert team of data specialists can implement a modern data warehouse for your organisation.

Fill out our form below, and one of our team will be in touch.  


The Gold Standard – Part 2

The Gold Standard – Part 2

Engaging Data Explains :

Creating The Gold Standards in Data –

Part II : Assessing the Gold


In this second part of our four-part series on gold standards in data, we’re going to examine the importance of instilling a review process.

Reviewing is never the most exciting nor relished procedure! It often feels like an unnecessary grind that slows down your whole operation. But it’s actually a critical part of any good business, enabling you to identify major faults before they become ingrained.


Reviewing Processes

We’ve already seen in the first part of our gold standards blog that the cake shop reviewed their existing processes before making a decision to change their order form. Often it’s only from self-assessing in this way that you can uncover new ways of working that enhance your productivity and efficiency.

It’s also important to emphasise that this needs to happen across the business. It’s normal to have a review as part of project governance for new projects, but it’s always worthwhile to reassess existing projects as well. They may be ‘good enough’, but they also might not be reaching the gold standard.

So in our cake shop scenario, how will the business assess the standard of the cakes being produced? Well, when the business reviewed the kitchen they found that each cook bakes one order, while also being responsible for checking their own cakes before they move to the decoration stage. This all seemed fine, but then the retailer received a few complaints about burnt edges and sub-par ingredients. Consequently, the cake shop reflected that its existing process of ‘marking your own homework’ was not sufficiently robust to identify problems.

In order to address this issue, several ways of reviewing the production of cakes were decided upon:

  • A simple visual inspection – does the cake look uncooked or overcooked?
  • A thorough test, such as breaking the centre of the cake to see if it is cooked. 
  • Pressing the centre of the cake to see if it springs back.
  • Hiring Paul Hollywood as a tester!

All of these checks are designed to see if the cake has been cooked satisfactorily. The shop decided to opt for all options, with the exception of hiring Paul Hollywood! Instead, each baker will review one another’s baking, with the hope that the business will grow to support a head baker who will review all cakes.

Just as the cake shop reviewed its processes to put a more stringent review structure in place, the same can also be implemented in a data-driven environment. The following questions are examples of some that you can ask yourself as part of this process:

  • Does the team need training?
  • Do you need to recruit new people with different skill sets?
  • Do the products need to change?
  • Is the supply line quick enough? 
  • Would more people or different processes help with efficiency?
  • Is the product still worth the effort that is invested in producing it?

Producing Gold

Once you’ve baked some quality cakes, you then need to take steps to market the product. It’s not enough to just produce the cakes and leave your customers to eat them if you want to maximise your marketing efforts. Building advocacy and influence via your customers is a great way of marketing your cakes to the right people. Word-of-mouth feedback from advocates is trusted by other potential customers, and is therefore far more effective than other forms of marketing.

However, there are two sides to the coin here. Negative feedback can be dangerous if it’s not managed effectively. Negative comments about the burnt edges of cakes will spread like wildfire to existing and prospective customers. But there are ways of recovering from this. Making courtesy calls to customers can provide you valuable insight into the process of ordering and consumption. 

Getting the right team in place, tailoring products for your target market, and taking feedback onboard in an active process are all important facets of contemporary marketing. 

What are the Considerations?

Gold standards always begin with what you are looking to deliver and who this will benefit. Gold standards should be designed to support outcomes, having considered both the internal and external factors that will influence design. Creating steps in the process to continually challenge the functionality of the end product and ensure that standards are still relevant to the end user should therefore be considered essential.

Sponsors and influencers can also play an extremely important role. Both can become prime advocates of your product, with the added benefit with sponsors that they pay you to advertise your goods or service!

Internal Considerations

Data

Data can be compared to ingredients within a cake. Naturally, good quality ingredients are critical to producing the best cake possible. The same applies to data. As we mentioned in part one, if you put rubbish data into your systems, you can expect rubbish outcomes!

At some point, the cake shop company realised that the bakers are not periodically reviewing the ingredients within their cupboards. To address this, the manager inspects everything that they have on hand, ensuring that any poor quality ingredients are replaced, and that anything out of data is thrown away. Labelling is updated, while processes are put in place to ensure that there is no repetition of these mistakes.

The key point here is that while complaints were registered about burnt edges, it may have been the ingredients that contributed to the final product that were the problem. Going forward, the team at the bakery put in place a series of key questions that would inform their processes in future baking:

  • Do we have enough data to make the size of cake required?
  • Are we getting our ingredients from the right suppliers? 
  • Does this product contain nuts?
  • Have I mixed sugar up with salt?
  • The milk smells as if it’s on the turn, should I use it?
  • Do we store the ingredients in the right place, in the correct containers? 

Resources and Teams

When baking your cake, you can select from many different types of bakers or specialist chefs to assist with the process. Or you may decide that you wish to train yourself, or an existing employee, so that they can handle the most challenging baking tasks.

In some cases, if the cake shop utilises industrial equipment, people who have been trained to use this equipment can be deployed, as opposed to bakers or specialist cake makers. Having the right team with the right skills, and/or the aptitude to learn them, can be critical to successfully achieving gold standards. Instilling this in your team culturally is critically important in providing direction to your whole operation.

Achieving this can be as simple as asking yourself the following questions:

  • Is the team right to build the end product? If not, what needs to change?
  • Is the team open to changing or evolving in order to improve the product or efficiency?
  • Do we have the right skills? If not, do we need to second or buy them in?

Company Culture

Finally, failing to understand the company culture will lead to failure. Gold standards must fit into the existing culture, or the direction the company is moving towards.

Understanding how your customers think, managing their expectations and developing a standard to consistently perform to those expectations is simple to conceive. However, the human element of this could result in you developing hundreds of different gold standards for multiple different customers. 

Important questions to ask yourself here:

  • Are the customers knowledgeable about your products? If not, can you educate them?
  • Do you share your practice? Would it help your customers to know what you do and how you do it?
  • Are there any expectations that you can manage? 
  • Are they any difficult expectations you have to work towards?

Implementing a gold standard for data may seem like an all-encompassing and intimidating goal. But it instead should be seen as a granular process. Breaking down the ingredients and individual components that collectively create gold standards is the best way to achieve this aim.



Data Warehousing Concepts Explained

Data Warehousing Concepts Explained

Engaging Data Explains:

Data Warehousing Concepts


Modern commerce is an environment in which companies are increasingly being required to make complex, data-backed decisions. Dealing with vast amounts of information has become an essential feature of a business, which can often lead to siloed data. This is difficult enough to store, let alone analyse or understand.

In many cases, business use demands require a more sophisticated system, improving data management and providing a holistic overview of essential aspects of the company. One of the best ways to achieve this is to invest in a data warehouse. Yet, many companies are still unaware of what this entails or how it can help their business.


What is Data Warehousing?

In simple terms, a data warehouse is a system that helps an organisation aggregate data from multiple sources. Instead of experiencing the sort of separation and siloing discussed previously, data warehousing makes it possible to draw together information from disparate sources. It’s almost akin to a universal translator of languages. Typically, data warehouses store vast amounts of historical data, and this can then be utilised by engineers and business analysts as required.

Data warehousing is particularly valuable as it essentially provides joined-up information to a company or organisation. This was quite impossible until relatively recently, as data has always been based on separate sources of information. Transactional systems, relational databases, and operational databases are often held entirely separately, and it was almost unthinkable until recently that the data from the systems could be effectively combined.

But in this Information Age, companies are seeking a competitive advantage via the leveraging of information. By combining the vast amount of data generated together into one source, businesses can better understand and analyse key customer indicators, giving them a real insight into the determining factors of the company. Data warehousing can build more robust information systems from which businesses can make superior predictions and better business decisions.

In recent years, the escalation and popularisation of the cloud has changed the potential of data warehousing. Historically, it was more usual to have an on-premise solution, which would be designed and maintained by a company at its own physical location. But this is no longer necessary. Cloud data architecture makes it possible to data warehouse without hardware, while the cloud structure also makes implementation and scaling more feasible.

Data Lakes

However, those who are uninitiated in deep data topics may encounter terminology that can be somewhat baffling! The concept of a data lake seems rather surreal and tends to conjure up imagery that is, ultimately, completely useless! Inevitably, people who have never encountered the concept of data lakes before find themselves imagining an expanse of azure water glittering in the sunlight. Well, data lakes aren’t quite like that.

A data lake is used for storing any raw data that does not currently have an intended use case. It really can be seen as similar to the wine lakes that used to be in the news quite regularly, but it doesn’t seem to be a talking point any longer! You can equally view a data lake as a surplus of information; it is data that may become useful in the future but does not have an immediate usage at this point in time. Thus, it is stored away in a lake until it can be consumed adequately.

This differs from data warehousing, which is used to deal with information that is known to be useful more efficiently. Data warehousing may deal with data stored in an impenetrable format. Still, there is a clear use case for understanding this information, or it needs to be stored for a particular reason.

When to use a Data Warehouse

There are a variety of reasons that a company or organisation would choose to utilise a data warehouse. The most obvious would be as follows:

  • If you need to start a large amount of historical data in a central location.
  • If you require to analyse your web, mobile, CRM, and other applications together in a single place.
  • If you need more profound business analysis than it has been possible to deliver with traditional analytic tools, by querying and analysing data directly with SQL, for example.
  • In order to allow simultaneous access to a dataset for multiple people or groups.

Data warehousing makes it possible to implement a set of analytical questions that would be impossible to address with traditional data analytics tools. Collecting all of your data into one location and source makes it possible to run queries that would otherwise be completely unfeasible. Instead of asking an analytical program to continually run back and forth, back and forth between several locations, the software can get to grips with one data source and deliver efficient and more holistic results.

Data Warehouse Factors

Many businesses now require data warehousing services to deal with the vast amount of data that is now generated. And that ‘many businesses’ will rapidly become ‘most businesses’, and then ‘virtually all businesses in the near future. But those that are inexperienced in this field are often confused about what factors to take into consideration.

Thus, we would recommend looking at these six key elements when considering warehousing:

  • The sheer scale of data that you wish to store.
  • The type of information that you need to store in the warehouse.
  • The dynamic nature of your scaling requirements.
  • How fast you require any queries to be carried out.
  • Whether manual or automatic maintenance is required.
  • Any compatibility issues with your existing system.

Concerning the first of these factors, data can be somewhat different in terms of its basic structure. Some data may be highly complex, but it can still be quantifiable, easily organised. However, in the era of Big Data there is a vast amount of unstructured data, which cannot be easily managed and analysed. Companies that generate a vast amount of unstructured data and need to collate and understand it are certainly excellent candidates for a data warehousing solution.

There is a lot to learn when it comes to the subject of data. And it can frankly be a little daunting at times. But what is certain is that this topic isn’t going anywhere. Big Data is here to stay. That’s why we have created our Data Vault 2.0 solution. Data Vault can ideally serve your organisations’ data needs when this is becoming an issue of paramount importance.