Shared with the kind permission of WhereScape Europe Ltd.
In January 2020, The first official Data Warehouse Automation Quadrant was published by Gartner. This created the creation of a definitive category for an area of Data Management that until recently, was always on the fringe of the mainstream. “Automating Data Warehouse Development”, written by Gartner analyst Henry Cook (Gartner subscribers can download the full report from their website here). As an organisation that recognises the value of Data Infrastructure Automation and all that it supports, Engaging Data supports the view that such publicity reflects a shift in the data world from seeing automation as a nice to have, to a necessity for companies’ ability to compete in the digital age.
Henry’s technical background means he can describe the gains Data Warehouse Automation (DWA) creates from a business angle, but also go into granular detail about what DWA means for developers day to day.
The Gartner Take On DWA
“Few companies can implement all their potential requirements — limits on resources typically preclude this. Therefore, a DWA tool is often used as a productivity enhancer, which helps to meet more requirements and thus deliver more benefits. The dependence on external staff — such as contractors — can also be reduced through the use of DWA.
“Developers gain agility through the ability to generate code, and — equally important — the ability to regenerate it to make changes. Aside from generating code for the original state and for the new state, the DWA tool can also generate the special code to move from one state to the other.
“For example, consider the process of adding columns to a presentation layer data mart. The DWA tool generated the data description for the original mart, then the ETL to load it. Done manually, the developers will also need to write some special, one-off code to unload the mart, reshape it, and then reload the data in its new form, merging in data for the new columns. The DWA can automatically generate this transitional code too. This is faster and much more productive than having to do this manually. Over the lifetime of a DW, the developers will repeatedly perform this type of process. The principle is to build once and then repeatedly reuse.”
Later in the report, Henry describes how DWA affects the Data Warehouse Development lifecycle in the long-term, not just in its initial development.
“On the left-hand side of Figure 3, the DWA developers are defining the system and the linkages between the components in metadata.
“Once defined, the DWA tool will then automatically generate all the components shown on the right. Generating those components manually, then integrating and testing them (and doing this repeatedly) will require significantly more effort than using the DWA tool. Organizations using these tools commonly see productivity gains of 400%. Of course, productivity gains will vary depending on the tool and the nature of the system.
“Developers can change metadata and then quickly deploy the generated changes in the subsequent automated routines. This means that your development staff can devote the time that they had allocated for administrative work to address much more valuable and leveraged design work. Equally importantly, developers are much less likely to introduce mistakes, since the automatic generation of changes is always exact.”
What is Data Warehouse Automation?
A few years ago, the only way to quickly explain Data Automation was to compare it to an ETL tool, which is not a straight comparison given the holistic reach of tools such as WhereScape. WhereScape extracts, transforms and loads but in a different order. ELT uses the power of modern databases to transform, often using MPP to complete the process in a fraction of the time of traditional ETL tools. So data is loaded 5-6x faster than with an old-school ETL tool, but this is just the tip of the iceberg.
Ten or even five years ago we had to really explain what we do from scratch, but now the conversations we have with data professionals are changing. The industry is much more educated on the role automation has to play in the development process. Stats of the leaps forward in productivity companies are making catch the eye of IT and business staff at all levels looking to do more with fewer resources.
Data Automation isn’t just a tool to plug in and forget about. It’s not just a quick fix for a small problem. It is a conduit for data warehouse modernization and a new way of working altogether. Recently a customer told us that, “WhereScape will force you to think differently,” (Will Mealing, Head of Data & Analytics at Legal & General – read the full case study here). “Your acceleration is huge once you rework how you historically have done things.”
Data Warehouse Modernization
WhereScape software has been around since the turn of the century and was designed as an automated code generator and integrated development environment to take the repetitive manual work out of the 90s ETL coding methodologies. It enables you to design and build data infrastructures with a drag and drop GUI, then writes 100s of lines of codes in seconds and documents everything at the click of a button. WhereScape merely sits on top of your existing infrastructure and writes the SQL code to make it fit together faster and more efficiently so developers can focus on higher-value work.
Some ambitious companies have been automating their data infrastructure for years and achieving rapid time to value. But only now, 20 years later, are the majority of companies catching up and DWA is quickly becoming the norm. The days of IT staff sitting in siloed offices, staring at screens all day through to evenings and weekends are gone. Today we have stand up meetings, SCRUM sessions and sprints, then let automation do the grunt work because we don’t have to anymore.
Torgil Hellman, Chief Architect at Atea Sweden – watch the video here, talks about life BA (before automation) and AA (after automation). Before WhereScape he used to spend 90% of his working life at a screen and 10% collaboration with the business, now it is the other way around. He doesn’t have to wake up early when he’s on holiday to check his emails because he trusts the automation tool.
Data Vault Automation
Data Vault has gone from the latest fad to a mainstream modeling style in just a few years. The agility it enables has endless implications for new streams of actionable data, but its complexity needs automation to implement and maintain. Today around half of our new customers are doing Data Vault, and it could be said that this is pulling DWA into the mainstream. Many customers also want to migrate data and architecture onto Cloud platforms for further agility, such as Aptus Health who operate a Data Vault on Snowflake.
In order for companies to succeed today, the puzzles and firefighting that accompany a continued reliance on hand coding must be removed. Modern data architectures are complex enough without also having to deal with problems arising from outdated 1990s methodologies. Only then can companies focus forward on innovation and agile collaboration to supply reliable data to the business at the right time.
Once lost between categories of data tools that address very specific challenges, end-to-end data warehouse automation tools such as WhereScape are now finally getting recognition from top analysts like Gartner, and we hope to bring many more reports like Henry’s to your attention over the next couple of years.