One of our clients had an interesting data masking requirement. How to mask Production data to meet with GDPR and IT security policies. The data needed to be human readable enabling the development and testing teams to create a data feed for a new Client Portal system. However, the core system did not have the ability to mask the data, only scramble or obfuscate. The core system was extremely complex, built & expanded on over 10 years. It is difficult to understand the system & how data is stored because documentation didn’t exist!
Furthermore, the architecture restraints meant there was not enough storage space to hold a second (in-line) database with masked production data.
The more companies we speak to, the more complex or complicated situations we find. From our experience, we’ve found a pattern emerging in the common problems or requirements:
We had a lot of options to solve this problem, but selected Redgate Data Masker and here is why:
We created a simple plan to extract the data, load into a SQL database and then mask. Only taking required data increased efficient use of storage and reduced processing time. This would allow the Client’s development team to export the masked data and transfer into the Client Portal.
Identifying the data was a difficult manual process because of the
At the end of this exercise, WhereScape 3D produced detailed documents of the core systems data structure as well as analysis of the data cardinality/profiles. It uncovered some interesting points about the system, including some parts of the system that held personally identifiable data, that the client had not known existed.
Using the information within the metadata; WhereScape’s Red imported the physical structure of the system and automating the extraction of data into a SQL database on a scheduled basis. We started off daily, but later to increase to every hour.
Now that the data was at rest in the SQL database, our consultant used Redgate’s Data Masker to convert the personally identifiable data to a data set, based on the agreed rules held within the metadata. Once the rules had been designed, WhereScape’s Red scheduler automated the masking so that it started as soon as the loading has completed.
Data processing, including masking and being loaded into the target database, took place within 4 hours (initially). Not too onerous and very timely compared to other options. More importan
Using WhereScape Red, the Engaging Data consultant was able to build a comparison process, that utilised the metadata (only using those field marked as containing personally identifiable data) and compare the values before and after the process.
The processed ends with an automatic email of the data masking comparison report. This report contains a summary of field error analysis as well as a number of field errors per record. The latter was used to fail the process & prevent the data from be transferred to the target database. Automating this, enabled the Client to feel confident that the process was working correctly.
All sorts of tools can be used to mask data. We find the best of them will automate the process allowing you to decide how to mask, when to mask & how frequent to do it.
Instead of putting our logo on the shirt, we choose to support a charity Young Minds who are the UK’s leading charity fighting for children and young people’s mental health.
Engaging Data will continue to support the team & its charity Young Minds in going into the 2020/21 season!
COVID-19 cut the season short, but we hope the team get back to playing football, safely. Who knows what football will look like next season, at a professional level or grassro
Engaging Data has started to work on Data Vault projects. If you are not aware, Data Vault is great methodology for data development (DevOps/DataOps). It’s a good fit for how we like to work with clients!
We have seen it work really well with cloud technologies such as Snowflake, but it is very adaptable to most architectures.
The Data Vault user group has set up a free meet up. This meeting will be a great introduction to how Data Vault 2.0 is being used. John Giles a Data Vault guru will be making a guest appears, so you know there will be good content!
See you there!
We are really excited to go into partnership with Redgate. Find out more about them here. Redgate’s software can help from development to environment controls. Moreover, these tools are perfect for any data developer!
Redgate develops tools for developers and data professionals. These products are a natural fit for Engaging Data providing suitable tools to for data development teams and enhancing or streamlining processes. Redgate produces specialised database management tools for Microsoft SQL Server, Oracle, MySQL and Microsoft Azure.
All of these platforms and tools driven by Engaging Data Consultants create engaging data solutions.
Our consultants are already using Redgate software in a project to mask data for a data warehouse. As a result, we will share some of the challenges and how we’ve built robust solutions using Redgate’s tools!
If you would like to know more about the tools we use, or have a question about Redgates product, please get in touch!
Output from the development team has doubled, engagement with business teams has increased & Power BI dashboards are being rolled out within days, rather than weeks.
Engaging Data consultants have worked with an existing WhereScape client, to kick start their data warehouse project. Implementation of bespoke development standards and a life cycle, designed for a small development team who have a mixture of data and application developers. These standards help the development team cross train & develop code using different technologies, without a large amount of effort.
Further more, the time taken to review code & publish to production has been reduced. We are now investigating how to automate the release process!
If your interested in our WhereScape Red development standard document, please get in touch.
Our consultants are helping a investment company within the City of London, to form their data strategy.
The COO has identified the value that data can bring to the company during a data warehouse proof of concept. The work has started to create good data & produce engaging data solutions to drive better insights for clients.
If your interested in learning how our consultants can help you, please get in contact.
Our Director, Simon Meacher, was part of a panel to discuss the importance of data governance. The series of talks will be framed with “what I wish i’d known…”. A clever phase, we think it helps to engage with people who would like to know more about the subject.
It was a very well run evening, we hope everyone who attended was able to take something way.
If you are interested how these questions were answered, please get in contact with us.
Special thank you to Paul Goldring & the team at Lawrence Harvey for arranging the event.
Simon would be happy to help with the next event!