Yearly Archive 2020

ByMarketing

Data Masking

The data masking challenge 

One of our clients had an interesting data masking requirement. How to mask Production data to meet with GDPR and IT security policies. The data needed to be human readable enabling the development and testing teams to create a data feed for a new Client Portal system. However, the core system did not have the ability to mask the data, only scramble or obfuscate. The core system was extremely complex, built & expanded on over 10 years. It is difficult to understand the system & how data is stored because documentation didn’t exist!

Furthermore, the architecture restraints meant there was not enough storage space to hold a second (in-line) database with masked production data.

Is this a common problem?

The more companies we speak to, the more complex or complicated situations we find. From our experience, we’ve found a pattern emerging in the common problems or requirements:

  • Old Tech – Ageing trading platforms/core systems or sources of data often don’t have the functionality to masked data. Those that do or have extensions/plug in to mask the data often take a long time to process or do not have the flexibility to fit every scenario.
  • Quick turnaround – Near realtime data is nice to have, but not always a real requirement.
  • Specific/varied masking – Different types of masking needed, obfuscation, scrambled, encrypted or human readable & randomised.
  • Storage – Limitations on storage or infrastructure makes it difficult to store an entire copy of production. 
  • Cost – Large database providers offer alternative tools with the same effect but also command a very large price tag.
  • Time – Developers can develop hand-cranked specific solutions which take reasonable amounts of time to develop but much longer to test to ensure the solution is working as expected.
  • Doing the right thing – Most clients want to do the right thing to meet regulatory requirements but see this as a complicated housekeeping chore and recognize the risk but choose to ignore it.

Engaging data discovery

We had a lot of options to solve this problem, but selected Redgate Data Masker and here is why:

  • After a review of the underlying data structure, it was too difficult, costly & time intensive to try to transfer the data into the Test environment and apply masking rules.  
  • We discovered that it would take 32 to 48 hours to copy the “majority” of the data from Production to UAT environments. Doing this would copy most but not all of the data creating a potential for leaving things behind. Plus it would take more time to run the system’s own obfuscation processes (another 8 hours).
  • Masking not Obscuring. Create human-readable values. i.e. Mr. Smith converts to Mr. Jones. This was not available from the trading platform’s masking function.
  • Defined values. Create predictable values, such as a telephone number set format or date of birth.
  • There was a lack of documentation regarding the location of personally identifiable data. This could result in the process missing part of the system if we processed the whole database.
  • We had a requirement to build in a verification process, comparing the masked data against the source. This report would answer the question – “have we missed masking any records?”

We created a simple plan to extract the data, load into a SQL database and then mask. Only taking required data increased efficient use of storage and reduced processing time. This would allow the Client’s development team to export the masked data and transfer into the Client Portal. 

Choosing the right tool

Identifying the data was a difficult manual process because of the core system’s table/column naming convention. Engaging Data’s Consultant used the WhereScape 3D product, which documented the structure of the system into a metadata layer. The consultant worked with the business teams to update the metadata layer & highlight fields that contained personally identifiable data. In addition, we added business definitions. Using an agile approach, each columns type of data masking requirement was agreed, along with how data joined and stored/reused in different tables. Helpfully, WhereScape 3D provided all the known diagrams and suggested relationships, helping to reduce the investigation time.

At the end of this exercise, WhereScape 3D produced detailed documents of the core systems data structure as well as analysis of the data cardinality/profiles. It uncovered some interesting points about the system, including some parts of the system that held personally identifiable data, that the client had not known existed.

Putting the Data Masking solution together

Using the information within the metadata; WhereScape’s Red imported the physical structure of the system and automating the extraction of data into a SQL database on a scheduled basis. We started off daily, but later to increase to every hour.

Now that the data was at rest in the SQL database, our consultant used Redgate’s Data Masker to convert the personally identifiable data to a data set, based on the agreed rules held within the metadata. Once the rules had been designed, WhereScape’s Red scheduler automated the masking so that it started as soon as the loading has completed. 

Data processing, including masking and being loaded into the target database, took place within 4 hours (initially). Not too onerous and very timely compared to other options. More importantly, meant we reduced processing time by a further hour.

Did the data masking work?

Using WhereScape Red, the Engaging Data consultant was able to build a comparison process, that utilised the metadata (only using those field marked as containing personally identifiable data) and compare the values before and after the process. 

The processed ends with an automatic email of the data masking comparison report. This report contains a summary of field error analysis as well as a number of field errors per record. The latter was used to fail the process & prevent the data from be transferred to the target database. Automating this, enabled the Client to feel confident that the process was working correctly.

In conclusion

All sorts of tools can be used to mask data. We find the best of them will automate the process allowing you to decide how to mask, when to mask & how frequent to do it.  

If you would like to learn more about this Redgate‘s Data Masker, WhereScape Red or how we can help with your data project, please feel free to contact office@engagingdata.co.uk


Would you like to know more?

Engaging Data Telephone

Call us on

+44 (0)203 488 4774

Engaging Data Question

Send a message

Here

BySimon Meacher

Supporting Girls Football

In 2019/20 season we sponsored Pace Youth‘s U16 Bobcats girls football team in Southampton playing in the Hampshire Girls Football League.

Instead of putting our logo on the shirt, we choose to support a charity Young Minds who are the UK’s leading charity fighting for children and young people’s mental health.

Engaging Data will continue to support the team & its charity Young Minds in going into the 2020/21 season!

Young Minds is a great place for parents and children to find support. If you are able to, please support this fantastic charity: Donate Here.

COVID-19 cut the season short, but we hope the team get back to playing football, safely. Who knows what football will look like next season, at a professional level or grassroots. We can’t wait to hear how the team get on, good luck Bobcats!

ByCarl Richards

Agile data warehouse development

Agile data warehouse development

One of the most common problems we are approached to resolve is agile data warehouse development. Clients understand the importance of data and building the right data platform to service their businesses needs. Delivering a new or amending a platform at pace is one of the keys in providing true flexibility and enabling change within organisations.

Engaging Data has built data solutions using our three pillars principles – people, process and technology. With a highly-skilled development and consultancy teams, we provide real added value to our clients.

Technology – Picking the right tool for the job

Quick and nimble!

Fantastic partners like WhereScape, enable us to build data platforms in days. Imagine, within the first few days of starting development, the business will have a data platform that they can start interrogating. WhereScape RED allows you to continue developing and amend your data platform without any complications. RED offers several data platform solutions allowing you to choose the right solution for your business.

If you are planning a project to create or alter a data warehouse, data lake, data vault or a combination of all of these, allow us to show you the power of data automation using WhereScape.

Do not just take our word for it, have a look through what our clients are saying.

People

Having the right team

Our services will help will all aspects of data solutions. From the development of a data platform, data model design and visualisation to day to day solution support.

Our design process uses the BEAM (Business Event Analysis & Modelling) methodology. This approach alongside WhereScape 3D provides a quick and effective data platform from the source system to report or visualisation.

Process

Having the right process for your company

Our support services offer a wide range of support levels that give comfort and peace of mind at a cost-effective price.

Achieve the goals you have set out for your data project with our experienced team who can help with your data strategy, governance and SDLC (Software Development Life Cycle) processes.

Training

Keep everyone up to date with the latest thinking and methodologies – don’t be afraid to change or try new ideas!

We provide training and support for your data platform & can arrange bespoke training courses for all WhereScape products. In our experience, learning a new tool using familiar data, makes the topic more digestible.

If you would like to read more about WhereScape RED and 3D, please click here.

Looking for help with geo-relational or mapping data. Read more about our partnership with the Ordnance Survey and the products they offer.


Would you like to know more?

Engaging Data Telephone

Call us on

+44 (0)203 488 4774

Engaging Data Question

Send a message

Here

BySimon Meacher

Data Vault 2.0 Meet Up

Engaging Data has started to work on Data Vault projects. If you are not aware, Data Vault is great methodology for data development (DevOps/DataOps). It’s a good fit for how we like to work with clients!

We have seen it work really well with cloud technologies such as Snowflake, but it is very adaptable to most architectures. 

You would not believe the speed we can deliver project using Data Vault and WhereScape! We talk in minutes and hours not days and months! Don’t beleive it, get in contact for a demo.

What can you expect?

The Data Vault user group has set up a free meet up. This meeting will be a great introduction to how Data Vault 2.0 is being used. John Giles a Data Vault guru will be making a guest appears, so you know there will be good content!

See you there!

BySimon Meacher

Redgate Partnership!

Using the right tools for the data job.

We are really excited to go into partnership with Redgate. Find out more about them here. Redgate’s software can help from development to environment controls. Moreover, these tools are perfect for any data developer!

Redgate develops tools for developers and data professionals. These products are a natural fit for Engaging Data providing suitable tools to for data development teams and enhancing or streamlining processes. Redgate produces specialised database management tools for Microsoft SQL Server, Oracle, MySQL and Microsoft Azure.

All of these platforms and tools driven by Engaging Data Consultants create engaging data solutions.

Our consultants are already using Redgate software in a project to mask data for a data warehouse. As a result, we will share some of the challenges and how we’ve built robust solutions using Redgate’s tools!

If you would like to know more about the tools we use, or have a question about Redgates product, please get in touch!


Would you like to know more?

Engaging Data Telephone

Call us on

+44 (0)203 488 4774

Engaging Data Question

Send a message

Here