How Jenkins Takes Wherescape to Another Level

How Jenkins Takes Wherescape to Another Level

Engaging Data Explains :

How Jenkins Takes Wherescape to Another Level


Data warehousing has grown in importance and popularity, as the global market for analytical systems continues to increase. The global market for data warehousing is expected to touch $30 billion by 2025, based on annual growth of around 12%. This led 76% of IT managers and executives to state that they are investing more in their analytics platforms when surveyed.

As more businesses use data warehouses, efficiency savings and improvements are expected going forward. Data automation is a concept that will benefit many companies, but it’s still important to choose the best solution.


Game-Changing Solution

That’s why using Jenkins to deploy Wherescape solutions is a game-changer. This integration tool used with Wherescape data warehouse automation software is rocket fuel for an already powerful package. 

With Jenkins, it’s possible for developers to build and test software projects continuously, thanks to actions built into the tool. This makes it easier for developers to integrate changes to any project, increasing flexibility in working practices. This can be hugely advantageous in the fast-moving contemporary data climate.

And this is just the beginning of the integration offered by Jenkins. The tool also makes it possible to integrate with other apps and software solutions, by installing plugins for the external tool – examples of this include Git and Powershell. There are over 1,000 plugins available for Jenkins, meaning that the platform supports the building and testing of virtually any WhereScape project.

Low-Maintenance Software

Another key advantage of Jenkins is its low-maintenance nature. The tool requires very little attention once it has been installed. However, when updates are required, the software includes a built-in GUI tool, ensuring that this process is as painless as possible.

Yet while it offers an ideal platform, Jenkins also benefits from continual improvement, thanks to its open-source nature. There is already an enthusiastic community contributing to the tweaking and evolution of the software, and this is expected to grow further still in the years to come.

Jenkins is a shining example of continuous integration, delivery and deployment, sometimes referred to as CI/CD. This approach to data warehousing means that code changes that translate into real-world improvements can be made more frequently and reliably, due to the automation of deployment steps.

Easy Plug-Ins

The process for plugging Jenkins into Wherescape begins with downloading the Java SE Development Kit, at which point you will also need to add JAVA_HOME to your environment variables. That is the only technical part; you then simply download Jenkins using the installer and follow the on-screen instructions. Before you can use the software, it will be necessary to create an admin username and password. Then you’re ready to go!

Among the palette of useful features included in the software is a list view of open projects, which provides an instantaneous insight into the status of everything important that you’re dealing with. This is the sort of feature that has ensured that as well as being powerful and flexible, Jenkins has also earned kudos in the data warehousing world for being user-friendly. 

Jenkins incorporates a user interface that is simple to pick up and navigate. There is a vast range of online tutorials available, while the active community that contributes to the tool is always on hand to offer assistance.

Configure and Customise

Another important aspect of Jenkins is the scope of configuration and customisation that it makes possible. Users can be managed by creating groups and roles, and this can all be handled elegantly via some straightforward menu prompts. Jobs can also be configured; for example, the tool enables them to be executed via timed intervals. 

Every aspect of the Jenkins software has been cultivated to ensure maximum functionality with minimum effort, yet enabling users to customise and monitor everything extensively at all stages of the process. You can even set up automatic email notifications, ensuring that everyone involved with a data warehousing project is kept in the loop.

At a time when the amount of information that companies deal with is escalating rapidly, data warehousing is becoming increasingly important. It’s simply not possible to ignore big data any longer; this is tantamount to being left behind by your competitors. Jenkins & WhereScape is an elegant data warehousing solution that has helped a multitude of businesses get to grips with their data warehousing requirements, without investing a huge amount of effort in training, onboarding, or hiring experts.

Wherescape was already a market-leader in its field, but with the advent of CI/CD tools such as Jenkins, this top solution just became even more compelling.

Self-Service BI

Self-Service BI

Engaging Data Explains :

Self-Service BI


Self-service business intelligence is making a huge difference to companies across a variety of sectors, by helping to optimise data analysis.

However, some businesses perceive that implementing a business intelligence approach can be challenging, due to various barriers to entry issues. This is something of a false impression, but this is why it’s important for business intelligence to explore data using simple tools, while enabling questions to be answered quickly.


Dealing with Data Management

Many traditional data teams build pipelines of data in order to deal with their data management procedures. But the approach utilised is not always ideal. It’s common for data engineers and analysts to explore data and build solutions with specific results in mind, which is rather putting the cart before the horse. More logical would be to make the data generally available and appropriately linked. This enables end-users to more readily explore the data, drawing more nuanced conclusions from the assembled information.

Truly valuable business intelligence should always be an end-to-end iterative process. Business teams need relevant, timely data in order to uncover accurate insights. Deploying platforms to achieve this is merely one crucial component of business intelligence. 

As you develop your business intelligence approach, it’s important to understand that self-service business intelligence doesn’t require everyone in an organisation to train as business analysts. Nor should it mean removing responsibility from your IT department. Instead, it’s about encouraging and educating your business teams to understand and interact with the data they generate throughout their work – creating a joined-up process across your organisation.

Using Power BI

One of the best ways to achieve this is via Power BI. This cloud package has been on a long journey since it was first released, but is now the ideal app to ensure that companies can engage with data safely. Probably the most valuable aspect of this package is that it provides an incredible amount of options and guided analysis, making your business intelligence process much more flexible.

Did you know, you can achieve 90% of this using Power BI Report Server – AKA the on-premise version of Power BI. 

Using Power BI Report Server prompted an excellent reaction from end-users, who were excited at their ability to slice data and retrieve answers to client queries rapidly. This is one of the big advantages of business intelligence; it enables companies to get to the core of what drives customer demand quickly. You are accelerating the ability of clients to use data to solve critical business problems.

Dealing with self-service business intelligence can be intimidating, though. This is particularly true if an organisation requires this to be achieved without using the cloud, or spending a significant amount of money on new software. Historically, this form of business intelligence has posed problems with identifying required data, such as reports, with little or no description or business knowledge behind the purpose of the reports. For this reason, self-service business intelligence has been known to reduce IT departments to cold sweats!

In early experiences of self-service business intelligence, there was often no gold standard with design. Imagine – using SSRS where there could be hundreds of folders and sub-folders, with dozens of data connections, generating thousands of reports. And then every report featured a different format. Understanding how each report was being used was nigh-on impossible, and general data analysis was far from effective.

Range of Options

But things are changing. New self-service business intelligence solutions offer clients a range of options, and enable data to be provided safely and in a controlled format. And there are several options available for companies, with decentralised, centralised, and hybrid approaches all possible. These approaches will be selected depending on the demographics of IT teams, with the level of governance and control involved having a major influence.

The great thing about Power BI desktop is that it is an adaptable tool, enabling users to get started with data transformation quickly. Most experienced Excel users can quickly get to grips with this innovation, meaning that it can be easily and widely implemented across an organisation.

The tool itself allows IT to extract the data transformation and, if required, reverse engineer it back into a data warehouse. Providing the business users with a tool that acts like a Rosetta stone between business and IT!

It’s important to emphasise, though, that regardless of the technology you always need to understand who you are catering for with data. With this in mind, there are several ways that you can delegate the permissions, rules, and responsibilities with Power BI, as this adds to the flexibility of the platform. Power users are also important, as these credentialed individuals utilise their experience with Excel to produce reports for an organisation.

As you develop your self-service business intelligence strategy, it is vital to implement proper governance. This will help you to avoid creating data silos, data sprawl, poor performance, and lax security. Unless you implement appropriate governance, business users will have unrestricted access to source systems and Power BI folders. This can lead to inappropriate sharing of data.

Key Factors

It’s also important to consider the following factors:

  • Service size and data storage – you do not have unlimited resources with Power BI Report Server, and you could therefore experience larger datasets consuming a significant portion of resources, impacting the performance of the entire service.
  • Risks to source systems – allowing users to connect with any source system or raw files can create problems.
  • Access and permissions – security must always be taken into consideration. Failing to pay proper attention to access and permissions can result in numerous ad-hoc groups being created, which can then be problematic. Controlling groups, and tweaking which users can be added into each AD group, is definitely advisable.
  • Many versions of the truth – if you have several different people involved to create the same report, you’re likely to get numerous different answers. This is why it is important for data warehouses to be populated, effectively creating a single source of universal truth.
  • Reduced audit and tracking – if end-users fail to provide adequate details or purposeful dashboards, the ‘who, what & why’ regarding the purpose of the report is lost, undermining the whole process.

Summary

Implementing Self-Service Business Intelligence has become far more feasible, but it’s still important to impose some control over how your services are being utilised, in order to generate the maximum and most accurate insight. Power BI Report Server can be an excellent tool in enhancing business intelligence, and definitely one we recommend for clients who are reliant on data.


If you are interested in Self Service BI, book a call us and find out how it could help your team.



High-Level Design Documentation

High-Level Design Documentation

Engaging Data Explains:

High Level Design Documentation –


Have you ever needed to create high-level documents of your data automation that explains a project/sprint within your WhereScape Red repository? Maybe so it looks a little like the above?

We recently worked with a client who wanted to represent the amount of change undertaken within a single project. They required something simple yet demonstrated the amount of change within each layer of the data automation.

Instead of creating something new, we re-used the WhereScape RED Overview design, WhereScape used to illustrate the design of the architecture.

A sample data solution shaped into the high-level design.

Engaging Data consultants worked with the client to create a solid naming convention and development standards. With this foundation and the metadata repository, we developed a script that produced an HTML document with details the client was looking for.

The idea continued to develop and now has options to include the following details:

  • Number of Jobs, Procedures and Host Scripts that support each layer.
  • Data Volume per object and summarised per layer
  • Processing time for each layer, with avg completion time & avg time to run

WhereScape RED and 3D speeds up development & documentation of the technical design. This solution utilises the metadata to create support or narrative documents for other business areas.

Build re-usable scripts, dashboards or reports for non-technical business teams & provide clarity around the technical function of your automation.


If you are interested in receiving a copy of the script that produced this report, please email simon.meacher@engagingdata.co.uk

Data Warehousing Concepts Explained

Data Warehousing Concepts Explained

Engaging Data Explains:

Data Warehousing Concepts


Modern commerce is an environment in which companies are increasingly being required to make complex, data-backed decisions. Dealing with vast amounts of information has become an essential feature of a business, which can often lead to siloed data. This is difficult enough to store, let alone analyse or understand.

In many cases, business use demands require a more sophisticated system, improving data management and providing a holistic overview of essential aspects of the company. One of the best ways to achieve this is to invest in a data warehouse. Yet, many companies are still unaware of what this entails or how it can help their business.


What is Data Warehousing?

In simple terms, a data warehouse is a system that helps an organisation aggregate data from multiple sources. Instead of experiencing the sort of separation and siloing discussed previously, data warehousing makes it possible to draw together information from disparate sources. It’s almost akin to a universal translator of languages. Typically, data warehouses store vast amounts of historical data, and this can then be utilised by engineers and business analysts as required.

Data warehousing is particularly valuable as it essentially provides joined-up information to a company or organisation. This was quite impossible until relatively recently, as data has always been based on separate sources of information. Transactional systems, relational databases, and operational databases are often held entirely separately, and it was almost unthinkable until recently that the data from the systems could be effectively combined.

But in this Information Age, companies are seeking a competitive advantage via the leveraging of information. By combining the vast amount of data generated together into one source, businesses can better understand and analyse key customer indicators, giving them a real insight into the determining factors of the company. Data warehousing can build more robust information systems from which businesses can make superior predictions and better business decisions.

In recent years, the escalation and popularisation of the cloud has changed the potential of data warehousing. Historically, it was more usual to have an on-premise solution, which would be designed and maintained by a company at its own physical location. But this is no longer necessary. Cloud data architecture makes it possible to data warehouse without hardware, while the cloud structure also makes implementation and scaling more feasible.

Data Lakes

However, those who are uninitiated in deep data topics may encounter terminology that can be somewhat baffling! The concept of a data lake seems rather surreal and tends to conjure up imagery that is, ultimately, completely useless! Inevitably, people who have never encountered the concept of data lakes before find themselves imagining an expanse of azure water glittering in the sunlight. Well, data lakes aren’t quite like that.

A data lake is used for storing any raw data that does not currently have an intended use case. It really can be seen as similar to the wine lakes that used to be in the news quite regularly, but it doesn’t seem to be a talking point any longer! You can equally view a data lake as a surplus of information; it is data that may become useful in the future but does not have an immediate usage at this point in time. Thus, it is stored away in a lake until it can be consumed adequately.

This differs from data warehousing, which is used to deal with information that is known to be useful more efficiently. Data warehousing may deal with data stored in an impenetrable format. Still, there is a clear use case for understanding this information, or it needs to be stored for a particular reason.

When to use a Data Warehouse

There are a variety of reasons that a company or organisation would choose to utilise a data warehouse. The most obvious would be as follows:

  • If you need to start a large amount of historical data in a central location.
  • If you require to analyse your web, mobile, CRM, and other applications together in a single place.
  • If you need more profound business analysis than it has been possible to deliver with traditional analytic tools, by querying and analysing data directly with SQL, for example.
  • In order to allow simultaneous access to a dataset for multiple people or groups.

Data warehousing makes it possible to implement a set of analytical questions that would be impossible to address with traditional data analytics tools. Collecting all of your data into one location and source makes it possible to run queries that would otherwise be completely unfeasible. Instead of asking an analytical program to continually run back and forth, back and forth between several locations, the software can get to grips with one data source and deliver efficient and more holistic results.

Data Warehouse Factors

Many businesses now require data warehousing services to deal with the vast amount of data that is now generated. And that ‘many businesses’ will rapidly become ‘most businesses’, and then ‘virtually all businesses in the near future. But those that are inexperienced in this field are often confused about what factors to take into consideration.

Thus, we would recommend looking at these six key elements when considering warehousing:

  • The sheer scale of data that you wish to store.
  • The type of information that you need to store in the warehouse.
  • The dynamic nature of your scaling requirements.
  • How fast you require any queries to be carried out.
  • Whether manual or automatic maintenance is required.
  • Any compatibility issues with your existing system.

Concerning the first of these factors, data can be somewhat different in terms of its basic structure. Some data may be highly complex, but it can still be quantifiable, easily organised. However, in the era of Big Data there is a vast amount of unstructured data, which cannot be easily managed and analysed. Companies that generate a vast amount of unstructured data and need to collate and understand it are certainly excellent candidates for a data warehousing solution.

There is a lot to learn when it comes to the subject of data. And it can frankly be a little daunting at times. But what is certain is that this topic isn’t going anywhere. Big Data is here to stay. That’s why we have created our Data Vault 2.0 solution. Data Vault can ideally serve your organisations’ data needs when this is becoming an issue of paramount importance.

Automated Deployment

Automated Deployment

Engaging Data Explains:

Automated Deployment


Continuous Delivery using Octopus Deploy

With WhereScape RED

Many companies are looking to make Code changes/deployment easier. Often the ability to deploy code to production is surrounded by red tape & audited control. If you don’t have this, count yourself lucky!

Jenkins & Octopus Deploy are two, to name a few (see here), that are helping to automate the deployment of code to production. Allowing companies to adopt a continuous deployment/delivery approach.

For a long time, WhereScape RED has had its own method of automating deployment, using the command line functions.

Why Automate?

Using tools such as WhereScape RED allow elements of automating deployments; however, we know that companies like to use a common toolset for their code deployments; like having a single picture of all the deployments and, in most cases, realise that they want to release multiple code deployments on different platforms because RED doesn’t do everything.

Git?

No problem! There are several ways to do this. Our perfered option is to push the deployment application to the code store respository. Afterall, it is more practical to store the changes you want to push to Production and not every change to any objects, including those that are not meant for Production!

Can I do This Now?

WhereScape RED uses a command prompt file to send commands to the admin EXE. All changes will be applied to the destination database (via ODBC). Installation settings/config is set using XML & a log file is created as part of the process. The XML file contains the DSN of the destination database. Let’s come back to this point later. The XML contains all of the settings that are applied when deploying the application. Settings like Alter or Recreate Job. Please make sure you have this set correctly. You do not want to re-create a Data Store table to lose the history!

Permissions are important. The key to running the command line to publish changes to production is that the service account executing the commands has permissions to change the underlying database.

Integration with Octopus

Octopus deploy uses PowerShell as it’s common execution code. So we have adapted all of our WhereScape BAT files to PowerShell in order to get everything working.

Building a list of repeatable tasks within Octopus is easy & provides an opportunities to create a standard release process that meets with your companies standards/processes. Tasks like database backup, metadata backup and much much more!

It can even run test scripts!

We used a PowerShell script to create a full backup of the database, to be used should the deployment fail. With a larger database, this may not always be the best solution. Depending on your environment set up you may have options to use OS snapshots or other methods to roll back the changes. The good news is Octopus Deploy works with most technology, so you should find something that works for your technology stack.

Recently, we been playing with creating rollback WhereScape applications on the target data warehouse. This is great for restoring the structure of the objects quickly and easily. Reducing risk is a great value add!

Go, Go, Go!

Triggering the deployment was easy, we could set this up in many ways, but used “does the application files exists” trigger to get things started – until the humans learned to trust the Octopus process.

However, linking the release to Jira is just as simple. Imagine, you’ve completed development and want to sent the code to UAT. You click the button to update the ticket…….wait a few seconds…..and the code is deployed! It’s complicated to set up, but you get the idea.

Final Thoughts

Octopus is a great tool and the automation really helps to control the process of deployments. Coupled with WhereScape automation, this provides and excellent end to end solution for data warehousing.


If you are interested in CI/CD and WhereScape RED/3D, book a call us and find out how it could help your team.



Documenting the Modern Day Data Warehouse

Documenting the Modern Day Data Warehouse

Engaging Data Explains:

Documenting The Modern Day Data Warehouse


When you’re operating a modern-day data warehouse, documentation is simply part of the job. But it’s not necessarily the easiest or most logistically straightforward part of the process, while also being important. Documentation is, in fact, invaluable to the continued development, expansion, and enhancement of a data warehouse. It’s therefore important to understand everything that is entailed in adequately documenting, in order to ensure that your data warehouse processes run smoothly.

Understanding your Audience

One of the first things to understand is who you are compiling the documentation for. Support, developers, data visualisation experts, and business users could all be possible recipients. Before you answer this question, you really need to fully understand the way that your organisation operates, and open the lines of communication with the appropriate departments.

A two-way dialogue will be productive in this ongoing process. This process of communication will then help ensure that you keep the documents in line with the design. This is vitally important, as any conflicts here can render the whole process less than constructive than is ideal.

And it’s especially vital considering how fast documentation moves nowadays. Everything has gone online, and is based on Wiki. Whether it’s Confluence, SharePoint, or Teams, all sorts of Wiki documents are being produced by businesses with the intention of sharing important information. These shareable documents are updated with increasing regularity, meaning it is important to get your strategy in place before beginning.

Different approaches to data warehouse design can also impact the amount of time that a document is live before being updated. If you are lucky enough to make weekly changes to your data warehouse, you will be making incremental changes to the documentation itself. Development teams spend hours on updating the documentation rather than doing what they are good at….developing data solutions! Naturally, minimising this where possible is always preferable. 

Self-Service Business Intelligence

Documentation is also crucial in self-service business intelligence. The integration of private and local data in this area, into existing reports, analyses or data models, requires accurate documentation. Data can be drawn in this area from Excel documents, flat files, or a variety of external sources.

By creating self-service functionality, business users can quickly integrate data into what can often be vital reports. Local data can even be used to extend the information delivered by data warehousing, which will limit the workload that is inevitably incumbent on data management. The pressure on business intelligence can be quite intense, so anything that lessens the load is certainly to be welcomed.

Another important aspect of documentation is that it reduces the number of questions that are typically directed at the IT and data warehousing teams. One thing anyone that works in IT knows only too intimately is the vast amount of pressure that can be heaped upon them by both internal and external enquiries. Again, anything that reduces this will certainly be favourable.

The data warehouse team also has huge responsibility within any organisation. They are required to produce a vast amount of information for front-end business users, and getting documentation right can certainly assist with this process.

Importance of Transparency

One important aspect of documentation that can sometimes be overlooked is the importance of transparency. This works on every level of an organisation, with the importance of sharing everything related to documents absolutely vital. Once this level of transparency is implemented, people who understand the data deeply can improve the documentation, or suggest changes to the Extract, Transform, and Load (ETL) and Extract, Load, and Transform (ELT), if this is indeed deemed necessary.

Conversely, it’s also important to understand that not all technology is suitable for documentation. As much as businesses and organisations would love this process to be completely holistic, this is not always possible.

Thus, packages such as Power BI, QlikView and QlikSense, and even Microsoft’s trusty Excel, are not necessarily ready to be documented. These software packages can use data, but often do not have the ability to provide a document set that explain how the data is being used, and for what purpose. Recently, Power BI has taken steps to ensure that the app can help with data lineage, but this remains better suited to IT teams, as opposed to Business Users.

Attempting to document data across multiple technologies is tricky, but Wikis can provide IT teams with the ability to collate all of this information into a central hub of knowledge, making access much more logistically convenient.

Conclusion

Ultimately, IT departments, data warehousing teams, and report developers should all be encouraged to produce documentation that contributes to the overall aims of their organisations. Anything excessively technical is not good enough for modern business requirements, especially considering the importance of communication, and of ensuring that everyone within an organisation is acquainted with as much vital data as possible.

Modern-day technology makes this goal a reality, and this means that it is increasingly an expectation of end-users. Failing to prepare properly in this area could indeed mean preparing to fail, as organisations will simply have failed to meet the compelling desires of the market. It is therefore vital for documentation to be dealt with diligently.

Getting this piece right, will go a long way to help with data governance!


If you would like to know more about how Engaging Data help companies to automate documentation, please contact us on the below.