How the Cloud is Changing the Role of Metadata in Industrial Intelligence

December 29, 2022 Guest User

When it comes to industrial machines, context looms large. Especially when the difference in data signatures between various operating contexts, original equipment manufacturers (OEMs), maintenance histories, and productivity thresholds reveals whether one asset needs inspection and another is performing optimally.

To borrow a pet analogy, it is the difference in knowing that a temperature of 102 degrees Fahrenheit is perfectly normal for a puppy and that a person has a fever.

Context matters, and it is the reason why process-intensive operations in industries like chemicals, oil and gas, mining and metals, manufacturing, and energy and utilities, are facing something like the puppy-or-person question at scale. Metadata is taking on an increasingly important role in industrial intelligence.

Issues with valuable industrial equipment factor into asset and personnel utilization decisions. Competitive pressures demand companies make smarter decisions in each function of their business. And it’s not just the balance sheet that shows the importance of data-backed decisions.

With environmental, social, and corporate governance (ESG) concerns top-of-mind, intelligence for decision-makers at each echelon of the organization begins with a precise understanding of the metadata — the context. For different stakeholders with various decisions to make, context is (excuse us) contextual.

Right now though, many companies have trouble seeing that context in existing datasets. Much of that difficulty owes to the original design of operational technology (OT) systems like supervisory control and acquisition (SCADA) systems or data historians.

These on-premise collection systems were made for plant staff with a deep familiarity in asset performance and maintenance. They did not place a premium on context for those who knew how equipment operated. But keeping context, especially for those without a working knowledge of industrial assets, can help teams uncover valuable use cases in asset planning, monitoring, and reporting. It also opens up the possibility of multi-purpose solutions from a single source of industrial data for decision-makers.

More Power Users of OT Data Wanted

Industrial asset expertise is hard to come by and expensive to develop, but it is critical as organizations look to share and delegate decision-making responsibilities to a growing set of data consumers and stakeholders. With the equipment and site expertise necessary to put datasets and industrial assets in their proper context, process-intensive operations are counting on legacy OT systems to extract and make information useful for the rest of the organization.

As a large part of the industrial workforce prepares to retire within the next few years across heavy industries, that dependence will be unworkable. Industrial organizations need a better way.

But since on-premise collection systems require more hardware and licenses in order to process data for more users, many companies must make decisions about who has access to data. These licensing restrictions make it difficult for companies to train various departments in the management of data. It also limits organizations from taking advantage of the breadth of industrial applications available to them once their data is in the cloud. The payment model of traditional on-premise systems now jeopardizes that progress by favoring a narrow user base of OT data.

For business analysts, data analysts, and executives, if a business unit or organization clears those high-cost barriers to an enterprise data lake strategy, they face another, similarly costly challenge of metadata loss. And for engineering and operations leaders looking over the performance of individual sites or facilities, metadata and data quality are critical for building out best practices.

Shared access to datasets remains expensive, in spite of organizational ownership over data. A new form of metadata management is in order: one that scales and adapts to data consumption needs as the priorities of industrial intelligence change. OT systems must cater to the growing appetite for smarter business decisions from all corners of the modern industrial organization: management, engineer, technician, data scientist, IT, business analyst.

The Evolving Role of OT Systems

The modern database historian emerged in the 1980s as a response to higher data intake. Industrial equipment, outfitted with better instrumentation and streaming higher and higher volumes of data, outmatched existing record-keeping. Earlier archival systems in individual spreadsheets, including (and sometimes still) those recorded with paper and pen, could not keep up with the volume and velocity of the newly available data. OT engineers adopted the on-premise historian as an archive of single-site information.

The new systems were a gold mine of information. OT environments, in addition to collecting asset-produced information, stamped each data point with descriptive information that distinguishes that figure among other strands of data for each tagged asset. Metadata included attributes like pressure, volume, location, and temperature. Together, they placed industrial asset data into perspective. It was this data about data that enriched the value of a dataset with greater granularity, more accurately reflecting the entire operating context of an asset.

Today, the story around the collection of data in OT systems is much the same. Each of these descriptive points about the data could paint a more holistic view of asset performance.

Except: hardware limitations kept these systems primarily as onsite technologies, suited to the analysis of on-premise assets. The data historian was a feasible solution for plant staff to quickly collect, query, and analyze OT data.

Different Standards in Metadata Management Persist

Before the emergence of the cloud, metadata from industrial control systems took on the individual characteristics of data collection at a single site. Since then, varied metadata collection practices have created a patchwork of different standards. That variation has obscured the visibility of operations into asset performance.

Ideally, metadata management includes detailed and understandable definitions, code values, data quality metrics, and data profiles. For some departments and organizations, this proves to be the case.

As often happens, however, metadata standards are local — most understandable to power users and their on-premise control systems. Beyond the consistent users of these systems, there was no need for translation. Likewise with asset frameworks or taxonomies — individual facilities developed different norms for collection and aggregation.

While these data collection, tagging, and naming conventions are familiar and useful for the power users of on-premise historians, many interested data consumers themselves lack the onsite or engineering backgrounds to know what to make of the metadata. That specificity can render datasets meaningless for people outside the environment.

It would be difficult, for example, to associate one value or another with this or that pump. For others without asset expertise, the difference in temperature readings between a puppy, person, and pump is not that clear. Without a reference system for context, metadata can contribute to imprecision and confusion for the enterprise.

Metadata Compression and Asset Hierarchy Losses

Hardware limitations and differences in metadata management between sites made data access difficult. The loss of context can be an even more fundamental challenge to power users of on-premise systems.

In many historians, preconfigured settings around time-series data compression save licensees from having to buy additional hardware required to handle more metadata. Instead, the historian reduces quality, storing metadata like temperature and pressure at lower granularity — often by reducing the frequency of collection or by having a predefined level of parameters for each of those data tags based on groupings of assets.

In other cases, the historian removes select metadata values entirely from storage. Compression leads to the loss of important variables in the entire operating context. It leaves the enterprise short of the full picture.

Real-time values are just a fraction of the data available to operators. Most organizations also have many years of historical data — and metadata — in their on-premise systems. With enterprise imperatives to migrate that data to the cloud, they face similar challenges to preserve high-resolution metadata and historical data.

Though cloud connectivity add-ons from on-premise collection systems providers have eased the migration of data beyond individual sites, operators have faced challenges with the depreciation of metadata through compression and lacking support for coexisting data models. Recurring payments for cloud migration and integration services contribute to higher data management costs. And those one-off fees fail to treat the basic issue of metadata loss – the sacrifice of quality for quantity of data.

Asset framework hierarchies are also regular casualties of OT data transfer to the cloud. Users of on-premise systems with cloud connections must consistently rebuild these hierarchies in the cloud. Even so, many data historians cannot support coexisting data models. In the cloud, multiple coexisting data models allow different consumers to see the same dataset within a context that makes sense for their decision-making responsibilities.

Metadata compression and single data models restrict visibility into operations. Integrations and plug-ins in the cloud from on-premise providers to promote data integrity are unable to treat the root issue of unscalable metadata management. And then with different metadata standards, companies lose the context critical for the development of industrial intelligence.

Scaling Metadata for Enterprise Use

As many process businesses turn to a data lake strategy to leverage the value of their data, the preservation of metadata in the movement of OT data to their cloud environment represents a significant opportunity to optimize the maintenance, productivity, sustainability, and safety of critical assets.

The loss of metadata has been among the most severe limiting factors in the value of OT data. By one estimate, industrial businesses are losing out on 20-30 percent of the value of their data from regular compression of metadata or losses in their asset hierarchy models. With an expertise shortage sweeping across process-intensive operations, many companies will need to digitize and conserve institutional (puppy-or-person) knowledge, beginning with their own data.

They now can — cost-effectively and at scale.

Liberate your OT Data