Integration Functionality

This document outlines the system context and major functionality of integration. This is the author's viewpoint based on his experience gained over many years. Every integration problem is unique (in general) and for each integration problem there are many ways to approach it. Having a system context as frame of reference helps analyzing the integration problem at hand. The functionality section presents a collection of main requirements and features and is by no means complete. The list gives an impression on the complexity of integration and on the variety of the integration problems.

System Context

The system context defines the frame of reference for the systems that are being integrated. Systems considered for integration here are HAD systems: heterogeneous, autonomous and distributed:

A note on heterogeneity. Although heterogeneity is often seen as a data-specific system property, heterogeneity can also be found in the remote access interface to those systems. Different systems use different technology to expose their functionality and that causes heterogeneity far beyond data level heterogeneity. Integration has to overcome this type of heterogeneity, too.

As a borderline case it is possible that the integration is with systems that are homogeneous, coordinated and centralized. While this case is certainly "nicer", it is not the usual case. Therefore, HAD systems are the target and the context.

System vs. Data Integration?

In some cases, authors and professionals make a case for the distinction between system and data integration. Is there really a fundamental difference?

Data integration usually refers to transferring data from one system to another system. In the extreme case this is database synchronization (replication or master-slave). In many cases, however, the data is exchanged between systems through application programming interfaces (APIs). In this case the data is extracted from the database into a main memory representation before sent over to the target system (using web services, queues, or other mechanisms).

System integration usually is a super set including data integration, but also remote functionality integration. In this case one system "invokes" the other system's functionality by sending data and in some cases including processing directives. As part of this approach data can be sent also, making system integration the super set.

Direct vs. Indirect Integration

Systems can directly communicate with each other to achieve integration, or through a third system, a so-called integration middleman.

There are several pros and cons for each approach. The direct invocation is quicker to achieve, but it requires the systems to know about each other. It is also difficult to reason about the overall systems state in terms of protocol consistency and data consistency.

The indirect integration takes more effort to implement, but since the integration logic is separate from the integrated systems, the overall system state and consistency is easier to reason about.

Functionality

Integration functionality can be from a very simply best-effort file transfer to a very complex data replication process ensuring data consistency/integrity.

In principle terms, there are several different integration functionalities:

This is only a higher level characterization of systems integration. Each can be sub-divided into more special cases. But at the same time, it can be argued that a client - server relationship is integration of systems also.