Research in the Information Systems Engineering (ISE) research group
chaired by Prof. Dr. rer. nat. habil. Bernhard Thalheim


ISE research directions: Further references:
Most recent results:

Co-design of structuring, functionality, interaction and distribution of information systems
Traditional software engineering and information systems engineering are structured, comprising requirements analysis and definition, systems design, systems implementation and testing, and systems operation and maintenance. For web information systems the traditional approach suffers from three obstacles: late integration of architectural decisions, neglect of user expectations, and late implementations.
The co-design approach integrates application domain description with development of presentation and information systems. At the same time the specification is executable due to our simulation system. The co-design methodology has been assessed by the SPICE committee and has been evaluated to be one of the first methodologies at maturity level 3.
The methodology has been extended to web information systems. Coherence and co-existence of UML diagrams can be based on a global ASM-backed systems model. This model supports co-evolution and co-development of sets of UML diagrams.
Component systems are becoming the main approach for efficient and effective development of large systems. Based on the approaches to application modelling that have been developed in the department in the past, an approach to component-based information systems has been developed and tested in application projects. The theory of component systems has been extended by facilities for view exchange among components.

Theory of conceptual models and modelling

Conceptual modelling is one of the central activities in Computer Science. A theory of conceptual models and a theory of modelling acts have been developed in our group. They are based on a general theory of modelling as an art, an apprenticeship, and a technology. Modelling is based on an explicit choice of languages, on application of restrictions, on negotiation, and on methodologies. Languages are defined through their syntactics, their semantics, and their pragmatics. Modelling is a process and is based on modelling acts. These modelling acts are governed by the purpose of modelling itself and of the model or models.
Conceptual modelling has changed over the years. Nowadays small scale conceptual modelling has become state-of-the-art for specialists and educated application engineers. Large scale conceptual modelling has been mainly developed within companies that handle large and complex applications. It covers a large variety of aspects such as models of structures, of business processes, of interaction among applications and with users, of components of systems and abstractions, or of derived models such as data warehouses and OLAP applications. We developed new architectural techniques for large scale conceptual modelling.
In software and information systems development different aspects and facets of the system being developed are usually analyzed and modelled independently from each other. A recurring challenge is the integration of the different partial models of the software system into one single consistent model. With the notion of model suites we introduce an approach which can be used to integrate heterogeneous models, to check consistency between those models, and to facilitate a consistent evolution of them. Model suites are based on explicit controllers for maintenance of coherence. They apply application schemata for their explicit maintenance and evolution, use tracers for establishment of their coherence, and thus support co-evolution of information system models. The use of model suites helps to minimize or completely avoid the risks, ambiguities, and contradictions, which normally result from the parallel use of different modelling languages and modelling tools.

Theory of models and modelling in sciences

The main achievement is the development of a novel notion of model, of modelling activities and model deployment scenarios, and a general model of a model.
A model is a well-formed, adequate and dependable artefact that represents other artefacts within some context based on criteria of adequacy and dependability commonly accepted by its community of practice. A model has - as an artefact - its background with a undisputable grounding of the sub-discipline and with a basis consisting of chosen elements from the sub-discipline. A model is functioning if it is combined with utilisation/deployment methods. A functioning model is effective if it can be successfully deployed according to its deployment scenarios and its portfolio. They thus function in the application scenario ('deployment games').
This notion has been validated and verified against the model notions of many disciplines. This notion extends the model notion developed in an interdisciplinary effort at CAU Kiel during 2009-2014. The notion has been tested against the notions of models used in archeology, arts, biology, chemistry, computer science, economics, electrotechnics, environmental sciences, farming and agriculture, geosciences, historical sciences, humanities, languages and semiotics, mathematics, medicine, ocean sciences, pedagogical science, philosophy, physics, political sciences, sociology, and sport science.
The validation brought an insight into the specific understanding of adequacy, dependability, functioning, and effectiveness used in each of these disciplines. The validation has also resulted in an understanding of the added value of the model within the discipline, in an evaluation of the model maturity, in detection of features which are missing and should be added to the model or which can be deleted from the model, and in restrictions to model deployment which must be observed.

Data mining design
Data mining algorithms aim to provide some means to expose the hidden information behind data. However, considering a particular problem statement raises the question as to which algorithm should be employed, and moreover, how and which processing steps should be nested to convey a target-aimed knowledge discovery process. Present approaches, such as the CRISP-DM, are mainly focused on the management or description of such processes but they do not really describe how such a discovery process should be designed. A novel framework has been developed that aims at the design of knowledge discovery processes where the prior knowledge of a user and his goals are central to the process design.

BPMN (Business process modelling and notation)
An abstract model for the dynamic semantics of the core process modelling concepts in the OMG standard for BPMN 2.0 has been created based on the development of a complete formalization of BPMN 1.0 and 1.1 that is the result of an international collaboration over the last few years. The UML class diagrams associated therein with each flow element are extended with a rigorous behaviour definition, which reflects the inheritance hierarchy structure by refinement steps. The correctness of the resulting precise algorithmic model for an execution semantics for BPMN can be checked by comparing the model directly with the verbal explanations in the BPMN standard. Thus, the model can be used to test reference implementations and to verify properties of interest for (classes of) BPMN diagrams. Based on the model a native BPMN 2.0 Process Engine and a BPMN debugger have been implemented.

Moving object databases and analysis
Current research in moving object databases focuses on data structures allowing the efficient storage and analysis of fine-grained data, i.e. trajectories are mostly indexed and analyzed by their spatial and/or temporal attributes, e.g. position and time. Analysis itself, however, often requires the association of such fine-grained data to more coarse-grained queries such as "return all trajectories where a turn has occurred and it is followed by a speed up". To cover the resulting gap, the fundamentals of a framework for classification of moving objects based on their "behaviour" has been developed. In this case, classification is defined as the assignment of trajectory streams to predefined scenarios that represent interactions between arbitrary moving objects. To allow efficient association of trajectory data with coarse-grained scenario descriptions as above, a novel index structure for trajectories of moving objects has been proposed and implemented using techniques from the area of computational movement analysis. The proposed index has the advantage that it uses not only the spatiotemporal domain of trajectories but also their topologies. In that context, the notion of topology is provided as the relation between characteristic events during the life span of a moving object. Providing and using that kind of meta-information allows for the efficient computation of similarities between trajectories at a high level of abstraction.

Database technology
Many modern applications are becoming performance critical. At the same time, the size of some databases has been increasing to levels that cannot be well supported by current technology. Performance engineering has been ruled in the past mainly by reactive techniques such as performance monitoring. A new active method for performance improvement has been developed. One of the potential methods for active performance improvement is performance forecasting based on assumptions of future operations and on extrapolations from the current situation.
Exceptions are considered to be unusual states that could be, but must not be, taken primarily into account. They form exclusions, represent cases to which a rule does not apply, and form specific states that are not going to be handled (at least by the current system) or might represent legal objections against the typical state. Information systems architectures can be made more flexible to cope with exceptions in a way that these systems are exception-aware, exception-reactive, and provide a management of exceptions in a coherent form.
Modernization of information systems is a fundamental but sometimes neglected aspect of conceptual modelling. The management of evolution, migration, and refinement and the ability for information systems to deal with modernization is an essential component in developing and maintaining truly useful systems that minimize service disruption and downtime, and maximize availability of data and applications. Migration and evolution are interwoven aspects. Migration strategies such as 'big bang', 'chicken little', and 'butterfly' can be based on systematic evolution steps. Evolution steps use the theory of model suites.
Classical software development methodologies take architectural issues as granted or pre-determined. Web information systems pay far more attention to user support and thus require sophisticated layout and playout systems. These systems go beyond what has been known for presentation systems. A framework has been developed that is based either on early architectural decisions, or on integration of new solutions into existing architectures. It allows co-evolution of architectures and software systems.

Database theory
The theory of integrity constraints has led to a large body of knowledge and to many applications. Integrity constraints are however often misunderstood, are given in the wrong database context or within the wrong database models, often combine a number of very different facets of semantics in databases, and are difficult to specify. A unifying approach to specification and treatment of integrity constraints has been developed.
NULL is a special marker used in SQL to indicate that a value for an attribute of an object does not exist in the database. The three-valued and many-valued logics developed in the past do not properly reflect the nature of this special marker. To support this we introduce a non-standard generalization of para-consistent logics. These logics reflect the nature of these markers. The solutions developed can be used without changing database technology.
Modelling with multi-level abstraction refers to representing objects at multiple levels of one or more abstraction hierarchies, mainly classification, aggregation, and generalization. Multiple representation, however, leads to accidental complexity, complicating modelling and extension. A theory of m-objects has been developed that offers powerful techniques for modular and redundancy-free models, for query flexibility, for heterogeneous level-hierarchies, and for multiple relationship-abstraction.
Local database normalization aims at the derivation of database structures that can easily be supported by the DBMS. Global normalization has not received appropriate attention in research despite the interest in its implementations. Our research on systematic treatment of this normalization resulted in new ER-based normalization techniques.
A general theory of database transformations defines the background for queries and updates, which are two fundamental types of computation in any database: the first provides the capability to retrieve data, and the second is used to maintain databases in the light of ever-changing application domains. In theoretical studies of database transformations, considerable effort has been directed towards exploiting the close ties between database queries and mathematical logics. It is widely acknowledged that a logic-based perspective for database queries can provide a yardstick for measuring the expressiveness and complexity of query languages.
Practical experience shows that the maintenance of very large database schemata causes severe problems and no systematic support is provided. Based on the analysis of a recent study, larger schemata may be built by composing smaller ones and frequently recurring meta-structures. Our approach leads to a category of schemata that is finitely complete and co-complete. We show that all constructors of the recently introduced schema algebra are well-defined in the sense that they give rise to schema morphism. The algebra is also complete in the sense that it captures all universal constructions in the category of schemata.

Information privacy
Privacy is becoming a major issue of social, ethical and legal concern on the Internet. The development of information technology and the Internet have major implications for the privacy of individuals. A new conceptual model for databases that contain exclusively private information has been developed. The model utilizes the theory of infons to define ''private infons'', and develops taxonomy of these private infons based on the notions of proprietary and possession. The proposed model also specifies different privacy rules and principles, derives their enforcement, and develops and tests architecture for this type of database. The model allows several variants for privacy supporting systems. The concept of privacy wallets has been implemented.

Knowledge bases and knowledge web
The internet and web applications have changed business and human life. Nowadays almost everybody is used to obtaining data through the internet. Most applications are still Web 1.0 applications. Web 2.0 community collaboration and annotated data on the basis of Web 3.0 technologies support new businesses and applications. The quality dimension of the web is however one of the main challenges. Knowledge web information systems target high-quality data on safe grounds, with a good reference to established science and technology and with data adaptation to the user's needs and demands. They can be built based on existing and novel technologies.
The knowledge web approach has been applied to management of processes that allow flexible handling of catastrophes. Another application targets delivery of actionable information on demand in a way that users in juristical environments can easily assimilate them to perform their tasks.
Our knowledge web approach is based on advanced content management and on the theory of media types. Content management is the process of handling information within an organization or community. We developed, applied, and implemented a novel data model for content, which treats semantic information not only as describing metadata but also incorporates on the same level the data itself, the intention behind it, its usage, and its origin.

Random databases
We consider stochastic modelling for databases with uncertain data and for some basic database operations (for example, join, selection) with exact and approximate matching. Approximate join is used for merging data or removing duplication in large databases. Distribution and mean of the join sizes are studied for random databases. A random database is treated as a table with independent random records with a common distribution (or a set of random tables). Our results can be used for integration of information from different databases, multiple join optimization, and various probabilistic algorithms for structured random data.

Quality management and assessment for information and software systems
Software and information systems design and development coexist and co-evolve with quality provision, assessment and enforcement. However, most (including current) research provides only bread-and-butter lists of useful properties without giving a systematic structure for evaluating them. Software engineers have been putting forward numerous quantities of metrics for software products, processes and resources but a theoretical foundation is still missing. We developed and applied a framework for quality property specification, quality control, quality utilization, and quality establishment. Our framework has a theoretical basis that is adaptable to all stages of software development.

Web information systems
We developed a general specification method for clouds. Technically, we understand a cloud as a federation of software services that are made available via the web and can be used by any application. A common understanding in the web services community is that a service is defined as a function or operation with the appropriate input/output specification. We take a general view regarding a service as a piece of software that not only provides functionality but also data. Services thus combine a hidden database layer with an operation-equipped view layer, and can be anything from a simple function to a fully-fledged web information system or a data warehouse.
Web information systems should also support speech dialogues. Their workflow and supporting infrastructure can be specified by storyboards. The integration of speech dialogues is however an unsolved issue due to the required flexibility, the wide variety of responses, and the expected nativeness. Speech dialogues must be very flexible in both recognition of questions and in generation of appropriate answers. We thus introduce a pattern-based approach to specification and utilization of speech dialogues. These patterns reflect the dialogue speech since answers and responses with a speech dialogue are instantiations or refinements of these patterns. It is possible to create patterns for common dialogue-forms. The results of this work show that only small adaptations regarding the storyboard concept are necessary and the extension of the presentation layer with a channel-dependent renderer is sufficient to be able to model natural language dialogues.
The design and reification of web information systems is a complex task, for which many integrated development methods have been proposed. While all these methods ultimately lead to the construction of web pages, very little attention is paid to the layout of these pages. Screenography developed in our group provides principles and rules for page layout that originate from knowledge of visual perception and communication and then investigates how layout can support the intentions associated with the WIS. This amounts to guidelines for partitioning pages and using layout objects, colour, light, and texture to obtain rhythm, contrast, and perspective as the carriers for web page comprehension. We use a pattern approach to systematic development of laying and playouting. These patterns can be combined to larger complex patterns. Therefore, an algebra for pattern construction will be developed. On a high level of abstraction the storyboard of a web information system specifies who will be using the system, in what way, and for which goals. Storyboard pragmatics deals with the question as to what the storyboard means for its users. One part of pragmatics is concerned with usage analysis by means of life cases, user models, and contexts. We also addressed another part of pragmatics that complements usage analysis by WIS portfolios. These comprise two parts: the information portfolio, and the utilization portfolio. The former is concerned with information consumed and produced by the WIS users, which leads to content chunks. The latter captures functionality requirements, which depend on the specific category to which the WIS belongs.


Return to the home page of B. Thalheim, to the pages on teaching or on projects.