Research in the Information Systems Engineering (ISE) research group
chaired by Prof. Dr. rer. nat. habil. Bernhard Thalheim
My books: survey 1 and survey 2.
See too: my papers at research gate.
ISE research directions:
Most recent results:
Co-design of structuring, functionality, interaction and distribution of information systems
Traditional software engineering and information systems engineering are structured, comprising requirements analysis
and definition, systems design, systems implementation and testing, and systems operation and maintenance. For web
information systems the traditional approach suffers from three obstacles: late integration of architectural decisions, neglect
of user expectations, and late implementations.
The co-design approach integrates application domain description with development of presentation and information
systems. At the same time the specification is executable due to our simulation system. The co-design methodology has
been assessed by the SPICE committee and has been evaluated to be one of the first methodologies at maturity level 3.
methodology has been extended to web information systems. Coherence and co-existence of UML diagrams can be based
on a global ASM-backed systems model. This model supports co-evolution and co-development of sets of UML diagrams.
Component systems are becoming the main approach for efficient and effective development of large systems. Based
on the approaches to application modelling that have been developed in the department in the past, an approach to
component-based information systems has been developed and tested in application projects. The theory of component
systems has been extended by facilities for view exchange among components.
Theory of conceptual models and modelling
Conceptual modelling is one of the central activities in Computer Science. A theory of conceptual models and a theory
of modelling acts have been developed in our group. They are based on a general theory of modelling as an art, an
apprenticeship, and a technology. Modelling is based on an explicit choice of languages, on application of restrictions, on
negotiation, and on methodologies. Languages are defined through their syntactics, their semantics, and their pragmatics.
Modelling is a process and is based on modelling acts. These modelling acts are governed by the purpose of modelling
itself and of the model or models.
Conceptual modelling has changed over the years. Nowadays small scale conceptual modelling has become state-of-the-art
for specialists and educated application engineers. Large scale conceptual modelling has been mainly developed within
companies that handle large and complex applications. It covers a large variety of aspects such as models of structures,
of business processes, of interaction among applications and with users, of components of systems and abstractions, or
of derived models such as data warehouses and OLAP applications. We developed new architectural techniques for large
scale conceptual modelling.
In software and information systems development different aspects and facets of the system being developed are usually
analyzed and modelled independently from each other. A recurring challenge is the integration of the different partial
models of the software system into one single consistent model. With the notion of model suites we introduce an approach
which can be used to integrate heterogeneous models, to check consistency between those models, and to facilitate a
consistent evolution of them. Model suites are based on explicit controllers for maintenance of coherence. They apply
application schemata for their explicit maintenance and evolution, use tracers for establishment of their coherence, and
thus support co-evolution of information system models. The use of model suites helps to minimize or completely avoid the
risks, ambiguities, and contradictions, which normally result from the parallel use of different modelling languages and
Theory of models and modelling in sciences
The main achievement is the development of a novel notion of model, of modelling activities and model deployment scenarios, and a general model of a model.
A model is a well-formed,
adequate and dependable
artefact that represents other artefacts within some context
based on criteria of adequacy and dependability commonly accepted by its community of practice.
A model has - as an artefact - its background
with a undisputable grounding of the sub-discipline and
with a basis consisting of chosen elements from the sub-discipline.
A model is functioning if it is combined with
A functioning model is effective if it can be successfully deployed according
to its deployment scenarios and its portfolio.
They thus function in the application
scenario ('deployment games').
This notion has been validated and verified against the model notions of many disciplines. This notion extends the model notion developed in an interdisciplinary effort at CAU Kiel during 2009-2014. The notion has been tested against the notions of models used in archeology, arts, biology, chemistry, computer science, economics, electrotechnics, environmental sciences, farming and agriculture, geosciences, historical sciences, humanities, languages and semiotics, mathematics, medicine, ocean sciences, pedagogical science, philosophy, physics, political sciences, sociology, and sport science.
The validation brought an insight into the specific understanding of adequacy, dependability, functioning, and effectiveness used in each of these disciplines. The validation has also resulted in an understanding of the added value of the model within the discipline, in an evaluation of the model maturity, in detection of features which are missing and should be added to the model or which can be deleted from the model, and in restrictions to model deployment which must be observed.
Data mining design
Data mining algorithms aim to provide some means to expose the hidden information behind data. However, considering
a particular problem statement raises the question as to which algorithm should be employed, and moreover, how and
which processing steps should be nested to convey a target-aimed knowledge discovery process. Present approaches, such
as the CRISP-DM, are mainly focused on the management or description of such processes but they do not really describe
how such a discovery process should be designed. A novel framework has been developed that aims at the design of
knowledge discovery processes where the prior knowledge of a user and his goals are central to the process design.
BPMN (Business process modelling and notation)
An abstract model for the dynamic semantics of the core process modelling concepts in the OMG standard for BPMN 2.0
has been created based on the development of a complete formalization of BPMN 1.0 and 1.1 that is the result of an
international collaboration over the last few years.
The UML class diagrams associated therein with each flow element are
extended with a rigorous behaviour definition, which reflects the inheritance hierarchy structure by refinement steps.
The correctness of the resulting precise algorithmic model for an execution semantics for BPMN can be checked by comparing
the model directly with the verbal explanations in the BPMN standard. Thus, the model can be used to test reference
implementations and to verify properties of interest for
(classes of) BPMN diagrams. Based on the model a native BPMN
2.0 Process Engine and a BPMN debugger have been implemented.
Moving object databases and analysis
Current research in moving object databases focuses on data structures allowing the efficient storage and analysis of
fine-grained data, i.e. trajectories are mostly indexed and analyzed by their spatial and/or temporal attributes, e.g.
position and time. Analysis itself, however, often requires the association of such fine-grained data to more coarse-grained
queries such as "return all trajectories where a turn has occurred and it is followed by a speed up". To cover the resulting
gap, the fundamentals of a framework for classification of moving objects based on their "behaviour" has been developed.
In this case, classification is defined as the assignment of trajectory streams to predefined scenarios that represent
interactions between arbitrary moving objects. To allow efficient association of trajectory data with coarse-grained scenario
descriptions as above, a novel index structure for trajectories of moving objects has been proposed and implemented using
techniques from the area of computational movement analysis. The proposed index has the advantage that it uses not
only the spatiotemporal domain of trajectories but also their topologies. In that context, the notion of topology is provided
as the relation between characteristic events during the life span of a moving object. Providing and using that kind of
meta-information allows for the efficient computation of similarities between trajectories at a high level of abstraction.
Many modern applications are becoming performance critical. At the same time, the size of some databases has been
increasing to levels that cannot be well supported by current technology. Performance engineering has been ruled in the
past mainly by reactive techniques such as performance monitoring. A new active method for performance improvement
has been developed. One of the potential methods for active performance improvement is performance forecasting based
on assumptions of future operations and on extrapolations from the current situation.
Exceptions are considered to be unusual states that could be, but must not be, taken primarily into account. They form
exclusions, represent cases to which a rule does not apply, and form specific states that are not going to be handled (at
least by the current system) or might represent legal objections against the typical state. Information systems architectures
can be made more flexible to cope with exceptions in a way that these systems are exception-aware, exception-reactive,
and provide a management of exceptions in a coherent form.
Modernization of information systems is a fundamental but sometimes neglected aspect of conceptual modelling. The
management of evolution, migration, and refinement and the ability for information systems to deal with modernization
is an essential component in developing and maintaining truly useful systems that minimize service disruption and downtime,
and maximize availability of data and applications. Migration and evolution are interwoven aspects. Migration
strategies such as 'big bang', 'chicken little', and 'butterfly' can be based on systematic evolution steps. Evolution steps
use the theory of model suites.
Classical software development methodologies take architectural issues as granted or pre-determined. Web information
systems pay far more attention to user support and thus require sophisticated layout and playout systems. These systems
go beyond what has been known for presentation systems. A framework has been developed that is based either on early
architectural decisions, or on integration of new solutions into existing architectures. It allows co-evolution of architectures
and software systems.
The theory of integrity constraints has led to a large body of knowledge and to many applications. Integrity constraints
are however often misunderstood, are given in the wrong database context or within the wrong database models, often
combine a number of very different facets of semantics in databases, and are difficult to specify. A unifying approach to
specification and treatment of integrity constraints has been developed.
NULL is a special marker used in SQL to indicate that a value for an attribute of an object does not exist in the database.
The three-valued and many-valued logics developed in the past do not properly reflect the nature of this special marker.
To support this we introduce a non-standard generalization of para-consistent logics. These logics reflect the nature of
these markers. The solutions developed can be used without changing database technology.
Modelling with multi-level abstraction refers to representing objects at multiple levels of one or more abstraction
hierarchies, mainly classification, aggregation, and generalization. Multiple representation, however, leads to accidental
complexity, complicating modelling and extension. A theory of m-objects has been developed that offers powerful
techniques for modular and redundancy-free models, for query flexibility, for heterogeneous level-hierarchies, and for
Local database normalization aims at the derivation of database structures that can easily be supported by the DBMS.
Global normalization has not received appropriate attention in research despite the interest in its implementations. Our
research on systematic treatment of this normalization resulted in new ER-based normalization techniques.
A general theory of database transformations defines the background for queries and updates, which are two fundamental
types of computation in any database: the first provides the capability to retrieve data, and the second is used to
maintain databases in the light of ever-changing application domains. In theoretical studies of database transformations,
considerable effort has been directed towards exploiting the close ties between database queries and mathematical logics.
It is widely acknowledged that a logic-based perspective for database queries can provide a yardstick for measuring the
expressiveness and complexity of query languages.
Practical experience shows that the maintenance of very large database schemata causes severe problems and no systematic
support is provided. Based on the analysis of a recent study, larger schemata may be built by composing smaller ones
and frequently recurring meta-structures. Our approach leads to a category of schemata that is finitely complete and
co-complete. We show that all constructors of the recently introduced schema algebra are well-defined in the sense that
they give rise to schema morphism. The algebra is also complete in the sense that it captures all universal constructions in
the category of schemata.
Privacy is becoming a major issue of social, ethical and legal concern on the Internet. The development of information
technology and the Internet have major implications for the privacy of individuals. A new conceptual model for databases
that contain exclusively private information has been developed. The model utilizes the theory of infons to define ''private
infons'', and develops taxonomy of these private infons based on the notions of proprietary and possession. The proposed
model also specifies different privacy rules and principles, derives their enforcement, and develops and tests architecture
for this type of database. The model allows several variants for privacy supporting systems. The concept of privacy wallets
has been implemented.
Knowledge bases and knowledge web
The internet and web applications have changed business and human life. Nowadays almost everybody is used to
obtaining data through the internet. Most applications are still Web 1.0 applications. Web 2.0 community collaboration
and annotated data on the basis of Web 3.0 technologies support new businesses and applications. The quality dimension
of the web is however one of the main challenges. Knowledge web information systems target high-quality data on safe
grounds, with a good reference to established science and technology and with data adaptation to the user's needs and
demands. They can be built based on existing and novel technologies.
The knowledge web approach has been applied to management of processes that allow flexible handling of catastrophes.
Another application targets delivery of actionable information on demand in a way that users in juristical environments
can easily assimilate them to perform their tasks.
Our knowledge web approach is based on advanced content management and on the theory of media types. Content
management is the process of handling information within an organization or community. We developed, applied, and
implemented a novel data model for content, which treats semantic information not only as describing metadata but also
incorporates on the same level the data itself, the intention behind it, its usage, and its origin.
We consider stochastic modelling for databases with uncertain data and for some basic database operations (for example,
join, selection) with exact and approximate matching. Approximate join is used for merging data or removing duplication
in large databases. Distribution and mean of the join sizes are studied for random databases. A random database is
treated as a table with independent random records with a common distribution (or a set of random tables). Our results
can be used for integration of information from different databases, multiple join optimization, and various probabilistic
algorithms for structured random data.
Quality management and assessment for information and software systems
Software and information systems design and development coexist and co-evolve with quality provision, assessment and
enforcement. However, most (including current) research provides only bread-and-butter lists of useful properties without
giving a systematic structure for evaluating them. Software engineers have been putting forward numerous quantities of
metrics for software products, processes and resources but a theoretical foundation is still missing. We developed and
applied a framework for quality property specification, quality control, quality utilization, and quality establishment. Our
framework has a theoretical basis that is adaptable to all stages of software development.
Web information systems
We developed a general specification method for clouds. Technically, we understand a cloud as a federation of software
services that are made available via the web and can be used by any application. A common understanding in the web
services community is that a service is defined as a function or operation with the appropriate input/output specification.
We take a general view regarding a service as a piece of software that not only provides functionality but also data.
Services thus combine a hidden database layer with an operation-equipped view layer, and can be anything from a simple
function to a fully-fledged web information system or a data warehouse.
Web information systems should also support speech dialogues. Their workflow and supporting infrastructure can be
specified by storyboards. The integration of speech dialogues is however an unsolved issue due to the required flexibility,
the wide variety of responses, and the expected nativeness. Speech dialogues must be very flexible in both recognition of
questions and in generation of appropriate answers. We thus introduce a pattern-based approach to specification and
utilization of speech dialogues. These patterns reflect the dialogue speech since answers and responses with a speech
dialogue are instantiations or refinements of these patterns. It is possible to create patterns for common dialogue-forms.
The results of this work show that only small adaptations regarding the storyboard concept are necessary and the extension
of the presentation layer with a channel-dependent renderer is sufficient to be able to model natural language dialogues.
The design and reification of web information systems is a complex task, for which many integrated development methods
have been proposed. While all these methods ultimately lead to the construction of web pages, very little attention is
paid to the layout of these pages. Screenography developed in our group provides principles and rules for page layout
that originate from knowledge of visual perception and communication and then investigates how layout can support the
intentions associated with the WIS. This amounts to guidelines for partitioning pages and using layout objects, colour,
light, and texture to obtain rhythm, contrast, and perspective as the carriers for web page comprehension. We use a pattern
approach to systematic development of laying and playouting. These patterns can be combined to larger complex patterns.
Therefore, an algebra for pattern construction will be developed.
On a high level of abstraction the storyboard of a web information system specifies who will be using the system, in
what way, and for which goals. Storyboard pragmatics deals with the question as to what the storyboard means for its
users. One part of pragmatics is concerned with usage analysis by means of life cases, user models, and contexts. We also
addressed another part of pragmatics that complements usage analysis by WIS portfolios. These comprise two parts: the
information portfolio, and the utilization portfolio. The former is concerned with information consumed and produced by
the WIS users, which leads to content
chunks. The latter captures functionality requirements, which depend on the specific
category to which the WIS belongs.
Return to the home page of B. Thalheim,
to the pages on teaching or on