M.P. Papazoglou, H.A. (Erik) Proper, and J. Yang. Landscaping the information space of large multi-database networks. In: Data & Knowledge Engineering, Nr: 3, Vol: 36, Pages: 251-281, 2001.
The promises of network-accessible information are increasingly difficult to achieve. These difficulties are due to a variety of causes, such as, the rapid growth in the volume of network-available information and the increasing complexity, diversity and terminological fluctuations of the different information sources available.
This paper presents a conceptual architecture for the organisation information space across collections of component systems in multi-databases that provides serendipity, exploration and contextualisation support so that users can achieve logical connections between concepts they are familiar with and schema terms employed in multi-database systems. Large-scale searching for multi-database schema information is guided by a combination of lexical, structural and semantic aspects of schema terms in order to reveal more meaning both about the contents of a requested information term and about its placement within the distributed information space.
J.J. Sarbo, S.J.B.A. (Stijn) Hoppenbrouwers, and J.I. Farkas. Towards thought as a logical picture of signs. In: International Journal of Computing Anticipatory Systems, Vol: 8, Pages: 1-16, 2001.
We are concerned with the problem of summarising the content of a coherent text. In this paper we argue that complex units of symbols like sentences, for example, are signs and the meaning of a text arises via their interaction. We introduce a model for the generation of summaries and illustrate its potential by a realistic example.
B.C.M. Wondergem, P. van Bommel, and Th.P. van der Weide. Combining Boolean Logic and Linguistic Structure. In: Information & Software Technology, Nr: 43, Pages: 53-59, 2001.
Keywords still seem to form the basis for document content and query representation. Approaches to use more advanced linguistic structures, such as noun phrases, still are in an experimental phase. In addition, Boolean descriptor languages have often been applied for Information Retrieval. However, the synthesis of logic and linguistics in one descriptor language still is an open issue. In this paper, Boolean index expressions, combining Boolean logic and linguistic structure, are proposed as a good balance between expresiveness and practical issues. Boolean index expressions are obtained by augmenting regular index expressions with logical operators for disjunction, conjunction, and negation. Boolean index expressions are more expressive than both index expressions and the Boolean query language based on keywords. They allow a compact representation of logical combinations of index expressions. In addition, Boolean index expressions are still efficiently parsible and their meaning can be determined through their structure. It is shown how Boolean index expressions can be brought into normal form, allowing fast numerical matching. Matching strategies for Boolean index expressions are obtained by adapting matching strategies for index expressions by providing a case for negations. Our implementation of Boolean index expressions illustrates mentioned issues.
ISP for Large-scale Migrations. Edited by: H.A. (Erik) Proper. Information Services Procurement Library, ten Hagen & Stam, Den Haag, The Netherlands, EU, 2001, ISBN 9076304882.
This book aims to provide insight into the procurement of projects dealing with large-scale migrations. Chapter 1 defines the scope of this book more precisely, by defining what is meant by a 'large-scale migration'.
An overview of the acquisition process for large-scale migrations is provided in chapter 2. The ensuing four chapters home in on specific aspects of the acquisition process. In chapter 3 we focus on the description of the initial and final states of projects. Chapter 4 is concerned with risk analysis in a migration context. It bases itself on an analysis of the factors that characterise the current situation, potential risks associated to this situation and factors, as well as their probability and impact. Mitigation of these risks in terms of actions and project strategies is discussed in chapter 5. Finally, chapter 6 is concerned with the identification of decision points to re-evaluate the progress of migration projects, the status of risks, and their mitigation.
A.T. Arampatzis. Adaptive and temporally-dependent document filtering. University of Nijmegen, 2001, ISBN 9090148914.
A.T. Arampatzis, and Th.P. van der Weide. Document Filtering as an Adaptive and Temporally-dependent Process. In: Proceedings of BCS-IRSG European Colloquium on IR Research (ECIR01), April, 2001.
The filtering task has traditionally been defined as a special case of the information retrieval task, and undeniably, it can be performed by applying retrieval techniques. This theoretical study summarizes our experiences in viewing filtering as an adaptive and temporally-dependent task. A task that, in contrast to traditional retrieval, takes into account the dynamic nature of the availability of information and its temporal location. We investigate the nature of information streams and user requests, and formulate some useful types of adaptivity. We discuss the effectiveness of different types of adaptivity in filtering environments where responsiveness is more important than convergence. To deal with this, we introduce the notion of the half-life of information. Moreoever, we pay special attention to the implementational and efficiency aspect of incrementality.
A.T. Arampatzis, and A. van Hameren. The Score-Distributional Threshold Optimization for Adaptive Binary Classification Tasks. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September, 2001.
The thresholding of document scores or probabilities of relevance has proved critical for the effectiveness of classification tasks. We review the most important approaches to thresholding, and introduce the score-distributional (S-D) threshold optimization method. The method is based on score distributions and is capable of optimizing any effectiveness measure defined in terms of the traditional contingency table.
As a by-product, we provide a model for score distributions, and demonstrate its high accuracy in describing empirical data. The estimation method can be performed incrementally, a highly desirable feature for adaptive environments. Our work in modeling score distributions is useful beyond threshold optimization problems. It directly applies to other retrieval environments that make use of score distributions, e.g., distributed retrieval, or topic detection and tracking.
The most accurate version of S-D thresholding -- although incremental -- can be computationally heavy. Therefore, we also investigate more practical solutions. We suggest practical approximations and discuss adaptivity, threshold initialization, and incrementality issues. The practical version of S-D thresholding has been tested in the context of the TREC-9 Filtering Track and found to be very effective.
F.A. Grootjen. Indexing using a grammerless parser. In: 2001 IEEE International Conference on Systems, Man & Cybernetics (SMC2001), 2001, ISBN 0780370899.
This article proposes an alternate view on natural language parsing. Instead of looking for some predefined (phrase) structure it takes inter-word relations as startingpoint. The reason for this is twofold: firstly it circumvents traditional parsing and linguistic problems and secondly it offers possibility to extract information specifically needed by IR applications. The close relationship with index expressions opens the door to feedback mechanisms like 'Query By Navigation' and conceptual knowledge extraction. The presented ideas are accompanied by an implementation and a small scale experiment.
[ Missing PDF ] [ Bibtex ]
H.A. (Erik) Proper, and Th.P. van der Weide. Information coverage - Incrementally satisfying a searcher`s information need. In: Universal Acces in HCI: Towards an Information Society for All, Edited by: C. Stephanidis. Pages: 719-722, August, Lawrence Erlbaum, Hillsdale, New Jersey, USA, 2001, ISBN 0805836098.
The Internet has become the virtual reality of mankind - a world that we shape without many of the imperfections of reality. We can jump to literally every place in no time and reach every resource anywhere anytime. In particular this last promise of information at your fingertips is under siege. The growing complexity of information space overwhelms the wired consumer and the vast increase in information is outpacing the improvement of retrieval tools.
J.J. Sarbo, and J.I. Farkas. A Peircean Ontology of Language. In: ICCS`2001, Stanford, California, USA, Edited by: H. Delugasch, and G. Stumme. Lecture Notes in AI, Vol: 2120, Pages: 1-14, Springer, 2001.
Formal models of natural language often suffer from excessive complexity. A reason for this, we think, may be due to the underlying approach itself. In this paper we introduce a novel, semiotic based model of language which provides us with a simple algorithm for language processing.
J.I. Farkas, and J.J. Sarbo. A Peircean Ontology of Semantics. Technical report: CSI-R0120, Nijmegen Institute for Information and Computing Sciences, University of Nijmegen, Nijmegen, The Netherlands, EU, 2001.
Peirce's semiotics can be effectively used for modeling different types of signs. In this paper it is argued that semantic signs, which are signs from the semantic point of view, are no exception. It turns out, however, that a proper modeling of semantic signs needs a better understanding of the concept of qualisigns, as well as, of the relation between Peirce's categories and his theory of signs.
[ Missing PDF ] [ Bibtex ]
R.D.T. Janssen, H.A. (Erik) Proper, H. Bosma, D. Verhoef, and S.J.B.A. (Stijn) Hoppenbrouwers. Developing an Architecture Method Library. Technical report, January, Ordina Institute, Gouda, The Netherlands, EU, 2001.
Today, there are millions of professionals worldwide acting as a designer, architect or engineer in the design, realization, and implementation of information systems. At this moment there is no well established and clearly identified body of knowledge that defines their profession in a 'standard' way.
In this article, we This article discusses a conceptual framework for architecture-driven information system development. Rather than defining a completely new framework, the conceptual framework is synthesized out of relevant pre-existing frameworks for system development and architecture.
Before discussing the actual framework, we briefly discuss the necessity for an architecture-driven approach to system development.
R.D.T. Janssen, and H.A. (Erik) Proper. A functionality taxonomy for document search engines. Technical report, June, Ordina Institute, Gouda, The Netherlands, EU, 2001.
In this paper a functionality taxonomy for document search engines is proposed. It can be used to assess the features of a search engine, to position search engines relative to each other, or to select which search engine `fits' a certain situation. One is able to identify areas for improvement. During development, we were guided by the viewpoint of the user. We use the word `search engine' in the broadest sense possible, including library and web based (meta) search engines.
The taxonomy distinguishes seven functionality areas: an indexing service, user profiling, query composition, query execution, result presentation, result refinement, and history keeping. Each of these relates and provides services to other functionality areas. It can be extended whenever necessary.
To illustrate the validity of our taxonomy, it has been used for comparing various document search engines existing today (ACM Digital Library, PiCarta, Copernic, AltaVista, Google, and GuideBeam). It appears that the functionality aspects covered by our taxonomy can be used for describing these search engines.
J.J. Sarbo, and J.I. Farkas. A linearly complex model for knowledge representation. Technical report: CSI-R0121, Nijmegen Institute for Information and Computing Sciences, University of Nijmegen, Nijmegen, The Netherlands, EU, 2001.
We present two results which complete and extend our Peircean semiotic model of signs introduced earlier. The first result is concerned with the potential of that model for the representation of knowledge in general. The second one formally proves that such a model can be linearly complex.
[ Missing PDF ] [ Bibtex ]