Last Call for Papers - INSTANCE SELECTION A Special Issue of the Data Mining and Knowledge Discovery Journal http://www.comp.nus.edu.sg/~liuh/dmkd.html Due date: 18 Sept 1999, electronic submission INTRODUCTION Knowledge discovery and data mining (KDD) is growing rapidly as computer technologies advance. However, no matter how powerful computers are now or will be in the future, KDD researchers and practitioners must consider how to manage ever-growing data which is, ironically, due to the extensive use of computers and ease of data collection with computers. Many different approaches have been used to address the data explosion issue. Algorithm scale-up is one and data reduction is another. Instance, example, or tuple selection is about algorithms that select or search for a representative portion of data that can fulfill a KDD task as if the whole data is used. Instance selection is directly related to data reduction and becomes increasingly important in many KDD applications due to the need for processing efficiency and/or storage efficiency. One of the major means of instance selection is sampling whereby a sample is selected for testing and analysis, and randomness is a key element in the process. Instance selection also covers other methods that require search. Examples can be found in density estimation - finding the representative instances (data points) for each cluster, and boundary hunting - finding the critical instances to form boundaries to differentiate data points of different classes. Other important issues related to instance selection extend to unwanted precision, focusing, concept drifts, noise/outlier removal, data smoothing, etc. OBJECTIVES This special issue on instance selection brings researchers and practitioners together to report new developments and applications, share hard-learned experiences to avoid similar pitfalls, and shed light on the future development of instance selection. Several critical questions are interesting to practitioners in KDD, and practically useful in real-life applications: * What are the existing methods? * Are they the same or just different names coined by researchers in different fields? * Are they application dependent or stand-alone? * Are new methods needed? * If there is no generic selection algorithm, are these algorithms specific to tasks such as classification, clustering, association, parallelization? * Are there common and reusable components in instance selection methods? * How can we reconfigure some components of instance selection for a particular task/application? * What are the new challenging issues of instance selection in the context of KDD? Sensible answers to these questions can greatly advance the field of KDD in handling large databases. This special issue hopes to answer these questions and to provide an easy reference point for further research and development. COVERAGE All aspects of instance selection will be considered: theories, methodologies, algorithms, and applications. Also studied are issues such as costs of selection, the gains and losses due to the selection, how to balance the gains and losses, and when to use what. Researchers and practitioners in KDD-related fields (Statistics, Databases, Machine Learning, etc.) are encouraged to submit their work to this special issue to share and exchange ideas and problems in any forms: survey, research manuscript, experimental comparison, theoretical study, or report on applications. IMPORTANT DATES 18 September, 1999 - Submissions due 15 November, 1999 - Reviews due (mainly peer review and the guest editors will review all the submissions) 22 Janurary, 2000 - Revised papers due 13 February, 2000 - To Editor-in-Chief FORMAT and PAGE LIMIT Each submission should be no more than 25 pages, have a line spacing of 1.5, use no smaller than a 12pt font, and have at least a 1 inch margin on each side. CONTACT INFORMATION Please direct any enquiries to the guest editors: Huan Liu, liuh@comp.nus.edu.sg, National University of Singapore Hiroshi Motoda, motoda@sanken.osaka-u.ac.jp, Osaka University, Japan. Please submit your work electronically (postscript file) to either guest editor. If you have to submit it in hard copy, please discuss it with the guest editors first. INFORMATION about the JOURNAL Data Mining and Knowledge Discovery, Kluwer Academic Publishers. http://www.wkap.nl/journalhome.htm/1384-5810 Editors-in-Chief: Usama Fayyad, Gregory Piatetsky-Shapiro, Heikki Mannila.