YuniPhoto

Yuni Xia

Assistant Professor
Department of Computer Science
Indiana University - Purdue University Indianapolis


Address:  723 W. Michigan St, SL280E, Indianapolis, IN 46202, U.S.A.
Phone:     (317) 274-9738
Fax:         (317) 274-9742
Email :      email


   Bio    Research    Publications    Teaching    Services    Awards   Personal


News

I am looking for PhD students in data mining and databases.  Please contact me with your resume if interested.

SIGMOD/PODS Volunteers Needed



Bio

Yuni Xia is an Assistant Professor of the Computer and Information Science Department at Indiana University - Purdue University Indianapolis (IUPUI). She received the B.S. in Computer Science from Huazhong University of Science and Technology in China in 1996, and her MS and PhD in Computer Science from Purdue University in 2002 and 2005. She had worked as a research intern at IBM T.J. Watson Research Center.

Xia's research is on data mining and databases, focusing on mining and management of uncertain data and constantly evolving data such as sensor data and moving object data. She also works on data storage, retrieval, management and mining in data-intensive applications, and managing uncertainty in the decision support process.  Her research is supported by National Science Foundation, State of Indiana and IBM. She is a recipient of IBM Real Time Innovation Award in 2008.


Research

        Data Mining: Uncertain Data Mining, Data Stream Mining
        Databases: Constant Evolving Data Management, Sensor and Moving Object Databases, Data Uncertainty Management
        Medical Informatics, Bioinformatics, Microarray Data Mining

    Research Projects:

1. TrafficAnalyzer: A Real-time Traffic Stream Processing and Analyzing System, Supported by IBM, PI.
Modern traffic monitoring systems are required to perform real-time processing and analysis of peta-bit continuous data streams. In this project, we propose to design and develop a real-time traffic stream processing and analyzing system.  The most important feature of TrafficAnalyzer is the real-time performance. The results of processing need to be produced with virtually zero latency, because in traffic monitoring system, real-time response is crucial for reducing accidences rate and smoothing traffic flow. TrafficAnalyzer must support sophisticated time-windowed processing operations since streaming data continually changes, often at high rates. These operations should be executed in a way that produces results incrementally as new data arrives, since the entire data set is never available in its entirety.  TrafficAnalyzer also provides careful management of the historical data, as it need compare and combine “present” data with the “past” to study the traffic flow change over the time. TrafficAnalyzer is also resilient to inaccuracy and uncertainties in the data streams, because inherent variations, losses, or reordering of the data streams cause data to arrive in the wrong order, or with variable delays.

2. DisProt Database: A Central Repository of Information on Intrinsically Disordered Proteins, Supported by NSF, Co-PI,  2009-2012
The goal of this project is to fully develop DisProt, a database that provides an essential depository of information about intrinsically disordered proteins (IDPs) . DisProt will be not only a collection of data on intrinsically disordered proteins and their functions, but also a unique research tool to conduct various computational studies on these proteins and to help design better research strategies for studying individual IDPs in laboratory. It's expected that DisProt will support a very wide-spread use, both for the purpose of carrying out bioinformatics experiments and for the entire community involved in understanding cell and molecular biology.

3. Development of SYMBIOTE; A Reconfigurable Logic Assisted Data Stream Management System for Multimedia Sensor Networks, Supported by NSF, Co-PI,  2008-2010
Numerous emerging applications require real-time processing of high bandwidth multimedia data streams. In this project, we propose a novel class of data stream management systems called Reconfigurable Logic Assisted DSMS (RLADSMS) that will provide one of the first comprehensive and demonstrative approaches to using Reconfigurable Logic coprocessors as data stream accelerators in the prototype RLADSMS called SYMBIOTE. This project will investigate key issues such as data models, query languages, hardware DSMS operators, corresponding cost models of query execution, considering hardware complexity of database operators, run-time complexity of hardware and software operators, interconnect latencies, bandwidth, resource allocation as well as optimization techniques for this new class of data stream management systems

4. Invention of a Consumer-Side Geriatric Health Care Knowledge Management and Decision Support System, Supported by 21st Century Research and Development Fund,  State of Indiana, Co-PI,  2008-2010
This project proposes to build an innovative Knowledge Management system unique in the Geriatric Care Management Industry. This system will accelerate the adoption of standards of care and provide the accumulation of knowledge from current Social Science, Psychology, and Health disciplines. It will also build a basis, comparable to the Health Care Industry model, for evidence based outcomes validation.

5. Innovative Anomaly Detection and Diagnosis for Aviation Safety, Supported by IUPUI Internal Grant, PI, 2008-2009
This project proposes to design and develop innovative data mining techniques to the massive aviation data to automatically detect operationally significant anomalous events or trends. The goal is to discover the failure precursors before the system or a component of the system fails. The results of such analyses can support strategic decisions as well as tactical decisions.  We will develop efficient tools for anomaly detection and diagnosis, and build an open platform and framework to accommodate various algorithms and data models. We will study various algorithms for abnormality detection such as time series analysis, clustering-based anomaly detection, neural network and dynamic Bayesian network.


Publications

  1. Jiaqi Ge, Yuni Xia, Chandima Nadungodage, Classify Uncertain Data with Neural Network,  the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2010.
  2. Biao Qin, Yuni Xia, Fang Li. A Bayesian Classifier for Uncertain Data,  the 25th ACM Symposium on Applied Computing (SAC), 2010.  (Acceptance Rate: 25%)
  3. Biao Qin, Yuni Xia, Rakesh Sathyesh, Sunil Prabhakar, Yicheng Tu, "uRule: A Rule Based Classifier for Data with Uncertainty", Demo, the IEEE International Conference on Data Mining (ICDM), 2009.
  4. Sandeep Raghuram, Yuni Xia, Mathew Palakal, Josette Jones, Dave Pecenka, Eric Tinsley, Jean Bandos, and Jerry Geesaman. "Bridging Text Mining and Bayesian Networks", Proc. of the Workshop on Intelligent Biomedical Information Systems (IBIS), 2009.
  5. Biao Qin, Yuni Xia, Fang Li, ”DTU: A Decision Tree for Classifying Uncertain Data”, the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD),  2009 (Acceptance Rate: 11.5%).
  6. Biao Qin, Yuni Xia, Sunil Prabhakar, Yicheng Tu, "A Rule-Based Classification Algorithm for Uncertain Data", the IEEE workshop on Management and Mining of Uncertain Data(MOUND), in conjunction with International Conference of Data Engineering, 2009.
  7. Jiangang Liu, Andrew Campen, Shuguang Huang, Sheng-Bin Peng, Xiang Ye, Mathew Palakal, A. Keith Dunker, Yuni Xia and Shuyu Li, "Identification of a gene signature in cell cycle pathway for breast cancer prognosis using gene expression profiling data", BMC Medical Genomics, 2008, 1:39 .
  8. Biao Qin, Yuni Xia,  "Generating Efficient Safe Query Plans for Probabilistic Databases", Journal of Data and Knowledge Engineering (DKE), Volume 67, Issue3, Pages 485-503, 2008.
  9. Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng and Rahul Shah, "Indexing Continuously Changing Data with Mean Variance Tree", International Journal of High Performance Computing and Networking, Vol. 5, No. 4, pages 263-272, 2008.
  10. Andrew Campen, Yuni Xia, Dan Rigsby, Ying Guo, Xingdong Feng, Eric Su, Mathew Palakal and Shuyu Li, "Mining Gene Expression Database for Primary Human Disease Tissues", Demo, the IEEE 24th International Conference on Date Engineering(ICDE), 2008.
  11. Yuni Xia, Bowei Xi, "Conceptual Clustering Categorical Data with Uncertainty", the IEEE 19th International Conference on Tools with Artificial Intelligence (ICTAI), Patras, Greece, 2007. (Acceptance Rate: 28%)
  12. Yuni Xia, Andrew Campen, Dan Rigsby, Ying Guo, Xingdong Feng, Eric Su, Mathew Palakal, Shuyu Li, "DGEM - a Microarray Gene Expression Database for Primary Human Disease Tissues", Molecular Diagnosis and Therapy, Issue 3, 2007.
  13. Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar, "Reducing Data Redundancy in Location-based Services", the International Conference on Geosensor Networks (GeoSensor 2006), pp. 30-35, Boston, USA, 2006.
  14. Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Rahul Shah, Jeffrey Scott Vitter, Yuni Xia, "Efficient Join Processing over Uncertain Data", the ACM 15th Conference on Information and Knowledge Management (ACM CIKM 2006), pp. 738-747, Arlington, USA, 2006. (Acceptance Rate: 15%)
  15. Yicheng Tu, Mohamed Hefeeda, Yuni Xia, Sunil Prabhakar, Song Liu, Control-Based Quality Adaptation in Data Stream Management Systems", the International Conference of Database and Expert Systems Applications (DEXA), pp.746 - 755, Copenhagen, Denmark, 2005. (Acceptance Rate: 23%)
  16. Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng, Rahul Shah, "Indexing Continuously Changing Data with Mean Variance Tree", the 20th ACM Symposium on Applied Computing (SAC), pp. 1125 - 1132, Santa Fe, New Mexico, USA, 2005. (Acceptance Rate: 30%)
  17. Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, "Change Tolerant Indexing for Constantly Evolving Data", the International Conference on Data Engineering (ICDE), pp. 391-402, Tokoyo, Japan, 2005. (Acceptance Rate: 13%)
  18. Yuni Xia, Sunil Prabhakar, Jiangzhong Sun, Shan Lei, Indexing and Query Constantly Evolving Data Using Time Series Analysis", the 10th International Conference on Database Systems for Advanced Applications (DASFAA), pp.637-648, Beijing, China 2005. (Acceptance Rate: 22%)
  19. Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, Jeffery Scott Vitter, "Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data", the 30th International Conference of Very Large Database (VLDB), pp.876 - 887, Toronto, Canada, 2004. (Acceptance Rate: 16%)
  20. Yuni Xia, Sunil Prabhakar, Efficient VNG Indexing in Location-aware Services", the International Workshop on Mobile and Distributed Computing (MDC), pp.414 - 419, Providence, Rhode Island, USA, 2003.
  21. Yuni Xia, Sunil Prabhakar, Q+Rtree: Efficient Indexing for Moving Object Databases", the 8th International Conference on Database Systems for Advanced Applications (DASFAA), pp.175 - 182, Kyoto, Japan, 2003. (Acceptance Rate: 25%)
  22. Sunil Prabhakar, Yuni Xia, Dmitri Kalashnikov, Walid Aref, Susanne Hambrusch, "Query Indexing and Velocity Constrained Indexing: Scalable Techniques for Continuous Queries on Moving Objects", IEEE Transactions on Computers, Vol.51, No.10, pp.1124 - 1140, 2002.

Book Chapters

  1. Yuni Xia, Jonathon Munson, David Wood, Alan Cole, Location-based Service System (LBS) Analysis and Design'',  Handbook of Research on Modern Systems Analysis and Design Technologies and Applications, ISBN: 978-1-59904-887-1; 698 pp, 2008.
  2. Meeta Pradhan and Yuni Xia, Bioterrorism and Biosecurity ", Handbook of Research on Information Security and Assurance, ISBN: 978-1-59904-855-0, 586 pp, 2008.
  3. Sunil Prabhakar, Dmitri V. Kalashnikov, and Yuni Xia, "Query Indexing and Velocity Constrained Indexing", Encyclopedia of GIS, Springer Science, 2008.


Teaching


Please log into Oncourse for lecture notes, readings, assignments, projects, etc.

CSCI590: Data Mining, Fall 2009, Fall 2008, Fall 2006
CSCI541: Database Management Systems, Spring 2009, Spring 2008, Spring 2007, Spring2006
CSCI441: Client Server Databases, Spring 2007
CSCI481: Data Mining, Spring 2006
CSCI590: Advanced Database Systems, Fall 2005


Profession Services

Program Committee:
        The IEEE International Conference on Computer and Information Technology (CIT), 2010
        The International Conference on Frontier Computing (FC), 2010
        The IEEE 12th International Conference on Computational Science and Engineering(CSE), 2009
        The International Workshop on Smart Homes for Tele-Health (SmarTel), 2009
        The International Workshop on Information Fusion and Dissemination in Wireless Sensor Networks (SensorFusion), 2009
        The International Conference on Intelligent Pervasive Computing (IPC), 2008
        The IEEE 11th International Conference on Computational Science and Engineering(CSE), 2008
        The IEEE 21st International Conference on Advanced Information Networking and Applications (AINA), 2007
        The IEEE/ACS 5th International Conference on Computer Systems and Applications (AICCSA), 2007
        The Third International Conference on Intelligent Environments(IE), 2007
        The International Workshop on Information Fusion and Dissemination in Wireless Sensor Networks(SensorFusion), 2007
        The International Workshop on Knowledge Management and Discovery for Ubiquitous and Pervasive Applications (KUPA), 2007

Local Chair:
       ACM SIGMOD/PODS Conference, 2010

Journal Review:
        IEEE Transaction on Knowledge and Data Engineering
        IEEE Transactions on Parallel and Distributed Systems
        Information Systems
        The International Journal of Telemedicine and Application
        The Information Fusion Journal
        The Journal of System and Software
        The International Journal of Data Mining and Bioinformatics
        The Electronics and Telecommunication Research Institute Journal
        The International Journal of Computer Science and Technology
        The Journal of Ubiquitous Computing and Intelligence

Panelist:
        Panelist, National Science Foundation, CISE, 2007, 2009

Program Co-Chair:
        The ACM workshop on Health Information and Knowledge Management (HIKM) 2006
 

Awards

         Trustees Teaching Award, IUPUI, 2009
         IBM Real Time Innovation Award, 2008
         TechPoint MIRA Award, with Purdue University Knowledge Projection Team, 2005
         Leading Light Award / Ice Miller Graduate Student Scholarship, 2004
         IBM Grace Hopper /Anita Borg Scholarship, 2004
         Excellent Graduate Student Scholarship, Huazhong Univ. of Science and Technology, 1998
         Outstanding Graduate Award, Huazhong University of Science and Technology, 1996
       


   Bio    Research    Publications    Teaching    Services    Awards   Personal