
|
Yuni Xia
|
Bio
Yuni Xia is an Assistant Professor of the Computer
and
Information Science Department at Indiana University - Purdue
University
Indianapolis (IUPUI). She received the B.S. in Computer
Science from Huazhong
University of Science and Technology in China in 1996, and her MS and
PhD in Computer
Science from Purdue University in 2002 and 2005. She had worked as a
research intern at IBM T.J.
Watson Research Center.
Xia's research is on data mining and databases,
focusing on
mining and management of uncertain data and constantly evolving data
such as
sensor data and moving object data. She also works on data storage,
retrieval, management and mining in data-intensive applications, and
managing uncertainty in
the decision support process. Her research
is supported by National Science Foundation, State of Indiana and IBM.
She is a
recipient of IBM Real Time Innovation Award in 2008.
Research
Data Mining: Uncertain Data
Mining, Data Stream Mining
Databases: Constant Evolving
Data Management, Sensor and Moving Object Databases, Data Uncertainty
Management
Medical Informatics,
Bioinformatics, Microarray
Data Mining
Research Projects:
1.
TrafficAnalyzer:
A Real-time Traffic Stream Processing and Analyzing System, Supported
by IBM, PI.
Modern traffic monitoring systems are
required to perform real-time processing and analysis of peta-bit
continuous data streams. In this project, we propose to design and
develop a real-time traffic stream processing and analyzing
system. The most important feature of TrafficAnalyzer is the
real-time performance. The results of processing need to be produced
with virtually zero latency, because in traffic monitoring system,
real-time response is crucial for reducing accidences rate and
smoothing traffic flow. TrafficAnalyzer must support sophisticated
time-windowed processing operations since streaming data continually
changes, often at high rates. These operations should be executed in a
way that produces results incrementally as new data arrives, since the
entire data set is never available in its entirety.
TrafficAnalyzer also provides careful management of the historical
data, as it need compare and combine “present” data with the “past” to
study the traffic flow change over the time. TrafficAnalyzer is also
resilient to inaccuracy and uncertainties in the data streams, because
inherent variations, losses, or reordering of the data streams cause
data to arrive in the wrong order, or with variable delays.
2.
DisProt Database: A Central
Repository of Information on Intrinsically Disordered Proteins,
Supported by NSF, Co-PI,
2009-2012
The goal of this project is to fully
develop DisProt, a database that provides an essential depository of
information about intrinsically disordered proteins (IDPs) . DisProt
will be not only a collection of data on intrinsically disordered
proteins and their functions, but also a unique research tool to
conduct various computational studies on these proteins and to help
design better research strategies for studying individual IDPs in
laboratory. It's expected that DisProt will support a very wide-spread
use, both for the purpose of carrying out bioinformatics experiments
and for the entire community involved in understanding cell and
molecular biology.
3. Development
of SYMBIOTE; A Reconfigurable Logic Assisted Data Stream Management
System for Multimedia Sensor Networks, Supported by NSF, Co-PI, 2008-2010
Numerous emerging applications require
real-time processing of high bandwidth multimedia data streams. In this
project, we propose a novel class of data stream management systems
called Reconfigurable Logic Assisted DSMS (RLADSMS) that will provide
one of the first comprehensive and demonstrative approaches to using
Reconfigurable Logic coprocessors as data stream accelerators in the
prototype RLADSMS called SYMBIOTE. This project will investigate key
issues such as data models, query languages, hardware DSMS operators,
corresponding cost models of query execution, considering hardware
complexity of database operators, run-time complexity of hardware and
software operators, interconnect latencies, bandwidth, resource
allocation as well as optimization techniques for this new class of
data stream management systems
4.
Invention of a Consumer-Side
Geriatric Health Care Knowledge Management and Decision Support System,
Supported by 21st Century Research and Development Fund, State of
Indiana, Co-PI, 2008-2010
This project proposes to build an
innovative Knowledge Management system unique in the Geriatric Care
Management Industry. This system will accelerate the adoption of
standards of care and provide the accumulation of knowledge from
current Social Science, Psychology, and Health disciplines. It will
also build a basis, comparable to the Health Care Industry model, for
evidence based outcomes validation.
5.
Innovative Anomaly Detection and
Diagnosis for Aviation Safety, Supported by IUPUI Internal Grant, PI, 2008-2009
This project proposes to design and
develop innovative data mining techniques to the massive aviation data
to automatically detect operationally significant anomalous events or
trends. The goal is to discover the failure precursors before the
system or a component of the system fails. The results of such analyses
can support strategic decisions as well as tactical decisions. We
will develop efficient tools for anomaly detection and diagnosis, and
build an open platform and framework to accommodate various algorithms
and data models. We will study various algorithms for abnormality
detection such as time series analysis, clustering-based anomaly
detection, neural network and dynamic Bayesian network.
Publications
- Jiaqi Ge, Yuni Xia, Chandima Nadungodage, Classify Uncertain Data
with Neural Network, the 14th Pacific-Asia Conference on
Knowledge Discovery and Data Mining (PAKDD), 2010.
- Biao Qin, Yuni Xia, Fang Li. A Bayesian Classifier for Uncertain
Data, the 25th ACM Symposium on Applied Computing (SAC),
2010. (Acceptance Rate: 25%)
- Biao Qin, Yuni Xia, Rakesh Sathyesh, Sunil Prabhakar, Yicheng Tu,
"uRule: A Rule Based Classifier for Data with Uncertainty", Demo,
the IEEE International
Conference on Data Mining (ICDM), 2009.
- Sandeep Raghuram, Yuni Xia, Mathew Palakal, Josette Jones, Dave
Pecenka, Eric Tinsley, Jean Bandos, and Jerry Geesaman. "Bridging Text
Mining and Bayesian Networks", Proc. of the Workshop on Intelligent
Biomedical Information Systems (IBIS), 2009.
- Biao Qin, Yuni Xia, Fang Li, ”DTU: A Decision Tree for
Classifying Uncertain
Data”, the Pacific-Asia Conference on Knowledge Discovery and Data
Mining (PAKDD), 2009 (Acceptance Rate: 11.5%).
- Biao Qin, Yuni Xia, Sunil Prabhakar, Yicheng Tu, "A Rule-Based
Classification
Algorithm for Uncertain Data", the IEEE workshop on Management and
Mining of
Uncertain
Data(MOUND), in conjunction with International Conference of Data
Engineering, 2009.
- Jiangang Liu, Andrew Campen, Shuguang Huang, Sheng-Bin Peng,
Xiang Ye, Mathew Palakal, A. Keith Dunker, Yuni Xia and Shuyu Li,
"Identification of a gene signature in cell cycle pathway for breast
cancer prognosis using gene expression profiling data", BMC Medical
Genomics, 2008, 1:39 .
- Biao Qin, Yuni Xia, "Generating Efficient Safe Query Plans
for Probabilistic Databases", Journal of Data and Knowledge Engineering
(DKE), Volume 67, Issue3, Pages 485-503, 2008.
- Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng and Rahul
Shah, "Indexing Continuously Changing Data with Mean Variance Tree",
International Journal of High Performance
Computing and Networking, Vol. 5, No. 4, pages 263-272, 2008.
- Andrew Campen, Yuni Xia, Dan Rigsby, Ying Guo, Xingdong Feng,
Eric Su, Mathew Palakal and Shuyu Li, "Mining Gene Expression Database
for Primary Human Disease Tissues", Demo, the IEEE 24th International
Conference on Date Engineering(ICDE), 2008.
- Yuni Xia, Bowei Xi, "Conceptual Clustering Categorical Data with
Uncertainty", the IEEE 19th International Conference on Tools with
Artificial Intelligence (ICTAI), Patras, Greece, 2007. (Acceptance
Rate:
28%)
- Yuni Xia, Andrew Campen, Dan Rigsby, Ying Guo, Xingdong Feng,
Eric Su, Mathew Palakal, Shuyu Li, "DGEM - a Microarray Gene Expression
Database for Primary Human Disease Tissues", Molecular Diagnosis and
Therapy, Issue 3, 2007.
- Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar, "Reducing
Data Redundancy in Location-based Services", the International
Conference on Geosensor Networks (GeoSensor 2006), pp. 30-35, Boston,
USA, 2006.
- Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Rahul Shah,
Jeffrey Scott Vitter, Yuni Xia, "Efficient Join Processing over
Uncertain Data", the ACM 15th Conference on Information and Knowledge
Management (ACM CIKM 2006), pp. 738-747, Arlington, USA, 2006.
(Acceptance Rate: 15%)
- Yicheng Tu, Mohamed Hefeeda, Yuni Xia, Sunil Prabhakar, Song Liu,
Control-Based Quality Adaptation in Data Stream Management Systems",
the International Conference of Database and Expert Systems
Applications (DEXA), pp.746 - 755, Copenhagen, Denmark, 2005.
(Acceptance Rate: 23%)
- Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng, Rahul Shah,
"Indexing Continuously Changing Data with Mean Variance Tree", the 20th
ACM Symposium on Applied Computing (SAC), pp. 1125 - 1132, Santa Fe,
New Mexico, USA, 2005. (Acceptance Rate: 30%)
- Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, "Change
Tolerant Indexing for Constantly Evolving Data", the International
Conference on Data Engineering (ICDE), pp. 391-402, Tokoyo, Japan,
2005.
(Acceptance Rate: 13%)
- Yuni Xia, Sunil Prabhakar, Jiangzhong Sun, Shan Lei, Indexing and
Query Constantly Evolving Data Using Time Series Analysis", the 10th
International Conference on Database Systems for Advanced Applications
(DASFAA), pp.637-648, Beijing, China 2005. (Acceptance Rate: 22%)
- Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, Jeffery
Scott Vitter, "Efficient Indexing Methods for Probabilistic Threshold
Queries over Uncertain Data", the 30th International Conference of Very
Large Database (VLDB), pp.876 - 887, Toronto, Canada, 2004. (Acceptance
Rate: 16%)
- Yuni Xia, Sunil Prabhakar, Efficient VNG Indexing in
Location-aware Services", the International Workshop on Mobile and
Distributed Computing (MDC), pp.414 - 419, Providence, Rhode Island,
USA, 2003.
- Yuni Xia, Sunil Prabhakar, Q+Rtree: Efficient Indexing for Moving
Object Databases", the 8th International Conference on Database Systems
for Advanced Applications (DASFAA), pp.175 - 182, Kyoto, Japan, 2003.
(Acceptance Rate: 25%)
- Sunil Prabhakar, Yuni Xia, Dmitri Kalashnikov, Walid Aref,
Susanne Hambrusch, "Query Indexing and Velocity Constrained Indexing:
Scalable Techniques for Continuous Queries on Moving Objects", IEEE
Transactions on Computers, Vol.51, No.10, pp.1124 - 1140, 2002.
Book Chapters
- Yuni Xia, Jonathon Munson, David Wood, Alan Cole, Location-based
Service System (LBS) Analysis and Design'', Handbook of Research
on Modern Systems Analysis and Design Technologies
and Applications, ISBN: 978-1-59904-887-1; 698 pp, 2008.
- Meeta Pradhan and Yuni Xia, Bioterrorism and Biosecurity ",
Handbook of Research on Information Security
and Assurance, ISBN: 978-1-59904-855-0, 586 pp, 2008.
- Sunil Prabhakar, Dmitri V. Kalashnikov, and Yuni Xia, "Query
Indexing and Velocity Constrained Indexing", Encyclopedia of GIS,
Springer Science, 2008.
Teaching
Please log into
Oncourse for lecture
notes, readings, assignments, projects, etc.
CSCI590: Data Mining, Fall 2009, Fall 2008, Fall 2006
CSCI541: Database Management Systems, Spring 2009, Spring
2008, Spring 2007, Spring2006
CSCI441: Client Server Databases,
Spring 2007
CSCI481: Data Mining, Spring 2006
CSCI590: Advanced Database Systems, Fall 2005
Profession Services
Program Committee:
The IEEE International
Conference on Computer and Information Technology (CIT), 2010
The International Conference on
Frontier Computing (FC), 2010
The IEEE 12th International
Conference on Computational Science and Engineering(CSE), 2009
The International Workshop on
Smart Homes for Tele-Health (SmarTel), 2009
The International Workshop on
Information Fusion and Dissemination in Wireless Sensor Networks
(SensorFusion), 2009
The International Conference on
Intelligent Pervasive Computing (IPC), 2008
The IEEE 11th International
Conference on Computational Science and Engineering(CSE), 2008
The IEEE 21st International
Conference on Advanced Information Networking and Applications (AINA),
2007
The IEEE/ACS 5th
International Conference on Computer Systems and Applications (AICCSA),
2007
The Third International
Conference on Intelligent Environments(IE), 2007
The International
Workshop on Information Fusion and Dissemination in Wireless Sensor
Networks(SensorFusion), 2007
The International Workshop
on Knowledge Management and Discovery for Ubiquitous and Pervasive
Applications (KUPA), 2007
Local Chair:
ACM SIGMOD/PODS Conference, 2010
Journal Review:
IEEE Transaction on Knowledge and
Data Engineering
IEEE Transactions on Parallel and
Distributed Systems
Information Systems
The International Journal of
Telemedicine and Application
The Information Fusion
Journal
The Journal of System and Software
The International Journal of
Data Mining and Bioinformatics
The Electronics and
Telecommunication Research Institute Journal
The International Journal of
Computer Science and Technology
The Journal of Ubiquitous
Computing and Intelligence
Panelist:
Panelist, National Science
Foundation, CISE, 2007, 2009
Program Co-Chair:
The ACM workshop on Health
Information and Knowledge Management (HIKM) 2006
Awards
Trustees Teaching
Award, IUPUI, 2009
IBM Real Time
Innovation Award, 2008
TechPoint MIRA Award,
with Purdue University Knowledge Projection Team, 2005
Leading
Light Award / Ice Miller Graduate Student Scholarship, 2004
IBM Grace
Hopper /Anita Borg Scholarship, 2004
Excellent Graduate
Student Scholarship, Huazhong Univ. of Science and Technology, 1998
Outstanding Graduate
Award, Huazhong University of Science and Technology, 1996