
|
Yuni Xia
|
Bio
Yuni Xia is an Assistant Professor of the
Computer
and
Information Science Department at Indiana University - Purdue
University
Indianapolis (IUPUI). She received B.S. in
Computer
Science from Huazhong
University of Science and Technology in China, and her
MS and
PhD in Computer
Science from Purdue University. She had worked
as a
research intern at IBM T.J.
Watson Research Center.
Xia's research is on data mining and
databases,
focusing on
mining and management of uncertain data and constantly evolving
data
such as
sensor data and moving object data. She also works on data
storage,
retrieval, management and mining in data-intensive applications,
and
managing uncertainty in
the decision support process. Her
research
is supported by National Science Foundation, IBM and State of
Indiana.
She received from IBM the Real Time Innovation Award in 2008 and
the
Scalable
Data Analytics Innovation Award in 2010.
.
Research
Data Mining: Uncertain
Data
Mining, Data Stream Mining
Databases: Constant
Evolving
Data Management, Sensor and Moving Object Databases, Data
Uncertainty
Management
Current Research Projects:
1.
Large
Scale
Sensor
Stream
Analysis
and
Mining
for
Geriatric
Care,
Supported
by IBM.
Many countries face the problem of
population aging. According to U.S.Census, the
population of seniors in the United States will be 71.5 Million
by 2030
- more than doubling in just 30 years [1]. While it is evident
that the demand for medical care specific to this age group will
increase, the number of physicians and nurses skilled and
specialized
in geriatric medicine is predicted to fall far short of the
need.
Furthermore, geriatrics care giving has an enormous economic
impact.
There is a growing need of using advanced technology to provide
high-quality geriatric care at a reduced cost. The recent
advance is
wireless sensor technology suggests low-cost alternative for
caring the
elderly[3]. We plan to design and develop a real-time
distributed
sensor stream monitoring and analysis system for geriatric care.
Medical sensors such as vital sign sensors, continuous glucose
sensors,
electrocardiograph sensors will be used to continuously
monitoring the
conditions of patients. Environmental sensors, such as
motion
sensor and location sensors will be used to monitor the
daily activities for patients as necessary, especially those
with
certain degree of physical or mental deterioration. The
continuous
sensor data stream will be automatically analyzed. In case of an
emergency, an alarm will be sent to the urgent care and the
patient
family. This enables effective home-based continuous geriatrics
care,
which is not only cost-savings, but also improves the quality of
life
of the elderly and their families.
2.
DisProt Database: A Central
Repository of Information on Intrinsically Disordered Proteins,
Supported by NSF, Co-PI,
2009-2012
The goal of this project is to
fully
develop DisProt, a database that provides an essential
depository of
information about intrinsically disordered proteins (IDPs) .
DisProt
will be not only a collection of data on intrinsically
disordered
proteins and their functions, but also a unique research tool to
conduct various computational studies on these proteins and to
help
design better research strategies for studying individual IDPs
in
laboratory. It's expected that DisProt will support a very
wide-spread
use, both for the purpose of carrying out bioinformatics
experiments
and for the entire community involved in understanding cell and
molecular biology.
3.
TrafficAnalyzer:
A
Real-time
Traffic
Stream
Processing
and
Analyzing
System,
Supported
by
IBM.
Modern traffic monitoring systems
are
required to perform real-time processing and analysis of
peta-bit
continuous data streams. In this project, we propose to design
and
develop a real-time traffic stream processing and analyzing
system. The most important feature of TrafficAnalyzer is
the
real-time performance. The results of processing need to be
produced
with virtually zero latency, because in traffic monitoring
system,
real-time response is crucial for reducing accidences rate and
smoothing traffic flow. TrafficAnalyzer must support
sophisticated
time-windowed processing operations since streaming data
continually
changes, often at high rates. These operations should be
executed in a
way that produces results incrementally as new data arrives,
since the
entire data set is never available in its entirety.
TrafficAnalyzer also provides careful management of the
historical
data, as it need compare and combine “present” data with the
“past” to
study the traffic flow change over the time. TrafficAnalyzer is
also
resilient to inaccuracy and uncertainties in the data streams,
because
inherent variations, losses, or reordering of the data streams
cause
data to arrive in the wrong order, or with variable delays.
4.
Development
of
SYMBIOTE;
A
Reconfigurable
Logic
Assisted
Data
Stream Management
System for Multimedia Sensor Networks, Supported by NSF, Co-PI, 2008-2011
Numerous emerging applications
require
real-time processing of high bandwidth multimedia data streams.
In this
project, we propose a novel class of data stream management
systems
called Reconfigurable Logic Assisted DSMS (RLADSMS) that will
provide
one of the first comprehensive and demonstrative approaches to
using
Reconfigurable Logic coprocessors as data stream accelerators in
the
prototype RLADSMS called SYMBIOTE. This project will investigate
key
issues such as data models, query languages, hardware DSMS
operators,
corresponding cost models of query execution, considering
hardware
complexity of database operators, run-time complexity of
hardware and
software operators, interconnect latencies, bandwidth, resource
allocation as well as optimization techniques for this new class
of
data stream management systems
5.
Invention of a Consumer-Side
Geriatric Health Care Knowledge Management and Decision Support
System,
Supported by 21st Century Research and Development Fund,
State of
Indiana, Co-PI,
2008-2010
This project proposes to build an
innovative Knowledge Management system unique in the Geriatric
Care
Management Industry. This system will accelerate the adoption of
standards of care and provide the accumulation of knowledge from
current Social Science, Psychology, and Health disciplines. It
will
also build a basis, comparable to the Health Care Industry
model, for
evidence based outcomes validation.
Publications
- Biao Qin, Yuni Xia, Shan Wang, Xiaoyong Du, A Novel Bayesian
Classification
Method for Uncertain Data, Knowledge-Based Systems,
Accepted.
- Omkar Tilak, Andrew Hoblitzell, Snehasis Mukhopadhyay, Qian
You,
Shiaofen Fang, Yuni Xia, Joseph Bidwell, Multi-Level Text
Mining for
Bone Biology, Concurrency and Computation: Practice and
Experience,
Accepted.
- Yu Chen, Pranav Vaidya, Jaehwan John Lee, Chandima Hewa
Nadungodage, Yuni Xia, Renfa Li, Qiang Wu, A New
Hardware/Software
Partitioning Methodology Combining Search Space Smoothing and
Discrete
Particle Swarm Optimization, , International Conference on
Engineering
of Reconfigurable Systems and Algorithms (ERSA), 2011.
- Chandima Hewa Nadungodage, Yuni Xia, Fang Li, Jaehwan John
Lee,
Jiaqi Ge, StreamFitter:
A
Real
Time
Linear Regression Analysis System for
Continuous Data Streams, Demo, International Conference on
Database
Systems for Advanced Applications (DASFAA) 2011.
- Biao Qin, Yuni Xia, Rakesh Sathyesh, Jiaqi Ge, Sunil
Probhakar, Classify
Uncertain
Data with Decision Tree, Demo,
International Conference on Database Systems for Advanced
Applications
(DASFAA) 2011.
- Sandeep Raghuram, Yuni Xia, Jiaqi Ge, Mathew Palakal,
Josette
Jones, Dave Pecenka, Eric Tinsley, Jean Bandos, and Jerry
Geesaman. AutoBayesian:
Developing
Bayesian Networks Based on Text Mining, Demo,
International Conference on Database Systems for Advanced
Applications
(DASFAA) 2011. (Best Demo Award)
- Biao Qin, Yuni Xia, Sunil Prabhakar, Rule Induction for
Uncertain
Data, Knowledge and Information System(KAIS), Accepted.
- Shaoping Chen , Yi-Cheng Tu , Yuni Xia, Performance
Analysis of a Dual-tree Algorithm for Computing Spatial
Distance
Histograms, VLDB Journal, Accepted.
- Pranav Vaidya, Y. Chen, Jaehwan John Lee, Chandima Hewa
Nadungodage, and Yuni Xia, "A
General Purpose FPGA Data Filter For Data Stream Processing",
International
Conference on Engineering of Reconfigurable Systems and
Algorithms (ERSA), pp. 247-250, 2010.
- Jiaqi Ge, Yuni Xia, Yicheng Tu, A
Discretization
Algorithm for
Uncertain
Data, the 21st International Conference on Database and
Expert
Systems
Applications (DEXA), 2010. (Acceptance Rate: 22.7%)
- Andrew Hoblitzell, Snehasis Mukhopadhyay, Qian You, Shiaofen
Fang, Yuni Xia, Joseph Bidwell, Text Mining
for Bone
Biology,
Proceeding of the Workshop on Emerging Computational Methods
for the
Life Sciences, 2010.
- Pranav S. Vaidya, Jaehwan John Lee, Francis Bowen, Yingzi
Du,
Chadima H. Nadungodage, Yuni Xia. Symbiote - A
Reconfigurable
Logic
Assisted Data Stream Management System (RLADSMS),
Demo, the
ACM
Conference on Management of Data (SIGMOD), 2010.
- Biao Qin, Yuni Xia, Fang Li. A
Bayesian
Classifier
for Uncertain
Data, the 25th ACM Symposium on Applied Computing
(SAC),
2010. (Acceptance Rate: 25%)
- Jiaqi Ge, Yuni Xia, Chandima Nadungodage, Classify Uncertain Data
with Neural Network, the 14th Pacific-Asia
Conference on
Knowledge Discovery and Data Mining (PAKDD), 2010. (Acceptance
Rate:
10.2%)
- Biao Qin, Yuni Xia, Rakesh Sathyesh, Sunil Prabhakar,
Yicheng Tu,
"uRule: A
Rule
Based Classifier for Data with Uncertainty", Demo,
the IEEE International
Conference on Data Mining (ICDM), 2009.
- Sandeep Raghuram, Yuni Xia, Mathew Palakal, Josette Jones,
Dave
Pecenka, Eric Tinsley, Jean Bandos, and Jerry Geesaman.
"Bridging Text
Mining and Bayesian Networks", Proc. of the Workshop on
Intelligent
Biomedical Information Systems (IBIS), 2009.
- Biao Qin, Yuni Xia, Fang Li, ”DTU: A Decision Tree for
Classifying Uncertain
Data”, the Pacific-Asia Conference on Knowledge
Discovery and Data
Mining (PAKDD), 2009 (Acceptance Rate: 11.5%).
- Biao Qin, Yuni Xia, Sunil Prabhakar, Yicheng Tu, "A Rule-Based
Classification
Algorithm for Uncertain Data", the IEEE workshop on
Management and
Mining of
Uncertain
Data(MOUND), in conjunction with International Conference of
Data
Engineering, 2009.
- Jiangang Liu, Andrew Campen, Shuguang Huang, Sheng-Bin Peng,
Xiang Ye, Mathew Palakal, A. Keith Dunker, Yuni Xia and Shuyu
Li,
"Identification of a gene signature in cell cycle pathway for
breast
cancer prognosis using gene expression profiling data", BMC
Medical
Genomics, 2008, 1:39 .
- Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng and Rahul
Shah, "Indexing
Continuously
Changing
Data
with Mean Variance Tree",
International Journal of High Performance
Computing and Networking, Vol. 5, No. 4, pages 263-272, 2008.
- Biao Qin, Yuni Xia, "Generating
Efficient
Safe Query Plans
for Probabilistic Databases", Journal of Data and
Knowledge
Engineering
(DKE), Volume 67, Issue3, Pages 485-503, 2008.
- Andrew Campen, Yuni Xia, Dan Rigsby, Ying Guo, Xingdong
Feng,
Eric Su, Mathew Palakal and Shuyu Li, "Mining Gene Expression
Database
for Primary Human Disease Tissues", Demo, the IEEE 24th
International
Conference on Date Engineering(ICDE), 2008.
- Yuni Xia, Bowei Xi, "Conceptual
Clustering
Categorical
Data with
Uncertainty", the IEEE 19th International Conference on
Tools with
Artificial Intelligence (ICTAI), Patras, Greece, 2007.
(Acceptance
Rate:
28%)
- Yuni Xia, Andrew Campen, Dan Rigsby, Ying Guo, Xingdong
Feng,
Eric Su, Mathew Palakal, Shuyu Li, "DGEM - a Microarray Gene
Expression
Database for Primary Human Disease Tissues", Molecular
Diagnosis and
Therapy, Issue 3, 2007.
- Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar,
"Reducing
Data Redundancy in Location-based Services", the International
Conference on Geosensor Networks (GeoSensor), pp. 30-35,
Boston,
USA, 2006.
- Reynold Cheng, Sarvjeet Singh, Sunil Prabhakar, Rahul Shah,
Jeffrey Scott Vitter, Yuni Xia, "Efficient Join Processing
over
Uncertain Data", the ACM 15th Conference on Information and
Knowledge
Management (CIKM), pp. 738-747, Arlington, USA, 2006.
(Acceptance Rate: 15%)
- Yicheng Tu, Mohamed Hefeeda, Yuni Xia, Sunil Prabhakar, Song
Liu, Control-Based
Quality
Adaptation in Data Stream Management Systems",
the International Conference of Database and Expert Systems
Applications (DEXA), pp.746 - 755, Copenhagen, Denmark, 2005.
(Acceptance Rate: 23%)
- Yuni Xia, Sunil Prabhakar, Shan Lei, Reynold Cheng, Rahul
Shah,
"Indexing Continuously Changing Data with Mean Variance Tree",
the 20th
ACM Symposium on Applied Computing (SAC), pp. 1125 - 1132,
Santa Fe,
New Mexico, USA, 2005. (Acceptance Rate: 30%)
- Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, "Change
Tolerant Indexing for Constantly Evolving Data", the
International
Conference on Data Engineering (ICDE), pp. 391-402, Tokoyo,
Japan,
2005.
(Acceptance Rate: 13%)
- Yuni Xia, Sunil Prabhakar, Jiangzhong Sun, Shan Lei, Indexing and
Query Constantly Evolving Data Using Time Series Analysis",
the
10th
International Conference on Database Systems for Advanced
Applications
(DASFAA), pp.637-648, Beijing, China 2005. (Acceptance Rate:
22%)
- Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah,
Jeffery
Scott Vitter, "Efficient
Indexing
Methods for Probabilistic Threshold
Queries over Uncertain Data", the 30th International
Conference of
Very
Large Database (VLDB), pp.876 - 887, Toronto, Canada, 2004.
(Acceptance
Rate: 16%)
- Yuni Xia, Sunil Prabhakar, Efficient VNG Indexing in
Location-aware Services", the International Workshop on Mobile
and
Distributed Computing (MDC), pp.414 - 419, Providence, Rhode
Island,
USA, 2003.
- Yuni Xia, Sunil Prabhakar, Q+Rtree: Efficient
Indexing
for Moving
Object Databases", the 8th International Conference on
Database
Systems
for Advanced Applications (DASFAA), pp.175 - 182, Kyoto,
Japan, 2003.
(Acceptance Rate: 25%)
- Sunil Prabhakar, Yuni Xia, Dmitri Kalashnikov, Walid Aref,
Susanne Hambrusch, "Query
Indexing
and Velocity Constrained Indexing:
Scalable Techniques for Continuous Queries on Moving Objects",
IEEE
Transactions on Computers, Vol.51, No.10, pp.1124 - 1140,
2002.
Book Chapters
- Yuni Xia, Jonathon Munson, David Wood, Alan Cole,
Location-based
Service System (LBS) Analysis and Design'', Handbook of
Research
on Modern Systems Analysis and Design Technologies
and Applications, ISBN: 978-1-59904-887-1; 698 pp, 2008.
- Meeta Pradhan and Yuni Xia, Bioterrorism and Biosecurity ",
Handbook of Research on Information Security
and Assurance, ISBN: 978-1-59904-855-0, 586 pp, 2008.
- Sunil Prabhakar, Dmitri V. Kalashnikov, and Yuni Xia, "Query
Indexing and Velocity Constrained Indexing", Encyclopedia of
GIS,
Springer Science, 2008.
Teaching
Please log into
Oncourse for
lecture
notes, readings, assignments, projects, etc.
CSCI590: Data Mining
CSCI541: Database Management Systems
CSCI590: Advanced Database Systems
CSCI441: Client Server Databases
CSCI481: Data Mining
CSCI443: Database Systems
Profession Services
Program Committee:
The International Conference
on
Collaborative Computing (CollaborateCom), 2010, 2011
The IEEE International
Conference on Computer and Information Technology (CIT), 2010
The International Conference
on
Frontier Computing (FC), 2010
The International Workshop
on
Smart Homes for Tele-Health (SmarTel), 2010
The IEEE 12th International
Conference on Computational Science and Engineering(CSE), 2009
The International Workshop
on
Smart Homes for Tele-Health (SmarTel), 2009
The International Workshop
on
Information Fusion and Dissemination in Wireless Sensor Networks
(SensorFusion), 2009
The International Conference
on
Intelligent Pervasive Computing (IPC), 2008
The IEEE 11th International
Conference on Computational Science and Engineering(CSE), 2008
The IEEE 21st International
Conference on Advanced Information Networking and Applications
(AINA),
2007
The IEEE/ACS 5th
International Conference on Computer Systems and Applications
(AICCSA),
2007
The Third International
Conference on Intelligent Environments(IE), 2007
The International
Workshop on Information Fusion and Dissemination in Wireless
Sensor
Networks(SensorFusion), 2007
The International
Workshop
on Knowledge Management and Discovery for Ubiquitous and Pervasive
Applications (KUPA), 2007
Local Chair:
ACM SIGMOD/PODS Conference, 2010
Journal Review:
IEEE Transaction on
Knowledge and
Data Engineering
IEEE Transactions on
Parallel and
Distributed Systems
Information Systems
Information Sciences
Knowledge and Information
Systems
Data and Knowledge
Engineering
The International
Journal of
Telemedicine and Application
The Information Fusion
Journal
The Journal of System and
Software
The International
Journal of
Data Mining and Bioinformatics
The Electronics and
Telecommunication Research Institute Journal
The International
Journal of
Computer Science and Technology
The Journal of
Ubiquitous
Computing and Intelligence
Panelist:
Panelist, National Science
Foundation, CISE, 2007, 2009, 2011
Program Co-Chair:
The ACM workshop on Health
Information and Knowledge Management (HIKM) 2006
Awards
IBM Scalable Data
Analytics Innovation Award, 2010
Trustees Teaching
Award, IUPUI, 2009
Research Venture
Award, IUPUI, 2009
IBM Real Time
Innovation Award, 2008
TechPoint MIRA Award,
with Purdue University Knowledge Projection Team, 2005
Leading
Light
Award
/
Ice
Miller
Graduate
Student
Scholarship, 2004
IBM Grace
Hopper /Anita Borg Scholarship, 2004
Excellent Graduate
Student Scholarship, Huazhong Univ. of Science and Technology, 1998
Outstanding
Graduate
Award, Huazhong University of Science and Technology, 1996
Personal
I was born in Hubei, China.
My
husband and I have a son and a daughter. The kids keep us busy and keep life fun and interesting. We live in
Carmel, Indiana. In
our free time, we love spending time as a family and with friends,
and
enjoy the outdoors, hiking, swimming, biking, and traveling.