April 9, 2015 - 10:00am PDT
DataBridge - Arcot (Raja) Rajasekar
DataBridge (http://databridge.web.unc.edu/) is an NSF-funded Big Data project exploring socio-metric network algorithms (SNA) to develop clustering of scientific data in multiple modalities. DataBridge tries to solve the first mile problem in big data, by providing a venue for linking and locating large volumes and diversity of long tail of science data. Built on a message-oriented architecture, DataBridge provides a platform to launch diverse and domain-centric analytical, relevance, similarity, clustering and networking algorithms that are honed towards recognizing salient aspects of distinctive datasets and clustering them into like-minded virtual collections in multiple dimensions. DataBridge will enable discovery of relevant datasets and methods by computing metrics in multiple spaces of relevancy – different ways data can be related to each other – by metadata and ontology, by pattern analysis and feature extraction, through usage tools and models, and via human connections. Work is underway in developing this framework and implementing and evaluating the DataBridge system.
Arcot Rajasekar is a Professor in the School of Library and Information Sciences at the University of North Carolina at Chapel Hill, a Chief Scientist at the Renaissance Computing Institute (RENCI) and co-Director of Data Intensive Cyber Environments (DICE) Center at the University of North Carolina at Chapel Hill. Previously he was at the San Diego Supercomputer Center at the University of California, San Diego, leading the Data Grids Technology Group. He has been involved in research and development of data grid middleware systems for over a decade and is a lead originator behind the concepts in the Storage Resource Broker (SRB) and the integrated Rule Oriented Data Systems (iRODS), two premier data grid middleware developed by the Data Intensive Cyber Environments Group. A leading proponent of policy-oriented large-scale data management, Rajasekar has several research projects funded by the National Science Foundation, the National Archives, National Institute of Health and other federal agencies. Rajasekar has a PhD in Computer Science from the University of Maryland at College Park and has more than 150 publications in the areas of data grids, digital library, persistent archives, logic programming and artificial intelligence. His latest projects include the Datanet Federation Consortium and the Data Bridge that is building a social network platform for scientific data.