The 3rd International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) 2021
in conjunction with  2021 IEEE International Conference on Big Data (IEEE BigData 2021)

December 15-18, 2021 @ Taking place virtually

Workshop Date/Time:  TBD

Call for Papers
Program Chairs
Sangkeun (Matt) Lee

Pravallika Devineni

Jong Youl Choi

Organizers’ Background
Sangkeun (Matt) Lee received his Ph.D. degree in computer science and engineering from Seoul National University in 2012. He is currently an R&D Associate in Computer Science and Mathematics Division at Oak Ridge National Laboratory. He has been studying big data, data science, and machine learning and applied state-of-the-art data analysis technologies in many application domains. He has developed many data analytics software, and one of his developed software, ORiGAMI has won the 2016 DOE R&D 100 Award. He has been contributing to many of leading computer science conferences and journals such as ACM WWW, ACM RecSys, Expert Systems with Applications. For the last few years, he has collaborated with scientists across various domains including material science, nuclear science, and mechanical engineering, and published papers in scientific journals such as Journal of Nuclear Materials, Acta Materialia, The Electricity Journal, Advanced Theory, and Simulations.

Pravallika (Pravi) Devineni is a Research Scientist in the Computing Directorate at Oak Ridge National Laboratory, TN, USA. She received her Ph.D. from University of California Riverside in 2018, where her dissertation focused on mining patterns and anomalies in dynamic graph networks. Her research interests include tensor decompositions for machine learning and data science, explainable AI using HPC, large-scale network mining, natural language processing and their applications, and applying computing techniques across a variety of scientific domains.  Pravi actively serves on conference and journal committees such as IJCAI, KDD, PAKDD, WSDM and IEEE TMC. Pravi has a passion for advocating for women in tech. She is an organizing committee member for Women in High Performance Computing (WHPC) and is the co-chair for AI track for vGHC 2021. 

Jong Youl (Jong) Choi is a researcher working in Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, USA. He earned his Ph.D. degree in Computer Science at Indiana University Bloomington in 2012 and his MS degree in Computer Science from New York University in 2004. His areas of research interest span data mining and machine learning algorithms, high-performance data-intensive computing, parallel and distributed systems. More specifically, he is focusing on researching and developing data-centric machine learning algorithms for large scale data management, in situ/in transit data processing, and data management for code coupling. Jong Choi actively serves on conference commitee and journal review such as ParaMo, CCPE, and CLUS. 

Introduction to Workshop
Advances in big data technology, artificial intelligence, and machine learning have created so many success stories in a wide range of areas, especially in industry. These success stories have been motivating scientists, who study physics, chemistry, materials, medicine and many more, to explore a new pathway of utilizing big data tools for their scientific activities.

However, there are barriers to overcome. Most existing big data tools, systems, and methodologies have been developed without considering scientific purposes or scientists’ specific requirements. They are not originally developed for scientists who have no or little knowledge of programming or computer science. On the other hand, for computer scientists, understanding the domain problem is often very challenging due to the lack of enough background knowledge.

We expect that big data technologies can play a great role in contributing to scientific innovation in many ways. There are already a lot of ongoing scientific projects around the world that aim to discover novel hypotheses, analyze big multidimensional data which couldn’t be handled by manually, and reduce the time required by complex calculations via machine. This workshop intends to bring domain scientists and computer scientists together while exploring and extending opportunities in the development of big data tools, systems, and methodologies for scientific discovery, to share success stories and lessons learned, and discuss challenges, which if overcome would enable successful collaboration across different domains, especially domain scientists and computer/data scientists.

In this workshop, we discuss the following questions:

What makes big data tools for scientists different from the existing tools?

What specific needs and challenges do domain scientists face when they try to adopt big data tools?

How can computer scientists and domain scientists communicate to define a feasible problem together?

What are the barriers of using big data for scientific discovery and how do these barriers differ in different science domains?

Workshop History
The international workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) was first held in December 2019 in conjunction with IEEE Big Data 2019 conference, organized by Matt Lee and Travis Johnston. Total of 26 submissions were received, and 12 papers were accepted. It was a great start to build a strong scientific collaboration community. The second BTSD workshop in 2021 was held in December 2020 as a virtual workshop. Total of 26 submissions were received and 11 papers were accepted and presented. It was a great communication and opportunity to learn from experiences across many scientific domains.

Research Topics Included in the Workshop 
Big data tools, systems, and methods related to, but not limited to:

Scientific data processing

Artificial intelligence/Deep neural networks/Machine learning

Text mining/Graph mining

Database/Query processing/Query Optimization

Parallel computation/High Performance Computing

Visualization/User Interface/HCI

Parallelization/Performance/Scalability

High Performance Computing …

that facilitate innovation and discovery in a scientific domain, such as:

Physics

Chemistry

Material science

Mechanical engineering

Nuclear engineering

Biomedical science …

Use cases, success stories, lessens learned in scientific discovery using big data tools, systems, and methods

Program Committee Members
Ramakrishnan Kannan, Oak Ridge National Laboratory, kannanr@ornl.gov

Yan Da, University of Alabama Birmingham, yanda@uab.edu

Seungha Shin, University of Tennessee, sshin@utk.edu 

Feng Bao, Florida State University, fbao@fsu.edu

Youngjae Kim,  SOGANG University, Seoul, Republic of Korea, youkim@sogang.ac.kr

Supriya Chinthavali, Oak Ridge National Laboratory

Michael Churchill, Princeton Plasma Physics Laboratory

Pei Zhang, Oak Ridge National Laboratory

Ivy Peng, Lawrence Livermore National Laboratory

Shaden Smith, Microsoft 

Priyanka Ghosh, Pacific Northwest National Laboratory

Christine Klymko, Lawrence Livermore National Laboratory

Gopinath Chennupati, Los Alamos National Laboratory

Ralph Kube, Princeton Plasma Physics Laboratory

Paper Submission
Please submit a short  paper (up to 4 page IEEE 2-column format) or full paper (up to 8 page IEEE 2-column format) through the online submission system. 

TBD

Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below). 

Formatting Instructions

8.5" x 11" (DOC, PDF) 

LaTex Formatting Macros

Important Dates 
Oct 1, 2021: Due date for full workshop papers submission

Nov 1, 2021: Notification of paper acceptance to authors

Nov 15, 2021: Camera-ready of accepted papers

Location
Taking Place Virtually

How to participate the workshop will be announced. 

Workshop Primary Contact 
Sangkeun (Matt) Lee, Computational Data Analytics Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory, TN, USA.  Tel: +1 865 574 8858 Email: lees4@ornl.gov