Links within this page: David Astling | Benjamin M. Good | Aaron Von Hooser | Kirk E. Jordan | Joslynn S. Lee | Zhiyong Lu | Deborah L. McGuinness | Nicole A. Vasilevsky
DAVID ASTLING, PhD
Scientist, Bioinformatics
SomaLogic Inc.
Colorado, USA
Biography
Dr. David Astling is a research scientist in the Bioinformatics department at SomaLogic Inc. working to build new tools and methodologies for cutting edge healthcare applications. The Bioinformatics team is primarily responsible for the statistical analysis and machine learning of high dimensional proteomic data to characterize a variety of clinical health issues, which range from cardiovascular disease to kidney function, to Type II diabetes. David collaborates extensively with the SomaLogic assay team and SomaLogic clinical teams in support of internal R&D projects, discovering proteomic insights, developing methodologies for the processing and normalization of high dimensional proteomic data.
Previously David was a research associate at the University of Colorado School of Medicine for six years where he was a member of the Bioinformatics and Genomics Shared Resource group. He collaborated with over 30 different research groups and developed custom analysis pipelines for next-gen sequencing applications.
David’s background in biochemistry gives him a strong biological foundation for his bioinformatics work. He received his Ph.D from UC Berkeley in the Department of Molecular and Cell
Beyond Genomics: Deriving Actionable Health Insights from the Human Proteome
The circulating human proteome offers a unique and dynamic perspective into a person's physiological and health status and presents a great opportunity for rapid and accurate health diagnostics. Genomics by contrast fails in applications where diagnostic fingerprints of environmental impact, disease progression, or infection are needed. SomaLogic’s proteomic assay utilizes a library of over 5,000 SOMAmers for the simultaneous measurement of thousands of protein-analytes in a single blood sample. SomaLogic has shown that analysis of the proteome can provide indicators of patient risk for occurrence of a secondary cardiovascular event. To further this work, SomaLogic has embarked on a collaboration with major academic institutions to discover indicators of primary cardiovascular events, type 2 diabetes, kidney function, and lifestyle characteristics of pre-diabetic patients, as targets for incorporation into actionable insights that are of medical significance. Machine learning and statistical modeling techniques are used to develop insights that can provide rapid feedback to patients to inform strategies of managing aspects of cardio-metabolic syndrome. Additional collaborations are underway to discover insights for other disease states, physiological indicators of health and wellness, and non-blood related sample types. This presentation will examine co-regulatory networks to further our understanding of the existing models, to explore and understand the biomarkers underlying each disease model.
BENJAMIN M. GOOD, PhD
Consultant
Lawrence Berkeley National Labs
California, USA
Integrating Pathway Databases with Gene Ontology Causal Activity Models
The Gene Ontology (GO) Consortium (GOC) is developing a new knowledge representation approach called ‘causal activity models’ (GO-CAM). A GO-CAM describes how one or several gene products contribute to the execution of a biological process. In these models (implemented as OWL instance graphs anchored in Open Biological Ontology (OBO) classes and relations), gene products are linked to molecular activities via semantic relationships like ‘enables’, molecular activities are linked to each other via causal relationships such as ‘positively regulates’, and sets of molecular activities are defined as ‘parts’ of larger biological processes. This approach provides the GOC with a more complete and extensible structure for capturing knowledge of gene function. It also allows for the representation of knowledge typically seen in pathway databases.
Here, we present details and results of a rule-based transformation of pathways represented using the BioPAX exchange format into GO-CAMs. We have 6px; float: left;matically converted all Reactome pathways into GO-CAMs and are currently working on the conversion of additional resources available through Pathway Commons. By converting pathways into GO-CAMs, we can leverage OWL description logic reasoning over OBO ontologies to infer new biological relationships and detect logical inconsistencies. Further, the conversion helps to increase standardization for the representation of biological entities and processes. The products of this work can be used to improve source databases, for example by inferring new GO annotations for pathways and reactions and can help with the formation of meta-knowledge bases that integrate content from multiple sources.
AARON VON HOOSER, PhD
Principal Scientist, Computational Biology
PatientsLikeMe, Inc.
Massachusetts, USA
Building a Learning System that Helps Individuals to Thrive by Connecting Their Experiences and Goals with Molecular Measures of Health
Through the PatientsLikeMe (PLM) network, patients connect with others who have the same disease or condition and track and share their experiences. In the process, they generate data about the real-world nature of disease that help researchers, pharmaceutical companies, regulators, providers, and non-profits develop more effective products, services and care. Studies have shown that members of PLM report tangible benefits from the connectedness and sharing that is part of the PLM community experience. With more than 500,000 members, PLM is a trusted source for real-world disease information and a clinically robust resource that has published more than 60 peer-reviewed research studies.
The Biocomputing team at PLM is leveraging the digitization of person-generated experiential data with deep molecular analyses and machine learning to help patients understand and evaluate their own molecular biology and how they may be able to change their daily lives to optimally thrive. To this end, participants in PLM’s DigitalMeTM program have donated 1000s of biospecimens, building a massive health data set that spans dozens of disease conditions, including SLE, Fibromyalgia, MS, ALS, PD, and RA; captured on an ever-increasing list of big data platforms, including DNAseq, RNAseq, metabolomics, proteomics, and antibody immunosignatures. Here, we report results from several pilot “n of 1” studies, providing deep molecular biological characterization of longitudinal timepoints from the same individuals, tracking normal physiological systems perturbed by health interventions, as well as indications that a spectrum of processes tightly associated with specific disease activities may be perturbed in “healthy” individuals under various circumstances.
KIRK E. JORDAN, Ph.D.
IBM Distinguished Engineer
Data Centric Solutions
IBM T.J. Watson Research
and
Chief Science Officer
IBM Research UK
Biography
Dr. Kirk E. Jordan is an IBM Distinguished Engineer, an IBM Executive position in IBM Research Division's Data Centric Solutions in IBM T.J. Watson Research Center and is the Chief Science Officer for IBM Research United Kingdom (UK). In the UK, he established the IBM Research presence at Science and Technologies Facilities Council's (STFC) Darebury Laboratory in collaboration with the STFC Hartree Centre focusing on data centric cognitive computing. He has vast experience in high performance and parallel computing. The Data Centric Solutions group is addressing the challenges involved in achieving Petascale and Exascale performance on IBM's very high end system platforms, running real workflows and workloads to obtain significant results in science, engineering, business and social policy, and partnering and collaborating with key IBM clients on the most challenging applications and workloads on these large systems. Dr. Jordan oversees development of applications for IBM's advanced computing architectures, investigates and develops concepts for new areas of growth involving high performance computing (HPC), and provides leadership in high-end computing, data centric cognitive computing and simulation in such areas as computational fluid dynamics, systems biology and high-end visualization. At IBM, he held several positions promoting HPC and high performance visualization, including leading technical efforts in the Deep Computing organization within IBM's Systems and Technology Group, managing IBM's University Relations SUR (Shared University Research) Program and leading IBM's Healthcare and Life Sciences Strategic Relationships and Institutes of Innovation Programs. He is a member of the IBM Academy of Technology.
In addition to his IBM responsibilities, Jordan is able to maintain his visibility as a computational applied mathematician in the high-performance computing community. He is a Fellow of SIAM (Society for Industrial and Applied Mathematics) and of AAAS (American Association for the Advancement of Science). He is active on national and international committees on science and high-performance computing issues and has received several awards for his work on supercomputers. His main research interests lie in the efficient use of advanced architectures computers for simulation and modeling especially in the area of systems biology and physical phenomena. He has authored numerous papers on performance analysis of advanced computer architectures and investigated methods that exploit these architectures. Areas he has published include interactive visualization on parallel computers, parallel domain decomposition for reservoir/groundwater simulation, turbulent convection flows, parallel spectral methods, multigrid techniques, wave propagation, systems biology and tumor modeling.
Algorithm Exploitation & Evolving AI/Cognitive Examples on IBM’s Data Centric Systems
The volume, variety, velocity and veracity of data is pushing how we think about computer systems. IBM Research’s Data Centric Solutions organization has been developing systems that handle large data sets shortening time to solution. This group has created a data centric architecture initially delivered to the DoE labs at the end of 2017 and being completed in 2018. As various features to improve data handling now exist in these systems, we need to begin to rethink the algorithms and their implementations to exploit these features. This data centric view is also relevant for Artificial Intelligence (AI) and Machine Learning (ML). In this talk, I will briefly describe the architecture and point out some of hardware and software features ready for exploitation. I will show how we are using these data centric AI/cognitive computing systems to address some challenges in the life sciences in new ways as case studies.
JOSLYNN S. LEE, Ph.D.
Science Education Fellow
Howard Hughes Medical Institute
Maryland, USA
Training and Engaging URM Undergraduate Students in Genomics Research Through a Place-based Microbiome Research Project
The participation of American Indian/Alaskan Native (AIAN) people and other underrepresented minority (URM) populations in STEM fields remains shockingly low. In the computational field, it is even lower. AIAN face various barriers that impede them from pursuing or continuing careers in genomics. Alongside, there is a demand for Integrating bioinformatics and data science into the life sciences curriculum. I am presenting a workshop training that allows students to gain hands-on laboratory and computational experience to understand the diversity of local environmental microbiomes in Colorado and New Mexico. This workshop targets early-career undergraduate students from Southwest regional PUIs, two-year and tribal colleges. Core competencies incorporated in the workshop are computational concepts (algorithms and file formats), statistics, accessing genomic data and running bioinformatics tools to analyze data. I will discuss some of the successes and pitfalls that I have encountered and the adaption for a one-semester course.
ZHIYONG LU, Ph.D.
Deputy Director for Literature Search
National Center for Biotechnology Information (NCBI)
Senior Investigator, NCBI/NLM/NIH
Maryland, USA
Biography
Dr. Zhiyong Lu is a Senior Investigator at the National Library of Medicine’s (NLM) Intramural Research Program, leading research in biomedical text and image processing, information retrieval, and AI/machine learning. As Deputy Director for Literature Search at National Center of Biotechnology Information (NCBI), Dr. Lu also directs the overall R&D efforts to improve literature search and information access in resources like PubMed and LitCovid that are used by millions worldwide. Dr. Lu is a Fellow of the American College of Medical Informatics (ACMI). Over the years, Dr. Lu has mentored over 50 trainees and is a highly cited author with over 300 peer-review articles in leading scientific journals such as Nature, Nature Biotechnology, PLoS Biology, etc. According to Google Scholar, he has an h-index over 70 with ~30,000 citations
Machine Learning in Biomedicine: from PubMed Search to 6px; float: left;nomous Disease Diagnosis
The explosion of biomedical big data and information in the past decade or so has created new opportunities for discoveries to improve the treatment and prevention of human diseases. But the large body of knowledge—mostly exists as free text in journal articles for humans to read—presents a grand new challenge: individual scientists around the world are increasingly finding themselves overwhelmed by the sheer volume of research literature and are struggling to keep up to date and to make sense of this wealth of textual information. Our research aims to break down this barrier and to empower scientists towards accelerated knowledge discovery. In this talk, I will present our work on developing open-source NLP and image analysis tools based on machine learning. Moreover, I will demonstrate their uses in some real-world applications such as improving PubMed searches, scaling up human curation for precision medicine, and enabling image-based 6px; float: left;nomous disease diagnosis.
DEBORAH L. MCGUINNESS, PhD
Tetherless World Senior Constellation Chair
Professor of Computer Science and Cognitive Science
Rensselaer Polytechnic Institute
New York, USA
Biography
Deborah E. McGuinness, Ph.D is the Tetherless World Senior Constellation Chair and Professor of Computer, Cognitive, and Web Sciences at RPI. She is also the founding director of the Web Science Research Center. Deborah has been recognized with awards as a fellow of the American Association for the Advancement of Science (AAAS) for contributions to the Semantic Web, knowledge representation, and reasoning environments and as the recipient of the Robert Engelmore award from the Association for the Advancement of Artificial Intelligence (AAAI) for leadership in Semantic Web research and in bridging Artificial Intelligence (AI) and eScience, significant contributions to deployed AI applications, and extensive service to the AI community. Deborah currently leads a number of large diverse data intensive resource efforts and her team is creating next generation ontology‐enabled research infrastructure for work in large interdisciplinary settings. Prior to joining RPI, Deborah was the acting director of the Knowledge Systems, Artificial Intelligence Laboratory and Senior Research Scientist in the Computer Science Department of Stanford University, and previous to that she was at AT&T Bell Laboratories.
Deborah also has consulted with numerous large corporations as well as emerging startup companies wishing to plan, develop, deploy, and maintain semantic web and/or AI applications. Deborah has also worked as an expert witness in a number of cases, and has deposition and trial experience. Some areas of recent work include: exposure science, data science, next generation health advisors, ontology design and evolution environments, semanticallyenabled virtual observatories, semantic integration of scientific data, contextaware mobile applications, search, eCommerce, configuration, and supply chain management. Deborah holds a Bachelor of Math and Computer Science from Duke University, her Master of Computer Science from University of California at Berkeley, and her Ph.D. in Computer Science from Rutgers University.
Semantic Data Resources Enabling Science: Building, Using, and Maintaining Ontology-Enabled Biology Data Resources
Ontologies are seeing a resurgence of interest and usage as big data proliferates, machine learning advances, and integration of data becomes more paramount. The previous models of sometimes labor-intensive, centralized ontology construction and maintenance do not mesh well in today’s interdisciplinary world that is in the midst of a big data, information extraction, and machine learning explosion. Today many high quality ontologies exist that can and should be utilized. We will describe our approach to building maintainable and reusable semantics-enabled health and life science data ecosystems. We will introduce our method in the context of our National Institutes of Environmental Health Science-funded Child Health Exposure Analysis Resource and we will describe how how our community built and maintains a broad interdisciplinary ontology that spans exposure science and health and integrates with numerous long standing, well used ontologies. We will also describe how this ontology powers an integrated data resource. We will also give examples of how the same methodology is being used in an IBM-funded Health Empowerment using Analysis, Learning and Semantics project as well as a semantics-aware drug repurposing effort. We will conclude by discussing today’s requirements for choosing, reusing, and interlinking existing, evolving resources and the resulting requirements for new methodologies and their resulting systems that can be used and maintained by large diverse communities to accelerate science discovery.
NICOLE A. VASILEVSKY, Ph.D.
Research Assistant Professor
Department of Medical Informatics and Clinical Epidemiology (DMICE)
Oregon Health & Science University
Oregon, USA
LOINC2HPO: Improving Translational Informatics by Standardizing EHR Phenotypic Data Using the Human Phenotype Ontology
Electronic Health Record (EHR) data are often encoded using Logical Observation Identifier Names and Codes (LOINC), which is a universal standard for coding clinical laboratory tests. LOINC codes encode clinical tests and not the phenotypic outcomes, and multiple codes can be used to describe laboratory findings that may correspond to one phenotype. However, LOINC encoded data is an untapped resource in the context of deep phenotyping with the Human Phenotype Ontology (HPO). The HPO describes phenotypic abnormalities encountered in human diseases, and is primarily used for research and diagnostic purposes. As part of the Center for Data to Health (CD2H)’s effort to make EHR data more translationally interoperable, our group developed a curation tool that is used to convert EHR observations into HPO terms for use in clinical research. To date, over 1,000 LOINC codes have been mapped to HPO terms. To demonstrate the utility of these mapped codes, we performed a pilot study with de-identified data from asthma patients. We were able to convert 70% of real-world laboratory tests into HPO-encoded phenotypes. Analysis of the LOINC2HPO-encoded data showed that the HPO term eosinophilia was enriched in patients with severe asthma and prednisone use. This preliminary evidence suggests that LOINC data converted to HPO can be used for machine learning approaches to support genomic phenotype-driven diagnostics for rare disease patients, and to perform EHR based mechanistic research.