ICML-98 Submission #144

        Theory Refinement of Bayesian Networks with Hidden Variables
    
    Sowmya Ramachandran,            Raymond J. Mooney,
    Stottler Henke and Associates, Inc.,        Department of Computer Sciences,

    1660, So. Amphlett Blvd. Ste 350,       University of Texas at Austin,
    San Mateo, CA, 94402.               Austin, TX, 78712.

                Abstract

    While there has been a growing interest in the problem of learning Bayesian
    networks from data, no technique exists for learning or revising Bayesian
    networks with hidden variables (i.e. variables not represented in the data),
    that has been shown to be efficient, effective, and scalable through
    evaluation on real data.  The few techniques that exist for revising such networks
    perform a blind search through a large space of revisions, and are therefore
    computationally expensive.  This paper presents BANNER, a technique for
    using data to revise a given Bayesian network with noisy-or and noisy-and
    nodes, to improve its classification accuracy.  The initial network can be
    derived directly from a logical theory expressed as propositional rules.
    BANNER can revise networks with hidden variables, and add hidden variables
    when necessary.  Unlike previous approaches, BANNER employs mechanisms
    similar to logical theory refinement techniques for using the data to focus the
    search for effective modifications.  Experiments on real-world problems in the
    domain of molecular biology demonstrate that BANNER can effectively
    revise fairly large networks to significantly improve their accuracies.  

    Keywords: Bayesian Networks, Theory Refinement, Probabilistic Reasoning

    Email address of contact author: sowmya@shai.com

    Phone number of contact author: (650) 655-7242