Fig. 1.
Overview of the steps in the svMIL method. (A) Rules applied by our model to link non-coding SVs to their effect on genes, and some biological examples of how these effects could be caused. (B) Each SV–gene pair is a bag, which contains instances representing regulatory elements. Each instance has its own feature vector. The number of features is the same between each instance, but each bag can have a different number and different types of instances. In this example, positive bags are identified by shared affected enhancers with a specific regulatory mark. (C) In the MILES approach, bags are mapped to a feature space by constructing a bag-to-instance similarity matrix. Positive bags will have smaller distances to positive instances than to negative instances, which (D) allows for a separation in feature space using a standard classifier