Publications de l’équipe
Année de publication : 2017
The Functional Impact of Alternative Splicing in Cancer.
Cell reports : 2215-2226 : DOI : S2211-1247(17)31104-X En savoir plusRésumé
Alternative splicing changes are frequently observed in cancer and are starting to be recognized as important signatures for tumor progression and therapy. However, their functional impact and relevance to tumorigenesis remain mostly unknown. We carried out a systematic analysis to characterize the potential functional consequences of alternative splicing changes in thousands of tumor samples. This analysis revealed that a subset of alternative splicing changes affect protein domain families that are frequently mutated in tumors and potentially disrupt protein-protein interactions in cancer-related pathways. Moreover, there was a negative correlation between the number of these alternative splicing changes in a sample and the number of somatic mutations in drivers. We propose that a subset of the alternative splicing changes observed in tumors may represent independent oncogenic processes that could be relevant to explain the functional transformations in cancer, and some of them could potentially be considered alternative splicing drivers (AS drivers).
ReplierNetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis.
PLoS computational biology : e1005573 : DOI : 10.1371/journal.pcbi.1005573 En savoir plusRésumé
Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations.
ReplierFunctional Toll-Like Receptor (TLR)2 polymorphisms in the susceptibility to inflammatory bowel disease.
PloS one : e0175180 : DOI : 10.1371/journal.pone.0175180 En savoir plusRésumé
The recent genome-wide association studies (GWAS) in inflammatory bowel disease (IBD) suggest significant genetic overlap with complex mycobacterial diseases like tuberculosis or leprosy. TLR variants have previously been linked to susceptibility for mycobacterial diseases. Here we investigated the contribution to IBD risk of two TLR2 polymorphisms, the low-prevalence variant Arg753Gln and the GTn microsatellite repeat polymorphism in intron 2. We studied association with disease, possible correlations with phenotype and gene-gene interactions.
ReplierNuclei segmentation in histopathology images using deep neural networks
2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017)2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) En savoir plusRésumé
ReplierAnnée de publication : 2016
A Stromal Immune Module Correlated with the Response to Neoadjuvant Chemotherapy, Prognosis and Lymphocyte Infiltration in HER2-Positive Breast Carcinoma Is Inversely Correlated with Hormonal Pathways.
PloS one : e0167397 : DOI : 10.1371/journal.pone.0167397 En savoir plusRésumé
HER2-positive breast cancer (BC) is a heterogeneous group of aggressive breast cancers, the prognosis of which has greatly improved since the introduction of treatments targeting HER2. However, these tumors may display intrinsic or acquired resistance to treatment, and classifiers of HER2-positive tumors are required to improve the prediction of prognosis and to develop novel therapeutic interventions.
RepliersmiFISH and FISH-quant – a flexible single RNA detection approach with super-resolution capability.
Nucleic acids research : e165 En savoir plusRésumé
Single molecule FISH (smFISH) allows studying transcription and RNA localization by imaging individual mRNAs in single cells. We present smiFISH (single molecule inexpensive FISH), an easy to use and flexible RNA visualization and quantification approach that uses unlabelled primary probes and a fluorescently labelled secondary detector oligonucleotide. The gene-specific probes are unlabelled and can therefore be synthesized at low cost, thus allowing to use more probes per mRNA resulting in a substantial increase in detection efficiency. smiFISH is also flexible since differently labelled secondary detector probes can be used with the same primary probes. We demonstrate that this flexibility allows multicolor labelling without the need to synthesize new probe sets. We further demonstrate that the use of a specific acrydite detector oligonucleotide allows smiFISH to be combined with expansion microscopy, enabling the resolution of transcripts in 3D below the diffraction limit on a standard microscope. Lastly, we provide improved, fully automated software tools from probe-design to quantitative analysis of smFISH images. In short, we provide a complete workflow to obtain automatically counts of individual RNA molecules in single cells.
ReplierCrowdsourced assessment of common genetic contribution to predicting anti-TNF treatment response in rheumatoid arthritis.
Nature communications : 12460 : DOI : 10.1038/ncomms12460 En savoir plusRésumé
Rheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, treatment fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment of the utility of SNP data for predicting anti-TNF treatment efficacy in RA patients was performed in the context of a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled the comparative evaluation of predictions developed by 73 research groups using the most comprehensive available data and covering a wide range of state-of-the-art modelling methodologies. Despite a significant genetic heritability estimate of treatment non-response trait (h(2)=0.18, P value=0.02), no significant genetic contribution to prediction accuracy is observed. Results formally confirm the expectations of the rheumatology community that SNP information does not significantly improve predictive performance relative to standard clinical traits, thereby justifying a refocusing of future efforts on collection of other data.
ReplierARHGEF17 is an essential spindle assembly checkpoint factor that targets Mps1 to kinetochores.
The Journal of cell biology : 647-59 : DOI : 10.1083/jcb.201408089 En savoir plusRésumé
To prevent genome instability, mitotic exit is delayed until all chromosomes are properly attached to the mitotic spindle by the spindle assembly checkpoint (SAC). In this study, we characterized the function of ARHGEF17, identified in a genome-wide RNA interference screen for human mitosis genes. Through a series of quantitative imaging, biochemical, and biophysical experiments, we showed that ARHGEF17 is essential for SAC activity, because it is the major targeting factor that controls localization of the checkpoint kinase Mps1 to the kinetochore. This mitotic function is mediated by direct interaction of the central domain of ARHGEF17 with Mps1, which is autoregulated by the activity of Mps1 kinase, for which ARHGEF17 is a substrate. This mitosis-specific role is independent of ARHGEF17’s RhoGEF activity in interphase. Our study thus assigns a new mitotic function to ARHGEF17 and reveals the molecular mechanism for a key step in SAC establishment.
ReplierMultitask feature selection with task descriptors
Pacific Symposium on Biocomputing : 261-72 En savoir plusRésumé
Machine learning applications in precision medicine are severely limited by the scarcity of data to learn from. Indeed, training data often contains many more features than samples. To alleviate the resulting statistical issues, the multitask learning framework proposes to learn different but related tasks jointly, rather than independently, by sharing information between these tasks. Within this framework, the joint regularization of model parameters results in models with few non-zero coefficients and that share similar sparsity patterns. We propose a new regularized multitask approach that incorporates task descriptors, hence modulating the amount of information shared between tasks according to their similarity. We show on simulated data that this method outperforms other multitask feature selection approaches, particularly in the case of scarce data. In addition, we demonstrate on peptide MHC-I binding data the ability of the proposed approach to make predictions for new tasks for which no training data is available.
ReplierNetwork-Guided Biomarker Discovery
Lecture Notes in Computer ScienceMachine Learning for Health Informatics : 9605 : 319-336 : DOI : 10.1007/978-3-319-50478-0_16 En savoir plusRésumé
Identifying measurable genetic indicators (or biomarkers) of a specific condition of a biological system is a key element of precision medicine. Indeed it allows to tailor diagnostic, prognostic and treatment choice to individual characteristics of a patient. In machine learning terms, biomarker discovery can be framed as a feature selection problem on whole-genome data sets. However, classical feature selection methods are usually underpowered to process these data sets, which contain orders of magnitude more features than samples. This can be addressed by making the assumption that genetic features that are linked on a biological network are more likely to work jointly towards explaining the phenotype of interest. We review here three families of methods for feature selection that integrate prior knowledge in the form of networks.
ReplierControlling the distance to a Kemeny consensus without computing it
Proceedings of The 33rd International Conference on Machine Learning : 2971-2980 En savoir plusRésumé
ReplierNew general features based on superpixels for image segmentation learning
International Symposium on Biomedical ImagingInternational Symposium on Biomedical Imaging En savoir plusRésumé
ReplierAnnée de publication : 2015
Large-scale machine learning for metagenomics sequence classification.
Bioinformatics (Oxford, England) : 1023-32 : DOI : 10.1093/bioinformatics/btv683 En savoir plusRésumé
Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions.
ReplierChanges in correlation between promoter methylation and gene expression in cancer.
BMC genomics : 873 : DOI : 10.1186/s12864-015-1994-2 En savoir plusRésumé
Methylation of high-density CpG regions known as CpG Islands (CGIs) has been widely described as a mechanism associated with gene expression regulation. Aberrant promoter methylation is considered a hallmark of cancer involved in silencing of tumor suppressor genes and activation of oncogenes. However, recent studies have also challenged the simple model of gene expression control by promoter methylation in cancer, and the precise mechanism of and role played by changes in DNA methylation in carcinogenesis remains elusive.
ReplierΦ-score: A cell-to-cell phenotypic scoring method for sensitive and selective hit discovery in cell-based assays.
Scientific reports : 14221 : DOI : 10.1038/srep14221 En savoir plusRésumé
Phenotypic screening monitors phenotypic changes induced by perturbations, including those generated by drugs or RNA interference. Currently-used methods for scoring screen hits have proven to be problematic, particularly when applied to physiologically relevant conditions such as low cell numbers or inefficient transfection. Here, we describe the Φ-score, which is a novel scoring method for the identification of phenotypic modifiers or hits in cell-based screens. Φ-score performance was assessed with simulations, a validation experiment and its application to gene identification in a large-scale RNAi screen. Using robust statistics and a variance model, we demonstrated that the Φ-score showed better sensitivity, selectivity and reproducibility compared to classical approaches. The improved performance of the Φ-score paves the way for cell-based screening of primary cells, which are often difficult to obtain from patients in sufficient numbers. We also describe a dedicated merging procedure to pool scores from small interfering RNAs targeting the same gene so as to provide improved visualization and hit selection.
Replier