HIT 2.0

Abstract Literature-described targets of herbal ingredients have been explored to facilitate the mechanistic study of herbs, as well as the new drug discovery. Though several databases provided similar information, the majority of them are limited to literatures before 2010 and need to be updated urgently. HIT 2.0 was here constructed as the latest curated dataset focusing on Herbal Ingredients' Targets covering PubMed literatures 2000-2020. Currently, HIT 2.0 hosts 10 031 compound-target activity pairs with quality indicators between 2208 targets and 1237 ingredients from more than 1250 reputable herbs. The molecular targets cover those genes/proteins being directly/indirectly activated/inhibited, protein binders, and enzymes substrates or products. Also included are those genes regulated under the treatment of individual ingredient. Crosslinks were made to databases of TTD, DrugBank, KEGG, PDB, UniProt, Pfam, NCBI, TCM-ID and others. More importantly, HIT enables automatic Target-mining and My-target curation from daily released PubMed literatures. Thus, users can retrieve and download the latest abstracts containing potential targets for interested compounds, even for those not yet covered in HIT. Further, users can log into 'My-target' system, to curate personal target-profiling on line based on retrieved abstracts. HIT can be accessible at http://hit2.badd-cao.net. Citation Deyu Yan, Genhui Zheng, Caicui Wang, Zikun Chen, Tiantian Mao, Jian Gao, Yu Yan, Xiangyi Chen, Xuejie Ji, Jinyu Yu, Saifeng Mo, Haonan Wen, Wenhao Han, Mengdi Zhou, Yuan Wang, Jun Wang, Kailin Tang, Zhiwei Cao. HIT 2.0: an enhanced platform for Herbal Ingredients' Targets, Nucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D1238-D1243.


Abstract Though transcriptomics technologies evolve rapidly in the past decades, integrative analysis of mixed data between microarray and RNA-seq remains challenging due to the inherent variability difference between them. Here, Rank-In was proposed to correct the nonbiological effects across the two technologies, enabling freely blended data for consolidated analysis. Rank-In was rigorously validated via the public cell and tissue samples tested by both technologies. On the two reference samples of the SEQC project, Rank-In not only perfectly classified the 44 profiles but also achieved the best accuracy of 0.9 on predicting TaqMan-validated DEGs. More importantly, on 327 Glioblastoma (GBM) profiles and 248, 523 heterogeneous colon cancer profiles respectively, only Rank-In can successfully discriminate every single cancer profile from normal controls, while the others cannot. Further on different sizes of mixed seq-array GBM profiles, Rank-In can robustly reproduce a median range of DEG overlapping from 0.74 to 0.83 among top genes, whereas the others never exceed 0.72. Being the first effective method enabling mixed data of cross-technology analysis, Rank-In welcomes hybrid of array and seq profiles for integrative study on large/small, paired/unpaired and balanced/imbalanced samples, opening possibility to reduce sampling space of clinical cancer patients. Rank-In can be accessed at http://www.badd-cao.net/rank-in/index.html. Citation Kailin Tang, Xuejie Ji, Mengdi Zhou, Zeliang Deng, Yuwei Huang, Genhui Zheng, Zhiwei Cao. Rank-in: enabling integrative analysis across microarray and RNA-seq for cancer, Nucleic Acids Research, Volume 49, Issue 17, 27 September 2021, Page e99.

[http://www.badd-cao.net/seppa3/] [https://www.biosino.org/seppa3/]

Abstract B-cell epitope information is critical to immune therapy and vaccine design. Protein epitopes can be significantly affected by glycosylation, while no methods have considered this till now. Based on previous versions of Spatial Epitope Prediction of Protein Antigens (SEPPA), we here present an enhanced tool SEPPA 3.0, enabling glycoprotein antigens. Parameters were updated based on the latest and largest dataset. Then, additional micro-environmental features of glycosylation triangles and glycosylation-related amino acid indexes were added as important classifiers, coupled with final calibration based on neighboring antigenicity. Logistic regression model was retained as SEPPA 2.0. The AUC value of 0.794 was obtained through 10-fold cross-validation on internal validation. Independent testing on general protein antigens resulted in AUC of 0.740 with BA (balanced accuracy) of 0.657 as baseline of SEPPA 3.0. Most importantly, when tested on independent glycoprotein antigens only, SEPPA 3.0 gave an AUC of 0.749 and BA of 0.665, leading the top performance among peers. As the first server enabling accurate epitope prediction for glycoproteins, SEPPA 3.0 shows significant advantages over popular peers on both general protein and glycoprotein antigens. It can be accessed at http://bidd2.nus.edu.sg/SEPPA3/ or at http://www.badd-cao.net/seppa3/index.html. Batch query is supported. Citation Chen Zhou, Zikun Chen, Lu Zhang, Deyu Yan, Tiantian Mao, Kailin Tang, Tianyi Qiu, Zhiwei Cao. SEPPA 3.0 - enhanced spatial epitope prediction enabling glycoprotein antigens, Nucleic Acids Research, Volume 47, Issue W1, 02 July 2019, Pages W388-W394.

[https://www.biosino.org/ce_blast/] [http://badd.tongji.edu.cn/ce_blast/]

Abstract Major challenges in vaccine development include rapidly selecting or designing immunogens for raising cross-protective immunity against different intra- or inter-subtypic pathogens, especially for the newly emerging varieties. Here we propose a computational method, Conformational Epitope (CE)-BLAST, for calculating the antigenic similarity among different pathogens with stable and high performance, which is independent of the prior binding-assay information, unlike the currently available models that heavily rely on the historical experimental data. Tool validation incorporates influenza-related experimental data sufficient for stability and reliability determination. Application to dengue-related data demonstrates high harmonization between the computed clusters and the experimental serological data, undetectable by classical grouping. CE-BLAST identifies the potential cross-reactive epitope between the recent zika pathogen and the dengue virus, precisely corroborated by experimental data. The high performance of the pathogens without the experimental binding data suggests the potential utility of CE-BLAST to rapidly design cross-protective vaccines or promptly determine the efficacy of the currently marketed vaccine against emerging pathogens, which are the critical factors for containing emerging disease outbreaks. Citation Tianyi Qiu, Yiyan Yang, Jingxuan Qiu, Yang Huang, Tianlei Xu, Han Xiao, Dingfeng Wu, Qingchen Zhang, Chen Zhou, Xiaoyan Zhang, Kailin Tang, Jianqing Xu, Zhiwei Cao. CE-BLAST: Making it possible to compute antigenic similarity for newly emerging pathogens, Nature Communications, 9(1), 2018, Pages 1772.