来自用户 颜林林 的文献。
当前共找到 130 篇文献分享,本页显示第 41 - 60 篇。
41.
颜林林
(2022-10-02 15:26):
#paper doi:10.1186/s12859-022-04948-9 BMC Bioinformatics, 2022, Visualizing the knowledge structure and evolution of bioinformatics. 这篇文章用了一些生物信息学中常用的数据分析方法和可视化方法,来研究生物信息学学科本身。对过去几十年所发表的相关论文摘要文本的分析,展示了一些研究模式变迁过程(如从纯理论性的模型计算到堆机器学习模型)以及相应的知识结构的变化过程。思路上很新颖,正文中以UMAP点图展示知识结构的方式也很有创意。
Abstract:
BACKGROUND: Bioinformatics has gained much attention as a fast growing interdisciplinary field. Several attempts have been conducted to explore the field of bioinformatics by bibliometric analysis, however, such works did …
>>>
BACKGROUND: Bioinformatics has gained much attention as a fast growing interdisciplinary field. Several attempts have been conducted to explore the field of bioinformatics by bibliometric analysis, however, such works did not elucidate the role of visualization in analysis, nor focus on the relationship between sub-topics of bioinformatics.RESULTS: First, the hotspot of bioinformatics has moderately shifted from traditional molecular biology to omics research, and the computational method has also shifted from mathematical model to data mining and machine learning. Second, DNA-related topics are bridge topics in bioinformatics research. These topics gradually connect various sub-topics that are relatively independent at first. Third, only a small part of topics we have obtained involves a number of computational methods, and the other topics focus more on biological aspects. Fourth, the proportion of computing-related topics hit a trough in the 1980s. During this period, the use of traditional calculation methods such as mathematical model declined in a large proportion while the new calculation methods such as machine learning have not been applied in a large scale. This proportion began to increase gradually after the 1990s. Fifth, although the proportion of computing-related topics is only slightly higher than the original, the connection between other topics and computing-related topics has become closer, which means the support of computational methods is becoming increasingly important for the research of bioinformatics.CONCLUSIONS: The results of our analysis imply that research on bioinformatics is becoming more diversified and the ranking of computational methods in bioinformatics research is also gradually improving.
<<<
翻译
42.
颜林林
(2022-10-01 23:28):
#paper doi:10.1186/s12896-022-00758-2 BMC Biotechnology, 2022, A new method for screening acute/chronic lymphocytic leukemia: dual-label time-resolved fluorescence immunoassay. 本文根据既往研究发现,锁定两个蛋白 S100A8 和 LRG1,作为白血病的早期发现生物标志物,使用 TRFIA(时间分辨的荧光免疫分析,该方法最早出现于2002年左右,参考doi: 10.1016/S0167-7012(01)00352-9 的文章)技术进行高灵敏度的检测,由此建立白血病的外周血早筛方法。本文对此筛查方法,在不同浓度的样本中(包括批间实验)进行了技术验证,并在120例健康人+59例白血病患者中进行了临床验证。
Abstract:
BACKGROUND: Lymphocytic leukemia (LL) is a primary malignant tumor of hematopoietic tissue, which seriously affects the health of children and the elderly. The study aims to establish a new detection …
>>>
BACKGROUND: Lymphocytic leukemia (LL) is a primary malignant tumor of hematopoietic tissue, which seriously affects the health of children and the elderly. The study aims to establish a new detection method for screening acute/chronic LL using time-resolved fluorescence immunoassay (TRFIA) via quantitative detection of S100 calcium binding protein A8 (S100A8) and leucine-rich alpha-2-glycoprotein 1 (LRG1) in serum.METHODS: Here a sandwich TRFIA was optimized and established: Anti-S100A8/LRG1 caputre antibodies immobilized on 96-well plates captured S100A8/LRG1, and then banded together with the anti-S100A8/LRG1 detection antibodies labeled with Europium(III) (Eu3+)/samarium(III) (Sm3+) chelates. Finally time resolved fluorometry measured the fluorescence intensity.RESULTS: The sensitivity of S100A8 was 1.15 ng/mL(LogY = 3.4027 + 0.4091 × LogX, R2 = 0.9828, P < 0.001, dynamic range: 2.1-10,000 ng/mL), and 3.2 ng/mL for LRG1 (LogY = 3.3009 + 0.4082 × LogX, R2 = 0.9748, P < 0.001, dynamic range: 4.0-10,000 ng/mL). The intra-assay and inter-assay CVs were low, ranging from 5.75% to 8.23% for S100A8 and 5.30% to 9.45% for LRG1 with high specificity and affinity in serum samples. Bland-Altman plots indicated TRFIA and ELISA kits have good agreement in clinical serum samples. Additionally, the cutoff values for S100A8 and LRG1 were 1849.18 ng/mL and 588.08 ng/mL, respectively.CONCLUSION: The present TRFIA method could be used for the quantitative detection of S100A8 and LRG1 in serum, and it has high sensitivity, accuracy and specificity. Clinically, this TRFIA method could be suitable for screening of LL via the quantitative detection of S100A8 and LRG1.
<<<
翻译
43.
颜林林
(2022-09-26 23:20):
#paper doi:10.1002/ajmg.a.62974 American Journal of Medical Genetics, 2022, Reduced resource utilization with early use of next-generation sequencing in rare genetic diseases in an Asian cohort. 这篇来自新加坡的文章,回顾了一家三级医院从2004到2020年的患者数据,调取其做过遗传检测且有相应计费数据的病例,最终筛选出近百例罕见病患者,覆盖GDD(全身发育迟缓)、MCA(多发性先天异常)、NMD(神经肌肉疾病)和 PID(原发性免疫缺陷)四种遗传疾病。根据他们病历中记录的所做检测内容,结合医疗常规实践路径规范,评价按照规范依次进行多种不同检测、对比合理去掉其中一个或多个检测项目,直至只留下最终采取全外显子组测序(WES)的策略。分别进行经济学和检测准确性方面的评估,由此给出一些实践建议。虽然病例收集时间跨度长,但最终可用病例数仍然有限,其结果价值也因此受到影响。不过该文章思路挺值得学习的,对于推动将WES或WGS(全基因组测序)提升至一线或早期的诊断方法,是一个合理且有说服力的策略。若在中国这样一个人口基数大的国家,建设并长期详细记录诊疗数据,用于此类回顾研究的开展,将是价值更加巨大的。
Abstract:
Children with genetic diseases endure a prolonged and costly "diagnostic odyssey." The use of whole exome sequencing (WES) and whole genome sequencing (WGS) has improved the diagnosis rate, ending the …
>>>
Children with genetic diseases endure a prolonged and costly "diagnostic odyssey." The use of whole exome sequencing (WES) and whole genome sequencing (WGS) has improved the diagnosis rate, ending the odyssey. However, the additional costs associated WES/WGS has impeded their adoption in Asian settings. We aim to estimate the expected change to the mean number of diagnostic tests used, and the associated costs from a decision to use WES early in the diagnostic pathways of pediatric phenotypes, as compared to Existing Practice. Retrospective data from a patient cohort recruited under the Singapore Undiagnosed Disease Program from a tertiary hospital in Singapore, for the period October 2004 to September 2020, was analyzed. Four phenotype-specific subgroups were used: multiple congenital anomalies (MCA) without developmental delay; global developmental delay (GDD); neuromuscular disorder (NMD) and primary immunodeficiency disorder (PID). Patients had undergone a traditional diagnostic pathway and received a diagnosis either through clinical exome or WES or WGS. A costs only analysis was performed, by tabulating the outcomes "test quantity" and "test costs" incurred by patients. The outcomes were compared with alternate diagnostic pathways which incorporates the early introduction of WES trio testing. To include uncertainty in cost outcomes, simulation studies were done on uncertain parameters. Cost outcomes are reported in Singapore dollars (S$). The 92 included patients had MCA (n = 48), GDD (n = 29), NMD (n = 10), or PID (n = 5). Patients were aged between 18 days and 26 years, 52.2% were males. The majority were of Chinese ethnicity (81.5%). If patients had access to WES directly, test quantity reduced by 97.38% for MCA, 96.98% for GDD, 96.56% for NMD, and 99.84% for PID. The expected cost savings per patient were $5940 for MCA (US$4433), $5342 for GDD (US$3986), $4622 for NMD (US$3449), and $58,497 for PID (US$43,654). Uncertainty assessment for MCA and GDD patients showed a respective likelihood of 86.9% and 97.4% for cost savings. Adoption of alternate diagnostic pathways with early WES in selected pediatric subgroups are likelt to reduce costs, when compared to Existing Practice. Benefits arising from earlier diagnosis, and the potential cost savings could mitigate the large initial cost of implementing WES in Asian settings.
<<<
翻译
44.
颜林林
(2022-09-25 15:32):
#paper doi:10.1101/2022.09.20.22280143 medRxiv, 2022, Whole-Genome Promoter Profiling of Plasma Cell-Free DNA Exhibits Predictive Value for Preterm Birth. 这篇文章试图从孕期母亲外周血cfDNA中发现早产相关生物标志物。对20例足月与20例早产的入组孕产妇进行全基因组测序,以及相应胎盘和外周血的全转录组测序,从中找到差异表达基因,并与外周血cfDNA中相应基因上游调控序列的覆盖深度进行关联,由此得到的特征,在2590例孕产妇(2072足月对518早产)的NIPT数据中进行验证,并预期此检测将为当前NIPT服务提供更多附加价值。这是一篇预发表文章,其摘要仅仅提及最后的两千多例的模型及性能,与正文整体逻辑还是有一定区别的,显然其文章逻辑还需要再继续打磨,不过这套数据及结果还是挺值得关注下的。
medRxiv,
2022.
DOI: 10.1101/2022.09.20.22280143
Abstract:
Preterm birth (PTB) occurs in around 11% of all births worldwide, resulting in significant morbidity and mortality for both mothers and offspring. Identification of pregnancies at risk of preterm birth …
>>>
Preterm birth (PTB) occurs in around 11% of all births worldwide, resulting in significant morbidity and mortality for both mothers and offspring. Identification of pregnancies at risk of preterm birth in early pregnancy may help improve intervention and reduce its incidence. However, there exist few methods for PTB prediction developed with large sample size, high throughput screening and validation in independent cohorts. Here, we established a large scale, multi center, and case control study that included 2,590 pregnancies (2,072 full term and 518 preterm pregnancies) from three independent hospitals to develop a preterm birth classifier. We implemented whole genome sequencing on their plasma cfDNA and then their promoter profiling (read depth spanning from -1 KB to +1 KB around the transcriptional start site) was analyzed. Using three machine learning models and two feature selection algorithms, classifiers for predicting preterm delivery were developed. Among them, a classifier based on the support vector machine model and backward algorithm, named PTerm (Promoter profiling classifier for preterm prediction), exhibited the largest AUC value of 0.878 (0.852-0.904) following LOOCV cross validation. More importantly, PTerm exhibited good performance in three independent validation cohorts and achieved an overall AUC of 0.849 (0.831-0.866). Taken together, PTerm could be based on current noninvasive prenatal test (NIPT) data without changing its procedure or adding detection cost, which can be easily adapted for preclinical tests.
<<<
翻译
45.
颜林林
(2022-09-23 22:56):
#paper doi:10.1371/journal.pgen.1010404 PLOS Genetics, 2022, Analysis of low-level somatic mosaicism reveals stage and tissue-specific mutational features in human development. 这篇文章纳入了来自190人的498个样本,包括神经疾病患者、脑肿瘤患者和健康对照,样本类型包括外周血及脑、心脏、肝脏等组织,对这些样本进行配对的全外显子测序(平均深度~500x),研究各样本的体细胞突变,以及它们在不同组织类型和不同发育阶段的分布情况,以及突变特征差异。对这些突变,还采取Sanger和靶向扩增超高深度测序等方法进行验证,对于突变在不同类型细胞的分布,也使用了流式细胞术进行了验证。分析方法上都比较常规,但作为一套数百例不同组织部位的深度全外显子数据,以及它所描述的体细胞突变的分布,还是比较有重分析挖掘的价值的。
Abstract:
Most somatic mutations that arise during normal development are present at low levels in single or multiple tissues depending on the developmental stage and affected organs. However, the effect of …
>>>
Most somatic mutations that arise during normal development are present at low levels in single or multiple tissues depending on the developmental stage and affected organs. However, the effect of human developmental stages or mutations of different organs on the features of somatic mutations is still unclear. Here, we performed a systemic and comprehensive analysis of low-level somatic mutations using deep whole-exome sequencing (average read depth ~500×) of 498 multiple organ tissues with matched controls from 190 individuals. Our results showed that early clone-forming mutations shared between multiple organs were lower in number but showed higher allele frequencies than late clone-forming mutations [0.54 vs. 5.83 variants per individual; 6.17% vs. 1.5% variant allele frequency (VAF)] along with less nonsynonymous mutations and lower functional impacts. Additionally, early and late clone-forming mutations had unique mutational signatures that were distinct from mutations that originated from tumors. Compared with early clone-forming mutations that showed a clock-like signature across all organs or tissues studied, late clone-forming mutations showed organ, tissue, and cell-type specificity in the mutation counts, VAFs, and mutational signatures. In particular, analysis of brain somatic mutations showed a bimodal occurrence and temporal-lobe-specific signature. These findings provide new insights into the features of somatic mosaicism that are dependent on developmental stage and brain regions.
<<<
翻译
46.
颜林林
(2022-09-21 07:48):
#paper doi:10.1002/humu.24458 Human Mutation, 2022, A survey of current methods to detect and genotype inversions. 倒位(inversion)是基因组上一类特殊的变异,越来越多的技术方法可以对其进行发现和鉴定,也因此发现该事件广泛存在于不同物种的基因组中。这篇综述从技术角度,分别介绍了PCR、NGS序列比对、单倍体型识别、模板链测序(template‐strand sequencing,Strand‐seq)、光学图谱(optical mapping,Bionano)及基因组组装这六类方法对倒位的鉴定,以及相应方法所取得的研究进展。
Abstract:
Polymorphic inversions are ubiquitous in humans and they have been linked to both adaptation and disease. Following their discovery in Drosophila more than a century ago, inversions have proved to …
>>>
Polymorphic inversions are ubiquitous in humans and they have been linked to both adaptation and disease. Following their discovery in Drosophila more than a century ago, inversions have proved to be more elusive than other structural variants. A wide variety of methods for the detection and genotyping of inversions have recently been developed: multiple techniques based on selective amplification by PCR, short- and long-read sequencing approaches, principal component analysis of small variant haplotypes, template strand sequencing, optical mapping, and various genome assembly methods. Many methods apply complex wet lab protocols or increasingly refined bioinformatic analyses. This review is an attempt to provide a practical summary and comparison of the methods that are in current use, with a focus on metrics such as the maximum size of segmental duplications at inversion breakpoints that each method can tolerate, the size range of inversions that they recover, their throughput, and whether the locations of putative inversions must be known beforehand.
<<<
翻译
47.
颜林林
(2022-09-20 06:54):
#paper doi:10.1002/humu.24465 Human Mutations, 2022, Long-read sequencing for molecular diagnostics in constitutional genetic disorders. 这是一篇关于使用三代长读长测序进行遗传病基因检测的综述,来自费城儿童医院。文章列举了其医院提供的耳聋基因检测的例子,来说明在实践中整合使用多种不同检测技术,实现检测上百个基因不同类型疾病相关突变的需求。此外,也通过实例,系统地分析了诸如重复片段、假基因、同一基因发生多个距离较远突变(需要进行phasing,即定相)等可能造成检测结果误判的问题,以及长读长测序技术如何解决相应问题。三代测序用于遗传基因检测,目前最大瓶颈在于所积累的证据和人群数据,但这正好是时间可以逐步积累并解决的。从这篇文章展示的这些几乎只能使用长读长相关技术才能解决的问题案例,可以预期不久的未来将迎来一批相应的长读长测序基因检测方法的落地应用。
Abstract:
Long-read sequencing (LRS) has been around for more than a decade, but widespread adoption of the technology has been slow due to the perceived high error rates and high sequencing …
>>>
Long-read sequencing (LRS) has been around for more than a decade, but widespread adoption of the technology has been slow due to the perceived high error rates and high sequencing cost. This is changing due to the recent advancements to produce highly accurate sequences and the reducing costs. LRS promises significant improvement over short read sequencing in four major areas: (1) better detection of structural variation (2) better resolution of highly repetitive or nonunique regions (3) accurate long-range haplotype phasing and (4) the detection of base modifications natively from the sequencing data. Several successful applications of LRS have demonstrated its ability to resolve molecular diagnoses where short-read sequencing fails to identify a cause. However, the argument for increased diagnostic yield from LRS remains to be validated. Larger cohort studies may be required to establish the realistic boundaries of LRS's clinical utility and analytical validity, as well as the development of standards for clinical applications. We discuss the limitations of the current standard of care, and contrast with the applications and advantages of two major LRS platforms, PacBio and Oxford Nanopore, for molecular diagnostics of constitutional disorders, and present a critical argument about the potential of LRS in diagnostic settings.
<<<
翻译
48.
颜林林
(2022-09-19 22:00):
#paper doi:10.1038/s41598-022-17585-2 Scientific Reports, 2022, Recursive integration of synergised graph representations of multi‑omics data for cancer subtypes identification. 随着高通量测序技术在不同组学水平上的应用,肿瘤研究也早已进入多组学研究阶段。如何将多组学高维数据进行有效整合,一直是一项有挑战的工作。与此相关的方法学研发工作,大多聚焦于单组学数据的各类降维和特征提取。本文开发了一个名为RISynG(Recursive Integration of Synergised Graph-representations)的方法,通过从原始的组学数据中提取Gramian和Laplacian两个表征矩阵(representation matrices),使整合不同组学之间更加有效。相比过去大多数将多组学数据进行简单串联堆叠的方式,能够取得更好的分类效果,实现基于肿瘤多组学数据(如TCGA)进行肿瘤分型。
Abstract:
Cancer subtypes identification is one of the critical steps toward advancing personalized anti-cancerous therapies. Accumulation of a massive amount of multi-platform omics data measured across the same set of samples …
>>>
Cancer subtypes identification is one of the critical steps toward advancing personalized anti-cancerous therapies. Accumulation of a massive amount of multi-platform omics data measured across the same set of samples provides an opportunity to look into this deadly disease from several views simultaneously. Few integrative clustering approaches are developed to capture shared information from all the views to identify cancer subtypes. However, they have certain limitations. The challenge here is identifying the most relevant feature space from each omic view and systematically integrating them. Both the steps should lead toward a global clustering solution with biological significance. In this respect, a novel multi-omics clustering algorithm named RISynG (Recursive Integration of Synergised Graph-representations) is presented in this study. RISynG represents each omic view as two representation matrices that are Gramian and Laplacian. A parameterised combination function is defined to obtain a synergy matrix from these representation matrices. Then a recursive multi-kernel approach is applied to integrate the most relevant, shared, and complementary information captured via the respective synergy matrices. At last, clustering is applied to the integrated subspace. RISynG is benchmarked on five multi-omics cancer datasets taken from The Cancer Genome Atlas. The experimental results demonstrate RISynG's efficiency over the other approaches in this domain.
<<<
翻译
49.
颜林林
(2022-09-16 23:18):
#paper doi:10.1016/j.molcel.2022.08.019 Molecular Cell, 2022, Developmental and housekeeping transcriptional programs in Drosophila require distinct chromatin remodelers. 这篇文章吸引到我,是因为浏览它时,我看到了两个词,“Drosophila(果蝇)”和“auxin(植物生长素)”,于是很好奇这两者是怎么联系起来的。过去在生物专业课上,就听说过植物生长素在植物研究领域中的至尊江湖地位。这篇文章提及一项技术“auxin-inducible degradation (AID)”,源自2009年的一篇Nature Methods文章(doi:10.1038/nmeth.1401),该技术通过为目标蛋白加入一段特定序列,使得在有植物生长素的情况下,能引发蛋白泛素化降解机制,从而可以人为控制蛋白的降解过程。由于泛素化降解是一个广泛存在于不同物种的机制,这项技术就可以应用于非植物的各种生物体系中。本文通过这项技术,对果蝇的看家基因(house keeping gene)和发育基因(developmental gene)进行了研究,前者普遍表达于所有类型细胞,后者则只在特定组织器官类型的细胞中表达。通过人为控制相应基因的蛋白降解,揭示了两类基因在染色质重塑(chromatin remodelling)及其他相关特征上的差异。
Abstract:
Gene transcription is a highly regulated process in all animals. In Drosophila, two major transcriptional programs, housekeeping and developmental, have promoters with distinct regulatory compatibilities and nucleosome organization. However, it …
>>>
Gene transcription is a highly regulated process in all animals. In Drosophila, two major transcriptional programs, housekeeping and developmental, have promoters with distinct regulatory compatibilities and nucleosome organization. However, it remains unclear how the differences in chromatin structure relate to the distinct regulatory properties and which chromatin remodelers are required for these programs. Using rapid degradation of core remodeler subunits in Drosophila melanogaster S2 cells, we demonstrate that developmental gene transcription requires SWI/SNF-type complexes, primarily to maintain distal enhancer accessibility. In contrast, wild-type-level housekeeping gene transcription requires the Iswi and Ino80 remodelers to maintain nucleosome positioning and phasing at promoters. These differential remodeler dependencies relate to different DNA-sequence-intrinsic nucleosome affinities, which favor a default ON state for housekeeping but a default OFF state for developmental gene transcription. Overall, our results demonstrate how different transcription-regulatory strategies are implemented by DNA sequence, chromatin structure, and remodeler activity.
<<<
翻译
50.
颜林林
(2022-09-15 22:35):
#paper doi:10.1002/humu.24455 Human Mutation, 2022, de novo variant calling identifies cancer mutation signatures in the 1000 Genomes Project. 本文开发了一种能利用GPU加速、基于trio(一家三口,父母两人及一个子女)全基因组测序数据、检测新发突变(de novo variant)的工具。并使用该工具重新分析了三个大规模trio人群数据,三个人群分别是Simons Simplex Collection(SSC)、Simons Foundation Powering Autism Research(SPARK)和千人基因组(1000 Genomes Project,1000G),其样本类型分别为外周血、唾液和细胞系。结果发现细胞系的新发突变数量和特征,明显不符合预期。通过对1000G中的这些新发突变的特征分析,发现它们与B细胞淋巴瘤相似,从而推断其大多应为细胞系制备过程(即EBV处理)中引入的artifacts。
Abstract:
Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our …
>>>
Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.
<<<
翻译
51.
颜林林
(2022-09-14 05:52):
#paper doi:10.1002/humu.24460 Human Mutation, 2022, CIC missense variants contribute to susceptibility for spina bifida. 既往研究发现,叶酸摄入对于神经系统发育具有重要作用,其缺乏可能导致神经管缺陷(Neural tube defects,NTDs)这样的严重先天畸形。本文应该是从另一项研究出发,由入组的140例散发脊柱裂(spina bifida)病例,进行的全基因组测序结果中,发现8例CIC基因的罕见错义突变。通过近缘物种间序列保守性,确认了这些突变可能存在重要作用。在细胞系中通过质粒转染和叶酸缺乏培养等实验,引入野生型或携带上述突变的CIC基因的质粒,通过免疫荧光观察突变对表达量和亚细胞定位的影响。此外,还使用Western、qPCR等方法,对CIC所调控的基因的表达进行测定,确认了所发现的CIC突变,确实会对相关通路造成影响。这是一篇用湿实验方法对所发现基因突变功能进行验证的典型研究。
Abstract:
Neural tube defects (NTDs) are congenital malformations resulting from abnormal embryonic development of the brain, spine, or spinal column. The genetic etiology of human NTDs remains poorly understood despite intensive …
>>>
Neural tube defects (NTDs) are congenital malformations resulting from abnormal embryonic development of the brain, spine, or spinal column. The genetic etiology of human NTDs remains poorly understood despite intensive investigation. CIC, homolog of the Capicua transcription repressor, has been reported to interact with ataxin-1 (ATXN1) and participate in the pathogenesis of spinocerebellar ataxia type 1. Our previous study demonstrated that CIC loss of function (LoF) variants contributed to the cerebral folate deficiency syndrome by downregulating folate receptor 1 (FOLR1) expression. Given the importance of folate transport in neural tube formation, we hypothesized that CIC variants could contribute to increased risk for NTDs by depressing embryonic folate concentrations. In this study, we examined CIC variants from whole-genome sequencing (WGS) data of 140 isolated spina bifida cases and identified eight missense variants of CIC gene. We tested the pathogenicity of the observed variants through multiple in vitro experiments. We determined that CIC variants decreased the FOLR1 protein level and planar cell polarity (PCP) pathway signaling in a human cell line (HeLa). In a murine cell line (NIH3T3), CIC loss of function variants downregulated PCP signaling. Taken together, this study provides evidence supporting CIC as a risk gene for human NTD.
<<<
翻译
52.
颜林林
(2022-09-13 07:21):
#paper doi:10.1016/j.vaccine.2022.08.036 Vaccine, 2022, Serious adverse events of special interest following mRNA COVID-19 vaccination in randomized trials in adults. 这篇文章跟进了Pfizer和Moderna两家公司的新冠RNA疫苗的三期临床试验,针对其报出的严重不良反应进行二次分析,确认各自疫苗相对于安慰剂所增加的风险比值。该结果提示应该进行更加详尽正式的利弊分析。而文末也再次呼吁要求公开受试者级别的相关数据,以保证临床试验的透明度和各类评估分析得以正确进行。
Abstract:
INTRODUCTION: In 2020, prior to COVID-19 vaccine rollout, the Brighton Collaboration created a priority list, endorsed by the World Health Organization, of potential adverse events relevant to COVID-19 vaccines. We …
>>>
INTRODUCTION: In 2020, prior to COVID-19 vaccine rollout, the Brighton Collaboration created a priority list, endorsed by the World Health Organization, of potential adverse events relevant to COVID-19 vaccines. We adapted the Brighton Collaboration list to evaluate serious adverse events of special interest observed in mRNA COVID-19 vaccine trials.METHODS: Secondary analysis of serious adverse events reported in the placebo-controlled, phase III randomized clinical trials of Pfizer and Moderna mRNA COVID-19 vaccines in adults (NCT04368728 and NCT04470427), focusing analysis on Brighton Collaboration adverse events of special interest.RESULTS: Pfizer and Moderna mRNA COVID-19 vaccines were associated with an excess risk of serious adverse events of special interest of 10.1 and 15.1 per 10,000 vaccinated over placebo baselines of 17.6 and 42.2 (95 % CI -0.4 to 20.6 and -3.6 to 33.8), respectively. Combined, the mRNA vaccines were associated with an excess risk of serious adverse events of special interest of 12.5 per 10,000 vaccinated (95 % CI 2.1 to 22.9); risk ratio 1.43 (95 % CI 1.07 to 1.92). The Pfizer trial exhibited a 36 % higher risk of serious adverse events in the vaccine group; risk difference 18.0 per 10,000 vaccinated (95 % CI 1.2 to 34.9); risk ratio 1.36 (95 % CI 1.02 to 1.83). The Moderna trial exhibited a 6 % higher risk of serious adverse events in the vaccine group: risk difference 7.1 per 10,000 (95 % CI -23.2 to 37.4); risk ratio 1.06 (95 % CI 0.84 to 1.33). Combined, there was a 16 % higher risk of serious adverse events in mRNA vaccine recipients: risk difference 13.2 (95 % CI -3.2 to 29.6); risk ratio 1.16 (95 % CI 0.97 to 1.39).DISCUSSION: The excess risk of serious adverse events found in our study points to the need for formal harm-benefit analyses, particularly those that are stratified according to risk of serious COVID-19 outcomes. These analyses will require public release of participant level datasets.
<<<
翻译
53.
颜林林
(2022-09-11 23:59):
#paper doi:10.1101/2022.09.09.453067 bioRxiv, 2022, HexSE: Simulating evolution in overlapping reading frames. 重叠基因是在病毒(质粒)中发现的一种有趣现象,即同一段核酸序列,因为翻译蛋白质的起始位置不同(即阅读框不同)导致形成不同蛋白。到目前为止的研究,发现在许多物种中都存在此现象。本文通过分析序列演化速率,来从积累的大量已被测序的基因组数据中,寻找这样的重叠基因。其基本假设是,如果存在重叠基因,则相应序列上受到的演化选择压力会有所不同,于是在结果上呈现出不同的演化速率。这是个很有意思的思路和研究课题。
bioRxiv,
2022.
DOI: 10.1101/2022.09.09.453067
Abstract:
Motivation: Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where …
>>>
Motivation: Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may provide a mechanism to increase the information content of compact genomes. The presence of overlapping reading frames (OvRFs) can skew estimates of selection based on the rates of non-synonymous and synonymous substitutions, since a substitution that is synonymous in one reading frame may be non-synonymous in another, and vice versa. Results: To understand the impact of OvRFs on molecular evolution, we implemented a versatile simulation model of nucleotide sequence evolution along a phylogeny with an arbitrary distribution of reading frames. We use a custom data structure to track the substitution rates at every nucleotide site, which is determined by the stationary nucleotide frequencies, transition bias, and the distribution of selection biases (dN/dS) in the respective reading frames. Availability and implementation: Our simulation model is implemented in the Python scripting language. All source code is released under the GNU General Public License (GPL) version 3, and is available at https://github.com/PoonLab/HexSE.
<<<
翻译
54.
颜林林
(2022-08-30 23:47):
#paper doi:10.1016/j.gpb.2022.08.003 Genomics, Proteomics & Bioinformatics, 2022, Dynamic Spatial-temporal Expression Ratio of X Chromosome to Autosomes but Stable Dosage Compensation in Mammals. 哺乳动物的性染色体,在雌性和雄性中分别为XX和XY,也即X染色体的数量在雌雄之间相差了一倍,然而X染色体上的基因并未因此在两性之间出现巨大的表达差异,这个现象称为剂量补偿效应。本文通过收集和分析多组学数据,包括转录组(RNA-seq)、翻译组(Ribo-seq)、蛋白组(质谱),涉及不同物种(人、鸭嘴兽、负鼠;使用鸡作为外类群),以及(小鼠模型的)不同发育阶段,对此现象进行了深入细致的研究。采用将X染色体连锁的各基因表达,与常染色体基因表达、直系同源基因的表达,分别计算比值,评估剂量补偿效应的量化情况,发现该表达量比值,在不同组织和不同发育阶段,存在时空动态性,且与演化相关。很有意思的一篇生信文章。
Genomics, proteomics & bioinformatics,
2023-06.
DOI: 10.1016/j.gpb.2022.08.003
PMID: 36031057
PMCID:PMC10787176
Abstract:
In the evolutionary model of dosage compensation, per-allele expression level of the X chromosome has been proposed to have twofold up-regulation to compensate its dose reduction in males (XY) compared …
>>>
In the evolutionary model of dosage compensation, per-allele expression level of the X chromosome has been proposed to have twofold up-regulation to compensate its dose reduction in males (XY) compared to females (XX). However, the expression regulation of X-linked genes is still controversial, and comprehensive evaluations are still lacking. By integrating multi-omics datasets in mammals, we investigated the expression ratios including X to autosomes (X:AA ratio) and X to orthologs (X:XX ratio) at the transcriptome, translatome, and proteome levels. We revealed a dynamic spatial-temporal X:AA ratio during development in humans and mice. Meanwhile, by tracing the evolution of orthologous gene expression in chickens, platypuses, and opossums, we found a stable expression ratio of X-linked genes in humans to their autosomal orthologs in other species (X:XX ≈ 1) across tissues and developmental stages, demonstrating stable dosage compensation in mammals. We also found that different epigenetic regulations contributed to the high tissue specificity and stage specificity of X-linked gene expression, thus affecting X:AA ratios. It could be concluded that the dynamics of X:AA ratios were attributed to the different gene contents and expression preferences of the X chromosome, rather than the stable dosage compensation.
<<<
翻译
55.
颜林林
(2022-08-26 23:18):
#paper doi:10.1101/2022.08.24.505159 bioRxiv, 2022, A genome-wide atlas of recurrent repeat expansions in human cancer. 这篇来自斯坦福大学的Michael Snyder团队。通过重分析来自ICGC和TCGA的2622个癌症全基因组测序数据,涉及29个癌种,从中鉴定出160个重复序列扩张(recurrent repeat expansions, rRE)事件,且这些事件绝大多数都与特定癌症亚型相关。这些重复序列所处基因组区域,也富集在某些基因的调控元件附近,提示了它们在基因调控方面可能发挥作用。其中一个GAAA重复发生在UGT2B7基因的内含子中,在34%的肾细胞癌样本中都能观察到,于是通过斯坦福癌症中心入组了12例肾癌病例,对其样本开展了二代测序(Illumina NovaSeq)和三代测序(PacBio),验证了该rRE事件的发生。
bioRxiv,
2022.
DOI: 10.1101/2022.08.24.505159
Abstract:
Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases. However, repeat expansions are often not explored beyond neurological and …
>>>
Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs (STRs), a phenomenon termed microsatellite instability (MSI); however larger repeat expansions have not been systematically analyzed in cancer. Here, we identified TR expansions in 2,622 cancer genomes, spanning 29 cancer types. In 7 cancer types, we found 160 recurrent repeat expansions (rREs); most of these (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with an enrichment near candidate cis-regulatory elements, suggesting a role in gene regulation. One rRE located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, targeting cells harboring this rRE with a rationally designed, sequence-specific DNA binder led to a dose-dependent decrease in cell proliferation. Overall, our results demonstrate that rREs are an important but unexplored source of genetic variation in human cancers, and we provide a comprehensive catalog for further study.
<<<
翻译
56.
颜林林
(2022-08-18 00:34):
#paper doi:10.1186/s12859-022-04876-8 BMC Bioinformatics, 2022, IMSE: interaction information attention and molecular structure based drug drug interaction extraction. 让机器自动读取大量论文,并从中提炼有用信息,是很多人的梦想,BERT等模型让这件事逐步成为现实。本文便是基于PubMed摘要和PMC全文,进行BioBERT预训练,并由此改进DDIExtraction 2013的任务执行性能,该任务旨在从生物医学领域的自由文本中提取药物间相互作用(drug-drug interaction, DDI)。关于这项任务已有不少研究工作,本文引入了交互注意力向量(interaction attention vector),以及加入药物分子结构(以利用其特征空间信息)等,来改善模型性能及可解释性,取得不错的效果。
Abstract:
BACKGROUND: Extraction of drug drug interactions from biomedical literature and other textual data is an important component to monitor drug-safety and this has attracted attention of many researchers in healthcare. …
>>>
BACKGROUND: Extraction of drug drug interactions from biomedical literature and other textual data is an important component to monitor drug-safety and this has attracted attention of many researchers in healthcare. Existing works are more pivoted around relation extraction using bidirectional long short-term memory networks (BiLSTM) and BERT model which does not attain the best feature representations.RESULTS: Our proposed DDI (drug drug interaction) prediction model provides multiple advantages: (1) The newly proposed attention vector is added to better deal with the problem of overlapping relations, (2) The molecular structure information of drugs is integrated into the model to better express the functional group structure of drugs, (3) We also added text features that combined the T-distribution and chi-square distribution to make the model more focused on drug entities and (4) it achieves similar or better prediction performance (F-scores up to 85.16%) compared to state-of-the-art DDI models when tested on benchmark datasets.CONCLUSIONS: Our model that leverages state of the art transformer architecture in conjunction with multiple features can bolster the performances of drug drug interation tasks in the biomedical domain. In particular, we believe our research would be helpful in identification of potential adverse drug reactions.
<<<
翻译
57.
颜林林
(2022-08-17 23:55):
#paper doi:10.1016/j.xgen.2022.100168 Cell Genomics, 2022, Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. 高通量测序技术的发展、降价和普及,拉动了一大批人类群体基因组学的研究。本文又是这样一篇大规模人群的全外显子组数据及其分析结果的发布,该人群来自UK biobank,入组人数超过39万。文章开发并使用了一个混合模型分析框架SAIGE-GENE,会同时考虑点突变的水平、基因水平的突变负荷、以及两者的组合,由此分析与4529种疾病或表型(包括II型糖尿病、心脏代谢等)存在关联关系的各类罕见突变。在此基础上,本文还提供了一个在线浏览器Genebass,以展示这些表型相关的罕见突变。作为一个实例,文章在结果部分还特意强调了所发现的一个基因SCRIB,以及它与MRI脑成像特征之间的关系。类似的大规模人群基因组分析文章层出不穷,分析方法各有侧重或不同,若有可能,倒是值得研究下它们之间的方法差异,是否可能对所报道的结果产生影响。
Abstract:
Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at …
>>>
Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.
<<<
翻译
58.
颜林林
(2022-08-13 23:36):
#paper doi:10.1038/s41586-022-04774-2 Nature, 2022, Stromal changes in the aged lung induce an emergence from melanoma dormancy. 众所周知,年龄是肿瘤发病的最重要因素。这篇文章将培养的黑色素瘤细胞(其中部分细胞系使用质粒体系过表达WNT通路相关基因),注入年轻与年老小鼠,观察其成瘤过程及表型变化,其中还穿插腹腔注射等干预实验,之后取样后对肺组织进行免疫组化、蛋白组(质谱)等检测,用以揭示衰老与肿瘤发生之间的关系。该研究发现,在老化的肺微环境中,黑色素瘤并未快速生长,反而是受到了抑制,处于一种休眠状态,但同时该微环境又会促进其转移扩散,使黑色素瘤细胞能够在转移性生态位中有效传播和播种。本文同时还详细研究了WNT通路在此过程中的作用,以及酪氨酸激酶受体 AXL 和 MER 对肿瘤休眠的促进再激活。这些结果为后续研究肿瘤休眠及肺组织微环境之间的关系提供了重要信息,同时也提示在肿瘤治疗过程中有必要关注年龄因素的影响。
Abstract:
Disseminated cancer cells from primary tumours can seed in distal tissues, but may take several years to form overt metastases, a phenomenon that is termed tumour dormancy. Despite its importance …
>>>
Disseminated cancer cells from primary tumours can seed in distal tissues, but may take several years to form overt metastases, a phenomenon that is termed tumour dormancy. Despite its importance in metastasis and residual disease, few studies have been able to successfully characterize dormancy within melanoma. Here we show that the aged lung microenvironment facilitates a permissive niche for efficient outgrowth of dormant disseminated cancer cells-in contrast to the aged skin, in which age-related changes suppress melanoma growth but drive dissemination. These microenvironmental complexities can be explained by the phenotype switching model, which argues that melanoma cells switch between a proliferative cell state and a slower-cycling, invasive state. It was previously shown that dermal fibroblasts promote phenotype switching in melanoma during ageing. We now identify WNT5A as an activator of dormancy in melanoma disseminated cancer cells within the lung, which initially enables the efficient dissemination and seeding of melanoma cells in metastatic niches. Age-induced reprogramming of lung fibroblasts increases their secretion of the soluble WNT antagonist sFRP1, which inhibits WNT5A in melanoma cells and thereby enables efficient metastatic outgrowth. We also identify the tyrosine kinase receptors AXL and MER as promoting a dormancy-to-reactivation axis within melanoma cells. Overall, we find that age-induced changes in distal metastatic microenvironments promote the efficient reactivation of dormant melanoma cells in the lung.
<<<
翻译
59.
颜林林
(2022-08-12 07:42):
#paper doi:10.1016/j.ccell.2022.07.002 Cancer Cell, 2022, Integrative analysis of drug response and clinical outcome in acute myeloid leukemia. 这是一项关于AML(急性骨髓性白血病)的长达10年的真实世界临床研究,收集了来自多个中心的 805 名患者(942 个样本),对样本进行基因组和转录组的测序,同时使用离体细胞培养进行药物反应实验,此外还利用NLP技术整理和分析患者的病历数据。在数据分析方面,使用反卷积方法,通过转录组数据推断出样本的细胞类群组成,并结合临床信息和组学数据分析结果,识别出影响药物响应情况的因素(如年龄、基因表达、细胞分化状态等)。所建立的模型,揭示了单个基因 PEAR1 是患者生存的最强预测因子之一。所形成的数据集,也提供了一个在线交互式网站进行分析展示。分析方面基本都是很多生信数据挖掘类文章的常见套路,并没有特别新颖之处,但得益于长时间积累的队列及其完整的临床信息,作为一个重要的数据集资源,以及单病种的真实世界研究实例,也还是很有价值的。此外,关于药物响应的细胞实验部分相对独立,与患者预后进行关联解释并不容易,大概也是为了提升文章份量而加入的。
Abstract:
Acute myeloid leukemia (AML) is a cancer of myeloid-lineage cells with limited therapeutic options. We previously combined ex vivo drug sensitivity with genomic, transcriptomic, and clinical annotations for a large …
>>>
Acute myeloid leukemia (AML) is a cancer of myeloid-lineage cells with limited therapeutic options. We previously combined ex vivo drug sensitivity with genomic, transcriptomic, and clinical annotations for a large cohort of AML patients, which facilitated discovery of functional genomic correlates. Here, we present a dataset that has been harmonized with our initial report to yield a cumulative cohort of 805 patients (942 specimens). We show strong cross-cohort concordance and identify features of drug response. Further, deconvoluting transcriptomic data shows that drug sensitivity is governed broadly by AML cell differentiation state, sometimes conditionally affecting other correlates of response. Finally, modeling of clinical outcome reveals a single gene, PEAR1, to be among the strongest predictors of patient survival, especially for young patients. Collectively, this report expands a large functional genomic resource, offers avenues for mechanistic exploration and drug development, and reveals tools for predicting outcome in AML.
<<<
翻译
60.
颜林林
(2022-08-08 07:54):
#paper doi:10.1038/s41596-022-00728-0 Nature Protocols, 2022, I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. 目前,关于蛋白质结构预测的工具,大多都只能处理单结构域蛋白。然而,自然界中广泛存在的蛋白质,更多是具有多个结构域的,各结构域之间会协同发挥功能,因此亟需开发对这类蛋白质进行结构及功能预测的算法工具。本文提供了一个流程,名为I-TASSER-MTD,用于多结构域蛋白质的结构与功能预测。通过整合如下步骤:基于序列分析结构域(sequence-based domain parsing)、单结构域结构折叠(single-domain structure folding)、结构域之间的结构组装(inter-domain structure assembly)、基于结构的功能注释(structure-based function annotation),并且在各个步骤中都引入了深度学习,以及整合其他诸如蛋白质交联、冷冻电镜等实验数据,来提升相应的准确度,从而提高整体的蛋白质结构功能预测效果,并最终封装成为一套全自动的分析流程。
Abstract:
Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there …
>>>
Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.
<<<
翻译