当前共找到 1276 篇文献分享,本页显示第 1021 - 1040 篇。
1021.
四叶草
(2022-07-31 10:05):
#paper DOI: 10.1038/s41586-020-2352-3 hair-bearing human skin generated entirely from pluripotent stem cells
2020年发表在Nature上的文章,关于人多能干细胞向皮肤类器官分化,且伴随皮肤附属器结构的形成。文章通过对TGFb,BMP以及FGF通路的控制将干细胞形成的EB通过非神经外胚层逐步分化成为皮肤,再诱发皮肤的自我重排,使皮肤形成多层结构。体外培养3个月可明显看到毛囊的生长,经过裸鼠的体内移植实验进一步验证了类器官可以在体内分层,形成皮脂腺和含有感受器细胞的毛囊,为皮肤发育提供模型,为皮肤移植提供供体。
Abstract:
The skin is a multilayered organ, equipped with appendages (that is, follicles and glands), that is critical for regulating body temperature and the retention of bodily fluids, guarding against external …
>>>
The skin is a multilayered organ, equipped with appendages (that is, follicles and glands), that is critical for regulating body temperature and the retention of bodily fluids, guarding against external stresses and mediating the sensation of touch and pain. Reconstructing appendage-bearing skin in cultures and in bioengineered grafts is a biomedical challenge that has yet to be met. Here we report an organoid culture system that generates complex skin from human pluripotent stem cells. We use stepwise modulation of the transforming growth factor β (TGFβ) and fibroblast growth factor (FGF) signalling pathways to co-induce cranial epithelial cells and neural crest cells within a spherical cell aggregate. During an incubation period of 4-5 months, we observe the emergence of a cyst-like skin organoid composed of stratified epidermis, fat-rich dermis and pigmented hair follicles that are equipped with sebaceous glands. A network of sensory neurons and Schwann cells form nerve-like bundles that target Merkel cells in organoid hair follicles, mimicking the neural circuitry associated with human touch. Single-cell RNA sequencing and direct comparison to fetal specimens suggest that the skin organoids are equivalent to the facial skin of human fetuses in the second trimester of development. Moreover, we show that skin organoids form planar hair-bearing skin when grafted onto nude mice. Together, our results demonstrate that nearly complete skin can self-assemble in vitro and be used to reconstitute skin in vivo. We anticipate that our skin organoids will provide a foundation for future studies of human skin development, disease modelling and reconstructive surgery.
<<<
翻译
1022.
小W
(2022-07-31 09:59):
#paper doi:https ://doi.org/10.1016/j.cell.2022.05.013
Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq
Perturb-seq 是一种实验方法,通过将基于 CRISPR 的遗传筛选与单细胞 RNA 测序表型相结合,绘制遗传扰动的转录效应。本文使用 CRISPRi 靶向慢性髓性白血病细胞(K562)中的所有表达基因和视网膜色素上皮细胞(RPE1)中的所有 DepMap(癌症依赖关系分析数据库) 必需基因,基于其CRISPRi基因-RNA表型的内在可解释性,将基因与它在细胞中的作用联系起来。阐述了Perturb-seq基因组筛选在以下方向的应用:1.预测引起转录表型的遗传扰动特征;2.从转录表型注释基因功能;3.复合表型假设驱动研究;4.线粒体基因组应激特异性调控。本文是使用Perturb-seq 技术对每个基因的遗传扰动分析,其测序数据以及表达(和差异分析)数据、sgRNA库(未找到)已公布,主要实验方法和分析脚本参照另一篇论文 "Scalable single-cell CRISPR screens by direct guide RNA capture and targeted library enrichment, Nature Biotechnology 2020"。
Abstract:
A central goal of genetics is to define the relationships between genotypes and phenotypes. High-content phenotypic screens such as Perturb-seq (CRISPR-based screens with single-cell RNA-sequencing readouts) enable massively parallel functional …
>>>
A central goal of genetics is to define the relationships between genotypes and phenotypes. High-content phenotypic screens such as Perturb-seq (CRISPR-based screens with single-cell RNA-sequencing readouts) enable massively parallel functional genomic mapping but, to date, have been used at limited scales. Here, we perform genome-scale Perturb-seq targeting all expressed genes with CRISPR interference (CRISPRi) across >2.5 million human cells. We use transcriptional phenotypes to predict the function of poorly characterized genes, uncovering new regulators of ribosome biogenesis (including CCDC86, ZNF236, and SPATA5L1), transcription (C7orf26), and mitochondrial respiration (TMEM242). In addition to assigning gene function, single-cell transcriptional phenotypes allow for in-depth dissection of complex cellular phenomena-from RNA processing to differentiation. We leverage this ability to systematically identify genetic drivers and consequences of aneuploidy and to discover an unanticipated layer of stress-specific regulation of the mitochondrial genome. Our information-rich genotype-phenotype map reveals a multidimensional portrait of gene and cellular function.
<<<
翻译
1023.
笑对人生
(2022-07-31 09:15):
#paper Phasing analysis of lung cancer genomes using a long read sequencer. Nat Commun. 2022 Jun 16;13(1):3464. doi: 10.1038/s41467-022-31133-6
背景知识:SNV(单核苷酸位点变异,single nucleotide variant)是指基因组上发生单碱基改变的位点。SNP(单核苷酸多态性,single nucleotide polymorphism)是指基因组上由单个核苷酸变异引起的DNA序列多态性。SNP描述的是个体基因组上发生碱基改变,而SNP更倾向于是一种群体属性。更加易懂的英文:A haplotype is a physical grouping of genomic variants (or polymorphisms) that tend to be inherited together. A specific haplotype typically reflects a unique combination of variants that reside near each other on a chromosome. 单倍型(Haplotype)是指位于一条染色体上某个区域,具有一定相关联等位变异位点的组合。一种组合就代表一种单倍型。对单倍型进行分型,判断变异是否来自同一条染色体的过程称为phasing(又称haplotype estimation)。这里提到的分型或变异,常常是经过比较后得出的结果,在群体遗传学中,这种比较可能是某个个体与群体其他人的比较,或子代和亲本之间的比较,讲述的是进化或变异的结果(自己理解)。等位基因(allele,又称allelomorph)一般指位于一对同源染色体(一条来自父本,一条来自母本)的相同位置上控制相同性状不同状态的一对基因。英文解释:Any one of a series of two or more different genes that may occupy the same locus on a specific chromosome;An allele is a variant form of a gene. 目前的二代或三代测序,测到的reads是来自同一条染色体,因此无法区分某一条序列来自父源还是母源。不过,相对于二代,三代测序可凭借长读长优势,能覆盖大部分相邻的单核苷酸多态性位点,实现更为准确的单倍型分型。另外,三代测序可精准检测拷贝数变异(copy number variant,CNV),以及在进行对序列进行定相的同时,携带甲基化等碱基修饰信息。
研究目的:既往对肿瘤内SNVs、indels和CNV的检测大多是基于二代测序。然而,二代测序因短读长的特点,无法对基因组上高GC、重复序列区域以及染色体大片段变异进行准确识别。因此,利用三代测序技术超长读长的优势,将有助于更加全面地揭示肿瘤内发生的变异事件。最近公布的ONT-Q20+测序技术,可实现>99%的原始reads(单链)准确度,或约Q30的双链(Duplex)准确度。本研究的研究目的就是利用ONT的nanopore技术对非小细胞肺癌进行组织和细胞层面的定相分析、拷贝数变异和染色体碎裂等研究。
研究方法:对来自20名非小细胞肺癌患者的正常组织同时进行二代和三代的全基因组测序,对肿瘤组织只进行三代测序。另外,利用测序中甲基化信号和基于ONT平台的全长转录组测序探究变异与表型的关系。为了进一步探究肿瘤细胞的克隆结构,还对2例样本完成了基于ONT平台的scDNAseq。
研究结果:本研究通过利用二代测序对三代测序数据进行校正,在N50长度超过834 kb定相区块中,实现与公开二代测序的WGS数据库一致性接近99%的SNVs检测。结合甲基化数据和全长转录组测序,仅在两个样本中发现定相区块的变异(单倍型变异)与甲基化修饰和转录调控存在相关。另外,对染色体大片段变异进行分析,发现EGFR突变阳性肺腺癌肿瘤组织存在特有的染色体碎裂事件,揭示了EGFR通路的异常可能会影响端粒酶活性。
Abstract:
Chromosomal backgrounds of cancerous mutations still remain elusive. Here, we conduct the phasing analysis of non-small cell lung cancer specimens of 20 Japanese patients. By the combinatory use of short …
>>>
Chromosomal backgrounds of cancerous mutations still remain elusive. Here, we conduct the phasing analysis of non-small cell lung cancer specimens of 20 Japanese patients. By the combinatory use of short and long read sequencing data, we obtain long phased blocks of 834 kb in N50 length with >99% concordance rate. By analyzing the obtained phasing information, we reveal that several cancer genomes harbor regions in which mutations are unevenly distributed to either of two haplotypes. Large-scale chromosomal rearrangement events, which resemble chromothripsis events but have smaller scales, occur on only one chromosome, and these events account for the observed biased distributions. Interestingly, the events are characteristic of EGFR mutation-positive lung adenocarcinomas. Further integration of long read epigenomic and transcriptomic data reveal that haploid chromosomes are not always at equivalent transcriptomic/epigenomic conditions. Distinct chromosomal backgrounds are responsible for later cancerous aberrations in a haplotype-specific manner.
<<<
翻译
1024.
孤舟蓑笠翁
(2022-07-31 08:58):
#paper doi:10.1016/j.ccell.2015.09.018 Cancer Cell, 2015, RNA-Seq of Tumor-Educated Platelets Enables Blood-Based Pan-Cancer, Multiclass, and Molecular Pathway Cancer Diagnostics. 发现血小板携带的mRNA可以预测(准确率96%)是否患癌,并且可以进一步预测(准确率74%)原发组织。
IF:48.800Q1
Cancer cell,
2015-Nov-09.
DOI: 10.1016/j.ccell.2015.09.018
PMID: 26525104
PMCID:PMC4644263
Abstract:
Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile. We determined the diagnostic potential of TEPs …
>>>
Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile. We determined the diagnostic potential of TEPs by mRNA sequencing of 283 platelet samples. We distinguished 228 patients with localized and metastasized tumors from 55 healthy individuals with 96% accuracy. Across six different tumor types, the location of the primary tumor was correctly identified with 71% accuracy. Also, MET or HER2-positive, and mutant KRAS, EGFR, or PIK3CA tumors were accurately distinguished using surrogate TEP mRNA profiles. Our results indicate that blood platelets provide a valuable platform for pan-cancer, multiclass cancer, and companion diagnostics, possibly enabling clinical advances in blood-based "liquid biopsies".
<<<
翻译
1025.
颜林林
(2022-07-31 07:26):
#paper doi:10.1016/j.ccell.2022.07.003 Cancer Cell, 2022, Dark genome, bright ideas: Recent approaches to harness transposable elements in immunotherapies. 占比达到近一半人类基因组的转座元件(transposable element,TE)是个需要继续深入研究的存在。这篇评论文章,快速综述了有关TE与免疫之间的关系,如TE具备的免疫原性,它能激活 DNA 或 RNA 的传感器,也能引发免疫系统反应,从而可能形成新的免疫治疗方法。本文相继描述了 TE 表达对抗肿瘤免疫的影响,以及如何通过介导 TE 表达、介导 TE 免疫原性、辅助 CAR-T 细胞等方式,来实现对肿瘤开展免疫治疗。补充点个人想法:在 DNA 水平上研究各类重复片段,一直是相当困难的,这也是这些序列区间通常被称为“dark genome”(暗黑基因组)的原因;这种困难类似于想要通过地面的投影去反推空中漂浮的大量物件,许多物件的投影彼此重叠而无法区分;而所幸新技术让我们能从长读长、多组学等角度,开始一层层剥开迷雾。
Abstract:
Transposable elements (TEs), which make up almost half of the human genome, often display altered expression in cancers. Here, we review recent progress in elucidating the role of TEs as …
>>>
Transposable elements (TEs), which make up almost half of the human genome, often display altered expression in cancers. Here, we review recent progress in elucidating the role of TEs as mediators of immune responses in cancer and discuss how novel therapeutic strategies can harness TE immunogenicity for cancer immunotherapy.
<<<
翻译
1026.
尹志
(2022-07-30 22:41):
#paper https://doi.org/10.48550/arXiv.2205.01529 Masked Generative Distillation ECCV 2022. 这是一篇知识蒸馏的文章,通过类似对比学习的方式去生成特征,从而实现蒸馏。我们知道,知识蒸馏作为一个通用的技巧,已经被用于各类
机器学习任务,在视觉上比如分类、分割、检测等。一般来说蒸馏算法通过使得学生模仿老师特征去提高学生特征的表征能力。但这篇文章提出,学生不用去模仿老师的特征了,干脆自己生成特征好了,即通过对学生特征进行随机遮盖,然后用学生的部分特征去生成老师特征。这样学生特征就具有了较强的表征能力。这个想法很有意思,我打个比方(可能不太合适),就像本来是要学习老师的一举一动,但是现在这个老师不太出现,你不方便直接模仿,那就学生自己通过监督,去盲猜老师的特征什么样的,这样多猜几次,每次都能猜准的时候,说明对这位老师已经很熟悉了,然后说明学生的表征能力就比较强了。通过这个方式,作者在图像分类、目标检测、语义分割、实例分割等多种任务上,在不同数据集不同model的基础上,做了大量实验,发现性能都得到了提升(基本上都有2-3个点的提升,具体数值见文献)。
arXiv,
2022.
DOI: 10.48550/arXiv.2205.01529
Abstract:
Knowledge distillation has been applied to various tasks successfully. The current distillation algorithm usually improves students' performance by imitating the output of the teacher. This paper shows that teachers can …
>>>
Knowledge distillation has been applied to various tasks successfully. The current distillation algorithm usually improves students' performance by imitating the output of the teacher. This paper shows that teachers can also improve students' representation power by guiding students' feature recovery. From this point of view, we propose Masked Generative Distillation (MGD), which is simple: we mask random pixels of the student's feature and force it to generate the teacher's full feature through a simple block. MGD is a truly general feature-based distillation method, which can be utilized on various tasks, including image classification, object detection, semantic segmentation and instance segmentation. We experiment on different models with extensive datasets and the results show that all the students achieve excellent improvements. Notably, we boost ResNet-18 from 69.90% to 71.69% ImageNet top-1 accuracy, RetinaNet with ResNet-50 backbone from 37.4 to 41.0 Boundingbox mAP, SOLO based on ResNet-50 from 33.1 to 36.2 Mask mAP and DeepLabV3 based on ResNet-18 from 73.20 to 76.02 mIoU. Our codes are available at this https URL.
<<<
翻译
1027.
哪有情可长
(2022-07-30 21:34):
#paper doi: 10.1126/science.abl7392 Gametophyte genome activation occurs at pollen mitosis I in maize. 孢子体经过减数分裂成单倍体的孢子,然后经细胞增殖和分化,形成配子体。配子体世代的主要功能是形成单倍体配子,而精、卵细胞的融合又产生了新的孢子体,从而完成了一个生活周期。母体基因控制着植物受精后大多数早期事件,随后是母体到合子的转变,这个过程中母体产物的降解与合子基因组的激活相协调。本研究对玉米减数分裂开始到花粉脱落的26天内单个玉米花粉前体细胞和籽粒RNA含量进行测序,发现,花粉发育到一半的过程中,花粉粒的单倍体基因组从亲本的二倍体基因组中夺取控制权,随着孢子体到配子体的转变,为下一代的生长发育奠定了基础。
Abstract:
Flowering plants alternate between multicellular haploid (gametophyte) and diploid (sporophyte) generations. Pollen actively transcribes its haploid genome, providing phenotypic diversity even among pollen grains from a single plant. In this …
>>>
Flowering plants alternate between multicellular haploid (gametophyte) and diploid (sporophyte) generations. Pollen actively transcribes its haploid genome, providing phenotypic diversity even among pollen grains from a single plant. In this study, we used allele-specific RNA sequencing of single pollen precursors to follow the shift to haploid expression in maize pollen. We observed widespread biallelic expression for 11 days after meiosis, indicating that transcripts synthesized by the diploid sporophyte persist long into the haploid phase. Subsequently, there was a rapid and global conversion to monoallelic expression at pollen mitosis I, driven by active new transcription from the haploid genome. Genes showed evidence of increased purifying selection if they were expressed after (but not before) pollen mitosis I. This work establishes the timing during which haploid selection may act in pollen.
<<<
翻译
1028.
张浩彬
(2022-07-30 17:14):
#paper doi:10.1287/ijoc.2021.1147,Improving Sales Forecasting Accuracy: A Tensor Factorization Approach with Demand Awareness
针对的是多个商店的多商品销售预测问题,借鉴于协同过滤思想,把数据看作高维张量,对张量进行分解,从而实现更好提取相关信息及上下文关系,并对分解后的特征接入时间序列框架SARIMA 及LSTM,实现了比传统方法更好的效果。
Abstract:
Because of the accessibility of big data collections from consumers, products, and stores, advanced sales forecasting capabilities have drawn great attention from many businesses, especially those in retail, because of …
>>>
Because of the accessibility of big data collections from consumers, products, and stores, advanced sales forecasting capabilities have drawn great attention from many businesses, especially those in retail, because of the importance of forecasting in decision making. Improvement of forecasting accuracy, even by a small percentage, may have a substantial impact on companies’ production and financial planning, marketing strategies, inventory controls, and supply chain management. Specifically, our research goal is to forecast the sales of each product in each store in the near future. Motivated by tensor factorization methodologies for context-aware recommender systems, we propose a novel approach called the advanced temporal latent factor approach to sales forecasting, or ATLAS for short, which achieves accurate and individualized predictions for sales by building a single tensor factorization model across multiple stores and products. Our contribution is a combination of a tensor framework (to leverage information across stores and products), a new regularization function (to incorporate demand dynamics), and extrapolation of the tensor into future time periods using state-of-the-art statistical (seasonal autoregressive integrated moving-average models) and machine-learning (recurrent neural networks) models. The advantages of ATLAS are demonstrated on eight product category data sets collected by Information Resources, Inc., where we analyze a total of 165 million weekly sales transactions of over 15,560 products from more than 1,500 grocery stores. Summary of Contribution: Sales forecasting has been a task of long-standing importance. Accurate sales forecasting provides critical managerial implications for companies’ decision making and operations. Improvement of forecasting accuracy may have a substantial impact on companies’ production planning, marketing strategies, inventory controls, and supply chain management, among other things. This paper proposes a novel computational (machine-learning-based) approach to sales forecasting and thus is positioned directly at the intersection of computing and business/operations research.
<<<
翻译
1029.
颜林林
(2022-07-30 01:17):
#paper doi:10.15252/msb.202211017 Molecular Systems Biology, 2022, Computational estimation of quality and clinical relevance of cancer cell lines. 这是一篇关于肿瘤细胞系的综述,主要考察公开并被广泛使用的各肿瘤细胞系的质量。文章首先概述了当前不同癌种的细胞系公共资源,包括相应的多组学数据。接着,介绍可能对细胞系质量产生影响的因素,如交叉污染、传代过程中的突变积累、缺少微环境因素、分子和细胞状态等层面的异质性等。然后,针对这些问题,可以如何进行评估,综述了相应的不同计算方法(含工具)。最后,在讨论部分,展望未来的改进方向,诸如多组学整合、迁移学习的引入、单细胞数据的使用、可解释性的提高等。细胞系是肿瘤研究的重要体系,本文对其相应的资源选择和分析评估方法,都系统性地提供了汇总信息。
Abstract:
Immortal cancer cell lines (CCLs) are the most widely used system for investigating cancer biology and for the preclinical development of oncology therapies. Pharmacogenomic and genome-wide editing screenings have facilitated …
>>>
Immortal cancer cell lines (CCLs) are the most widely used system for investigating cancer biology and for the preclinical development of oncology therapies. Pharmacogenomic and genome-wide editing screenings have facilitated the discovery of clinically relevant gene-drug interactions and novel therapeutic targets via large panels of extensively characterised CCLs. However, tailoring pharmacological strategies in a precision medicine context requires bridging the existing gaps between tumours and in vitro models. Indeed, intrinsic limitations of CCLs such as misidentification, the absence of tumour microenvironment and genetic drift have highlighted the need to identify the most faithful CCLs for each primary tumour while addressing their heterogeneity, with the development of new models where necessary. Here, we discuss the most significant limitations of CCLs in representing patient features, and we review computational methods aiming at systematically evaluating the suitability of CCLs as tumour proxies and identifying the best patient representative in vitro models. Additionally, we provide an overview of the applications of these methods to more complex models and discuss future machine-learning-based directions that could resolve some of the arising discrepancies.
<<<
翻译
1030.
洪媛媛
(2022-07-29 14:23):
#paper https://doi.org/10.1038/s41467-022-31765-8 Nat Commun 13, 4248 (2022). Accurate somatic variant detection using weakly supervised deep learning。肿瘤体细胞突变的calling一般使用统计学方法结合过滤条件来确定。这篇文章使用一种命名为“VarNet" 的深度学习方法,利用配对的肿瘤和正常DNA数据来确定体细胞突变。VarNet利用已知突变和非突变答案的肿瘤DNA和它配对正常DNA序列信息,将每个位点的base, base quality, mapping quality, strand bias 和 the reference base信息形成多维矩阵来训练模型,预测每个位置存在突变的概率。接着又在4套publicly available benchmark datasets比较VarNet和另外4种已发表方法,calling突变的Precision和recall能力,证明VarNet优于现有的4种方法。
Abstract:
Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of …
>>>
Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.
<<<
翻译
1031.
沈么是快乐星球
(2022-07-29 08:53):
#paper doi:10.1038/s41467-020-19681-1 Nature Communications, 2020, Genome-enabled discovery of anthraquinone biosynthesis in Senna tora.决明作为一种中草药,主要活性物质为其大量蒽醌,蒽醌主要存在于种子中。本文通过全基因组测序,比较基因组学分析发现决明中CHS基因家族的特异快速扩展的特征,且集中分布在染色体7上;通过不同发育时期种子的代谢物测定与转录组测定,筛选出3个候选基因,根据表达模式,进化关系与基因结构确定一个候选基因,并选择亲缘关系较远的另一个CHS基因家族为阴性对照;最后通过体外酶学反应进行验证(候选基因表达蛋白、失活候选基因表达蛋白、阴性对照蛋白,仅候选基因蛋白催化底物生成下游产物)。思路简单明了,在筛选候选基因时,使用了基因表达模式与代谢物表达模式相似的基因簇为基础,并构建了“代谢库”,分析其主要富集的代谢通路。在进行酶学反应时,因涉及到大部分的代谢知识,还并未详细研究。
Abstract:
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. …
>>>
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. To identify the genes responsible for plant anthraquinone biosynthesis, we reveal the genome sequence of S. tora at the chromosome level with 526 Mb (96%) assembled into 13 chromosomes. Comparison among related plant species shows that a chalcone synthase-like (CHS-L) gene family has lineage-specifically and rapidly expanded in S. tora. Combining genomics, transcriptomics, metabolomics, and biochemistry, we identify a CHS-L gene contributing to the biosynthesis of anthraquinones. The S. tora reference genome will accelerate the discovery of biologically active anthraquinone biosynthesis pathways in medicinal plants.
<<<
翻译
1032.
颜林林
(2022-07-29 08:21):
#paper doi:10.1093/nar/gkac586 Nucleic Acid Research, 2022, De novo assembly of human genome at single-cell levels. 作者之前开发的一项名为 SMOOTH-seq 的技术,大致原理是:用 Tn5 转座子插入基因组DNA,使其随机片段化,然后用带有 barcode 的引物对片段进行链置换和扩增,再将双链末端分别连入一段序列以成环,进行滚环扩增,得到可供长读长测序的长片段,该长片段上带有多份原始序列片段,因而可以准确校正序列碱基。本文在此基础上进行了改进,使用 PacBio HiFi 和 Oxford Nanopore Technologies(ONT)两种测序平台,对 K562 和 HG002 两个细胞系进行单细胞测序。首次在单细胞水平上完成了具有高连续性的人类基因组组装。其结果包括:95 个 K562 细胞,总测序深度约37x(如果没理解错,应该每个细胞的测序深度为 37/95 = 0.4 x),NG50 约 2 Mb;30 个 HG002 细胞,每个细胞的测序深度约为 1G(相当于是 0.33x),NG50 约 1.3 Mb。按文章摘要的说法“开启了单细胞基因组从头组装实践的新篇章”。这个主题看似创新度很高,仔细推敲却不禁有些疑问:单细胞基因组测序很难区分不同类群细胞,因而应该只能在单细胞水平上分别进行组装,否则大量不同类群细胞混合起来组装,则又失去了原本的立意。但是,单个细胞的基因组覆盖度是不可能很全面的(文章提到平均覆盖率约是 41.7%,我猜提升测序数据量也未必对此会有大幅改善),这又很大程度上会限制组装本身,因而最终只能关注其中的结构变异鉴定结果。此外,单细胞基因组结果其实很难验证,很难用其他细胞的结果来评判当前被测细胞的结果是否准确,这应该也是一个逻辑上的硬伤。所以,最终这篇文章的贡献,除了两个细胞系的单细胞基因组测序数据本身外,大概主要还是在于实验方法摸索优化和技术方法建立吧,当然其数据分析方法过程也是值得参考的。
IF:16.600Q1
Nucleic acids research,
2022-07-22.
DOI: 10.1093/nar/gkac586
PMID: 35819189
PMCID:PMC9303314
人类基因组在单细胞水平上的从头组装
Abstract:
Genome assembly has been benefited from long-read sequencing technologies with higher accuracy and higher continuity. However, most human genome assembly require large amount of DNAs from homogeneous cell lines without …
>>>
Genome assembly has been benefited from long-read sequencing technologies with higher accuracy and higher continuity. However, most human genome assembly require large amount of DNAs from homogeneous cell lines without keeping cell heterogeneities, since cell heterogeneity could profoundly affect haplotype assembly results. Herein, using single-cell genome long-read sequencing technology (SMOOTH-seq), we have sequenced K562 and HG002 cells on PacBio HiFi and Oxford Nanopore Technologies (ONT) platforms and conducted de novo genome assembly. For the first time, we have completed the human genome assembly with high continuity (with NG50 of ∼2 Mb using 95 individual K562 cells) at single-cell levels, and explored the impact of different assemblers and sequencing strategies on genome assembly. With sequencing data from 30 diploid individual HG002 cells of relatively high genome coverage (average coverage ∼41.7%) on ONT platform, the NG50 can reach over 1.3 Mb. Furthermore, with the assembled genome from K562 single-cell dataset, more complete and accurate set of insertion events and complex structural variations could be identified. This study opened a new chapter on the practice of single-cell genome de novo assembly.
<<<
翻译
1033.
李翛然
(2022-07-28 13:15):
#paper DOI: 10.1126/science.aba2374 Preventing Engrailed-1 activation in Mbrob- lasts yields wound regeneration without scarring 2021年4月份发表在Science的皮肤损伤修复靶点很有意思,号称不留伤疤,目前发现一个老药作用在这个靶点,没有其它药物进入临床,但是有其它抑制剂。那个老药Verteporfin是通过激光照射眼睛治疗眼部血管破裂的,不知道用于皮肤损伤新靶点的疗效会受制于老药靶点(假定靶点不同)。这个靶点已经成功地引起了我司注意
Abstract:
Skin scarring, the end result of adult wound healing, is detrimental to tissue form and function. lineage-positive fibroblasts (EPFs) are known to function in scarring, but lineage-negative fibroblasts (ENFs) remain …
>>>
Skin scarring, the end result of adult wound healing, is detrimental to tissue form and function. lineage-positive fibroblasts (EPFs) are known to function in scarring, but lineage-negative fibroblasts (ENFs) remain poorly characterized. Using cell transplantation and transgenic mouse models, we identified a dermal ENF subpopulation that gives rise to postnatally derived EPFs by activating expression during adult wound healing. By studying ENF responses to substrate mechanics, we found that mechanical tension drives activation via canonical mechanotransduction signaling. Finally, we showed that blocking mechanotransduction signaling with either verteporfin, an inhibitor of Yes-associated protein (YAP), or fibroblast-specific transgenic YAP knockout prevents activation and promotes wound regeneration by ENFs, with recovery of skin appendages, ultrastructure, and mechanical strength. This finding suggests that there are two possible outcomes to postnatal wound healing: a fibrotic response (EPF-mediated) and a regenerative response (ENF-mediated).
<<<
翻译
1034.
muton
(2022-07-28 11:58):
#paper DOI: 10.1371/journal.pcbi.1009267 Unveiling functions of the visual cortex using task-specific deep neural networks.人类的视觉感知是一种复杂的认知能力,它是由大脑不同皮层区域控制调节的。然而目前这些区域的确切功能我们了解的仍不完全清楚,进而这些区域是如何协调视觉感知的也没有确切的答案。目前的观点认为视觉信息的转变过程是通过不同功能区域的层次化计算,通常我们概括为这些功能区域为腹侧和背侧视觉通路。无论是发现各个视觉皮层区域的确切功能还是利用计算建模的方法实现这种功能都是具有挑战性的,但也是我们的最终诉求。深度神经网络(DNNs)用于实现建模和预测视觉区域反应的一种较有前景的方法。本文通过比较不同视觉任务中的fMRI数据集与针对不同视觉任务优化过的DNN 模型子集的相关(作者选择了通过Taskonomy数据集训练的18个DNNs模型,这些模型分别对应于室内场景图片理解的18个不同任务的优化)发现了视觉信息沿腹侧和背侧视觉通路的结构化映射。低级视觉任务映射到早期视觉皮层,三维场景感知任务映射到背侧流,语义任务映射到腹侧流。文章的亮点可能就是通过模型和人脑实际数据相似性比较的方法能够得出哪些脑区贡献于哪些任务的这种思路。
Abstract:
The human visual cortex enables visual perception through a cascade of hierarchical computations in cortical regions with distinct functionalities. Here, we introduce an AI-driven approach to discover the functional mapping …
>>>
The human visual cortex enables visual perception through a cascade of hierarchical computations in cortical regions with distinct functionalities. Here, we introduce an AI-driven approach to discover the functional mapping of the visual cortex. We related human brain responses to scene images measured with functional MRI (fMRI) systematically to a diverse set of deep neural networks (DNNs) optimized to perform different scene perception tasks. We found a structured mapping between DNN tasks and brain regions along the ventral and dorsal visual streams. Low-level visual tasks mapped onto early brain regions, 3-dimensional scene perception tasks mapped onto the dorsal stream, and semantic tasks mapped onto the ventral stream. This mapping was of high fidelity, with more than 60% of the explainable variance in nine key regions being explained. Together, our results provide a novel functional mapping of the human visual cortex and demonstrate the power of the computational approach.
<<<
翻译
1035.
前进
(2022-07-28 11:54):
#paper doi: 10.1109/TMI.2019.2953788 Transactions on Medical Imaging 2019
Progressively trained convolutional neural networks for deformable image registration
现有的基于深度学习的配准算法对存在大尺度变形的配准任务经常表现不佳。为了解决这种大尺度变形的问题,现有的方法主要分为两种:1、在配准前先采用传统的方法对图像进行预配准(affine,rigid)2、采用多个网络级联的方式,逐步变形,最终生成大尺度变形配准场。这两种方式都存在一定的弊端:1、传统方法耗时过长,削弱了利用深度学习进行后续配准的优势。2、级联网络在配准图像时,会对浮动图像进行多次插值,插值误差积累将会影响最后的变形场质量。因此论文作者提出只采用一个单独的网络联合渐进式训练方式来进行大尺度变形配准。渐进式训练方式首先是被用来提高GAN生成图像的分辨率,现被作者迁移用来解决配准问题。渐进式训练方式简单解释就是当网络的一层训练收敛以后,添加新层,再进行训练,直到生成最后的变形场。该论文有3点创新:
1、 提出了一个渐进式学习模型,能在同一个卷积网络内学习图像不同尺度的变形。
2、 证明了用神经网络配准两张图之前无需预配准。
3、 证明了神经网络可以采用合成的变形场进行监督训练,最后能够泛化解决实际配准问题。
IF:8.900Q1
IEEE transactions on medical imaging,
2020-05.
DOI: 10.1109/TMI.2019.2953788
PMID: 31751269
Abstract:
Deep learning-based methods for deformable image registration are attractive alternatives to conventional registration methods because of their short registration times. However, these methods often fail to estimate larger displacements in …
>>>
Deep learning-based methods for deformable image registration are attractive alternatives to conventional registration methods because of their short registration times. However, these methods often fail to estimate larger displacements in complex deformation fields, for which a multi-resolution strategy is required. In this article, we propose to train neural networks progressively to address this problem. Instead of training a large convolutional neural network on the registration task all at once, we initially train smaller versions of the network on lower resolution versions of the images and deformation fields. During training, we progressively expand the network with additional layers that are trained on higher resolution data. We show that this way of training allows a network to learn larger displacements without sacrificing registration accuracy and that the resulting network is less sensitive to large misregistrations compared to training the full network all at once. We generate a large number of ground truth example data by applying random synthetic transformations to a training set of images, and test the network on the problem of intrapatient lung CT registration. We analyze the learned representations in the progressively growing network to assess how the progressive learning strategy influences training. Finally, we show that a progressive training procedure leads to improved registration accuracy when learning large and complex deformations.
<<<
翻译
1036.
芝麻
(2022-07-28 09:52):
#paper doi: 10.1016/j.tranon.2021.101016. Epub 2021 Jan 16. PMID: 33465745; PMCID: PMC7815805. Machine learning analysis using 77,044 genomic and transcriptomic profiles to accurately predict tumor type. Transl Oncol. 肿瘤转移是肿瘤患者的主要死亡威胁之一,而对一部分转移瘤患者,仅凭形态学观察无法确定肿瘤的原发部位,这样的转移瘤被临床称为原发灶不明转移瘤(Cancer of unknown primary, CUP)因为CUP具有较高的转移侵袭性,且没有可识别的起源部位,医生在选择治疗方案时会有的困扰,因此CUP的精准治疗是肿瘤临床的一个挑战。2021年,Jim Abraham 和同事在超过20000个癌症样本中,结合基因组突变和转录组表达特征两类数据进行基于机器学习的模型训练,并且先后尝试了超过300个不同的机器学习模型,最后在19555个样本的独立验证集中达到了97%的正确率
IF:4.500Q1
Translational oncology,
2021-Mar.
DOI: 10.1016/j.tranon.2021.101016
PMID: 33465745
PMCID:PMC7815805
Abstract:
Cancer of Unknown Primary (CUP) occurs in 3-5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated …
>>>
Cancer of Unknown Primary (CUP) occurs in 3-5% of patients when standard histological diagnostic tests are unable to determine the origin of metastatic cancer. Typically, a CUP diagnosis is treated empirically and has very poor outcomes, with median overall survival less than one year. Gene expression profiling alone has been used to identify the tissue of origin but struggles with low neoplastic percentage in metastatic sites which is where identification is often most needed. MI GPSai, a Genomic Prevalence Score, uses DNA sequencing and whole transcriptome data coupled with machine learning to aid in the diagnosis of cancer. The algorithm trained on genomic data from 34,352 cases and genomic and transcriptomic data from 23,137 cases and was validated on 19,555 cases. MI GPSai predicted the tumor type in the labeled data set with an accuracy of over 94% on 93% of cases while deliberating amongst 21 possible categories of cancer. When also considering the second highest prediction, the accuracy increases to 97%. Additionally, MI GPSai rendered a prediction for 71.7% of CUP cases. Pathologist evaluation of discrepancies between submitted diagnosis and MI GPSai predictions resulted in change of diagnosis in 41.3% of the time. MI GPSai provides clinically meaningful information in a large proportion of CUP cases and inclusion of MI GPSai in clinical routine could improve diagnostic fidelity. Moreover, all genomic markers essential for therapy selection are assessed in this assay, maximizing the clinical utility for patients within a single test.
<<<
翻译
1037.
王昊
(2022-07-28 09:51):
#paper doi:10.48550/arXiv.2207.04630 Yi Ma, Doris Tsao, and Heung-Yeung Shum. 2022. On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence. 作者马毅数学功底很好,和做神经科学的Doris Tsao合作的一篇讲述他们认为的2个重要的AI基本原理的文章。本文提出了一个理解深度神经网络的新框架:压缩闭环转录,并回答了从数据中学习的目标是什么,如何衡量?(信息编码论)以及 如何通过高效和有效的计算实现这样的目标?(控制)这两个问题。提出理解AI的两个基本原理:简约性与自洽性。
arXiv,
2022.
DOI: 10.48550/arXiv.2207.04630
Abstract:
Ten years into the revival of deep networks and artificial intelligence, we propose a theoretical framework that sheds light on understanding deep networks within a bigger picture of Intelligence in …
>>>
Ten years into the revival of deep networks and artificial intelligence, we propose a theoretical framework that sheds light on understanding deep networks within a bigger picture of Intelligence in general. We introduce two fundamental principles, Parsimony and Self-consistency, that address two fundamental questions regarding Intelligence: what to learn and how to learn, respectively. We believe the two principles are the cornerstones for the emergence of Intelligence, artificial or natural. While these two principles have rich classical roots, we argue that they can be stated anew in entirely measurable and computable ways. More specifically, the two principles lead to an effective and efficient computational framework, compressive closed-loop transcription, that unifies and explains the evolution of modern deep networks and many artificial intelligence practices. While we mainly use modeling of visual data as an example, we believe the two principles will unify understanding of broad families of autonomous intelligent systems and provide a framework for understanding the brain.
<<<
翻译
1038.
李欣
(2022-07-28 09:45):
#paperFertil Steril. 2022 Apr;117(4):792-800. doi: 10.1016/j.fertnstert.2021.12.025. Epub 2022 Jan 31. PMID: 35109980
在IVF周期中,子宫内膜厚度是常规测量的,子宫内膜薄与流产、异位妊娠、前置胎盘、低出生体重、以及其他产科并发症风险增加有关。既往研究表明,在鲜胚移植周期中,子宫内膜厚度的增加对妊娠结局的改善有帮助。冷冻胚胎移植周期中,子宫内膜厚度与IVF妊娠结局的关系不一致,也有研究认为FET周期中子宫内膜厚度不能预测活产率。因此,目前尚不清楚妊娠率和活产率是否在某一点趋于稳定,或者是否随着子宫内膜厚度的增加而继续上升。此外,FET与fresh ET的最佳子宫内膜厚度是否相同仍有待揭示。本研究探索了在鲜胚周期与冻胚周期中是否存在最合适的内膜厚度。
研究目的主要目的是确定在新鲜IVF-ET和FET周期中,是否存在活产率达到峰值的子宫内膜厚度,以及是否存在活产率下降的子宫内膜厚度。同时比较了患者年龄、胚胎期别及获卵数是否影响子宫内膜厚度与活产率。
纳入数据来自加拿大辅助生殖技术注册+(CARTR Plus)数据库,纳入2013年1月至2019年12月之间96760个自体周期。这包括43383个鲜胚周期和53377个冻胚周期。
研究性质回顾性队列研究,将冻胚与鲜胚周期分别进行分析,观察其合适的内膜厚度。鲜胚周期的内膜厚度记录的是扳机当天的,而在冷冻周期中,内膜厚度的记录主要是来自开始给孕酮之前或在LH峰或HCG扳机前的。
这是迄今为止该方向最大样本量的一项研究,比较了新鲜和冻融体外受精周期中子宫内膜厚度对活产率的影响。在鲜胚周期中,子宫内膜厚度增加与回收的卵母细胞平均数、雌二醇平均峰值水平和可用胚胎平均数显著增加有关,这可能导致内膜厚度与预后良好患者对于妊娠结局改善的混淆。新鲜和冷冻周期之间的“最佳”内膜厚度似乎存在差异,可能是由于控制性卵巢过度刺激(COH)对子宫内膜的影响导致的。
结论 在新鲜胚胎移植的周期中,活产率显著增加,直到子宫内膜厚度为10-12mm,而在FET周期中,活产率在内膜为7-10mm后趋于稳定。
Abstract:
OBJECTIVE: To study the effect of increasing endometrial thickness on live birth rates in fresh and frozen-thaw embryo transfer (FET) cycles.DESIGN: Retrospective cohort study.SETTING: National data from Autologous in vitro …
>>>
OBJECTIVE: To study the effect of increasing endometrial thickness on live birth rates in fresh and frozen-thaw embryo transfer (FET) cycles.DESIGN: Retrospective cohort study.SETTING: National data from Autologous in vitro fertilization (IVF) embryo transfer and FET cycles in Canada from the Canadian Assisted Reproductive Technology Registry Plus (CARTR Plus) database for records between January 2013 and December 2019.PATIENTS: Thirty-three Canadians clinics participated in voluntary reporting of IVF and pregnancy outcomes to the Canadian Assisted Reproductive Technology Registry Plus database, and a total of 43,383 fresh and 53,377 frozen transfers were included.INTERVENTION(S): None.MAIN OUTCOME MEASURE(S): Clinical pregnancy, pregnancy loss, and live birth rates.RESULTS: In fresh IVF-embryo transfer cycles, increasing endometrial thickness is associated with significant increases in the mean number of oocytes retrieved, peak estradiol levels, number of usable embryos, clinical pregnancy rates, live birth rates, and mean term singleton birth weights, and a decrease in pregnancy loss rates. However, live birth rates plateau after 10-12 mm. In contrast, in FET cycles live birth rates plateau after the endometrium measures 7-10 mm. The improvement in live birth rates with increasing endometrial thickness was independent of patient age, timing of embryo transfer (e.g., cleavage stage vs. blastocyst stage), or the number of oocytes at retrieval.CONCLUSIONS: In cycles with a fresh embryo transfer, live birth rates increase significantly until an endometrial thickness of 10-12 mm, while in FET cycles live birth rates plateau after 7-10 mm. However, an endometrial thickness <6 mm was associated clearly with a dramatic reduction in live birth rates in fresh and frozen embryo transfer cycles.
<<<
翻译
1039.
颜林林
(2022-07-28 08:50):
#paper doi:10.1093/bioinformatics/btac137 Bioinformatics, 2022, BWA-MEME: BWA-MEM emulated with a machine learning approach. 看到李恒在Twitter上转发这篇文章,本以为大神又升级了bwa mem2,之后发现原来是他人的作品,得到了李恒钦点而已。作为某个知名软件的后继者,必然是要在某个方面有较大改进的,这篇的改进主要在性能。用于高通量测序数据的短序列比对算法,通常都是先用精确匹配种子(这几乎都是查表法在常数时间内完成),然后进行延伸匹配。而种子序列的长度选择,是一项比较有技巧性的事,太短可能导致重复匹配(hit)过多,太长则可能大量单词无匹配(在基因组上无该序列)却占据字典,导致字典过大。为此,过去也有一些算法,会采用变长种子来解决该问题(我也设想过这个策略,但惭愧的是,最终未能付诸实践)。而变长种子的策略,存在内存块大小不定、访问频繁等问题,会导致性能瓶颈。在本文中,通过机器学习的方法,在建立种子索引的阶段进行预处理,使得索引能够根据基因组序列数据进行适应,使不同长度种子的内存访问次数固定,从而获得性能提升。在最终的评测中,bwa-meme 能保持与 bwa-mem2 的输出相同,运行速度则提升了 3.45 倍。这篇文章的算法,可以再仔细深入学习下。
Abstract:
MOTIVATION: The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding …
>>>
MOTIVATION: The growing use of next-generation sequencing and enlarged sequencing throughput require efficient short-read alignment, where seeding is one of the major performance bottlenecks. The key challenge in the seeding phase is searching for exact matches of substrings of short reads in the reference DNA sequence. Existing algorithms, however, present limitations in performance due to their frequent memory accesses.RESULTS: This article presents BWA-MEME, the first full-fledged short read alignment software that leverages learned indices for solving the exact match search problem for efficient seeding. BWA-MEME is a practical and efficient seeding algorithm based on a suffix array search algorithm that solves the challenges in utilizing learned indices for SMEM search which is extensively used in the seeding phase. Our evaluation shows that BWA-MEME achieves up to 3.45× speedup in seeding throughput over BWA-MEM2 by reducing the number of instructions by 4.60×, memory accesses by 8.77× and LLC misses by 2.21×, while ensuring the identical SAM output to BWA-MEM2.AVAILABILITY AND IMPLEMENTATION: The source code and test scripts are available for academic use at https://github.com/kaist-ina/BWA-MEME/.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
<<<
翻译
1040.
徐炳祥
(2022-07-27 21:51):
#paper International Conference on Learning Representations, 2020, Hyper-SAGNN: a self-attention based graph neural network for hypergraphs. 对具有高阶连接的超图进行图表示学习是提取很多现实问题中有用模式的必经步骤,然而当前(2020)的超图表示学习算法均无法很好处理超边大小不一致的超图。本文作者基于自注意力思想设计了一种称为Hyper-SAGNN的图神经网络结构,很好的处理了有可变超边大小的超图网络学习问题。此网络架构首先使用一单层神经网络将输入特征映射为“静态嵌入”,然后使用Multi-heat attention结构将位于同一超边内的节点映射为“动态嵌入”,进而使用Hadamard积刻画“静态表示”和“动态表示”的相似性,结果传入一单层神经网络,最终预测超边存在的概率。模型在通用测试数据集上均有比当时通行模型更好的表现,同时在单细胞Hi-C数据的表示和细胞分类问题中也有上佳表现。2022年,他们在Nature biotechnology上发表了基于此网络结构的单细胞Hi-C数据表示方法Higashi(doi: 10.1038/s41587-021-01034-y)
IF:33.100Q1
Nature biotechnology,
2022-02.
DOI: 10.1038/s41587-021-01034-y
PMID: 34635838
PMCID:PMC8843812
Abstract:
Single-cell Hi-C (scHi-C) can identify cell-to-cell variability of three-dimensional (3D) chromatin organization, but the sparseness of measured interactions poses an analysis challenge. Here we report Higashi, an algorithm based on …
>>>
Single-cell Hi-C (scHi-C) can identify cell-to-cell variability of three-dimensional (3D) chromatin organization, but the sparseness of measured interactions poses an analysis challenge. Here we report Higashi, an algorithm based on hypergraph representation learning that can incorporate the latent correlations among single cells to enhance overall imputation of contact maps. Higashi outperforms existing methods for embedding and imputation of scHi-C data and is able to identify multiscale 3D genome features in single cells, such as compartmentalization and TAD-like domain boundaries, allowing refined delineation of their cell-to-cell variability. Moreover, Higashi can incorporate epigenomic signals jointly profiled in the same cell into the hypergraph representation learning framework, as compared to separate analysis of two modalities, leading to improved embeddings for single-nucleus methyl-3C data. In an scHi-C dataset from human prefrontal cortex, Higashi identifies connections between 3D genome features and cell-type-specific gene regulation. Higashi can also potentially be extended to analyze single-cell multiway chromatin interactions and other multimodal single-cell omics data.
<<<
翻译