刘昊辰 (2024-08-20 15:24):
#paper arXiv:2406.00741v1 [cs.AI] 2 Jun 2024, Learning to Play 7 Wonders Duel Without Human Supervision. 这篇论文介绍了玩桌游七大奇迹对决的人工智能程序ZeusAI。ZeusAI的灵感来源于AlphaZero强化学习算法,它结合了MCTS和Transformer,在没有人类监督的情况下学习游戏。ZeusAI与人类玩家的对弈结果显示,它达到了非常高的竞技水平,赢得了38局中的26局。文章以ZeusAI为工具研究了该桌游的平衡性。社区普遍认为先手玩家有显著优势,ZeusAI的自我对弈游戏证实了这一点。文章提出了一些规则变体,以减少这种不平衡,例如改变初始金币数量或改变奇迹选择阶段。下载地址:https://arxiv.org/pdf/2406.00741
arXiv, 2024-06-02T13:28:57Z. DOI: 10.48550/arXiv.2406.00741
This paper introduces ZeusAI, an artificial intelligence system developed toplay the board game 7 Wonders Duel. Inspired by the AlphaZero reinforcementlearning algorithm, ZeusAI relies on a combination of Monte Carlo … >>>
This paper introduces ZeusAI, an artificial intelligence system developed toplay the board game 7 Wonders Duel. Inspired by the AlphaZero reinforcementlearning algorithm, ZeusAI relies on a combination of Monte Carlo Tree Searchand a Transformer Neural Network to learn the game without human supervision.ZeusAI competes at the level of top human players, develops both known andnovel strategies, and allows us to test rule variants to improve the game'sbalance. This work demonstrates how AI can help in understanding and enhancingboard games. <<<
颜林林 (2024-08-18 05:49):
#paper doi:10.1038/s41597-024-03701-6, Scientific data, 2024, ChineseMPD: A Semantic Segmentation Dataset of Chinese Martial Arts Classic Movie Props. 只做数据清洗和整理,提供公开的数据集,也是可以发表文章的,Scientific Data杂志上就大量收录此类文章。这篇文章分享的数据很有意思,是来自大批量的中国武侠电影,通过语义分割算法,从中识别出枪、剑、棍、刀、钩、箭等武侠道具,动用了包括11名本科生在内的21人,历时半年,进行人工标注和审核,填补了现有语义分割数据集在动作电影道具方面的研究空白。数据集以CC BY 4.0许可发布,可供非商业用途的重新分发、修改、调整和构建作品,下载地址:https://www.scidb.cn/en/anonymous/SlpaelFy
IF:5.800Q1 Scientific data, 2024-Aug-14. DOI: 10.1038/s41597-024-03701-6 PMID: 39143093 PMCID:PMC11325024
Recent advances in computer vision and deep learning techniques have facilitated significant progress in video scene understanding, thus helping film and television practitioners achieve accurate video editing. However, so far, … >>>
Recent advances in computer vision and deep learning techniques have facilitated significant progress in video scene understanding, thus helping film and television practitioners achieve accurate video editing. However, so far, publicly available semantic segmentation datasets are mostly limited to indoor scenes, city streets, and natural images, often ignoring example objects in action movies, which is a research gap that needs to be urgently filled. In this paper, we introduce a large-scale, high-precision semantic segmentation dataset of props in Chinese martial arts movie clips, named ChineseMPD. Specifically, this dataset first establishes segmentation rules and general review criteria for audiovisual data, and then provides semantic segmentation annotations for six weapon props (Gun, Sword, Stick, Knife, Hook, and Arrow) with a summary of 32,992 objects.To the best of our knowledge, this dataset is the largest semantic segmentation dataset for movie props to date. ChineseMPD dataset not only significantly expands the application of traditional tasks of computer vision such as object detection and scene understanding, but also opens up new avenues for interdisciplinary research. <<<
张浩彬 (2024-08-16 20:33):
#paper SMDE: Unsupervised representation learning for time series based on signal mode decomposition and ensemble doi: https://doi.org/10.1016/j.knosys.2024.112369 这个月读自己刚见刊的论文吧,当是做一个宣传。在本文中,我们提出一种新的时间序列对比学习框架SMDE,在实例对比的基础上,首次将模态级别对比纳入对比学习当中,从而加深了对复杂时间序列动态的理解。我们进一步提出了专门针对时间序列特点的代理任务,全局信号一致性与局部模态一致性代理任务,并基于此提出了一种新的损失函数DE Circle loss。我们在广泛的半监督实验中,取得了sota的效果。说实话,虽然全监督的效果也很好,但是我个人觉得半监督是我们做的一个比较好的点
龙海晨 (2024-08-15 22:28):
#paper Kashyap J, Tyagi RK. Mitotic genome bookmarking by nuclear receptor VDR advocates transmission of cellular transcriptional memory to progeny cells. Exp Cell Res. 2022 Aug 1;417(1):113193. doi: 10.1016/j.yexcr.2022.113193. Epub 2022 May 4. PMID: 35523304. 有丝分裂是细胞自我更新的重要过程,伴随有核结构和染色质组织的动态变化。细胞仍设法在有丝分裂后重新建立所有亲本表观遗传标记。一些序列特异性转录因子在细胞分裂过程中仍附着在有丝分裂染色质上,以确保能够及时重新激活维持细胞身份所必需的转录因子。这些有丝分裂相关因子被认为是“基因组书签因子”,这种现象被称为“基因组书签”。本研究中,讨论了另一种经典核受体 Vitamin D Receptor (VDR),VDR(NR1I1)在基因组标记中的相关性及其在谱系承诺和细胞身份中的可能作用。在细胞分裂过程中,VDR 始终与有丝分裂染色质保持组成性相关。VDR 促进有丝分裂染色质与其异二聚体伴侣 RXR,视黄酸 X 受体Retinoid X receptor (RXR) 的结合。VDR 在间期和有丝分裂期间与靶基因启动子中的 DR3 序列结合。VDR-DBD(DNA 结合域 DNA binding domain (DBD)) 在组成基因组书签中起着核心作用。
DeDe宝 (2024-08-02 14:36):
#paper DOI:https://doi.org/10.7554/eLife.88095.3, eLife, 2024, An allocentric human odometer for perceiving distances on the ground plane.这篇文章分析了人类如何通过非自我中心(allocentric)的空间参考框架解决移动时静态物体的视错觉变化。研究者比较被试在静止和行走状态下对目标位置的感知,发现在行走条件下,被试对目标位置的判断显示出与基线-静止条件相比更近的偏差,这支持了非自我中心编码。此外,研究者测试在行走时进行认知任务(如倒数计数)是否影响路径整合机制,发现分心行走条件下,被试对目标位置的判断受到了影响,表明路径整合机制可能需要一定的注意力资源。然后,研究者通过被动移动被试测试路径整合机制是否在被动移动和不同方向移动下工作,发现无论是向前还是向后移动,路径整合机制都能正常工作,表明它可以使用前庭信号。最后,使用口头报告任务来确认在自我运动中的观察者是否与静止观察者有一致的空间感知效果。实验表明,无论是行动任务还是静止任务,空间感知效果是一致的。最后,研究者还发现路径整合机制在水平方向上比垂直方向上更有效。综上,研究结果表明,人类使用一个非自我中心的空间参考框架来感知在地面平面上的距离。
IF:6.400Q1 eLife, 2024-7-18. DOI: 10.7554/eLife.88095.3
We reliably judge locations of static objects when we walk despite the retinal images of these objects moving with every step we take. Here, we showed our brains solve this … >>>
We reliably judge locations of static objects when we walk despite the retinal images of these objects moving with every step we take. Here, we showed our brains solve this optical illusion by adopting an allocentric spatial reference frame. We measured perceived target location after the observer walked a short distance from the home base. Supporting the allocentric coding scheme, we found the intrinsic bias , which acts as a spatial reference frame for perceiving location of a dimly lit target in the dark, remained grounded at the home base rather than traveled along with the observer. The path-integration mechanism responsible for this can utilize both active and passive (vestibular) translational motion signals, but only along the horizontal direction. This asymmetric path-integration finding in human visual space perception is reminiscent of the asymmetric spatial memory finding in desert ants, pointing to nature’s wondrous and logically simple design for terrestrial creatures. <<<
半面阳光 (2024-07-31 23:48):
#paper DOI:https://doi.org/10.1016/j.gim.2024.101137, Genet Med, 2024, Laboratory testing for preconception/prenatal carrier screening: A technical standard of the American College of Medical Genetics and Genomics (ACMG). 这是一篇ACMG最新发布的技术标准,用作实验室孕前/产前携带者筛查的技术参考。这篇技术标准是对2013年发布的关于常染色体隐性遗传和X-染色体连锁遗传的技术标准的更新和补充。在技术标准中,考虑了诸多因素,包括人群携带者频率、最佳panel大小和包含的基因,以及关于将携带者筛查分为 4 级的建议。本实验室技术标准确立了携带者筛查检测的设计和验证标准,定义了此类测试的范围和限制,制定了测试结果解释和报告的指南,并根据适用情况推荐适当的后续测试。但需要注意的是该技术标准并不作为临床实践指南使用。
孕前/产前携带者筛查的实验室检测:美国医学遗传学和基因组学学院 (ACMG) 的技术标准
Carrier screening has historically assessed a relatively small number of autosomal recessive and X-linked conditions selected based on frequency in a specific subpopulation and association with severe morbidity or mortality. … >>>
Carrier screening has historically assessed a relatively small number of autosomal recessive and X-linked conditions selected based on frequency in a specific subpopulation and association with severe morbidity or mortality. Advances in genomic technologies enable simultaneous screening of individuals for several conditions. The American College of Medical Genetics and Genomics recently published a clinical practice resource that presents a framework when offering screening for autosomal recessive and X-linked conditions during pregnancy and preconception and recommends a tier-based approach when considering the number of conditions to screen for and their frequency within the US population in general. This laboratory technical standard aims to complement the practice resource and to put forth considerations for clinical laboratories and clinicians who offer preconception/prenatal carrier screening. <<<
携带者筛查历来评估相对较少的常染色体隐性遗传病和 X 连锁病症,这些病症是根据特定亚群的发生率以及与严重发病率或死亡率的关联选择的。基因组技术的进步使得可以同时筛查个体的多种疾病。美国医学遗传学和基因组学学会(American College of Medical Genetics and Genomics)最近发布了一份临床实践资源,该资源在提供妊娠和孕前常染色体隐性遗传病和X连锁病症筛查时提供了一个框架,并在考虑要筛查的疾病数量及其在美国人群中的频率时,建议采用基于等级的方法。该实验室技术标准旨在补充实践资源,并为提供孕前/产前携带者筛查的临床实验室和临床医生提出注意事项。
白鸟 (2024-07-31 22:52):
#paper, DOI: 10.1186/s13059-020-02116-x, Integrative analyses of single-cell transcriptome and regulome using MAESTRO.刘小乐实验室在2020年发表的一篇工具类文章。看这篇文章,主要是想看scATAC分析的新颖之处,和其他软件的异同之处。 1.开发的MAESTRO流程支持单细胞转录组+ATAC全分析,兼顾不同的单细胞平台,打通上下游分析; 2.染色质可及性:在基因水平对染色质可及性进行建模;强大的转录调节因子预测; 3.细胞类型自动注释,优化差异基因分析步骤,自动细胞类型注释和转录调节因子推断; 4.通过Snakemake流程执行,一些分析步骤很值得借鉴;scATAC代码部分还没看; 不足之处,是软件后期没有维护,文献引用率低。学习代码时,软件会调用不同的软件包,也一并需要了解。
IF:10.100Q1 Genome biology, 2020-08-07. DOI: 10.1186/s13059-020-02116-x PMID: 32767996 PMCID:PMC7412809
We present Model-based AnalysEs of Transcriptome and RegulOme (MAESTRO), a comprehensive open-source computational workflow ( http://github.com/liulab-dfci/MAESTRO ) for the integrative analyses of single-cell RNA-seq (scRNA-seq) and ATAC-seq (scATAC-seq) data from … >>>
We present Model-based AnalysEs of Transcriptome and RegulOme (MAESTRO), a comprehensive open-source computational workflow ( http://github.com/liulab-dfci/MAESTRO ) for the integrative analyses of single-cell RNA-seq (scRNA-seq) and ATAC-seq (scATAC-seq) data from multiple platforms. MAESTRO provides functions for pre-processing, alignment, quality control, expression and chromatin accessibility quantification, clustering, differential analysis, and annotation. By modeling gene regulatory potential from chromatin accessibilities at the single-cell level, MAESTRO outperforms the existing methods for integrating the cell clusters between scRNA-seq and scATAC-seq. Furthermore, MAESTRO supports automatic cell-type annotation using predefined cell type marker genes and identifies driver regulators from differential scRNA-seq genes and scATAC-seq peaks. <<<
我们提出了基于模型的转录组和 RegulOme 分析 (MAESTRO),这是一种全面的开源计算工作流程 ( http://github.com/liulab-dfci/MAESTRO ),用于对来自多个平台的单细胞 RNA-seq (scRNA-seq) 和 ATAC-seq (scATAC-seq) 数据进行综合分析。MAESTRO 提供用于预处理、比对、质量控制、表达和染色质可及性定量、聚类、差异分析和注释的功能。通过在单细胞水平上对染色质可及性的基因调控潜力进行建模,MAESTRO优于现有的scRNA-seq和scATAC-seq之间整合细胞簇的方法。此外,MAESTRO还支持使用预定义的细胞类型标记基因进行自动细胞类型注释,并从差异scRNA-seq基因和scATAC-seq峰中识别驱动调节因子。
小W (2024-07-31 22:45):
#paper doi:10.1016/S0140-6736(23)02799-X An empowerment model for managing menopause 一半的人会经历更年期,正好看了柳叶刀更年期2024专题四篇文章中的一篇。面对更年期绝望焦虑的严重不良症状,单纯当做医学问题进行药物治疗是不够的,一个通过提升自我认知和自信,以自我管理健康并对护理作出知情决定的方式是必要的。同时文章概述了什么是更年期 、更年期预测、更年期症状以及更年期前后的管理,为护理人员和希望获取更年期知识的人提供自信。
Menopause eventually happens to all people with typically functioning ovaries, and almost one billion women worldwide are postmenopausal. Although the biology of typical menopause is ubiquitous, the experience varies substantially. … >>>
Menopause eventually happens to all people with typically functioning ovaries, and almost one billion women worldwide are postmenopausal. Although the biology of typical menopause is ubiquitous, the experience varies substantially. Factors contributing to the experience include not only individual factors, such as the nature and severity of symptoms, but also psychological, social, and contextual considerations, many of which are modifiable. In this first paper in the Lancet Series on menopause, we argue for a new approach that goes beyond the treatment of specific symptoms, to encompass a broad model to support women transitioning this life stage, using the model of empowerment. WHO defines empowerment as an active process of gaining knowledge, confidence, and self-determination to self-manage health and make informed decisions about care. Rather than focusing on menopause as an endocrine deficiency, we propose an empowerment model that recognises factors modifying the experience, in which the patient is an expert in their own condition and the health-care worker supports the patient to become an equal and active partner in managing their own care. <<<
庞庞 (2024-07-31 22:26):
#paper doi: 10.1016/j.jad.2019.09.067 ,这篇论文Altered Brain Entropy as a predictor of antidepressant response in major depressive disorder 这篇论文强调了MOFC/sgACC 的熵——BEN作为预测MDD诊断和治疗效果的潜在标记物。 MDD可能增加了MOFC/sgACC的 BEN,但降低了视觉和感觉运动回路的BEN,这与不平衡的情绪和感觉运动信息处理相对应。逆转这种不平衡的BEN将改善MDD的疾病状况。
林海onrush (2024-07-31 22:12):
#paper, DOI: 10.1109/LCSYS.2022.3166446, Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model ,这篇论文“Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model”研究了一种基于深度强化学习(DRL)的做市策略。通过使用多元Hawkes过程模拟器训练控制器,解决了在限价订单簿(LOB)下的最优做市问题。研究模型在简化的LOB框架下,考虑了订单到达率对市场做市商控制策略的动态响应,确保了模型的可操作性。DRL策略在收益和风险管理方面表现出色,优于传统的做市基准策略,如Avellaneda-Stoikov模型和其他线性策略。展示了在基于Hawkes过程的LOB模型下使用DRL进行做市的可行性,并取得了优异的实验结果。特别是DRL策略在收益和风险管理方面表现出色,具有更高的均值收益、更有利的夏普比率和较低的库存风险。未来研究可以考虑更复杂的做市模型,或者基于其他类型核函数的Hawkes过程,以及使用对抗性强化学习来提高模型在不确定性条件下的泛化能力和鲁棒性。
钟鸣 (2024-07-31 22:11):
#paper doi:10.3389/frsps.2024.1359672 The influence of recall direction on judgments of subjective temporal distance from the beginning of the COVID-19 pandemic lockdowns 人对一段历史的的回忆有两种方式:前向回忆(从古至今)和后向回忆(倒叙,从今至古),一般而言后向回忆会比前向回忆给人的心理感觉更近。但是在作者对约80名大学生关于新冠发生前后的7个标志性事件的访谈发现,这个结论被逆转了,即前向回忆者认为这7件标志性事件发生的时间比后向记忆者更近。针对这一现象作者分析了4个可能的原因:1新冠时候人们比较无聊;2由于聚光灯效应(印象越深刻的事情越觉得刚发生不久)的存在,且情绪会因更具前瞻性的时间顺序叙述而增强,而不是反向叙述;3事件效价,即情绪会因更具前瞻性的时间顺序叙述而增强而不是反向叙述4疫情的开放性让后向回忆感觉更加费力,从而导致判断时间距离更大
IntroductionIn a series of 5 studies, Lam and Buehler found that first-year university students felt closer to a target event (the day they learned that they were accepted into university) … >>>
IntroductionIn a series of 5 studies, Lam and Buehler found that first-year university students felt closer to a target event (the day they learned that they were accepted into university) when they recalled a stream of related events in a backward direction (a reverse-chronological order ending with the target event) than when they recalled those events in a forward direction (a forward-chronological order beginning with the target event).MethodsIn a conceptual replication of their Study 2, we asked participants how close they felt to the first day that lockdowns were imposed in response to the Covid-19 pandemic in the U.S. (federally mandated on March 13, 2020) following either backward or forward recall of a stream of related events.ResultsThe results of the present study ran directly counter to those of Lam and Buehler: participants rated the first day of lockdowns as feeling closer following forward recall than following backward recall.DiscussionPotential explanations for this reversal of Lam and Buehler's effect are discussed that focus on the temporal distortions that people have been found to experience when they think about autobiographical events that occurred at the beginning of the pandemic. <<<
符毓 Yu (2024-07-31 21:51):
#paper doi.org/10.48550/arXiv.2312.06512, 2024, Stoch BiRo: Design and Control of a low cost bipedal robot. 本文所提出的双足平台模型突出了熟练的行走能力、低计算需求和轻量级硬件设计。强化学习的奖励函数设计是用作动画镜像模仿跟随(motion-imitation rewards)并没有优先服务于整个机器人的IMU的水平保持,减少了很多扭矩模拟的数据
arXiv, 2023-12-11T16:39:11Z. DOI: 10.48550/arXiv.2312.06512
This paper introduces the Stoch BiRo, a cost-effective bipedal robot designedwith a modular mechanical structure having point feet to navigate uneven andunfamiliar terrains. The robot employs proprioceptive actuation in abduction,hips, … >>>
This paper introduces the Stoch BiRo, a cost-effective bipedal robot designedwith a modular mechanical structure having point feet to navigate uneven andunfamiliar terrains. The robot employs proprioceptive actuation in abduction,hips, and knees, leveraging a Raspberry Pi4 for control. Overcomingcomputational limitations, a Learning-based Linear Policy controller managesbalance and locomotion with only 3 degrees of freedom (DoF) per leg, distinctfrom the typical 5DoF in bipedal systems. Integrated within a modular controlarchitecture, these controllers enable autonomous handling of unforeseenterrain disturbances without external sensors or prior environment knowledge.The robot's policies are trained and simulated using MuJoCo, transferringlearned behaviors to the Stoch BiRo hardware for initial walking validations.This work highlights the Stoch BiRo's adaptability and cost-effectiveness inmechanical design, control strategies, and autonomous navigation, promisingdiverse applications in real-world robotics scenarios. <<<
muton (2024-07-31 18:12):
#paper:Hu, H., Li, A., Zhang, L., Liu, C., Shi, L., Peng, X., ... & Xue, G. (2024). Goal-directed attention transforms both working and long-term memory representations in the human parietal cortex. PLoS biology, 22(7), e3002721. 日常生活中我们会加工诸多信息,并存储到长时记忆,但当多个信息同时存在时,选择性注意是如何处理相互竞争的信息,减少干扰信息的影响呢。现有的工作记忆研究发现,注意可以通过灵活的编码和映射机制来保护目标表征。然而,在更复杂的图片刺激以及更长时间尺度下,这些机制是否同样有助于目标表征的记忆编码和长时提取尚未清楚。这项研究结合选择性注意任务和一天后的再认任务,揭示了选择性注意影响工作记忆以及长时记忆的神经机制。首先,研究采用分类器解码的分析方法,发现知觉注意和内省注意两种自上而下的注意在有效线索出现后均能增强目标表征和抑制干扰表征,且知觉注意的调节作用更强。此外,与视觉皮层相比,顶叶对干扰项的抑制作用更强,在知觉注意条件下干扰项的解码概率甚至和未出现的图片种类的解码概率无显著差异,说明顶叶更能抵抗干扰,更受自上而下的注意调节影响。并且虽然顶叶和视觉皮层都表征了目标信息,但在有干扰图片的知觉注意条件下,目标信息更多表征在背外侧顶叶,而在没有干扰物的基线条件下更多表征在视觉皮层。这表明在面对干扰时,大脑中维持目标表征的区域发生了部分的转移。研究还发现,在再认阶段目标表征和干扰表征的相似性越低,被试的总体记忆成绩越好。这些结果表示人类在进行长时记忆加工过程中受注意调控的影响,也证明了情景记忆的动态性。
IF:7.800Q1 PLoS biology, 2024-Jul. DOI: 10.1371/journal.pbio.3002721 PMID: 39008524 PMCID:PMC11271952
The abundance of distractors in the world poses a major challenge to our brain's limited processing capacity, but little is known about how selective attention modulates stimulus representations in the … >>>
The abundance of distractors in the world poses a major challenge to our brain's limited processing capacity, but little is known about how selective attention modulates stimulus representations in the brain to reduce interference and support durable target memory. Here, we collected functional magnetic resonance imaging (fMRI) data in a selective attention task in which target and distractor pictures of different visual categories were simultaneously presented. Participants were asked to selectively process the target according to the effective cue, either before the encoding period (i.e., perceptual attention) or the maintenance period (i.e., reflective attention). On the next day, participants were asked to perform a memory recognition task in the scanner in which the targets, distractors, and novel items were presented in a pseudorandom order. Behavioral results showed that perceptual attention was better at enhancing target memory and reducing distractor memory than reflective attention, although the overall memory capacity (memory for both target and distractor) was comparable. Using multiple-voxel pattern analysis of the neural data, we found more robust target representation and weaker distractor representation in working memory for perceptual attention than for reflective attention. Interestingly, perceptual attention partially shifted the regions involved in maintaining the target representation from the visual cortex to the parietal cortex. Furthermore, the targets and distractors simultaneously presented in the perceptual attention condition showed reduced pattern similarity in the parietal cortex during retrieval compared to items not presented together. This neural pattern repulsion positively correlated with individuals' recognition of both targets and distractors. These results emphasize the critical role of selective attention in transforming memory representations to reduce interference and improve long-term memory performance. <<<
小年 (2024-07-31 15:47):
#paper DOI: 10.1016/j.csbj.2024.03.030 A novel framework for human leukocyte antigen (HLA) genotyping using probe capture - based targeted next - generation sequencing and computational analysis 这篇文章介绍了一种利用基于探针捕获的靶向下一代测序和计算分析进行人类白细胞抗原(HLA)基因分型的分析流程,研究团队没有使用常用的IPD-IMGT/HLA 数据库做参考而是使用了人类泛基因组参考联盟(HPRC)资源作为HLA参考的基准,丰富了HLA参考数据库,为解决传统参考存在的问题提供了新的思路。在算法方面该团队使用了五个开源软件工具(OptiType、HLA - VBseq、HISAT - genotype、SpecHLA和T1K)作比较,虽然没有单一一种软件可以做到分型100%正确,但结合使用T1K、DRAGEN和QzType三个工具进行联合分析能使HLA基因的准确率达到100%。该研究证明了HLA分型基于探针捕获的靶向测序的有效性,特别是结合集成软件分析方法,能够提高HLA分型的准确性。
Human leukocyte antigen (HLA) genes play pivotal roles in numerous immunological applications. Given the immense number of polymorphisms, achieving accurate high-throughput HLA typing remains challenging. This study aimed to harness … >>>
Human leukocyte antigen (HLA) genes play pivotal roles in numerous immunological applications. Given the immense number of polymorphisms, achieving accurate high-throughput HLA typing remains challenging. This study aimed to harness the human pan-genome reference consortium (HPRC) resources as a potential benchmark for HLA reference materials. We meticulously annotated specific four field-resolution alleles for 11 HLA genes (HLA, , , , , , , , , and ) from 44 high-quality HPRC personal genome assemblies. For sequencing, we crafted HLA-specific probes and conducted capture-based targeted sequencing of the genomic DNA of the HPRC cohort, ensuring focused and comprehensive coverage of the HLA region of interest. We used publicly available short-read whole-genome sequencing (WGS) data from identical samples to offer a comparative perspective. To decipher the vast amount of sequencing data, we employed seven distinct software tools: OptiType, HLA-VBseq, HISAT genotype, SpecHLA, T1K, QzType, and DRAGEN. Each tool offers unique capabilities and algorithms for HLA genotyping, allowing comprehensive analysis and validation of the results. We then compared these results with benchmarks derived from personal genome assemblies. Our findings present a comprehensive four-field-resolution HLA allele annotation for 44 HPRC samples. Significantly, our innovative targeted next-generation sequencing (NGS) approach for HLA genes showed superior accuracy compared with conventional short-read WGS. An integrated analysis involving QzType, T1K, and DRAGEN was developed, achieving 100% accuracy for all 11 HLA genes. In conclusion, our study highlighted the combination of targeted short-read sequencing and astute computational analysis as a robust approach for HLA genotyping. Furthermore, the HPRC cohort has emerged as a valuable assembly-based reference in this realm. <<<
尹志 (2024-07-31 15:46):
#paper Machine learning-aided generative molecular design, nature machine intelligence, DOI: 10.1038/s42256-024-00843-5. 文章综述了生成模型做分子设计领域的情况。从表征、生成方法和优化策略层面进行了总结,特别清楚。感兴趣的同学可以直接看文章里的几张表格,作为了解该领域发展情况及切入研究问题非常有帮助。
哪有情可长 (2024-07-31 15:18):
#paper Biofortification of iron content by regulating a NAC transcription factor in maize, Science, 7 Dec 2023, DOI: 10.1126/science.adf3256. 玉米籽粒铁的含量对人类健康具有重要的作用,该项研究发现一个NAC的转录因子能够提高籽粒中的Fe的含量,并且该基因能够在果仁基部胚乳转移层中富集表达,该蛋白可以直接调节铁转运体mRNA的丰度,并且该课题组也培育了富含铁元素的玉米品种。该项工作从GWAS找候选基因的流程,逻辑性强,一步步的验证该基因的确是籽粒富含铁的候选基因,这个方法可以借鉴。
Iron (Fe) deficiency remains widespread among people in developing countries. To help solve this problem, breeders have been attempting to develop maize cultivars with high yields and high Fe concentrations … >>>
Iron (Fe) deficiency remains widespread among people in developing countries. To help solve this problem, breeders have been attempting to develop maize cultivars with high yields and high Fe concentrations in the kernels. We conducted a genome-wide association study and identified a gene, (), that regulates Fe concentrations in maize kernels. We cultivated maize varieties with both high yield and high Fe concentrations in their kernels by using a molecular marker developed from a 42-base pair insertion or deletion (indel) in the promoter of . expression is enriched in the basal endosperm transfer layer of kernels, and the ZmNAC78 protein directly regulates messenger RNA abundance of Fe transporters. Our results thus provide an approach to develop maize varieties with Fe-enriched kernels. <<<
徐炳祥 (2024-07-31 13:54):
#paper doi: 10.1101/2024.04.18.590148 bioRxiv, 2024, Droplet Hi-C for Fast and Scalable Profiling of Chromatin Architecture in Single Cells。单细胞Hi-C技术是目前单细胞三维基因组研究的主要技术手段,然而现有单细胞Hi-C技术存在通量不高,实验流程复杂,获取的单细胞文库质量较差等缺点。在本预印本论文中,作者们介绍了一种基于微流控技术的单细胞Hi-C技术的改良,称为Droplet Hi-C。Droplet Hi-C将单细胞Hi-C中barcoding步骤改为使用微流控平台自动化进行,从而大幅加快了文库构建的自动化水平和效率。Droplet Hi-C可实现超过4万个细胞的单细胞Hi-C文库的平行构建。借助此技术,作者分析了小鼠脑神经元中的染色质构象图谱的分布,研究了结直肠癌细胞系和组织中染色体外DNA的分布,实现了高通量的单细胞Hi-C和转录组共同构建。需要指出的是,论文仅提升了文库构建的效率,并未提升单个单细胞文库的质量,这可能是本领域下一个需要突破的重要技术瓶颈。
Comprehensive analysis of chromatin architecture is crucial for understanding the gene regulatory programs during development and in disease pathogenesis, yet current methods often inadequately address the unique challenges presented by … >>>
Comprehensive analysis of chromatin architecture is crucial for understanding the gene regulatory programs during development and in disease pathogenesis, yet current methods often inadequately address the unique challenges presented by analysis of heterogeneous tissue samples. Here, we introduce Droplet Hi-C, which employs a commercial microfluidic device for high-throughput, single-cell chromatin conformation profiling in droplets. Using Droplet Hi-C, we mapped the chromatin architecture at single-cell resolution from the mouse cortex and analyzed gene regulatory programs in major cortical cell types. Additionally, we used this technique to detect copy number variation (CNV), structural variations (SVs) and extrachromosomal DNA (ecDNA) in cancer cells, revealing clonal dynamics and other oncogenic events during treatment. We further refined this technique to allow for joint profiling of chromatin architecture and transcriptome in single cells, facilitating a more comprehensive exploration of the links between chromatin architecture and gene expression in both normal tissues and tumors. Thus, Droplet Hi-C not only addresses critical gaps in chromatin analysis of heterogeneous tissues but also emerges as a versatile tool enhancing our understanding of gene regulation in health and disease. <<<
前进 (2024-07-31 11:35):
#paper DOI:https://doi.org/10.48550/arXiv.2006.16236 Katharopoulos A, Vyas A, Pappas N, et al. Transformers are rnns: Fast autoregressive transformers with linear attention[C]//International conference on machine learning. PMLR, 2020: 5156-5165. 这篇论文提出了一种新型的线性Transformer模型,该模型通过将自注意力机制表达为线性点积的核特征映射,并利用矩阵乘法的结合性质,显著降低了传统Transformer在处理长序列时的计算复杂度,从O(N^2)降低到O(N)。作者展示了这种新模型不仅能够实现与标准Transformer相似的性能,而且在自回归预测长序列时速度提升了多达4000倍。此外,论文还探讨了Transformer与循环神经网络(RNN)之间的关系,证明了通过适当的转换,Transformer可以像RNN一样高效地进行自回归预测。
arXiv, 2020-06-29T17:55:38Z. DOI: 10.48550/arXiv.2006.16236
Transformers achieve remarkable performance in several tasks but due to theirquadratic complexity, with respect to the input's length, they areprohibitively slow for very long sequences. To address this limitation, weexpress … >>>
Transformers achieve remarkable performance in several tasks but due to theirquadratic complexity, with respect to the input's length, they areprohibitively slow for very long sequences. To address this limitation, weexpress the self-attention as a linear dot-product of kernel feature maps andmake use of the associativity property of matrix products to reduce thecomplexity from $\mathcal{O}\left(N^2\right)$ to $\mathcal{O}\left(N\right)$,where $N$ is the sequence length. We show that this formulation permits aniterative implementation that dramatically accelerates autoregressivetransformers and reveals their relationship to recurrent neural networks. Ourlinear transformers achieve similar performance to vanilla transformers andthey are up to 4000x faster on autoregressive prediction of very longsequences. <<<
李翛然 (2024-07-30 20:10):
#paper DOI:10.1101/2023.08.08.552403 Identifying small-molecules binding sites in RNA conformational ensembles with SHAMAN 这篇文章怎么说呢,一看就是搞计算机人写的。我来说说为啥。 介绍了一种名为SHAMAN的计算技术,可以识别RNA结构集合中的潜在小分子结合位点。与依赖静态结构的其他计算工具不同,SHAMAN旨在解决RNA分子动态性带来的挑战。该技术通过分析RNA结构的构象集合,而不仅仅是单一静态结构,来识别潜在的结合位点。这种方法对于理解小分子与RNA柔性和动态性之间的相互作用特别有用。 这里面的关键点,是RNA的构象如何确定的,但是他是使用这个方法确定rna构象的: 1.使用分子动力学(MD)模拟来生成RNA的构象集合。论文中提到使用了Amber力场和TIP3P水模型进行了100 ns的MD模拟。 2.从MD轨迹中提取出具有代表性的RNA构象集合。作者使用了聚类算法来对MD轨迹进行聚类,选择了聚类中心作为代表性构象。 3. 这些代表性构象进行分析,识别小分子可能结合的位点。SHAMAN工具就是用来分析这些构象集合,预测小分子的可能结合位点。 这就很扯了, 用聚类的方法来选取最有可能的rna 结构,这不扯呢么! 邮箱TIP3P水模型就已经是生物容忍的最低限度了,居然在这个状态下模拟rna,然后用数学聚类的方法来选取构想。 有点扯!缺乏 实验室人员的嘲讽~~~哈哈
AbstractThe rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Mostin silicotools for binding site identification rely on static … >>>
AbstractThe rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Mostin silicotools for binding site identification rely on static structures and therefore cannot face the challenges posed by the dynamic nature of RNA molecules. Here, we present SHAMAN, a computational technique to identify potential small-molecule binding sites in RNA structural ensembles. SHAMAN enables exploring the conformational landscape of RNA with atomistic molecular dynamics and at the same time identifying RNA pockets in an efficient way with the aid of probes and enhanced-sampling techniques. In our benchmark composed of large, structured riboswitches as well as small, flexible viral RNAs, SHAMAN successfully identified all the experimentally resolved pockets and ranked them among the most favorite probe hotspots. Overall, SHAMAN sets a solid foundation for future drug design efforts targeting RNA with small molecules, effectively addressing the long-standing challenges in the field. <<<
张浩彬 (2024-07-29 13:18):
#paper DOI: https://doi.org/10.1038/s41586-024-07566-y ,AI models collapse when trained on recursively generated data。Nature关于大模型合成语料的探讨文章,讨论了在训练数据中,合成语料的加入(可能是被动,由于现有网络资料已经大量的大模型合成语料),导致模型崩溃的问题。当然,合成语料的使用易燃是大模型的训练的有效方式,但是要做好对合成语料的筛选工作
IF:50.500Q1 Nature, 2024-07-24T15:01:51. DOI: 10.1038/s41586-024-07566-y
AbstractStable diffusion revolutionized image creation from descriptive text. GPT-2 (ref. 1), GPT-3(.5) (ref. 2) and GPT-4 (ref. 3) demonstrated high performance across a variety of language tasks. ChatGPT introduced such … >>>
AbstractStable diffusion revolutionized image creation from descriptive text. GPT-2 (ref. 1), GPT-3(.5) (ref. 2) and GPT-4 (ref. 3) demonstrated high performance across a variety of language tasks. ChatGPT introduced such language models to the public. It is now clear that generative artificial intelligence (AI) such as large language models (LLMs) is here to stay and will substantially change the ecosystem of online text and images. Here we consider what may happen to GPT-{n} once LLMs contribute much of the text found online. We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. We refer to this effect as ‘model collapse’ and show that it can occur in LLMs as well as in variational autoencoders (VAEs) and Gaussian mixture models (GMMs). We build theoretical intuition behind the phenomenon and portray its ubiquity among all learned generative models. We demonstrate that it must be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet. <<<