当前共找到 1435 篇文献分享,本页显示第 1 - 20 篇。
1.
林海onrush (2026-01-31 23:55):
#paper,DOI: arXiv:2406.03816,ReST-MCTS: LLM Self-Training via Process Reward Guided Tree Search,本文提出ReST-MCTS,一种将过程奖励(Process Reward)与改进的蒙特卡洛树搜索(MCTS)相结合的大语言模型自训练框架,旨在解决现有自训练方法仅依赖最终正确答案、却容易引入低质量中间推理的问题。该方法在仅已知最终正确答案的情况下,通过树搜索中的多次 rollout 自动推断每一步中间推理对通向正确解的贡献概率,从而生成高质量的过程奖励信号,用于同时训练策略模型和过程奖励模型。实验结果表明,在相同搜索预算下,ReST-MCTS*在推理准确率上优于 Best-of-N、Tree-of-Thought 等方法,并在多轮自训练中持续提升模型性能,显著超过 ReSTEM、Self-Rewarding 等已有自训练范式,验证了其在高质量推理轨迹获取和稳定自提升方面的有效性
arXiv, 2024-06-06T07:40:00Z. DOI: 10.48550/arXiv.2406.03816
Abstract:
Recent methodologies in LLM self-training mostly rely on LLM generating responses and filtering those with correct output answers as training data. This approach often yields a low-quality fine-tuning training set … >>>
Recent methodologies in LLM self-training mostly rely on LLM generating responses and filtering those with correct output answers as training data. This approach often yields a low-quality fine-tuning training set (e.g., incorrect plans or intermediate reasoning). In this paper, we develop a reinforced self-training approach, called ReST-MCTS*, based on integrating process reward guidance with tree search MCTS* for collecting higher-quality reasoning traces as well as per-step value to train policy and reward models. ReST-MCTS* circumvents the per-step manual annotation typically used to train process rewards by tree-search-based reinforcement learning: Given oracle final correct answers, ReST-MCTS* is able to infer the correct process rewards by estimating the probability this step can help lead to the correct answer. These inferred rewards serve dual purposes: they act as value targets for further refining the process reward model and also facilitate the selection of high-quality traces for policy model self-training. We first show that the tree-search policy in ReST-MCTS* achieves higher accuracy compared with prior LLM reasoning baselines such as Best-of-N and Tree-of-Thought, within the same search budget. We then show that by using traces searched by this tree-search policy as training data, we can continuously enhance the three language models for multiple iterations, and outperform other self-training algorithms such as ReST$^\text{EM}$ and Self-Rewarding LM. We release all code at https://github.com/THUDM/ReST-MCTS. <<<
翻译
2.
尹志 (2026-01-31 23:53):
#paper https://arxiv.org/abs/2601.21571. arxiv 2026. Shaping capabilities with token-level data filtering。文档级过滤过渡到Token 级过滤确实是很直接的想法,但用良好的工程实现获得洞见,确实是alec的风格。
arXiv, 2026-01-29T11:34:01Z. DOI: 10.48550/arXiv.2601.21571
Abstract:
Current approaches to reducing undesired capabilities in language models are largely post hoc, and can thus be easily bypassed by adversaries. A natural alternative is to shape capabilities during pretraining … >>>
Current approaches to reducing undesired capabilities in language models are largely post hoc, and can thus be easily bypassed by adversaries. A natural alternative is to shape capabilities during pretraining itself. On the proxy task of removing medical capabilities, we show that the simple intervention of filtering pretraining data is highly effective, robust, and inexpensive at scale. Inspired by work on data attribution, we show that filtering tokens is more effective than filtering documents, achieving the same hit to undesired capabilities at a lower cost to benign ones. Training models spanning two orders of magnitude, we then demonstrate that filtering gets more effective with scale: for our largest models, token filtering leads to a 7000x compute slowdown on the forget domain. We also show that models trained with token filtering can still be aligned on the forget domain. Along the way, we introduce a methodology for labeling tokens with sparse autoencoders and distilling cheap, high-quality classifiers. We also demonstrate that filtering can be robust to noisy labels with sufficient pretraining compute. <<<
翻译
3.
钟鸣 (2026-01-31 23:42):
#paper doi:10.1038/s41598-024-54874-4 Python farming as a flexible and efficient form of agricultural food security 本研究旨在评估两种大型蟒蛇(网纹蟒和缅甸蟒)在养殖场中的生长效率,探究其作为新型畜牧业的潜力。方法是定期测量12个月内蟒蛇的吻肛长和体重、摄食情况,并记录蟒蛇禁食20天的体重变化。结果发现,缅甸蟒日均增重远大于网纹蟒(42.6克vs19.7克),雌性生长快于雄性,早期生长速率(前2个月)和总摄食量是预测12个月生长的关键因素。禁食实验表明禁食期间日均体重损失仅0.004%,且恢复摄食后可继续快速生长,受饲料波动影响小。蟒的能量效率远高于恒温动物或因其是变温动物,鲑鱼、蟋蟀等变温动物能量效率与蟒蛇相当但稍高。较高的能量效率和抗禁食能力突出了蟒蛇养殖的潜力,但仍需考虑蛇文化的影响,且考虑其食物(雏鸡、鼠)的大规模供应。
Abstract:
AbstractDiminishing natural resources and increasing climatic volatility are impacting agri-food systems, prompting the need for sustainable and resilient alternatives. Python farming is well established in Asia but has received little … >>>
AbstractDiminishing natural resources and increasing climatic volatility are impacting agri-food systems, prompting the need for sustainable and resilient alternatives. Python farming is well established in Asia but has received little attention from mainstream agricultural scientists. We measured growth rates in two species of large pythons (Malayopython reticulatus and Python bivittatus) in farms in Thailand and Vietnam and conducted feeding experiments to examine production efficiencies. Pythons grew rapidly over a 12-month period, and females grew faster than males. Food intake and growth rates early in life were strong predictors of total lifetime growth, with daily mass increments ranging from 0.24 to 19.7 g/day for M. reticulatus and 0.24 to 42.6 g/day for P. bivittatus, depending on food intake. Pythons that fasted for up to 4.2 months lost an average of 0.004% of their body mass per day, and resumed rapid growth as soon as feeding recommenced. Mean food conversion rate for dressed carcasses was 4.1%, with useable products (dressed carcass, skin, fat, gall bladder) comprising 82% of the mass of live animals. In terms of food and protein conversion ratios, pythons outperform all mainstream agricultural species studied to date. The ability of fasting pythons to regulate metabolic processes and maintain body condition enhances food security in volatile environments, suggesting that python farming may offer a flexible and efficient response to global food insecurity. <<<
翻译
4.
李翛然 (2026-01-31 23:04):
#paper doi:10.1038/s41586-025-10014-0 Nature Advancing regulatory variant effect prediction with AlphaGenome。 AlphaGenome,这是一个能够统一解读 DNA 非编码“暗物质”的深度学习模型。该模型可直接输入长达 1 兆碱基对(1 Mb)‍ 的 DNA 序列,以单碱基分辨率同时预测数千种基因组功能信号(如染色质可及性、转录因子结合、剪接等)在性能上,AlphaGenome 在 24 项基因组轨迹预测任务中的 22 项,以及 26 项变异效应预测任务中的 24 项 上达到了最先进水平。它能够准确预测非编码变异如何影响基因调控,例如成功解析了白血病相关癌基因 TAL1 附近变异的作用机制。我觉得他倒是和 kegg没有特别让我眼前一亮的。
IF:50.500Q1 Nature, 2026-1-29. DOI: 10.1038/s41586-025-10014-0 PMID: 41606153
Abstract:
Abstract Deep learning models that predict functional genomic measurements from DNA sequences are powerful tools for deciphering the genetic regulatory code. Existing methods involve a trade-off between input sequence length … >>>
Abstract Deep learning models that predict functional genomic measurements from DNA sequences are powerful tools for deciphering the genetic regulatory code. Existing methods involve a trade-off between input sequence length and prediction resolution, thereby limiting their modality scope and performance 1–5 . We present AlphaGenome, a unified DNA sequence model, which takes as input 1 Mb of DNA sequence and predicts thousands of functional genomic tracks up to single-base-pair resolution across diverse modalities. The modalities include gene expression, transcription initiation, chromatin accessibility, histone modifications, transcription factor binding, chromatin contact maps, splice site usage and splice junction coordinates and strength. Trained on human and mouse genomes, AlphaGenome matches or exceeds the strongest available external models in 25 of 26 evaluations of variant effect prediction. The ability of AlphaGenome to simultaneously score variant effects across all modalities accurately recapitulates the mechanisms of clinically relevant variants near the TAL1 oncogene 6 . To facilitate broader use, we provide tools for making genome track and variant effect predictions from sequence. <<<
翻译
5.
半面阳光 (2026-01-31 22:47):
#paper doi: https://doi.org/10.1038/s41467-025-67218-1. Nature Communications. 2025. Flexible read-aware genotype imputation from sequence using biobank sized reference panels. 这篇文章在先前的QUILT的基因型填充(Genotype Imputation)方法基础上开发了一个新的QUILT2方法。这个方法能够基于大规模单倍型参考panel数据,对低深度全基因组测序reads及游离DNA进行快速的单体型推断与基因型填充。此外,QUILT2还包含一个方法学上的创新,旨在通过NIPT数据实现对母体的和胎儿的基因组填充。
Abstract:
Abstract Inexpensive and accurate genotyping methods are essential to modern genomics and health risk prediction. Here we introduce QUILT2, a scalable and read-aware imputation method that can efficiently use biobank … >>>
Abstract Inexpensive and accurate genotyping methods are essential to modern genomics and health risk prediction. Here we introduce QUILT2, a scalable and read-aware imputation method that can efficiently use biobank scale haplotype reference panels. This allows for fast and accurate imputation using short reads, as well as long reads (e.g. Oxford Nanopore Technologies (ONT) 1X, r2 = 0.937 at common SNPs), linked-reads and ancient DNA. In addition, QUILT2 contains a methodological innovation that is designed to enable imputation of the maternal and fetal genome using cell free non-invasive prenatal testing (NIPT) data. Using a UK Biobank reference panel and simulated NIPT data, we see accurate imputation of the mother (0.25X, r2 = 0.966, common SNPs) and modest imputation of the fetus (0.25X, r2 = 0.465, fetal fraction of 10%) at low coverage, with fetal imputation accuracy rising with coverage (4.0X, fetal r2 = 0.894). We show using simulated data that this could enable both GWAS and PRS for the mother and fetus, which could create clinical opportunities, and if phenotypes can be collected alongside clinical NIPT, the potential for large GWAS. <<<
翻译
6.
哪有情可长 (2026-01-31 21:29):
#paper Multi-omics identifies key genetic and metabolic networks regulating spike organ development in wheat. Plant Cell,  18 October 2025 doi.org/10.1093/plcell/koaf250. 小麦是全球重要的粮食作物,穗部发育是决定穗粒数、籽粒大小等关键产量性状的核心过程,但其复杂的基因与代谢物互作调控机制尚不明确。以“陇春35”为研究材料,针对小穗、穗轴、小花(含子房、花药)、芒等组织,覆盖12个关键发育阶段,结合LC-MS/MS代谢组学与转录组测序技术,构建了小麦穗发育的高时空分辨率多组学图谱。研究发现代谢物在不同组织中的富集特异性,揭示了激素时空分布对穗型发育的影响。鉴定出调控籽粒大小的关键基因TaOPR3、GL1和 GL2,并证实其优异单倍型在现代育种过程中被利用。该图谱深刻解析了代谢物与基因表达网络的互作机制,为理解小麦产量的分子基础提供了全新视角
Abstract:
Abstract Wheat (Triticum aestivum L.) spike development is tightly regulated by genetic and metabolic programs that drive organ growth and morphological changes. However, the dynamic interplay between metabolic shifts, gene … >>>
Abstract Wheat (Triticum aestivum L.) spike development is tightly regulated by genetic and metabolic programs that drive organ growth and morphological changes. However, the dynamic interplay between metabolic shifts, gene expression patterns, and their regulatory roles during spike development, remains poorly characterized. To address this knowledge gap, we performed integrated metabolomic and transcriptomic profiling across 12 stages of wheat spike organ development. Our analysis detected 1,105 metabolites in 233 spike, spikelet, and floret samples, uncovering an uneven distribution of phytohormone-related metabolites. The exogenous phytohormone treatments validated the regulatory roles of phytohormones in spike morphogenesis. High-resolution spatiotemporal data from carpel organs enabled the reconstruction of a regulatory network, identifying key genes (including 12-oxo-phytodienoic acid reductase3 (TaOPR3), Grain Length1 (GL1), and Grain Length2 (GL2)) as critical determinants of grain size. Genomic analyses revealed geographical differentiation in gene haplotypes and their selective retention during breeding, with superior alleles associated with increased grain size. This comprehensive dataset provides a valuable resource for understanding the molecular basis of wheat grain yield and offers potential targets for crop improvement. <<<
翻译
7.
徐炳祥 (2026-01-31 21:09):
#paper doi: 10.1038/s41592-018-0033-z Nature methods, 2018, SAVER: gene expression recovery for single-cell RNA sequencing。本文是单细胞RNA-seq缺失值插补方面的经典论文之一。作者给出了一种能借助单细胞文库中其他细胞的基因表达水平填补单个细胞数据缺失的算法。算法利用泊松分布建模基因表达计数,用Gamma分布对其均值建模。利用细胞间基因表达水平的相互回归估计此Gamma分布的均值。通过假定变异率在细胞群体中恒定来估计Gamma分布的形状参数。最终实现确实表达水平的填补和测得表达计数的纠偏。在模拟和真实单细胞RNA-seq数据集中算法性能均得到了验证。本文为单细胞数据的缺失值插补提供了一个可行的理论框架,是后续众多研究的基础。
8.
Vincent (2026-01-31 17:31):
#paper https://arxiv.org/abs/2201.11903 arxiv 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. 这篇文首次提出了Chain-of-Thought(CoT)的思路,通过在少样本提示中显式提供中间自然语言推理步骤,可以显著提升大语言模型在复杂推理任务上的表现。作者在多种推理任务基准测试上展示了 CoT 的显著增益,尤其在 100B+ 参数规模模型上表现为一种随规模涌现(emergent)的能力。消融实验表明,性能提升并非仅来自“多算一步”,而是顺序化、可读的推理过程本身在发挥作用。该方法无需额外训练或微调,仅通过提示即可实现,因而得以广泛运用,为大模型的可解释推理研究开辟了新方向
arXiv, 2022-01-28T02:33:07Z. DOI: 10.48550/arXiv.2201.11903
Abstract:
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, … >>>
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting. Experiments on three large language models show that chain of thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. The empirical gains can be striking. For instance, prompting a 540B-parameter language model with just eight chain of thought exemplars achieves state of the art accuracy on the GSM8K benchmark of math word problems, surpassing even finetuned GPT-3 with a verifier. <<<
翻译
9.
符毓 (2026-01-31 13:25):
#paper doi:10.1109/ECCE47101.2021.9595683 IEEE, 2021, Application of Flat Rectangular Wire Concentrated Winding for AC loss Reduction in Electrical Machines. 随着电机功率和转速要求的提高,高速运行时产生的损耗,特别是绕组交流损耗,已成为一个亟待解决的问题。扁线可以最大限度地减少由趋肤效应和邻近效应引起的绕组交流损耗。本文分析了传统圆形导线对交流损耗的影响,并将其与所提出的扁平矩形导线结构进行了比较。同时,对开槽和半闭槽定子结构进行了评估 结果表明采用扁平矩形导线结构可以显著降低交流损耗。分析和仿真结果表明,与开放式槽定子相比,半封闭式槽定子绕组交流损耗较低,但铁芯损耗和永磁体损耗较高。由于开放式槽定子可以容纳预成型的扁平矩形导线绕组,并且易于实现自动化绕线工艺
10.
小年 (2026-01-30 11:57):
#paper doi:10.1038/s41565-025-02080-2,Goerzen D, Heller DA, et al. Machine perception liquid biopsy identifies brain tumours via systemic immune and tumour microenvironment signature(Nature Nanotechnology, 2025) 该研究开发了一种“机器感知液体活检(MPLB)”的新技术。研究团队构建含 21 种量子阱缺陷修饰单壁碳纳米管的传感器阵列,结合近红外荧光光谱与 CatBoost 机器学习算法,对 739 例血浆样本(含胶质瘤、脑膜瘤等 4 类脑肿瘤及非肿瘤样本)进行分析,实现 98% 的脑肿瘤检测准确率和 71% 的肿瘤分型准确率,外部队列验证准确率达 89.8%,且能检出 WHO 1-2 级低级别肿瘤。通过定量蛋白质组学解析传感器表面 “蛋白冠”,鉴定出 2017 种富集蛋白,包括 ENPP2、S100A 家族等新型标志物,这些标志物来源于肿瘤细胞、肿瘤微环境及全身免疫系统。该技术突破血脑屏障导致的生物标志物稀缺限制,无需特殊样本处理,为脑肿瘤无创早期诊断、亚型区分及个性化治疗靶点发现提供了全新方案,也为其他缺乏有效生物标志物的疾病检测奠定技术基础。
11.
cellsarts (2026-01-30 00:13):
# Paper DOI:10.3389/fmicb.2019.008492019-04-24 The Biogeochemical Sulfur Cycle of Marine Sediments 海洋沉积物的生物地球化学硫循环 摘要:微生物通过异化硫酸盐还原作用将硫酸盐转化为硫化物,是缺氧海底有机质矿化的主要终末途径。生成的硫化物经化学或微生物氧化后,在硫循环中形成复杂的途径网络,进而产生多种中间态硫化合物,并部分重新转化为硫酸盐。这些中间产物包括单质硫、多硫化物、连四硫酸盐和亚硫酸盐,它们均可作为进一步微生物氧化、还原或歧化反应的底物。近年来,一些新的微生物发现,例如通过硫化物氧化电缆细菌实现长距离电子传递的现象,进一步增加了这一过程的复杂性。同位素交换反应在稳定同位素地球化学以及利用放射性示踪剂研究硫转化的实验中发挥着重要作用。微生物催化的这些过程部分具有可逆性,其逆反应会影响我们对放射性示踪实验的解释,并为同位素分馏提供了一种机制。本文综述了我们在理解海底硫循环方面所取得的进展及当前的研究现状,涵盖其微生物生态学、生物地球化学和同位素地球化学等多个领域。
12.
白鸟 (2026-01-29 16:02):
#paper DOI:10.1126/science.ads9530 文献名称:Deep contrastive learning enables genome-wide virtual screening 发表期刊:Science, 2026 文章概要:DrugCLIP模型,基于深度对比学习的框架,用于实现超大规模、超快速的全基因组虚拟筛选。核心问题:传统分子对接计算量巨大,无法高效处理全人类基因组(约10,000+个蛋白靶点)× 海量化合物库(如500百万分子)的组合(万亿级交互)。 算法思路:通过对比学习,将蛋白pocket(结合位点)和小分子嵌入到一个共享的潜在空间中。在这个空间里,相似度直接编码蛋白-分子结合的可能性,实现了开创性的万亿级全基因组筛选,是后AlphaFold时代的新范式,推动从“靶点-化合物”的一对一筛选转向“全基因组-全化学空间”的系统性探索。 亮点 速度极快:DrugCLIP可在一天内完成万亿级交互,真正实现全基因组规模。 准确性强:在多种基准上表现出色,EF1%(top 1%富集因子)等指标领先;支持多靶点筛选和泛化。 可解释性:嵌入空间可视化(t-SNE等)能直观展示蛋白-分子匹配模式。 开放性高:作者公开了大规模筛选数据库,研究者可直接查询/下载结果;早期版本代码已开源(NeurIPS 2023 DrugCLIP仓库)。 部署和应用 1.在线版本:提交输入文件,即可生成结果 2.GitHub开源版本:早期版本开源,可python调用; 局限性:结构依赖、计算资源和实验验证
Abstract:
Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that … >>>
Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that achieves ultrafast and accurate virtual screening, up to 10 million times faster than docking, while consistently outperforming various baselines on in silico benchmarks. In wet-lab validations, DrugCLIP achieved a 15% hit rate for norepinephrine transporter, and structures of two identified inhibitors were determined in complex with the target protein. For thyroid hormone receptor interactor 12, a target that lacks holo structures and small-molecule binders, DrugCLIP achieved a 17.5% hit rate using only AlphaFold2-predicted structures. Finally, we released GenomeScreenDB, an open-access database providing precomputed results for ~10,000 human proteins screened against 500 million compounds, pioneering a drug discovery paradigm in the post-AlphaFold era. <<<
翻译
13.
惊鸿 (2026-01-29 10:07):
#paper DOI: 10.1056/NEJMoa2504747 英文标题: Customized Base-Editing Therapy for CPS1 Deficiency(基于案例内容概括) 发表时间: 2025年(《新英格兰医学杂志》正式发表) 核心突破 本研究首次为一名患有致死性氨甲酰磷酸合成酶1(CPS1)缺乏症的6月龄男婴(KJ Muldoon)量身定制了体内碱基编辑疗法,并在7个月内完成从设计到临床施用的全过程,实现了“N-of-1”个性化基因治疗的里程碑。患儿在接受两次静脉输注(0.1 mg/kg、0.3 mg/kg)后,血氨水平显著下降,能够耐受更高蛋白质摄入,且未再发生高血氨危象。 技术亮点 1. 平台化快速开发:采用腺嘌呤碱基编辑器(ABE)+ 脂质纳米颗粒(LNP) 递送系统,仅需针对患者特定突变定制向导RNA(gRNA),极大缩短研发周期。 2. 精准无创编辑:ABE直接化学修饰DNA碱基(A·T→G·C),无需切割双链,避免传统CRISPR-Cas9的脱靶和插入/缺失风险。 3. 可重复给药优势:LNP递送允许剂量调整(二次输注),克服了AAV载体免疫原性高、仅能单次给药的局限。 局限与展望 - 随访时间短:目前仅报道7周临床数据,长期安全性与持久性需进一步验证。 - 成本与可及性:个性化疗法开发成本高昂,需推动平台化生产以降低费用。 - 拓展潜力:该“平台化+定制化”模式可推广至其他由点突变引起的肝脏代谢遗传病,为数百万罕见病患者提供新希望。 总结 本研究不仅成功挽救一名危重患儿,更验证了个性化基因编辑疗法在极短时间内从概念到临床的可行性,标志着基因治疗从“一刀切”迈向“量身定制”的新纪元。未来需聚焦长期监测、成本优化及适应症拓展,让精准医疗惠及更多罕见病患者。 原文链接:"https://doi.org/10.1056/NEJMoa2504747" (https://doi.org/10.1056/NEJMoa2504747)
14.
颜林林 (2026-01-29 01:05):
#paper doi:10.1016/j.jbi.2025.104971, Journal of Biomedical Informatics, 2026, Augmented intelligence for multimodal virtual biopsy in breast cancer using generative artificial intelligence. 这是一篇应用生成式AI来帮助提升诊断准确度的文章。在乳腺癌诊断中,活检(biopsy)虽是金标准,但由于其侵入性和滞后性,实践中还是需要仅基于影像学的诊断方法。当前使用的标准图像是全场数字乳腺摄影(FFDM),但它往往难以在致密型乳腺中准确辨识病灶,于是需要补充另一种图像,增强光谱乳腺摄影(CESM),来通过造影剂显著提升病灶可见度,但这种图像因辐射剂量和造影剂过敏风险,难以在所有患者中普及。这篇论文针对这一痛点,提出了一种解决方案:利用生成式人工智能(CycleGAN)在只有普通FFDM图像的情况下,合成出高质量的“虚拟”CESM图像,使诊断可以同时基于两种图像进行,从而提高准确度,并将其取名为“虚拟活检(virtual biopsy)”。虽然文章评估结果证实这种“虚构图像”的引入,相比只有标准图像的情况,的确能提升诊断的准确度。然而,在我看来,大概是由于只依赖标准图像的模型,并未充分把标准图像的信息利用起来,才为这篇文章的方法留下了提升空间。这种不直接改进原模型,而通过增加生成式虚构图像来补充信息的方法,让我想到那个关于数学家救火队员的段子:如果发现着火了怎么办?取出高压水枪灭火;如果发现没着火怎么办?先点火,然后取出高压水枪灭火。
15.
孤舟蓑笠翁 (2026-01-21 22:40):
paper 【doi】10.1038/s41588-025-02449-y;【发表年份】2026年;【期刊】Nature Genetics;【标题】Protein-protein interactions shape trans-regulatory impact of genetic variation on protein expression and complex traits。【内容总结】这篇论文想弄明白我们的基因差异(遗传变异)是如何通过影响蛋白质之间的相互作用,最终决定我们血液里蛋白质的多少(表达水平),并影响身高、疾病风险等复杂特征的;为此,研究人员重新分析了英国生物银行血浆蛋白质组计划(UK Biobank Pharma Proteomics Project, UKB-PPP)这个大型数据库,重点比较了影响蛋白质水平的遗传位点(pQTL)和影响信使RNA水平的位点(eQTL)的效应强弱,并开发了一种叫trans-PCO的新方法来找出那些能“远程调控”(trans调控)蛋白质表达的遗传变异,同时严格控制了数据分析过程以避免假信号。他们发现,那些“远程调控”蛋白质的遗传效应,其强度远超“远程调控”RNA的效应(trans-to-cis z-score比率更高),并且蛋白质表达水平的遗传差异有89-90%是由这种“远程调控”贡献的,而RNA只有60-79%;这些执行“远程调控”的基因往往更重要、更受进化约束(高pLI评分基因占27.2%),并且与36%的复杂疾病GWAS位点(例如类风湿性关节炎RA的位点)存在关联,而直接调控的基因仅关联10.1%;最关键的是,他们证实蛋白质相互作用网络是这种“远程调控”的主要途径,相关信号在相互作用网络中富集了5.4倍,并据此新发现了17,662个“远程调控”位点,这些位点能将多个看似不相关的疾病位点(如6个RA风险位点)联系到同一个蛋白质相互作用系统(如BAFF/APRIL系统)上,从而解释了疾病发生的可能通路。
16.
龙海晨 (2026-01-15 22:55):
#paper Nordling L. Are these the happiest PhD students in the world? Nature. 2025 Oct;646(8086):1013-1016. doi: 10.1038/d41586-025-03346-4. PMID: 41116079.这是一篇nature对全球博士读博满意度的调查。这项调查共有来自107个国家的3785名自愿参与受访者。其中,44%自称女性,25%自称属于其留学国家的少数族裔,33%在非原籍国的国家学习。调查由《自然》与总部位于伦敦的研究咨询公司Thinks Insight & Strategy合作设计。此次《自然》调查中,巴西的表现格外突出——在那里学习的受访者中,高达83%的人表示对他们的博士课程满意,这一数据显著高于75%这一全球平均水平。巴西学生对他们的经历也最为乐观——80%的人表示喜欢他们的学位,78%的人感到工作有成就感,而这些对应的全球平均水平分别为70%和72%。唯一能与巴西媲美的是澳大利亚。在澳大利亚读博的学生在享受度和成就感方面与巴西持平,满意度方面仅低一个百分点。澳大利亚和巴西的学生也最有可能表示他们的博士生体验符合预期,澳大利亚有68%的人同意这一说法,巴西则为65%。在参与调查的107个国家中,有些国家的回应数量很少。有8个国家的参与者超过100名——澳大利亚(101名)、巴西(113名)、中国(312名)、德国(247名)、印度(430名)、意大利(111名)、英国(201名)和美国(568名)。显然,它们之间的比较更为可靠。另有10个国家有50至100名受访者,也被纳入本次分析。其余国家因参与者太少,无法单独研究。在澳大利亚样本中,国际博士生比例很高,这也可能影响其满意度得分。全球范围内,在国外攻读博士学位的学生报告满意度显著高于在本国学习的学生。然而,澳大利亚的生活成本更为高昂。罗素指出,每年33500澳元的博士生津贴低于澳大利亚约49000澳元的最低工资。在此次《自然》调查中,在样本量最为充足的8个国家中,中国和德国的博士生满意度低于平均水平。在中国的312名受访者在导师关系、研究指导、旅行机会和独立性,享受博士学习和有成就感等方面低于平均水平。但报告也有积极的一面。结果显示,中国博士生声称自己受到虐待的比例更低,只有 15% 的学生表示自己受到了欺凌,而在其他国家,这一比例高达 22%。同样,声称自己受到歧视或骚扰的学生比例也较低,只有 12%,其他国家为 22%。这份调查采访了 690 名中国学生,只有 55% 的人表示对自己的博士生涯有一点满意。而在针对国外学生的调查中,这一数字为 72%。当被问及「你的博士生涯在多大程度上满足了预期」时,45% 的中国博士生表示「未达预期」,而在其他国家,这一数字仅为 36%。此外,只有 5% 的中国博士生表示博士生涯超出预期,还不到国际水平的一半。那后悔了怎么办呢?22% 的中国博士生表示他们会选择更换导师,36% 的人会转换研究领域,还有 7% 的人会直接放弃。在调查中,40% 的中国博士生表示他们曾因抑郁、焦虑而去寻求心理帮助。这一比例略高于其他国家(36%)。但对于中国学生来说,这种心理帮助似乎并不是触手可及。只有 10% 的受访者表示他们得到的有效帮助来自于自己的学校或科研院所,而在其他国家,28% 的学生都可以从自己的学校得到有效帮助。满意度最高的国家: 巴西:83% 的学生表示满意,远高于全球平均的 75%。 澳大利亚:满意度接近巴西,尤其在 "享受度" 和 "成就感" 上表现突出。 意大利:82% 的学生表示满意,但对 工作与生活平衡、薪酬 等方面的满意度较低。 🇧🇷 巴西:乐观背后的韧性 尽管许多学生面临经济困难、长时间工作等问题,但整体满意度仍很高。 学生普遍提到: 导师支持 和 科研社群团结 是关键。 对科研的 使命感 和 社会贡献感 强。 政治环境改善(前总统下台)带来乐观情绪。 🇦🇺 澳大利亚:生活品质与包容性 学生享受良好的 工作与生活平衡 和 户外文化。 社会福利(如医疗补助)和 多元包容的校园环境 提升了满意度。 但生活成本高,博士生津贴低于最低工资,经济压力依然存在。 🇮🇹 意大利:满意但充满挑战 学生普遍对博士学位感到自豪,但对 薪酬、独立性、心理健康 等方面的满意度低于平均水平。 许多学生靠 对科研的热情 坚持,但现实压力大,近半数学生存在焦虑或抑郁风险。 🇨🇳 中国:满意度最低 仅有 60% 的学生表示满意,远低于全球平均。 主要原因: 工作时间极长(每周超过 80小时)超过了一些地方的“996”。 学术与就业竞争激烈,职位供不应求。 津贴低与其他国家相比,中国博士生的薪资也很低:政府给予的奖学金平均每年约42000元人民币。,压力大。过去十年,中国博士生数量翻倍增长,但科研岗位几乎没怎么增加。博士越来越多,出路却越来越窄。 “学历通胀”让博士不再是金字招牌,而导师制的层级结构、指标导向的考核体系,让科研变成了一场“精确到小数点的内卷”。 在这样的环境下,博士生的生活几乎被“论文—基金—考核”三件事绑架。 你想创新,却被要求先“跟上指标”; 你想休息,却担心落后; 你想倾诉,却被告知“科研本来就这样”。 这种高压下的“麻木生存”,让很多博士生陷入情绪倦怠。 有人调侃自己:“我们不是在读博,是在练习如何在焦虑里活下去。” 这并不是个体问题,而是整个科研体系的信号。 当一个国家的博士普遍不快乐,那说明系统里的“人”,被效率逻辑吞噬得太久了。有趣的是,《Nature》的研究还发现,博士的幸福感和经济条件的相关性并不强。 像德国这样的高福利国家,博士满意度反而低于全球平均;而像巴西这样的发展中国家,却位居榜首。 换句话说——博士幸福的关键,不在钱,而在“关系”。 调查显示,博士生如果每周能和导师见面至少一小时,满意度显著提高; 如果每周工作超过60小时,满意度则直线下降。 看似简单的数据,其实揭示了最本质的一点:博士不是机器,而是人。 能不能被理解、能不能被支持、能不能找到归属感,比工资更影响他们的幸福感。 而当体系能容纳“人”的脆弱,那才是科研最强的底气。
17.
DeDe宝 (2026-01-05 05:19):
#paper doi: https://doi.org/10.1371/journal.pcbi.1013826 Long-term perceptual priors drive confidence bias that favors prior-congruent evidence. Plos Computational Biology 该研究探究了长期感知先验(long-term perceptual priors)如何影响人类的感知决策与信心判断,并揭示了信心判断中存在偏向先验一致证据的偏差机制。基于贝叶斯框架的模型认为感知决策(及其置信度)基于先验和似然的基于精度的加权,然而,一些研究发现先验对置信度影响更大。研究使用Confidence Forced-Choice Task以探究感知任务重长期先验的影响,在该任务中,被试需要连续两次判断刺激的运动方向并判断在哪一次判断的置信度更高。长期感知先验可能与判断边界垂直(不提供额外偏向)或者落在其中一个方向的区域(提供额外偏向)。研究结果表明,被试更倾向于认为与先验一致的判断置信度更高,说明长期认知先验对置信度的影响存在额外的确认性偏差。研究者提出WPPCE 模型(加权后验与先验一致证据)解释观察到的信心偏差。
Abstract:
According to the Bayesian framework, both our perceptual decisions and confidence about those decisions are based on the precision-weighted integration of prior expectations and incoming sensory information. While it is … >>>
According to the Bayesian framework, both our perceptual decisions and confidence about those decisions are based on the precision-weighted integration of prior expectations and incoming sensory information. While it is generally assumed that priors influence both decisions and confidence in the same way, previous work has found priors to have a stronger impact at the confidence level, challenging this assumption. However, these patterns were found for high-level probabilistic expectations that are flexibly induced in the task context. It remains unclear whether this generalizes to low-level perceptual priors that are naturally formed through long term exposure. Here we investigated human participants’ confidence in decisions made under the influence of a long-term perceptual prior: the slow-motion prior. Participants viewed tilted moving-line stimuli for which the slow-motion prior biases the perceived motion direction. On each trial, they made two consecutive motion direction decisions followed by a confidence decision. We contrasted two conditions – one in which the prior impacted discrimination performance, and one in which it did not. We found a confidence bias favoring the condition in which the prior influenced discrimination decisions, even after accounting for performance differences. Computational modeling revealed this effect to be best explained by confidence using the prior-congruent evidence as an additional cue, beyond the posterior evidence used in the perceptual decision. This is in agreement with a confirmatory confidence bias favoring evidence congruent with low-level perceptual priors, revealing that, in line with high-level expectations, even long-term priors have a greater influence on the metacognitive level than on perceptual decisions. <<<
翻译
18.
刘昊辰 (2026-01-04 09:37):
#paper Collapsi is strongly solved. 2025年6月由Mark S. Ball发布的两人完全信息游戏Collapsi,在16张牌(含4张A、4张2、4张3、2张4、2张Joker)组成的4×4环形棋盘上进行,玩家轮流依据所在牌面数值移动棋子,移动后起始牌翻面,无合法移动者输;Michael Young通过对称破缺将初始16!(约2.1×10¹³)种牌局简化,用带α-β剪枝的极小极大搜索算法开发求解器,20毫秒内可找最优移动,在13代Intel Core i5-13500处理器上耗时7小时29分钟完成47,297,250种等效牌局分析,发现先手(红方)仅37.5%牌局可必赢,后手(蓝方)62.5%牌局可必赢,游戏最短必赢步数为7回合,6.4%牌局中败方能将游戏拖至最大14回合,最终证明该游戏被强解。下载地址:https://arxiv.org/pdf/2507.16823
arXiv, 4 Jul 2025. DOI: 10.48550/arXiv.2507.16823
Abstract: No abstract available.
19.
尹志 (2025-12-31 23:41):
#paper doi: https://doi.org/10.1016/j.future.2024.04.060. Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions. 大综述,讲了量子计算为中心的计算范式,在材料科学中的算法、应用及方向。对多个材料科学的案例进行了讲解,算法部分的综述也很系统。可以说是量子计算for材料科学最优的概览素材之一。甚至对其他类似领域如药物发现等也有很好的借鉴意义。
Yuri Alexeev, Maximilian Amsler, Marco Antonio Barroca, Sanzio Bassini, Torey Battelle, Daan Camps, David Casanova, Young Jay Choi, Frederic T. Chong, Charles Chung, Christopher Codella, Antonio D. Córcoles, James Cruise, Alberto Di Meglio, Ivan Duran, Thomas Eckl, Sophia Economou, Stephan Eidenbenz, Bruce Elmegreen, Clyde Fare, Ismael Faro, Cristina Sanz Fernández, Rodrigo Neumann Barros Ferreira, Keisuke Fuji, Bryce Fuller, Laura Gagliardi, Giulia Galli, Jennifer R. Glick, Isacco Gobbi, Pranav Gokhale, Salvador de la Puente Gonzalez, Johannes Greiner, Bill Gropp, Michele Grossi, Emanuel Gull, Burns Healy, Matthew R. Hermes, Benchen Huang, Travis S. Humble, Nobuyasu Ito, Artur F. Izmaylov, Ali Javadi-Abhari, Douglas Jennewein, Shantenu Jha, Liang Jiang, Barbara Jones, Wibe Albert de Jong, Petar Jurcevic, William Kirby, Stefan Kister, Masahiro Kitagawa, Joel Klassen, Katherine Klymko, Kwangwon Koh, Masaaki Kondo, Dog̃a Murat Kürkçüog̃lu, Krzysztof Kurowski, Teodoro Laino, Ryan Landfield, Matt Leininger, Vicente Leyton-Ortega, Ang Li, Meifeng Lin, Junyu Liu, Nicolas Lorente, Andre Luckow, Simon Martiel, Francisco Martin-Fernandez, Margaret Martonosi, Claire Marvinney, Arcesio Castaneda Medina, Dirk Merten, Antonio Mezzacapo, Kristel Michielsen, Abhishek Mitra, Tushar Mittal, Kyungsun Moon, Joel Moore, Sarah Mostame, Mario Motta, Young-Hye Na, Yunseong Nam, Prineha Narang, Yu-ya Ohnishi, Daniele Ottaviani, Matthew Otten, Scott Pakin, Vincent R. Pascuzzi, Edwin Pednault, Tomasz Piontek, Jed Pitera, Patrick Rall, Gokul Subramanian Ravi, Niall Robertson, Matteo A.C. Rossi, Piotr Rydlichowski, Hoon Ryu, Georgy Samsonidze, Mitsuhisa Sato, Nishant Saurabh, Vidushi Sharma, Kunal Sharma, Soyoung Shin, George Slessman, Mathias Steiner, Iskandar Sitdikov, In-Saeng Suh, Eric D. Switzer, Wei Tang, Joel Thompson, Synge Todo, Minh C. Tran, Dimitar Trenev, Christian Trott, Huan-Hsin Tseng, Norm M. Tubman, Esin Tureci, David García Valiñas, Sofia Vallecorsa, Christopher Wever, Konrad Wojciechowski, Xiaodi Wu, Shinjae Yoo, Nobuyuki Yoshioka, Victor Wen-zhe Yu, Seiji Yunoki, Sergiy Zhuk, Dmitry Zubarev <<<
Abstract: No abstract available.
20.
半面阳光 (2025-12-31 22:50):
#paper doi: https://doi.org/10.1038/s41598-019-50378-8. Scientific Reports. 2019. A novel high-throughput molecular counting method with single base-pair resolution enables accurate single-gene NIPT. 在NIPT技术应用中,无论是传统的无创检测还是近年来不断发展的无创单基因病检测,分子计数非常关键。这篇文章开发一个叫做 Quantitative Counting Template (QCT)的分子计数技术。简单说就是测在扩增和测序序之前,在模板DNA上添加了一个Embedded Molecular Index (EMI)独特分子标签,然后再进行扩增和测序,在获得测序数据后,通过EMI来识别模板分子,进而实现更为准确的计数。随后,研究人员基于这个技术,开发了针对镰状细胞病、囊性纤维化、脊髓性肌萎缩症、α地中海贫血及β地中海贫血(sickle cell disease, cystic fibrosis, spinal muscular atrophy, alpha-thalassemia, and beta-thalassemia)的单基因NIPT(sgNIPT)检测。该检测的分析敏感性与特异性均超过98%和99%。通过妊娠期采集的母体血液样本进一步验证了sgNIPT检测,其结果与新生儿随访检测100%一致。近年来,无创单基因病检测技术日渐增多和成熟,但直观感受上来看,单基因病的NIPT检测更多地要依靠实验环节的技术创新来达成。
Abstract:
Abstract Next-generation DNA sequencing is currently limited by an inability to accurately count the number of input DNA molecules. Molecular counting is particularly needed when accurate quantification is required for … >>>
Abstract Next-generation DNA sequencing is currently limited by an inability to accurately count the number of input DNA molecules. Molecular counting is particularly needed when accurate quantification is required for diagnostic purposes, such as in single gene non-invasive prenatal testing (sgNIPT) and liquid biopsy. We developed Quantitative Counting Template (QCT) molecular counting to reconstruct the number of input DNA molecules using sequencing data. We then used QCT molecular counting to develop sgNIPTs of sickle cell disease, cystic fibrosis, spinal muscular atrophy, alpha-thalassemia, and beta-thalassemia. The analytical sensitivity and specificity of sgNIPT was >98% and >99%, respectively. Validation of sgNIPTs was further performed with maternal blood samples collected during pregnancy, and sgNIPTs were 100% concordant with newborn follow-up. <<<
翻译
回到顶部