#paper Zhang W, Ma Z, Wu Y, Shi X, Zhang Y, Zhang M, Zhang M, Wang L, Liu W. SARS-CoV-2 3C-like protease antagonizes interferon-beta production by facilitating the degradation of IRF3. Cytokine. 2021 Dec;148:155697. doi: 10.1016/j.cyto.2021.155697. Epub 2021 Sep 3. PMID: 34509038; PMCID: PMC8413301. 文章通过对转染SARS-CoV-2的293T细胞(人)。研究感染病毒后,细胞的 RLR 通路,发现感染后,I 型干扰素的产生受到 SARS-CoV-2 3CL的显著影响。
The prevalence of SARS-CoV-2 is a great threat to global public health. However, the relationship between the viral pathogen SARS-CoV-2 and host innate immunity has not yet been well studied. The genome of SARS-CoV-2 encodes a viral protease called 3C-like protease. This protease is responsible for cleaving viral polyproteins during replication. In this investigation, 293T cells were transfected with SARS-CoV-2 3CL and then infected with Sendai virus (SeV) to induce the RIG-I like receptor (RLR)-based immune pathway. q-PCR, luciferase reporter assays, and western blotting were used for experimental analyses. We found that SARS-CoV-2 3CL significantly downregulated IFN-β mRNA levels. Upon SeV infection, SARS-CoV-2 3CL inhibited the nuclear translocation of IRF3 and p65 and promoted the degradation of IRF3. This effect of SARS-CoV-2 3CL on type I IFN in the RLR immune pathway opens up novel ideas for future research on SARS-CoV-2.
#paper doi:10.1016/j.tcb.2023.11.002 Mechanism-aware and multimodal AI: beyond model-agnostic interpretation 本文是一篇介绍通过多模态人工智能将多组学、临床数据和基因组规模代谢模型(GSMM 通量组学)结合起来,以生成更准确透明解释的生物标志物的综述文章。本文介绍了GSMM的构建方法、用于多模态数据集成的 AI 建模方法以及图神经网络方法。GSMM的构建来源于组学数据,其参考文章也验证使用转录组数据和GSMM的多模态模型对于酵母生长预测性能的提升,并揭示了仅从基因表达中无法直接推断的功能模式。
Artificial intelligence (AI) is widely used for exploiting multimodal biomedical data, with increasingly accurate predictions and model-agnostic interpretations, which are however also agnostic to biological mechanisms. Combining metabolic modelling, 'omics, and imaging data via multimodal AI can generate predictions that can be interpreted mechanistically and transparently, therefore with significantly higher therapeutic potential.
#paper doi.org/10.1038/s41586-023-06306-y Nature, 2023, Solid-body trajectoids shaped to roll along
desired pathways 本文介绍了一种名为trajectoids的固体轨迹体,可以沿着所需路径滚动,并通过算法设计出这些轨迹体,并通过三维打印验证了这些设计的可行性。文章探讨了轨迹体的运动规律、路径设计和形态学,并提供了多个物理系统中的应用案例,如量子力学、经典光学和机器人学等。研究结果对于理解物体运动的动力学和设计新型光学器件具有重要意义
#paper DOI: 10.1038/s41436-018-0295-y genetics in medicine, 2019,Performance of prenatal cfDNA screening for sex chromosomes. Copy-number variants in clinical genome sequencing: deployment and interpretation for rare and undiagnosed disease. 这篇文献是用测序的方法进行疾病相关的CNVs检测。研究分析比较了17个参考样本的测序和临床芯片检测CNVs的结果。进一步建立了以家庭为单位的基于测序技术的CNVs calling方法,并用79个罕见或未确诊案例的样本对该方法进行了验证。结果表明测序在CNV calling上与芯片效果无差。此外,文章建立的方法还可以检出UPD和三体的嵌合情况。这是一篇关注测序技术用于临床CNVsj检测文章,是了解目前临床已经广泛开展的CNV-Seq检测方法的前导和基础参考。
#paper Consistency Models https://doi.org/10.48550/arXiv.2303.01469 扩散模型目前已经是生成式AI的核心技术方案了,但是由于它的迭代生成的性质,使得采样速度一直存在问题,因此在实际应用的场景下就会遇到阻碍。CM(consistency models)作为常规的扩散模型的高效改进方案,基于PE(probability flow) ODE轨道,提出一个针对ODE轨道(可以认为是演化迭代的步骤)上的映射,使得我们能够从任意轨道点,即任意迭代的timestep,映射到初始点,即原图。cm模型的提出,让单步扩散模型采样的质量变得更高,从而带动了大量实际应用的产生,包括图像编辑、图像补全等。目前大量基于扩散模型的实际应用,都已经使用了cm。这个是年初的时候Yang Song大佬和Ilya Sutskever一起的工作,四个作者全部都是来自openAI的扩散模型大佬。
Diffusion models have significantly advanced the fields of image, audio, andvideo generation, but they depend on an iterative sampling process that causesslow generation. To overcome this limitation, we propose consistency …
Diffusion models have significantly advanced the fields of image, audio, andvideo generation, but they depend on an iterative sampling process that causesslow generation. To overcome this limitation, we propose consistency models, anew family of models that generate high quality samples by directly mappingnoise to data. They support fast one-step generation by design, while stillallowing multistep sampling to trade compute for sample quality. They alsosupport zero-shot data editing, such as image inpainting, colorization, andsuper-resolution, without requiring explicit training on these tasks.Consistency models can be trained either by distilling pre-trained diffusionmodels, or as standalone generative models altogether. Through extensiveexperiments, we demonstrate that they outperform existing distillationtechniques for diffusion models in one- and few-step sampling, achieving thenew state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 forone-step generation. When trained in isolation, consistency models become a newfamily of generative models that can outperform existing one-step,non-adversarial generative models on standard benchmarks such as CIFAR-10,ImageNet 64x64 and LSUN 256x256.
#paper Theta mediated dynamics ofhuman hippocampal-neocortical learning systems in memory formation and retrieval https://doi.org/10.1038/s41467-023-44011-6 本研究探讨了人类海马-皮层学习系统在记忆编码和提取过程中的θ节律动力学。研究结果表明,θ节律在记忆形成和检索过程中起着重要的调节作用,并且海马和新皮质之间存在信息传递的动态变化。这些发现对于理解记忆的神经机制和相关疾病具有重要意义。具体而言,文章关心三个研究问题:1)在记忆编码和检索过程中,海马和新皮质之间的信息传递方向是如何的?2)在不同条件下,海马和新皮质之间的信息传递是否存在差异?3)Theta频率范围内的神经振荡是否在海马-新皮质学习系统中发挥重要作用?作者记录了8个颅内病人数据,通过模式分离任务(让被试判断old,lure,new)中的电生理信号来揭示4-5hz的θ节律分别在编码和提取过程的作用,以及这一节律如何贡献于海马和皮层的信息交互。结果发现,4-5hz的θ节律在模式完成和模式分离中表现不同,海马和新皮质无论在模式完成还是模式分离还是编码提取阶段都表现出双向信息交流。但这几种条件下存在一定的偏向性,如模式分离(即能够区分相似的项目)4 - 5hz介导了新皮层→海马方向偏差,而如果4-5hz可以在编码阶段介导海马→新皮层方向偏差,那么被试在提取阶段可以更好的识别学过的项目。总体来说海马和新皮层的交互在记忆编码提取以及模式分离和模式整合等不同阶段和条件下展现出了非常动态性的过程,4-5hz的θ震荡在其中起到一定的作用。
Episodic memory arises as a function of dynamic interactions between the hippocampus and the neocortex, yet the mechanisms have remained elusive. Here, using human intracranial recordings during a mnemonic discrimination …
Episodic memory arises as a function of dynamic interactions between the hippocampus and the neocortex, yet the mechanisms have remained elusive. Here, using human intracranial recordings during a mnemonic discrimination task, we report that 4-5 Hz (theta) power is differentially recruited during discrimination vs. overgeneralization, and its phase supports hippocampal-neocortical when memories are being formed and correctly retrieved. Interactions were largely bidirectional, with small but significant net directional biases; a hippocampus-to-neocortex bias during acquisition of new information that was subsequently correctly discriminated, and a neocortex-to-hippocampus bias during accurate discrimination of new stimuli from similar previously learned stimuli. The 4-5 Hz rhythm may facilitate the initial stages of information acquisition by neocortex during learning and the recall of stored information from cortex during retrieval. Future work should further probe these dynamics across different types of tasks and stimuli and computational models may need to be expanded accordingly to accommodate these findings.
#paper Brain aging in major depressive disorder: results from the ENIGMA major depressive disorder working group
doi: 10.1038/s41380-020-0754-0 研究者使用大样本数据集(ENIGMA)探究了抑郁症患者脑龄相对正常人的差异。具体而言,他们使用正常人的脑结构信息,构建了预测脑龄的模型,并将抑郁症患者作为测试集对他们的脑龄进行了预测。研究发现,抑郁症患者的脑龄相比正常人更高,并且和临床症状无关。
Major depressive disorder (MDD) is associated with an increased risk of brain atrophy, aging-related diseases, and mortality. We examined potential advanced brain aging in adult MDD patients, and whether this …
Major depressive disorder (MDD) is associated with an increased risk of brain atrophy, aging-related diseases, and mortality. We examined potential advanced brain aging in adult MDD patients, and whether this process is associated with clinical characteristics in a large multicenter international dataset. We performed a mega-analysis by pooling brain measures derived from T1-weighted MRI scans from 19 samples worldwide. Healthy brain aging was estimated by predicting chronological age (18-75 years) from 7 subcortical volumes, 34 cortical thickness and 34 surface area, lateral ventricles and total intracranial volume measures separately in 952 male and 1236 female controls from the ENIGMA MDD working group. The learned model coefficients were applied to 927 male controls and 986 depressed males, and 1199 female controls and 1689 depressed females to obtain independent unbiased brain-based age predictions. The difference between predicted "brain age" and chronological age was calculated to indicate brain-predicted age difference (brain-PAD). On average, MDD patients showed a higher brain-PAD of +1.08 (SE 0.22) years (Cohen's d = 0.14, 95% CI: 0.08-0.20) compared with controls. However, this difference did not seem to be driven by specific clinical characteristics (recurrent status, remission status, antidepressant medication use, age of onset, or symptom severity). This highly powered collaborative effort showed subtle patterns of age-related structural brain abnormalities in MDD. Substantial within-group variance and overlap between groups were observed. Longitudinal studies of MDD and somatic health outcomes are needed to further assess the clinical value of these brain-PAD estimates.
#paper Transcriptome-wide association analyses reveal the impact of regulatory variants on rice panicle architecture and causal gene regulatory networks, Nature Communications, 18 November 2023, https://doi.org/10.1038/s41467-023-43077-6 。利用转录组和全基因组关联分析鉴定控制水稻穗型结构的基因。研究人员利用275份水稻1-2mm幼穗转录组数据,鉴定出表达水平与性状相关的基因,后续又对鉴定到的显著的基因和受小效应位点影响,对等位基因差异选择影响基因的表达和穗部性状,后续利用顺式表达成分和和反式表达成分来构建了水稻穗发育的基因表达调控网络。
Panicle architecture is a key determinant of rice grain yield and is mainly determined at the 1-2 mm young panicle stage. Here, we investigated the transcriptome of the 1-2 mm …
Panicle architecture is a key determinant of rice grain yield and is mainly determined at the 1-2 mm young panicle stage. Here, we investigated the transcriptome of the 1-2 mm young panicles from 275 rice varieties and identified thousands of genes whose expression levels were associated with panicle traits. Multimodel association studies suggested that many small-effect genetic loci determine spikelet per panicle (SPP) by regulating the expression of genes associated with panicle traits. We found that alleles at cis-expression quantitative trait loci of SPP-associated genes underwent positive selection, with a strong preference for alleles increasing SPP. We further developed a method that integrates the associations of cis- and trans-expression components of genes with traits to identify causal genes at even small-effect loci and construct regulatory networks. We identified 36 putative causal genes of SPP, including SDT (MIR156j) and OsMADS17, and inferred that OsMADS17 regulates SDT expression, which was experimentally validated. Our study reveals the impact of regulatory variants on rice panicle architecture and provides new insights into the gene regulatory networks of panicle traits.
#paper doi:10.4161/bioe.28791 Bioengineered collagens: Emerging directions for biomedical materials
1. 胶原蛋白作为生物医学材料的历史和应用。
2. 动物胶原蛋白的局限性及疾病传播风险。
3. 重组胶原蛋白技术的发展,特别是在大肠杆菌中表达的细菌胶原蛋白的特性和潜在应用。
4. 生物工程方法改善胶原蛋白稳定性和功能的可能性。
5. 不同系统(如酵母、昆虫细胞、植物、微生物)用于胶原蛋白的生产。
6. 细菌胶原蛋白的特性、稳定性、非免疫原性和生产方法。
7. 结合生物工程技术,设计出具有特定功能的胶原蛋白结构。
Mammalian collagen has been widely used as a biomedical material. Nevertheless, there are still concerns about the variability between preparations, particularly with the possibility that the products may transmit animal-based …
Mammalian collagen has been widely used as a biomedical material. Nevertheless, there are still concerns about the variability between preparations, particularly with the possibility that the products may transmit animal-based diseases. Many groups have examined the possible application of bioengineered mammalian collagens. However, translating laboratory studies into large-scale manufacturing has often proved difficult, although certain yeast and plant systems seem effective. Production of full-length mammalian collagens, with the required secondary modification to give proline hydroxylation, has proved difficult in E. coli. However, recently, a new group of collagens, which have the characteristic triple helical structure of collagen, has been identified in bacteria. These proteins are stable without the need for hydroxyproline and are able to be produced and purified from E. coli in high yield. Initial studies indicate that they would be suitable for biomedical applications.
#paper doi: 10.1186/s13073-020-00810-w Genome Medicine, 2023, DNA methylation and body mass index from birth to adolescence: meta-analyses of epigenome-wide association studies。成人中DNA甲基化图谱与肥胖间存在关联这一事实已为人们所知,然而此关联性是否在少儿期存在,二者间是否存在因果关系尚不明确。本文作为一项Meta分析,检查了来自全球23项研究的超过1万名儿童青少年参与者,分析了其血液DNA甲基化图谱和体重指数(BMI)的关系。通过横断面研究识别了在儿童青少年时期与BMI有显著关联的DNA甲基化位点,通过纵向研究探讨了其因果性。结果发现,仅少量血液DNA甲基化位点与BMI显著关联,且有证据显示其可能为高BMI的结果。本研究是一项典型的分子流行病学研究,其结构和方法有很多可取之处。
AbstractBackgroundDNA methylation has been shown to be associated with adiposity in adulthood. However, whether similar DNA methylation patterns are associated with childhood and adolescent body mass index (BMI) is largely …
AbstractBackgroundDNA methylation has been shown to be associated with adiposity in adulthood. However, whether similar DNA methylation patterns are associated with childhood and adolescent body mass index (BMI) is largely unknown. More insight into this relationship at younger ages may have implications for future prevention of obesity and its related traits.MethodsWe examined whether DNA methylation in cord blood and whole blood in childhood and adolescence was associated with BMI in the age range from 2 to 18 years using both cross-sectional and longitudinal models. We performed meta-analyses of epigenome-wide association studies including up to 4133 children from 23 studies. We examined the overlap of findings reported in previous studies in children and adults with those in our analyses and calculated enrichment.ResultsDNA methylation at three CpGs (cg05937453, cg25212453, and cg10040131), each in a different age range, was associated with BMI at Bonferroni significance,P < 1.06 × 10−7, with a 0.96 standard deviation score (SDS) (standard error (SE) 0.17), 0.32 SDS (SE 0.06), and 0.32 BMI SDS (SE 0.06) higher BMI per 10% increase in methylation, respectively. DNA methylation at nine additional CpGs in the cross-sectional childhood model was associated with BMI at false discovery rate significance. The strength of the associations of DNA methylation at the 187 CpGs previously identified to be associated with adult BMI, increased with advancing age across childhood and adolescence in our analyses. In addition, correlation coefficients between effect estimates for those CpGs in adults and in children and adolescents also increased. Among the top findings for each age range, we observed increasing enrichment for the CpGs that were previously identified in adults (birthPenrichment = 1; childhoodPenrichment = 2.00 × 10−4; adolescencePenrichment = 2.10 × 10−7).ConclusionsThere were only minimal associations of DNA methylation with childhood and adolescent BMI. With the advancing age of the participants across childhood and adolescence, we observed increasing overlap with altered DNA methylation loci reported in association with adult BMI. These findings may be compatible with the hypothesis that DNA methylation differences are mostly a consequence rather than a cause of obesity.
#paper The quantum house of cards 10.1073/pnas.2313269120 Pub Date : 2023-12-26
量子计算机已被提议解决许多重要问题,例如发现新药、肥料生产的新催化剂、破解加密协议、优化金融投资组合或实施新的人工智能应用。然而,迄今为止,诸如 3 乘以 5 之类的简单任务超出了现有的量子硬件的能力。本文探讨了量子计算机兑现其承诺需要解决的困难。我讨论了构建量子计算机的整个技术堆栈,从顶层(实际算法和相关应用程序)到最底层(量子硬件、其控制电子设备、低温等),而不是忘记了量子纠错的关键中间层。
Quantum computers have been proposed to solve a number of important problems such as discovering new drugs, new catalysts for fertilizer production, breaking encryption protocols, optimizing financial portfolios, or implementing …
Quantum computers have been proposed to solve a number of important problems such as discovering new drugs, new catalysts for fertilizer production, breaking encryption protocols, optimizing financial portfolios, or implementing new artificial intelligence applications. Yet, to date, a simple task such as multiplying 3 by 5 is beyond existing quantum hardware. This article examines the difficulties that would need to be solved for quantum computers to live up to their promises. I discuss the whole stack of technologies that has been envisioned to build a quantum computer from the top layers (the actual algorithms and associated applications) down to the very bottom ones (the quantum hardware, its control electronics, cryogeny, etc.) while not forgetting the crucial intermediate layer of quantum error correction.
#paper,Using sequences of life-events to predict human lives. Nat Comput Sci (2023). Lives,https://doi.org/10.1038/s43588-023-00573-5,大语言模型可以精准算命了吗?是的!发表于Nature Computational Science的论文提出预测人生走向的模型,用与语言结构相似的方式来表示人类生活,将一系列人类行为事件构建为生命序列。该论文提出了一个名为life2vec的深度学习模型,用于预测人类生活轨迹的各种结果,比如早逝风险和个性特质。该模型基于Transformer架构,可以学习表示人生事件序列的稠密向量表示。研究使用了丹麦全国范围内约600万居民近10年的详细劳动力和医疗数据,构建了生活事件序列。L2V模型的Accuracy达到了78.8%(0.788 [0.782, 0.794])。
该模型包含三个组件:嵌入层、编码器和特定任务的解码器。模型首先通过masked language modeling任务和sequence ordering预测任务进行预训练,学习事件表示和序列结构。之后进行微调,通过早逝预测和个性特质预测等下游任务学习整个生活轨迹的向量表示。结果显示,该模型能够准确预测各种不同领域的结果,在早逝预测任务上明显优于当前最先进的方法。
Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and …
Here we represent human lives in a way that shares structural similarity to language, and we exploit this similarity to adapt natural language processing techniques to examine the evolution and predictability of human lives based on detailed event sequences. We do this by drawing on a comprehensive registry dataset, which is available for Denmark across several years, and that includes information about life-events related to health, education, occupation, income, address and working hours, recorded with day-to-day resolution. We create embeddings of life-events in a single vector space, showing that this embedding space is robust and highly structured. Our models allow us to predict diverse outcomes ranging from early mortality to personality nuances, outperforming state-of-the-art models by a wide margin. Using methods for interpreting deep learning models, we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to discover potential mechanisms that impact life outcomes as well as the associated possibilities for personalized interventions.
#paper https://doi.org/10.48550/arXiv.2312.03701 , Self-conditioned Image Generation via Generating Representations
这篇文章介绍了一种名为“表示条件图像生成”(RCG)的新型图像生成框架。RCG 不依赖于人类标注,而是基于自监督的表示分布来生成图像。使用预训练的编码器将图像分布映射到表示分布,然后通过表示扩散模型(RDM)从中采样,最后通过像素生成器根据采样的表示生成图像。RCG 在 ImageNet 256×256 数据集上实现了显著的性能提升,其 FID 和 IS 分别达到了 3.31 和 253.4。这个方法不仅显著提升了类无条件图像生成的水平,而且与当前领先的类条件图像生成方法相比也具有竞争力,弥补了这两种任务之间长期存在的性能差距。
This paper presents $\textbf{R}$epresentation-$\textbf{C}$onditioned image$\textbf{G}$eneration (RCG), a simple yet effective image generation frameworkwhich sets a new benchmark in class-unconditional image generation. RCG doesnot condition on any human annotations. Instead, it …
This paper presents $\textbf{R}$epresentation-$\textbf{C}$onditioned image$\textbf{G}$eneration (RCG), a simple yet effective image generation frameworkwhich sets a new benchmark in class-unconditional image generation. RCG doesnot condition on any human annotations. Instead, it conditions on aself-supervised representation distribution which is mapped from the imagedistribution using a pre-trained encoder. During generation, RCG samples fromsuch representation distribution using a representation diffusion model (RDM),and employs a pixel generator to craft image pixels conditioned on the sampledrepresentation. Such a design provides substantial guidance during thegenerative process, resulting in high-quality image generation. Tested onImageNet 256$\times$256, RCG achieves a Frechet Inception Distance (FID) of3.31 and an Inception Score (IS) of 253.4. These results not only significantlyimprove the state-of-the-art of class-unconditional image generation but alsorival the current leading methods in class-conditional image generation,bridging the long-standing performance gap between these two tasks. Code isavailable at https://github.com/LTH14/rcg.
#paper doi:10.1002/cyto.a.23690 , Best Practices for Preparing a Single Cell Suspension from Solid Tissues for Flow Cytometry
分散酶、胶原酶、透明质酸酶用作将组织解离成小细胞团块,其中分散酶可能会破坏细胞表位。细胞-细胞间存在3种链接:1)闭塞连接、2)通信连接和3)锚定连接,使用胰蛋白酶或木瓜蛋白酶破坏他们。胰蛋白酶会对细胞膜蛋白有非常严重的影响,且会导致游离DNA诱导的细胞聚集,因此要避免使用。一种替代方案是木瓜蛋白酶,但其同样会导致游离 DNA 诱导的细胞聚集。还需要引入DNA酶来降解游离的DNA,通常使用DNase-I而非DNase-II,因为前者不启动细胞凋亡途径。钙离子在这一步是必要的,因其能充当DNA酶的激活剂。
关于酶的使用,确定酶解中所用酶的最佳强度和浓度是经验性的,对于正确分离细胞和成功消化组织至关重要。根据酶的不同,酶解也可以在 4°C或冰上进行,这些较低的温度可能会减慢酶的反应速率并延长潜伏期,但有助于最大限度地减少细胞死亡。
Preparing a single cell suspension is a critical step in any solid tissue flow cytometry experiment. Tissue dissection, enzymatic digestion, and mechanical dissociation are three significant steps leading to the …
Preparing a single cell suspension is a critical step in any solid tissue flow cytometry experiment. Tissue dissection, enzymatic digestion, and mechanical dissociation are three significant steps leading to the degradation of the extracellular matrix and the isolation of single cells, allowing the generation of high-quality flow cytometry data. Cells and the extracellular matrix contain various proteins and other structures which must be considered when designing a tissue digestion protocol to preserve the viability of cells and the presence of relevant antigens while digesting matrix components and cleaving cell-cell junctions. Evaluation of the single cell suspension is essential before proceeding with the labeling of the cells as high viability and absence of cell debris and aggregates are critical for flow cytometry. The information presented should be used as a general guide of steps to consider when preparing a single cell suspension from solid tissues for flow cytometry experiments. © 2018 International Society for Advancement of Cytometry.
#paper doi:10.1021/jacs.2c04325,JACS,2022,DNA Strand-Displacement Temporal Logic Circuits
Molecular circuits capable of processing temporal information are essential for complex decision making in response to both the presence and history of a molecular environment. A particular type of temporal …
Molecular circuits capable of processing temporal information are essential for complex decision making in response to both the presence and history of a molecular environment. A particular type of temporal information that has been recognized to be important is the relative timing of signals. Here we demonstrate the strategy of temporal memory combined with logic computation in DNA strand-displacement circuits capable of making decisions based on specific combinations of inputs as well as their relative timing. The circuit encodes the timing information on inputs in a set of memory strands, which allows for the construction of logic gates that act on current and historical signals. We show that mismatches can be employed to reduce the complexity of circuit design and that shortening specific toeholds can be useful for improving the robustness of circuit behavior. We also show that a detailed model can provide critical insights for guiding certain aspects of experimental investigations that an abstract model cannot. We envision that the design principles explored in this study can be generalized to more complex temporal logic circuits and incorporated into other types of circuit architectures, including DNA-based neural networks, enabling the implementation of timing-dependent learning rules and opening up new opportunities for embedding intelligent behaviors into artificial molecular machines.
#paper arXiv:2312.11514v1 ,2023, LLM in a flash:
Efficient Large Language Model Inference with Limited Memory 大型语言模型(LLMs)在现代自然语言处理中具有重要作用,但其高昂的计算和内存需求对于内存有限的设备构成了挑战。为了高效运行超过可用DRAM容量的LLMs,该论文采用了存储模型参数在闪存上,并按需将其调入DRAM的方法。研究方法包括构建与闪存行为协调的推理模型,并在两个关键领域进行优化:减少闪存传输的数据量和以更大、更连续的块来读取数据。在这个框架下,引入了两种主要技术:“windowing”策略通过重复使用先前激活的神经元减少数据传输,“row-column bunding”则充分利用了闪存的顺序数据访问特性,增加了从闪存中读取的数据块的大小。这些方法使得可以在有限DRAM上运行比原先两倍大的模型,相较于朴素的加载方法,在CPU和GPU上推断速度分别提高了4-5倍和20-25倍。
Large language models (LLMs) are central to modern natural languageprocessing, delivering exceptional performance in various tasks. However, theirintensive computational and memory requirements present challenges, especiallyfor devices with limited DRAM capacity. …
Large language models (LLMs) are central to modern natural languageprocessing, delivering exceptional performance in various tasks. However, theirintensive computational and memory requirements present challenges, especiallyfor devices with limited DRAM capacity. This paper tackles the challenge ofefficiently running LLMs that exceed the available DRAM capacity by storing themodel parameters on flash memory but bringing them on demand to DRAM. Ourmethod involves constructing an inference cost model that harmonizes with theflash memory behavior, guiding us to optimize in two critical areas: reducingthe volume of data transferred from flash and reading data in larger, morecontiguous chunks. Within this flash memory-informed framework, we introducetwo principal techniques. First, "windowing'" strategically reduces datatransfer by reusing previously activated neurons, and second, "row-columnbundling", tailored to the sequential data access strengths of flash memory,increases the size of data chunks read from flash memory. These methodscollectively enable running models up to twice the size of the available DRAM,with a 4-5x and 20-25x increase in inference speed compared to naive loadingapproaches in CPU and GPU, respectively. Our integration of sparsity awareness,context-adaptive loading, and a hardware-oriented design paves the way foreffective inference of LLMs on devices with limited memory.
#paper doi:10.1101/2023.10.04.560604. bioRxiv, 2023, Federated Learning for multi-omics: a performance evaluation in Parkinson's disease. 这篇文章基于两个帕金森病研究的数据集(PPMI和PDBP),这两个数据集都入组了数百例患者和对照健康人,分别都进行了WGS和RNA-seq,获得了多组学检测的分析特征结果。通过将PPMI拆分为K折,留出一折后所剩余K-1折用于模型训练,再将模型放到PPMI预先留出的一折数据和PBMP上进行测试和性能评估。建模分别使用了集中化的机器学习方法,以及将数据拆分到多个节点(site)以采取联邦学习法,并使用了不同的联邦学习策略。结果显示,虽然样本在不同site的分散程度、联邦学习的策略等都会对最终性能有所影响,但联邦学习的最优结果,能与集中化训练的性能相当。此外,本文对联邦学习的训练时间进行评估,比集中化的方法至少高出一个数量级。虽然如此,由于联邦学习可以避免大规模数据在不同sites之间分享和传输,对于整合更广泛的数据,提升模型性能,还是有优势的。提供了对联邦学习在多组学和特别是在帕金森病预测中的应用的深入分析,展示了其作为一种协作工具在处理大规模异构数据时的潜力和挑战。
While machine learning (ML) research has recently grown more in popularity, its application in the omics domain is constrained by access to sufficiently large, high-quality datasets needed to train ML …
While machine learning (ML) research has recently grown more in popularity, its application in the omics domain is constrained by access to sufficiently large, high-quality datasets needed to train ML models. Federated Learning (FL) represents an opportunity to enable collaborative curation of such datasets among participating institutions. We compare the simulated performance of several models trained using FL against classically trained ML models on the task of multi-omics Parkinson's Disease prediction. We find that FL model performance tracks centrally trained ML models, where the most performant FL model achieves an AUC-PR of 0.876 ± 0.009, 0.014 ± 0.003 less than its centrally trained variation. We also determine that the dispersion of samples within a federation plays a meaningful role in model performance. Our study implements several open source FL frameworks and aims to highlight some of the challenges and opportunities when applying these collaborative methods in multi-omics studies.
#paper,https://doi.org/10.7554/eLife.55389,A Bayesian and efficient observer model explains concurrent attractive and repulsive history biases in visual perception,人类的视觉感知受到历史经验的影响,同时产生排斥性偏差和吸引性偏差,且二者具有不同的时间尺度:吸引性偏差的衰减速度快,只由上一个试次刺激影响产生,排斥性偏差衰减速度慢,可以持续受到过去数分钟的刺激的影响,暗示吸引性和排斥性偏差的机制是分离的。但是,目前仍不清楚吸引性的偏差和排斥性的偏差在知觉决策过程中是否相互作用以及如何相互作用。这篇研究结合了快速编码框架和贝叶斯解码模型,同时捕获了吸引性偏差和排斥性偏差的关键特征。
Human perceptual decisions can be repelled away from (repulsive adaptation) or attracted towards recent visual experience (attractive serial dependence). It is currently unclear whether and how these repulsive and attractive …
Human perceptual decisions can be repelled away from (repulsive adaptation) or attracted towards recent visual experience (attractive serial dependence). It is currently unclear whether and how these repulsive and attractive biases interact during visual processing and what computational principles underlie these history dependencies. Here we disentangle repulsive and attractive biases by exploring their respective timescales. We find that perceptual decisions are concurrently attracted towards the short-term perceptual history and repelled from stimuli experienced up to minutes into the past. The temporal pattern of short-term attraction and long-term repulsion cannot be captured by an ideal Bayesian observer model alone. Instead, it is well captured by an ideal observer model with efficient encoding and Bayesian decoding of visual information in a slowly changing environment. Concurrent attractive and repulsive history biases in perceptual decisions may thus be the consequence of the need for visual processing to simultaneously satisfy constraints of efficiency and stability.
#paper, https://doi.org/10.1038/s41562-019-0804-2,Quantum reinforcement learning during human
Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on …
Classical reinforcement learning (CRL) has been widely applied in neuroscience and psychology; however, quantum reinforcement learning (QRL), which shows superior performance in computer simulations, has never been empirically tested on human decision-making. Moreover, all current successful quantum models for human cognition lack connections to neuroscience. Here we studied whether QRL can properly explain value-based decision-making. We compared 2 QRL and 12 CRL models by using behavioural and functional magnetic resonance imaging data from healthy and cigarette-smoking subjects performing the Iowa Gambling Task. In all groups, the QRL models performed well when compared with the best CRL models and further revealed the representation of quantum-like internal-state-related variables in the medial frontal gyrus in both healthy subjects and smokers, suggesting that value-based decision-making can be illustrated by QRL at both the behavioural and neural levels.
#paper doi:doi.org/10.1038/s41551-023-01114-1 Detection of cellular traction forces via the force-triggered Cas12a-mediated catalytic cleavage of a fluorogenic reporter strand 本文介绍了利用CRISPR相关蛋白(Cas)-Cas12a 检测活细胞表面受体分子力事件的方法,其技术路径:激活剂是固定在表面(如玻璃载玻片)上的ssDNA,激活剂通过与互补链杂交而被隐藏,互补链又与配体肽结合;当细胞被植入该表面时,表面受体和配体结合,并施加力,超过双链的机械耐受性的力会导致其断裂,暴露激活剂;激活Cas12a会高效地催化切割荧光性ssDNA报告基因。在作为测试的血小板力检测中,其具有以下优势1.活细胞2.只需要~5 μl或更少的血液来进行每次测量,降低了高通量筛选的难度3.检测结果与出血风险更高的相关4.更短的时间(30min),更易识别的信号,可能更低的成本。对 CRISPR 检测不太了解,欢迎斧正。
Molecular forces generated by cell receptors are infrequent and transient, and hence difficult to detect. Here we report an assay that leverages the CRISPR-associated protein 12a (Cas12a) to amplify the …
Molecular forces generated by cell receptors are infrequent and transient, and hence difficult to detect. Here we report an assay that leverages the CRISPR-associated protein 12a (Cas12a) to amplify the detection of cellular traction forces generated by as few as 50 adherent cells. The assay involves the immobilization of a DNA duplex modified with a ligand specific for a cell receptor. Traction forces of tens of piconewtons trigger the dehybridization of the duplex, exposing a cryptic Cas12-activating strand that sets off the indiscriminate Cas12-mediated cleavage of a fluorogenic reporter strand. We used the assay to perform hundreds of force measurements using human platelets from a single blood draw to extract individualized dose-response curves and half-maximal inhibitory concentrations for a panel of antiplatelet drugs. For seven patients who had undergone cardiopulmonary bypass, platelet dysfunction strongly correlated with the need for platelet transfusion to limit bleeding. The Cas12a-mediated detection of cellular traction forces may be used to assess cell state, and to screen for genes, cell-adhesion ligands, drugs or metabolites that modulate cell mechanics.