MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations
Tang X, Tran A, Tan J, Gerstein M. MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations. Bioinformatics 2024, 40: i357-i368. PMID: 38940177, PMCID: PMC11256921, DOI: 10.1093/bioinformatics/btae260.Peer-Reviewed Original ResearchConceptsTransformer encoderDownstream tasksLanguage modelBiomedical textSelf-supervised pre-trainingExplicit 3D representationRepresentation improves performanceDeep learning modelsRepresentation of moleculesContrastive learningSupervisory signalExtract embeddingsRepresentation capabilityJoint representationBiomedical domainPre-trainingTextual dataLearning modelsMolecular representationsModel weightsJupyter NotebookStep-by-step guidanceEncodingProperty predictionStructural informationFAVOR-GPT: a generative natural language interface to whole genome variant functional annotations
Li T, Zhou H, Verma V, Tang X, Shao Y, Van Buren E, Weng Z, Gerstein M, Neale B, Sunyaev S, Lin X. FAVOR-GPT: a generative natural language interface to whole genome variant functional annotations. Bioinformatics Advances 2024, 4: vbae143. PMID: 39387060, PMCID: PMC11461909, DOI: 10.1093/bioadv/vbae143.Peer-Reviewed Original ResearchVariant functional annotationFunctional annotationNatural language interfaceFunctional annotation dataDisease-associated variantsLanguage interfaceWhole genomeFunctional prioritizationGenomeUser promptsRetrieval frameworkLanguage modelRaw annotationsAnnotated dataAnnotationUsersRetrievalOnline resourcesChatbotInformation interpretationUsabilityVariantsDatabase