| 年份 | 2018 |
| 學科 | 物理與天文學 Physics and Astronomy |
| 國家/州 | United States of America |
Automated Identification and Inference of Organic Molecular Structure and Relative Concentrations from Infrared Spectral Data
The discovery of complex organic molecules in space is critical to the understanding of the reaction pathways leading to biomolecules and the origins of life. Existing techniques for the analysis of astronomical spectra require knowledgeable researchers and often struggle to identify and differentiate complex spectral signatures, such as those of polycyclic aromatic hydrocarbons (PAHs). My project applies machine learning (convolutional neural networks) to the problem of identifying complex organic molecules in IR spectroscopic data and proposes a novel method for creating synthetic training data to tune models to specific astronomical environments.
My project created: a) models to identify organic molecules from empirical IR spectroscopic data when trained on the approximate theoretical counterparts from NASA’s PAHdb v2 and v3 databases and b), models to identify molecular compositions from spectra of random theoretical molecule mixtures with realistic noise.
My principal findings are: a) network models trained on theoretical spectra can accurately identify empirical molecules with ~73% accuracy, and b) models trained on random mixtures of 3,139 theoretical PAH spectra can identify molecular concentrations with weight vector correlations of ~85% and can correctly identify the largest constituent ~67% of the time. In all cases, my models (the best being ResNet5 with ~200M parameters) dramatically outperform standard linear models.?
My convolutional network models can recognize complex spectral patterns and generalize across datasets with realistic noise. These models can significantly increase the scale and efficiency of analyzing astronomical IR spectra data and improve our understanding of the distribution of complex organic molecules in the universe.
英特爾國際科學與工程大獎賽,簡稱 "ISEF",由美國 Society for Science and the Public(科學和公共服務協會)主辦,英特爾公司冠名贊助,是全球規模最大、等級最高的中學生的科研科創賽事。ISEF 的學術活動學科包括了所有數學、自然科學、工程的全部領域和部分社會科學。ISEF 素有全球青少年科學學術活動的“世界杯”之美譽,旨在鼓勵學生團隊協作,開拓創新,長期專一深入地研究自己感興趣的課題。
·
Physics is the science of matter and energy and of interactions between the two. Astronomy is the study of anything in the universe beyond the Earth.
Atomic, Molecular, and Optical Physics?(AMO):?The?study of atoms, simple molecules, electrons, light, and their interactions.? Projects studying?non-solid state?lasers and masers also belong in this subcategory.
Astronomy and Cosmology?(AST):?The study of space,? the universe as a whole, including its origins and evolution, the physical properties of objects in space and computational astronomy.
Biological Physics?(BIP):?The study of the physics of biological processes and systems.
Condensed Matter and Materials?(MAT):?The study of the properties of solids and liquids. Topics such as superconductivity, semi-conductors, complex fluids, and thin films are studied.
Mechanics?(MEC):?Classical physics and mechanics, including the macroscopic study of forces, vibrations and flows; on solid, liquid and gaseous materials.?Projects studying aerodynamics or hydrodynamics also belong in this subcategory.
Nuclear and Particle Physics?(NUC):?The study of the physical properties of the atomic nucleus and of fundamental particles and the forces of their interaction.?Projects developing particle detectors also belong in this subcategory.
Theoretical, Computational, and Quantum Physics?(THE):?The study of nature, phenomena and the laws of physics employing mathematical or computational methods?rather than experimental processes.
Other?(OTH):?Studies that cannot be assigned to one of the above subcategories. If the project involves multiple subcategories, the principal subcategory should be chosen instead of Other.


? 2025. All Rights Reserved. 滬ICP備2023009024號-1