Combining Knowledge and Data in Symbolic Regression

dc.contributor.advisorJosephson, Tyler R
dc.contributor.authorFox, Charles Elliott
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2023-04-05T14:17:32Z
dc.date.available2023-04-05T14:17:32Z
dc.date.issued2022-01-01
dc.description.abstractSymbolic regression (SR) is a machine learning tool that aims to generate models that fit data, by constructing equations from variables of interest, constants, and mathematical operators. SR has inspired applications for data-driven discovery of scientific laws and equation-based physical models. When traditional SR algorithms generate models, they attempt to achieve accuracy to data and short equation length, but they do not relate models to background knowledge. This work explores the augmentation of SR methods by incorporating domain knowledge (in the form of symbolic constraints on the equations) to guide the search through equation space. Specifically, we apply this to the chemistry problem of adsorption (when a gas sticks to a material), whose governing equations must satisfy certain thermodynamic constraints in the form of limiting behavior. This work explores how Bayesian SR and genetic algorithm-based SR can be augmented with these constraints to aid in the search. We use a computer algebra system to check constraint satisfaction for each generated expression, and we find this helps both SR algorithms generate more accurate, concise, and constraint-consistent models.
dc.formatapplication:pdf
dc.genretheses
dc.identifierdoi:10.13016/m2s3je-zf7h
dc.identifier.other12611
dc.identifier.urihttp://hdl.handle.net/11603/27364
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.sourceOriginal File Name: Fox_umbc_0434M_12611.pdf
dc.subjectAdsorption
dc.subjectGenetic Algorithms
dc.subjectMachine Learning
dc.subjectMarkov Chain Monte Carlo
dc.subjectSymbolic Regression
dc.titleCombining Knowledge and Data in Symbolic Regression
dc.typeText
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Fox_umbc_0434M_12611.pdf
Size:
3.65 MB
Format:
Adobe Portable Document Format