SGDB: a database of synthetic genes re-designed for optimizing protein over-expression





Citation of Original Publication

Gang Wu and others, SGDB: a database of synthetic genes re-designed for optimizing protein over-expression, Nucleic Acids Research, Volume 35, Issue suppl_1, 1 January 2007, Pages D76–D79,


This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution-NonCommercial 2.0 UK: England & Wales (CC BY-NC 2.0 UK)



Here we present the Synthetic Gene Database (SGDB): a relational database that houses sequences and associated experimental information on synthetic (artificially engineered) genes from all peer-reviewed studies published to date. At present, the database comprises information from more than 200 published experiments. This resource not only provides reference material to guide experimentalists in designing new genes that improve protein expression, but also offers a dataset for analysis by bioinformaticians who seek to test ideas regarding the underlying factors that influence gene expression. The SGDB was built under MySQL database management system. We also offer an XML schema for standardized data description of synthetic genes. Users can access the database at, or batch downloads all information through XML files. Moreover, users may visually compare the coding sequences of a synthetic gene and its natural counterpart with an integrated web tool at, and discuss questions, findings and related information on an associated e-forum at