|
I. CENTER FOR
INFORMATION BIOLOGY AND DNA DATA BANK OF JAPAN
I-b. Laboratory for Gene-Product Informatics - Ken
Nishikawa Group
RESEARCH
ACTIVITIES
(1)
Eigenvalue analysis of amino acid substitution
matrices reveals a sharp transition of the mode of
sequence conservation in proteins
Akira R. Kinjo and Ken Nishikawa
--The pattern of
amino acid substitutions and sequence conservation
over many structure-based alignments of protein
sequences was analyzed as a function of percentage
sequence identity. The statistics of the amino acid
substitutions were converted into the form of
log-odds amino acid substitution matrices to which
eigenvalue decomposition was applied. It was found
that the most important component of the
substitution matrices exhibited a sharp transition
at the sequence identity of 30-35%, which coincides
with the twilight zone. Above the transition point,
the most dominant component is related to the
mutability of amino acids and it acts to disfavor
any substitutions, whereas below the transition
point, the most dominant component is related to
the hydrophobicity of amino acids and substitutions
between residues of similar hydrophobic character
are positively favored. Implications for protein
evolution and sequence analysis are discussed. See
Ref. 1 for details.
(2)
Estimation of the number of authentic orphan genes
in bacterial genomes
Satoshi Fukuchi and Ken Nishikawa
--Genome annotation
produces a considerable number of putative proteins
lacking sequence similarity to known proteins.
These are referred to as “Orphans". The proportion
of orphan genes varies among genomes, and is
independent of genome size. In the present study,
we show that the proportion of orphan genes roughly
correlates with the isolation index of organisms
(IIO), an indicator introduced in the present
study, which represents the degree of isolation of
a given genome as measured by sequence similarity.
However, there are outlier genomes with respect to
the linear correlation, consisting of those genomes
that may contain excess amounts of orphan genes.
Comparisons of genome sequences among closely
related strains revealed that some of the annotated
genes are not conserved, suggesting that they are
ORFs occurring by chance. Exclusion of these
non-conserved ORFs within closely related genomes
improved the correlation between the proportion of
orphan genes and the IIO values. Assuming that the
correlation holds in general, this relationship was
used to estimate the number of “authentic" orphan
genes in a genome. Using this definition of
authentic orphan genes, the anomalies arising from
over-assignments, e.g., the percentages of
structural annotations, were corrected for 16
genomes, including those of five archaea. See Ref.
2 for details.
(3)
Alternative splice variants encoding unstable
protein domains exist in the human
brain
Keiichi Homma, Reiko F. Kikuno, Takahiro Nagase,
Osamu Ohara and Ken Nishikawa
--Alternative
splicing has been recognized as a major mechanism
by which protein diversity is increased without
significantly increasing genome size in animals and
has crucial medical implications, as many
alternative splice variants are known to cause
diseases. Despite the importance of knowing what
structural changes alternative splicing introduces
to the encoded proteins for the consideration of
its significance, the problem has not been
adequately explored. Therefore, we systematically
examined the structures of the proteins encoded by
the alternative splice variants in the HUGE protein
database derived from long (>4 kb) human brain
cDNAs. Limiting our analyses to reliable
alternative splice junctions, we found alternative
splice junctions to have a slight tendency to avoid
the interior of SCOP domains and a strong
statistically significant tendency to coincide with
SCOP domain boundaries. These findings reflect the
occurrence of some alternative splicing events that
utilize protein structural units as a cassette.
However, 50 cases were identified in which SCOP
domains are disrupted in the middle by alternative
splicing. In six of the cases, insertions are
introduced at the molecular surface, presumably
affecting protein functions, while in 11 of the
cases alternatively spliced variants were found to
encode pairs of stable and unstable proteins. The
mRNAs encoding such unstable proteins are much less
abundant than those encoding stable proteins and
tend not to have corresponding mRNAs in non-primate
species. We propose that most unstable proteins
encoded by alternative splice variants lack normal
functions and are an evolutionary dead-end. See
Ref. 3 for details.
(4)
Construction and characterization of chimeric
proteins composed of type-1 and type-2 periplasmic
binding proteins MglB and ArgT
Kenji Kashiwagi, Kaoru Fukami-Kobayashi,
Kiyotaka Shiba and Ken Nishikawa
--The respective
type-1 and type-2 periplasmic binding proteins
(PBPs) MglB and ArgT are believed to have evolved
from a common ancestor into siblings showing
topological differences in their main chain
connectivity. At first glance, they show similar
structure. But, more detailed examination reveals
that the chain connectivity of ArgT is more
convoluted than that of MglB. Reflecting that
complexity, the folding of ArgT is complicated and
involves intermediate folds. On the other hand, the
folding of MglB is a simple two-state transition.
In the present study, we constructed and
characterized several chimeras made up of various
subdomains of MglB and ArgT with the aim of gaining
insight into the evolution of protein folding and
protein structure. Although these chimeras did not
fold as compactly as their parental proteins, some
did exhibit cooperative folding, which suggests
that novel proteins with new connectivity and new
folding pathways could have emerged at a fairly
high rate throughout the evolution of proteins. See
Ref. 4 for details.
PUBLICATIONS
Papers
1. Kinjo, A.R. and Nishikawa, K. (2004).
Eigenvalue analysis of amino acid substitution
matrices reveals a sharp transition of the mode of
sequence conservation in proteins.
Bioinformatics, 20 (16) 2504-2508.
2. Fukuchi, S. and Nishikawa, K. (2004). Estimation
of the number of authentic orphan genes in
bacterial genomes. DNA Res., 11,
219-231.
3. Homma, K., Kikuno, R.F., Nagase, T., Ohara, O.
and Nishikawa, K. (2004). Alternative splice
variants encoding unstable protein domains exist in
the human brain. J. Mol. Biol., 343,
1207-1220.
4. Kashiwagi, K., Fukami-Kobayashi, K., Shiba, K.
and Nishikawa, K. (2004). Construction and
characterization of chimeric proteins composed of
type-1 and type-2 periplasmic binding proteins MglB
and ArgT. Biosci. Biotechnol. Biochem., 68
(4) 808-813.
5. Imanishi, T., Itoh, T., Suzuki, Y., O'Donovan,
C. and Fukuchi, S. et al. (2004). Integrative
annotation of 21,037 human genes validated by
full-length cDNA clones. PLoS Biology, 2
(6), 856-875.
6. Kinjo, A.R., Horimoto, K. and Nishikawa, K.
(2005). Predicting absolute contact numbers of
native protein structure from amino acid sequence.
Proteins, 58, 158-165.
7. Kinjo, A.R. and Nishikawa, K. Recoverable
one-dimensional encoding of protein
three-dimensional structures.
Bioinformatics, in press.
Reviews
8.
吉宗一晃,福地佐斗志,森口充瞭,西川建(2004)「タンパク質から見た極限微生物の環境適応戦略」,バイオサイエンスとインダストリー,Vol.62,
17-22.
9.
福地佐斗志,西川建(2004)「蛋白質構造解析プログラム・データベース」蛋白質核酸酵素増刊「バイオ高性能機器・新技術利用マニュアル」(小原収他編,共立出版)Vol.49
(11), pp.1944-1948.
Database
GTOP(ゲノム中のタンパク質立体構造DB):
http://spock.genes.nig.ac.jp/~genome/gtop.html
PMD(変異タンパク質DB):
http://pmd.ddbj.nig.ac.jp/~pmd/pmd.html
TTDB(原核生物の転写因子DB):
http://spock.genes.nig.ac.jp/~ttdb/
ORAL
PRESENTATIONS
1. Kinjo, R.A.: Competition between protein
folding and aggregation inside the cell: Studies by
density functional theory (invited talk). NMRS 2004
Symposium on NMR, Drug Design, and Bioinformatics,
Saha Institute of Nuclear Physics, Kolkata,India,
Feb. 2004.
2. Nishikawa, K.: Genome-wide compositional changes
of DNA and proteins in thermophilic bacteria for
adaptation to higher temperatures. The 1st
Pacific-Rim International Conference on Protein
Science, Yokohama, Apr. 2004.
3. Nishikawa, K.: A study of comparative genomics
based on domain structures of proteins. Satellite
Symposium of PRICPS2004, Yokohama, Apr. 2004.
POSTER
PRESENTATIONS
1.
福地佐斗志、深海薫、本間桂一、太田元規、西川建「H-invitationalヒトcDNAの解析から見つかった挿入アミノ酸配列」第27回日本分子生物学会年会、神戸、2004年12月
2.
本間桂一、菊野玲子、長瀬隆弘、小原収、西川建「選択的スプライシングバリアントの中には不安定なタンパク質をコードするものもある」第27回日本分子生物学会年会、神戸、2004年12月
3. 長島剛宏、三井崇志、金城玲、西川建「Wang-Landau
MDを用いたタンパク質の構造空間探索」第42回日本生物物理学会年会、京都、2004年12月
4.
金城玲、西川建「タンパク質の一次元情報から天然構造を再現する」第42回日本生物物理学会年会、京都、2004年12月
5.
峯崎善章、西川建「自動判定法による転写因子の網羅的同定と比較ゲノム解析」第42回日本生物物理学会年会、京都、2004年12月
EDUCATION
1. 福地佐斗志 第9回DDBJing講習会,東京,3月.
2. 西川建,福地佐斗志,金城玲
科学技術振興事業団主催ゲノムリテラシー講座「データベースを利用した蛋白質の立体構造予測」東京,7月.
3. 西川建
立命館大学理工学部生命情報学科セミナー,11月.
|