NAGRP Bioinformatics Arm Seeks Input from Livestock Genomics Community to Set
Future Priorities
Feb 05, 2015 | Uduak Grace Thomas Premium
NEW YORK (GenomeWeb) - The bioinformatics coordination program
of the US Department of Agriculture's National Animal Genome Research Program
is soliciting feedback from the livestock genomics community to help chart its
future course.
At the recently concluded Plant and Animal Genome conference in San Diego,
James Reecy, an animal science professor at Iowa State University and
co-coordinator of the program, provided an update on his group's activities
and asked livestock genomics players for their help in prioritizing future
projects.
More specifically, Reecy and his colleagues are currently accepting feedback
about perceived gaps in the current informatics pipeline and how addressing
these gaps should be prioritized, he told GenomeWeb this week. Based on
submitted suggestions, the team will put together a survey and send it out to
the community hopefully by the end of the month, pending approval by the
Internal Review Board at Iowa State, he said.
The bioinformatics coordination program operates under the umbrella of the
NAGRP's National Research Support Project, NRSP-8, which launched in 1993 to
initially coordinate US genome mapping efforts in cattle, sheep, swine, and
poultry, with horses and aquatic species added more recently. NAGRP, which is
part of the USDA's National Institute of Food and Agriculture, is one of two
programs that were established by the 1990 Farm Bill as a result of the
recognition of the potential of agricultural genomics.
The NRSP-8 was set up, among other reasons, to facilitate communication among
various interest groups, maintain genomic maps, and establish databases for
sharing information among various stakeholders. It supports the activities of
several collaborative research projects including one aimed at using genetic
and functional genomic approaches to improve pork production and quality; and
one that explores gene function as it pertains to immune response in poultry.
For its part, the bioinformatics coordination program seeks to help the
animal genomics community make use of available informatics infrastructure as
well as effectively share, manage, and analyze information gleaned from
genomics studies. Its largest resource to date, according to Reecy, is the
Animal Quantitative Trait Loci (QTL) database which was set up to aggregate
publicly available trait mapping data, candidate genes and association data,
and copy number variations that have been mapped to livestock genomes. It
currently has QTL information from cattle, chicken, horse, pig, rainbow
trout, and sheep and is working to crosslink this information with data in
QTL resources for human, mouse, and rat, as well.
The bioinformatics group also works to provide computational resources that
members of the community can use to analyze their data. The list of
applications covers tools for tasks such sequence searching, associating
genes with their unique GO terms, genetic linkage analysis of diploid
species, designing gene-specific primers based on known gene structure and
EST sequence information, a tool for gene ontology enrichment analysis, and
more. Also available is genome browser with annotated tracks for multiple
fish species, cattle, chicken, horse, pig, sheep, and oyster. Active projects
for the group include the development of an ontology for animal phenotypes
and a clinical measurement ontology that's intended to standardize
morphological and physiological measurement records generated from clinical
and model organism research.
Furthermore, the researchers are working with members of the iPlant
Collaborative on a variant calling pipeline that will reside and run on the
iPlant infrastructure, allowing members of the livestock community to take
advantage of the compute resources and storage available on that system,
Reecy said. They've also developed resources that allow members of the
community to share data such as variant call files, he said.
As much as possible, the group tries to use existing bioinformatics
solutions, such as BWA-MEM, Platypus, SAMtools, and the Genome Analysis
Toolkit in its pipelines, Reecy said, optimizing them to work more
efficiently with livestock reference assemblies, many of which he said are
not as "pristine" as the human reference, for example.
Lower-quality assemblies make tasks such as sequence alignment and variant
calling extremely time consuming. For example, simply running GATK's
integrative genotyper out of the box on some animal genomes requires some 48
hours to align sequences and call variants — roughly 24 hours per task, Reecy
said. To shorten the time to results, the NRSP-8 group worked with members of
the iPlant consortium to parallelize the analyses on the iPlant
infrastructure and were able to cut alignment and variant calling time
requirements down to around six hours. There are also complementary efforts
by other groups within NAGRP to improve existing livestock reference
assemblies and to develop SNP chips for genotyping the different species, he
added.
Through its survey, the NRSP-8 bioinformatics group hopes to gain some
perspective about what its immediate next steps should be although it has
some intuition about the most pressing needs. "One of the biggest gaps right
now is in terms of raw data and the systems biology integration of that data
... [to] get from genotype to phenotype," Reecy said. Basically, "the
equivalent of the ENCODE data but for livestock species." To that end, the
NRSP-8 is involved in an international initiative called the Functional
Annotation of Animal Genomes (FAANG) consortium, which has taken up the task
of identifying all functional elements in multiple domesticated species.
Having access to this functional data and being able to combine it with
existing genomic information will make it possible to do run more kinds of
evolutionary analyses than is currently possible with existing resources,
Reecy said. It would also improve researchers' ability to predict early on
which animals are, for example, genetically predisposed to be healthier than
other or will produce more nutritious meat and milk products, he added.
"Our guess is that most of [the survey responses are] going to be headed
towards the systems biology integration of the different omics platforms in
order to better explain the variation of the phenotype that's there," he
said, adding that the group still wants to do the survey "to make sure that
we are not missing something obvious."
(from https://www.genomeweb.com/informatics/nagrp-bioinformatics-arm-seeks-input)
|