Corteva Agriscience™ Challenge: Methods for Determining Similar Sequences Across Genomes

Corteva Agriscience™ Challenge

Individuals of a species differ from one another at the genetic level to various degrees. These differences represent different genotypes, or genetic constitutions, within a species. To better understand the genetic content of each individual genome, it is important to understand similarities and differences of gene sequences and their sub-components when compared across genomes. Therefore, the Seeker is looking for a methodology to accurately identify similar gene sequences across genomes from individuals of a single species.

Individuals of a species differ from one another at the genetic level to various degrees. To deeply characterize the genetic content for each individual genome, it is important to understand which sequences of common ancestry have been inherited, possibly in a modified form, across the genomes. Existing knowledge about a gene variant from a well-characterized genome can be applied to better understand other variants, or alleles, of the same gene in different, uncharacterized genomes. Knowledge of which sequences represent the same genes in different individuals is necessary to understand the impact of similarities or any differences that may exist in the gene sequences of individuals from different genetic backgrounds.

The difficulty lies in determining which gene-derived sequences in the genomes are allelic. Transcription of a gene may produce many alternative transcript representations which differ in sequence composition. Finding the best mapping between transcripts of different genomes is a difficult and time-consuming task. Current methods rely on a combination of common software and proprietary techniques, but the reliability and accuracy of the processed results could be improved. Therefore, the Seeker is interested in a better methodology, with algorithms and/or best selections of existing software/programs, able to relate transcript sets of two genotypes within a species quickly and accurately to identify the allelic relationships.

The submitted proposal should include the following:

  1. detailed description of the proposed Solution addressing specific Technical Requirements presented in the Detailed Description of the Challenge. This should also include a thorough description of the method used in the Solution accompanied by a well-articulated rationale for the software or programs employed and/or the algorithms developed.
  2. Output from the proposed method applied to the test sets presented in the Challenge in the required format described in DATA-Expected Output-Format
  3. Upon request, a software/algorithm/package including source code and executable(s) with sufficient documentation to enable the Seeker to compile, execute the algorithm, and validate the method using additional validation data sets.

The Challenge award is contingent upon theoretical evaluation of the method/algorithm by the Seeker, and validation by the Seeker of the submitted software/algorithm/package.

To receive an award, the Solvers will not have to transfer their exclusive IP rights to the Seeker. Instead, Solvers will grant to the Seeker a non-exclusive license to practice their solutions.

Awards:- $50,000

Deadline:- 31-12-2019

Take this challenge