Kākāpō125+ gene sequencing
IntroductionWe’re sequencing the genomes of all living kākāpō to assist conservation. Apply to use the data in your own research.
In this section
Kākāpō125+ aims to sequence the genomes of all living kākāpō plus some important recently-deceased individuals. The name refers to the total of 125 kākāpō living in 2015 when the project started.
The overarching aim of the project is to improve genetic management of kākāpō, particularly to address the key issues hampering recovery: infertility and disease.
The kākāpō is a unique, critically endangered parrot found only in New Zealand. It lives in the wild on predator-free islands. Intensive conservation management has recovered the population from a low of 51 individuals in 1995 to about 150 in 2018.
Conservation efforts include genetic management using relatedness scores from microsatellite markers. These are used to guide translocations, artificial insemination attempts and prioritisation of individuals.
Kākāpō125+ was established by Kākāpō Recovery and the Genetic Rescue Foundation following the initial sequencing of a reference kākāpō genome in 2015 at Duke University and Pacific Biosciences. The full chromosome-level assembly of the reference genome (of individual ‘Jane’) was completed and released on 13 September 2018 (Genbank accession No. TBC).
Kākāpō125+ is a collaboration of the following parties: DOC, Genetic Rescue Foundation, Ngāi Tahu, Science Exchange, Experiment.com, Otago University, Duke University, Rockefeller University and Genomics Aotearoa.
The project has been funded by a combination of private and public contributions, with all fundraising coordinated by the Genetic Rescue Foundation.
- The kākāpō reference genome (from individual ‘Jane’) is available via the VGP project.
- The remaining 171 population genomes have been generated by Kākāpō125+. See Table 1 below.
- Data are available as raw and mapped genomes.
- The data are the property of the New Zealand Department of Conservation, fulfilling its commitment to New Zealand Māori through the New Zealand Conservation Act 1987.
|Date generated||Host project||No. of genomes||Genome sequence numbers||Yield (mean + range; Gb per individual)||Sequencing technology||Location|
|2015 - 2018||Genome 10K||1||Reference ('Jane')||PacBioBioNanoArima Hi-C10X Genomics||Duke University & Rockefeller University|
|May 2016||Kākāpō125+||39||1-39||19.1 (15.0-22.8)||HiSeq2500,TruSeq DNA Nano 2x125 bases PE||NZGL, Dunedin, New Zealand|
|Apr 2017||Kākāpō125+||42||40-81||16.1 (13.1-21.7)||HiSeq2500,TruSeq DNA Nano 2x125 bases PE||NZGL|
|May 2018||Kākāpō125+||88||82-169||30.8 (19.3-53.4)||HiSeqX, 2x150 bases PE v2.0. 10||Genome.One, Sydney, Australia|
For the reference genome standards see VGP Technology.
The Kākāpō125+ genome assembly will be conducted by assembly standards (developed by Genomics Aotearoa) encouraging a high-quality chromosome-level genome and the prioritisation of high-quality contigs from resequenced birds to facilitate the examination of structural variants and genome content differences. High-quality contigs combined with a high-quality chromosome-level assembly will allow the creation of both a pan-reference assembly and an estimated core-reference assembly for the entire species. A README file detailing the creation, use and statistics for each assembly will be made available.
Genome annotation standards
Genomes published by Kākāpō125+ will have genes, SNPs, indels, repetitive elements, structural variants including copy-number variations and presence/absence variation, computed core and pan-genomes, assembly statistics, signatures of selection and other population statistics, and estimated genic effects resulting from SNPs and indels. Annotations will be supported by transcriptomics of a limited set of tissues. A README file detailing the creation, use and statistics of each analysis will be made available.
All scripts and pipelines used in the analysis of kākāpō data by the Kākāpō125+ consortium, led by Genomics Aotearoa, will be available on the GitHub website.
The primary aim of Kākāpō125+ is to improve kākāpō conservation, as guided by Kākāpō Recovery at DOC. The following analyses are priorities for kākāpō conservation:
- Genetic management:relatedness, inheritance and viability of offspring produced by artificial insemination.
- Disease: genetic signatures for cloacitis, disease resistance, dwarfism and vitamin D.
- Fertility: genetic basis for sperm quality, clutch size, incubation success, hatching failures and embryo deaths.
- Ageing: age kākāpō using DNA methylation.
The Kākāpō125+ consortium led by Genomics Aotearoa will produce global analyses of the dataset covering the following topics:
- Annotation of the genomes using automated tools, and by hand for genes of interest in immunity, inbreeding, infertility, gigantism, flightlessness, and
- Comparisons of the genomes of kākāpō with closely related species (eg kea and kaka) to identify genomics changes associated with key adaptations in kākāpō.
- Comparisons of the genomic signatures of decline among several critically endangered species (eg kākāpō, black stilt/kakī, shore plover/tūturuatu, orange-fronted parakeet/kakariki karaka, New Zealand fairy tern/tara iti).
- Develop measures of inbreeding and inbreeding depression for kākāpō.
- Investigate Mendelian and non-Mendelian inheritance of kākāpō traits
- Identify markers of particular kākāpō lineages, and markers of sex.
- Examine the de novo mutation rate and estimate lethal equivalents and recombination rates.
- Identify significant copy number variations and structural variation in the data set.
- Carry out an association analysis to identify genetic variation associated with key traits.
- Determine population structure and carry out genome wide assessments of heritability for key traits, both disease related and physiology related.
How to access the data
Kākāpō125+ population genome data will be publicly available in late 2018 once sequencing and mapping of all genomes is complete.
A key principle of Kākāpō125+ is that the data are made publicly available for non-commercial and appropriate use. The data will be shared in adherence to the principles of the Fort Lauderdale and Toronto Agreements for sharing large-scale genetic datasets.
The project is a collaborative effort and encourages involvement of all interested parties in analysis of the data, particularly for the benefit of kākāpō conservation.
The project is affiliated with the B10K Project and the G10K Vertebrate Genomes Project.
You'll need to agree to our data sharing agreement to get approval to use the data.
- In recognition of the special relationship Ngāi Tahu have as kaitiaki (guardians) of this taonga (treasured) species, and the nature of the active Treaty partnership between Ngāi Tahu and DOC, the decision-making for this approval process is a joint responsibility between DOC and Te Rūnanga o Ngāi Tahu.
- The Data Sharing Agreement is a simple electronic document to ensure non-commercial and appropriate use of the data.
- Permitted access to the data will be via Amazon Web Services S3 bucket.
- The reference genome data are not covered by this process and are available from the VGP project at GenomeArk.
- Mātauranga Maori reflects the way Māori engage in the world – applying kawa (cultural practices) and tikanga (culturalprinciples) to enhance traditional or present-day knowledge. The data user may be invited by Ngai Tahu to engage with them, to understand and consider their research through a Mātauranga Māori lens. This provides an opportunity to strengthen the research and increasing its relevance to Aotearoa New Zealand.
- The Data Sharing Agreement also outlines requirements for reporting back to DOC and Ngāi Tahu prior to publication, in confidence, on research outcomes. This is to enable shared understanding of the research, its relevance for the kaitiaki (guardianship) of the kākāpō and to enhance the relationship between DOC, Ngāi Tahu, Mātauranga Māori and researchers.
- The project consortium encourages others to use the data but asks users to respect a publication embargo covering the analyses listed above in the global analyses. This embargo will apply for the period of two years from the data release, or until the global paper is published. The embargo for Jane follows the G10K VGP data use policy.
- Users are free to access and use data during the embargo period, but not publish results without consent of the Kākāpō125+ analysis consortium.
- Exceptions to this embargo are: analyses of limited amounts of the data, such as small groups of genes or data from single individuals. Please contact the Kākāpō125+ analysis consortium to discuss: firstname.lastname@example.org.
- The Kākāpō125+ analysis consortium encourages collaboration: users with research interests overlapping those of the consortium will be encouraged to join the consortium.
- September 2018: Reference genome of Jane available
- December 2018: Raw and mapped genome sequences publicly available
- December 2020: Publication embargo ends