Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genetic #42

Open
wants to merge 14 commits into
base: dev
Choose a base branch
from
20 changes: 16 additions & 4 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Diversity"
uuid = "d3d5718d-52de-57ab-b67a-eca7fd6175a4"
author = ["Richard Reeve <Richard.Reeve@glasgow.ac.uk>", "Claire Harris", "Isaac Peetom Heida"]
version = "0.5.11"
author = ["Richard Reeve <Richard.Reeve@glasgow.ac.uk>", "Claire Harris", "Sonia Mitchell", "Isaac Peetom Heida"]
version = "0.6.0"

[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Expand All @@ -15,41 +15,53 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[weakdeps]
AxisArrays = "39de3d68-74b9-583c-8d2d-e117c070f3a9"
BioSequences = "7e6ae17a-c86d-528c-b3b9-7f778a29fe59"
Phylo = "aea672f4-3940-5932-aa44-993d1c3ff149"
PopGen = "af524d12-c74b-11e9-22a8-3b091653023f"
Requires = "ae029012-a4dd-5104-9daa-d747884805df"
StringDistances = "88034a9c-02f8-509d-84a9-84ec65e18404"

[extensions]
DiversityPhyloExt = "Phylo"
DiversityAxisArraysExt = "AxisArrays"
DiversityGeneticsExt = ["StringDistances", "BioSequences", "PopGen"]
DiversityPhyloExt = "Phylo"

[compat]
AxisArrays = "0.4"
BioSequences = "3"
CSV = "0.10"
DataFrames = "0.21, 0.22, 1.0"
Distances = "0.10"
EcoBase = "0.1"
FASTX = "2"
InteractiveUtils = "1.8"
LinearAlgebra = "1.8"
Missings = "0.4, 1.0"
Phylo = "0.4, 0.5"
PopGen = "0.9"
RCall = "0.13"
RecipesBase = "0.6, 0.7, 0.8, 1"
Requires = "^1"
SpatialEcology = "0.9"
Statistics = "1.8"
StatsBase = "0.32, 0.33, 0.34"
StringDistances = "0.11"
julia = "1.8"

[extras]
AxisArrays = "39de3d68-74b9-583c-8d2d-e117c070f3a9"
BioSequences = "7e6ae17a-c86d-528c-b3b9-7f778a29fe59"
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
FASTX = "c2308a5c-f048-11e8-3e8a-31650f418d12"
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
Phylo = "aea672f4-3940-5932-aa44-993d1c3ff149"
PopGen = "af524d12-c74b-11e9-22a8-3b091653023f"
RCall = "6f49c342-dc21-5d91-9882-a32aef131414"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
SpatialEcology = "348f2d5d-71a3-5ad4-b565-8af070f99681"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
StringDistances = "88034a9c-02f8-509d-84a9-84ec65e18404"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["AxisArrays", "CSV", "Distances", "Phylo", "Random", "RCall", "SpatialEcology", "StatsBase", "Test"]
test = ["AxisArrays", "BioSequences", "CSV", "Distances", "FASTX", "Phylo", "Random", "RCall", "SpatialEcology", "StatsBase", "Test"]
3 changes: 2 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ makedocs(modules = [Diversity,
sitename = "Diversity.jl")

deploydocs(repo = "github.com/EcoJulia/Diversity.jl.git",
devbranch = "dev")
devbranch = "dev",
push_preview = true)
73 changes: 73 additions & 0 deletions ext/DiversityGeneticsExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
using Diversity
using Diversity.API

import LinearAlgebra
import BioSequences

abstract type AbstractGenetic <:
Diversity.API.AbstractTypes
end

struct GeneticFASTA{GeneticData} <: Diversity.API.AbstractTypes where
{ACID <: Alphabet, GeneticData <: AbstractVector{<: BioSequence(ACID)}}
dat::GeneticData
ntypes::Int64
Zmatrix::Matrix{Float64}
end

struct GeneticVCF{PopData} <: Diversity.API.AbstractTypes
dat::PopData
ntypes::Int64
Zmatrix::Matrix{Float64}
end

function _hammingDistance(geno1, geno2)
ismissing(geno1) || ismissing(geno2) && return missing
if length(geno1) > 2
@warn "hamming_distance may not work correctly for ploidy > 2"

Check warning on line 27 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L24-L27

Added lines #L24 - L27 were not covered by tests
end
#TODO Fix ploidy > 2 - e.g. (1, 1, 1, 2) ≠ (1, 2, 2, 2)

max(sum(geno1 .∉ Ref(geno2)), sum(geno2 .∉ Ref(geno1)))

Check warning on line 31 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L31

Added line #L31 was not covered by tests
end

function GeneticType(dat::PopData)

Check warning on line 34 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L34

Added line #L34 was not covered by tests
# Initialise objects
matrix_obj = PopGen.loci_matrix(dat)
ntypes = size(matrix_obj, 1)
output = zeros(Float64, ntypes, ntypes)
indices = PopGen.pairwise_pairs(1:ntypes)

Check warning on line 39 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L36-L39

Added lines #L36 - L39 were not covered by tests

# Calculate distance matrix
for (a, b) in indices
output[a, b] = sum(_hammingDistance.((@view matrix_obj[a, :]),
(@view matrix_obj[b, :])))
end
dist = LinearAlgebra.Symmetric(output)
dist /= maximum(dist)

Check warning on line 47 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L42-L47

Added lines #L42 - L47 were not covered by tests

# Calculate similarity matrix
Zmatrix = 1.0 .- dist

Check warning on line 50 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L50

Added line #L50 was not covered by tests

return GeneticVCF{PopData}(dat, ntypes, Zmatrix)

Check warning on line 52 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L52

Added line #L52 was not covered by tests
end

function GeneticType(dat::GeneticData) where

Check warning on line 55 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L55

Added line #L55 was not covered by tests
{ACID <: Alphabet, GeneticData <: AbstractVector{<: BioSequence(ACID)}}
# Initialise objects
ntypes = length(dat)
output = zeros(Int64, ntypes, ntypes)
indices = PopGen.pairwise_pairs(1:ntypes)

Check warning on line 60 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L58-L60

Added lines #L58 - L60 were not covered by tests

# Calculate distance matrix
for (a, b) in indices
output[a, b] = StringDistances.evaluate(StringDistances.Hamming(), dat[a], dat[b])
end
dist = LinearAlgebra.Symmetric(output)
dist /= maximum(dist)

Check warning on line 67 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L63-L67

Added lines #L63 - L67 were not covered by tests

# Calculate similarity matrix
Zmatrix = 1.0 .- dist

Check warning on line 70 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L70

Added line #L70 was not covered by tests

return GeneticFASTA{GeneticData}(dat, ntypes, Zmatrix)

Check warning on line 72 in ext/DiversityGeneticsExt.jl

View check run for this annotation

Codecov / codecov/patch

ext/DiversityGeneticsExt.jl#L72

Added line #L72 was not covered by tests
end
10 changes: 9 additions & 1 deletion src/Diversity.jl
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
matching longer ASCII names (e.g. ```NormalisedAlpha()```), which are.
We also provide functions to calculate appropriate
```subcommunityDiversity()``` and ```metacommunityDiversity()```
values for each measure, a general ```diversity()``` function for
values for each measure, and a general ```diversity()``` function to
extract any diversity measure at a series of scales.
"""
module Diversity
Expand Down Expand Up @@ -130,9 +130,17 @@
function __init__()
@require Phylo = "aea672f4-3940-5932-aa44-993d1c3ff149" include("../ext/DiversityPhyloExt.jl")
@require AxisArrays = "39de3d68-74b9-583c-8d2d-e117c070f3a9" include("../ext/DiversityAxisArraysExt.jl")
@require StringDistances = "88034a9c-02f8-509d-84a9-84ec65e18404" begin
@require BioSequences = "7e6ae17a-c86d-528c-b3b9-7f778a29fe59" begin
@require PopGen = "af524d12-c74b-11e9-22a8-3b091653023f" begin
include("../ext/DiversityGeneticsExt.jl")

Check warning on line 136 in src/Diversity.jl

View check run for this annotation

Codecov / codecov/patch

src/Diversity.jl#L134-L136

Added lines #L134 - L136 were not covered by tests
end
end
end
end
end

# From Phylo
abstract type AbstractPhyloTypes{Tree} <:
Diversity.API.AbstractTypes
end
Expand Down