Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlap point segment patch #183 #184

Merged

Conversation

smitkadvani
Copy link
Contributor

@smitkadvani smitkadvani commented Feb 2, 2024

This PR addresses Issue. Overlap operations between point and segment is fixed.

import numpy as np
import bioframe as bf
import pandas as pd
df1 = pd.DataFrame(
        [
            ['chr1', 1, 1]
        ], 
        columns=['chrom','start','end']
    ).astype({"start": pd.Int64Dtype(), "end": pd.Int64Dtype()})

df2 = pd.DataFrame(
    [
        ['chr1', 1, 2]
    ], 
    columns=['chrom','start','end']
).astype({"start": pd.Int64Dtype(), "end": pd.Int64Dtype()})

b = bf.overlap(
    df1,
    df2,
    on=None,
    how="left",
    return_index=True,
    return_input=True,
)
print(b)
b = bf.overlap(
    df2,
    df1,
    on=None,
    how="left",
    return_index=True,
    return_input=True,
)
print(b)

Result :

   index chrom  start  end  index_ chrom_  start_  end_
0      0  chr1      1    1       0   chr1       1     2
   index chrom  start  end  index_ chrom_  start_  end_
0      0  chr1      1    2       0   chr1       1     1

@gfudenberg
Copy link
Member

this may break desired behavior of subtract-- let's try to write the subtract tests and see if they pass. If they do not, the place to think about modifying the code may be with the searchsorted for points: https://github.com/open2c/bioframe/blob/424d3a1609f960612591bcf04a2e3a983874d99c/bioframe/core/arrops.py#L333C1-L338C85

@smitkadvani
Copy link
Contributor Author

this may break desired behavior of subtract-- let's try to write the subtract tests and see if they pass. If they do not, the place to think about modifying the code may be with the searchsorted for points: https://github.com/open2c/bioframe/blob/424d3a1609f960612591bcf04a2e3a983874d99c/bioframe/core/arrops.py#L333C1-L338C85

expected behavior of subtract is contained and extra test cases added for subtraction operation

bioframe/core/arrops.py Outdated Show resolved Hide resolved
golobor
golobor previously requested changes Feb 8, 2024
Copy link
Member

@golobor golobor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems legit to me! It's a nice solution, hacky but not too much.
There would still be no overlap between ('chr1', 1, 2) and ('chr1, 2, 2), right? (maybe worth adding a small test, since our intervals are explicitly semi-open?)

bioframe/core/arrops.py Outdated Show resolved Hide resolved
bioframe/core/arrops.py Outdated Show resolved Hide resolved
bioframe/core/arrops.py Outdated Show resolved Hide resolved
bioframe/core/arrops.py Outdated Show resolved Hide resolved
bioframe/core/arrops.py Outdated Show resolved Hide resolved
tests/test_ops.py Outdated Show resolved Hide resolved
tests/test_ops.py Outdated Show resolved Hide resolved
tests/test_ops.py Outdated Show resolved Hide resolved
@gfudenberg
Copy link
Member

gfudenberg commented Mar 20, 2024

functionality added:

  • overlap now correctly deals with point intervals (e.g. ['chr1', 1, 1]). This is implemented internally by converting points to pseudo-len1 segments.

tests for overlap added:

  • no overlap ['chr1', 1, 2], ['chr1', 2, 2]
  • no overlap ['chr1', 2, 2], ['chr1', 1, 2]
  • yes overlap ['chr1', 1, 2], ['chr1', 1, 1]
  • yes overlap ['chr1', 1, 1], ['chr1', 1, 2]
  • no overlap ['chr1', 1, 1], ['chr1', 2, 2]

tests for subtract added:

  • subtract( ['chr1', 1, 2], ['chr1', 1, 1]) = ['chr1', 1, 2]
  • subtract( ['chr1', 0, 2], ['chr1', 1, 1]) = ['chr1', 0, 1], ['chr1', 0, 2]

closes #183

@gfudenberg gfudenberg merged commit ffceb61 into open2c:main Mar 20, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants