Skip to content

Generic sorting of large datasets (using temporary files as temporary space)

License

Notifications You must be signed in to change notification settings

luispedro/outsort

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Outsort: generic (Haskell-based) external sorting

Example

    import qualified Data.Conduit.Combinators as CC
    import qualified Data.Conduit.Binary as CB

    import Algorithms.OutSort (isolateBySize)
    import Algorithms.SortMain (sortMain)

    main :: IO ()
    main = sortMain
        CB.lines
        CC.unlinesAscii
        (isolateBySize (const 1) 500000)

All that is needed is a decoder (ConduitT ByteString a m ()), an encoder (ConduitT ByteString a m ()), and a function to split the input into blocks (ConduitT a a m ()). Given these elements, the result is a programme which can sort arbitrarily large inputs using external memory.

Licence: MIT

Author: Luis Pedro Coelho (email: coelho@embl.de) (on twitter: @luispedrocoelho)

About

Generic sorting of large datasets (using temporary files as temporary space)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published