Skip to content

A weighted random item sampler (selector), where the probability of selecting an item is proportional to its weight. The sampling method utilizes a binary search optimization, making it suitable for performance-demanding applications where the set of items is large and the sampling frequency is high.

License

Notifications You must be signed in to change notification settings

ori88c/weighted-random-item-sampler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weighted Random Item Sampler

The WeightedRandomItemSampler class implements a random sampler where the probability of selecting an item is proportional to its weight.

For example, given items [A, B] with respective weights [5, 12], the probability of sampling item B is 12/5 higher than the probability of sampling item A.

Weights must be positive numbers, and there are no restrictions on them being natural numbers. Floating point weights such as 0.95, 5.4, and 119.83 are also supported.

Use case examples include:

  • Distributed Systems: The sampler can assist in distributing workloads among servers based on their capacities or current load, ensuring that more capable servers handle a greater number of tasks.
  • Surveys and Polls: The sampler can be used to select participants based on demographic weights, ensuring a representative sample.
  • Attack Simulation: Randomly select attack vectors for penetration testing based on their likelihood or impact.
  • Anomaly Detection: Sample data points from a dataset with weights based on their anomaly scores for further analysis.
  • ML Model Training: Select training samples with weights based on their importance or difficulty to ensure diverse and balanced training data.

Key Features ✨

  • Weighted Random Sampling: Sampling items with proportional probability to their weight.
  • Efficiency ⚙️: O(log(n)) time and O(1) space per sample, making this class suitable for performance-demanding applications where the set of items is large and the sampling frequency is high.
  • Comprehensive documentation 📚: The class is thoroughly documented, enabling IDEs to provide helpful tooltips that enhance the coding experience.
  • Tests: Fully covered by unit tests.
  • No external runtime dependencies: Only development dependencies are used.
  • ES2020 Compatibility: The tsconfig target is set to ES2020, ensuring compatibility with ES2020 environments.
  • TypeScript support.

Use Case Example 👨‍💻

Consider a component responsible for selecting training-samples for a ML model. By assigning weights based on the importance or difficulty of each sample, we ensure a diverse and balanced training dataset.

import { WeightedRandomItemSampler } from 'weighted-random-item-sampler';

interface TrainingSampleData {
  // ...
}

interface TrainingSampleMetadata {
  importance: number; // Weight for sampling.
  // ...
}

interface TrainingSample {
  data: TrainingSampleData;
  metadata: TrainingSampleMetadata;
}

class ModelTrainer {
  private readonly _trainingSampler: WeightedRandomItemSampler<TrainingSample>;

  constructor(samples: ReadonlyArray<TrainingSample>) {
    this._trainingSampler = new WeightedRandomItemSampler(
      samples, // Items array.
      samples.map(sample => sample.metadata.importance) // Respective weights array.
    );
  }

  public selectTrainingSample(): TrainingSample {
    return this._trainingSampler.sample();
  }
}

License 📜

Apache 2.0

About

A weighted random item sampler (selector), where the probability of selecting an item is proportional to its weight. The sampling method utilizes a binary search optimization, making it suitable for performance-demanding applications where the set of items is large and the sampling frequency is high.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published