-
Notifications
You must be signed in to change notification settings - Fork 0
Documentation
PistolMagazine is a flexible and extensible tool for generating realistic data, suitable for testing, creating sample data for demonstrations, and populating databases. It offers various ways to customize and extend the data generation process to meet different needs.
Key Features✨:
- Highly Extensible: Easily extend Pistol Magazine by defining custom data providers and hooks, allowing you to generate data sets tailored to your specific requirements.
- Custom Providers: Create your own provider classes to generate specific types of data, making mock data more realistic and relevant.
- Hook System: Use hooks to execute operations at various stages of data generation, such as preprocessing, data validation, and modification.
- Diverse Data Models: Construct complex data models using classes like Dict, List, and Timestamp, enabling you to represent and generate a wide range of data structures.
- Multiple Export Options: Support for exporting generated data in CSV, JSON, and XML formats, or directly importing it into a database to meet various usage needs.
PistolMagazine, with its flexible architecture and diverse functionality, helps developers, testers, and data scientists generate the mock data they need.
The Str class is used to generate strings of a specific data type. By default, the data type is "word". The data_type parameter usage is similar to the usage in fake, supporting various common types.
Str(data_type="word")
- data_type (optional): Specifies the type of data to generate, default is "word". The supported types are similar to those used in fake.
mock()
Returns a random string of the specified data type.
match(value: str)
Returns the appropriate Str class instance based on the input string value. If value is a digit string, it returns a StrInt instance; if value is a float string, it returns a StrFloat instance; otherwise, it returns a Str instance.
- word
- name
- address
- text
The usage of these types is similar to their usage in fake.
from pistol_magazine import Str
# Generate a "word" type string by default
s = Str()
print(s.mock())
# Generate a "name" type string, e.g. Michelle Mendez
s = Str(data_type="name")
print(s.mock())
# Match and generate the corresponding type instance based on the input value
s = Str.match("123")
print(s.mock()) # This will return a StrInt instance, e.g. 8907407153424553311
The Int class is used to generate random integers. You can specify the number of bytes and whether the integer is unsigned.
Int(byte_nums=64, unsigned=False)
- byte_nums (optional): Specifies the number of bytes for the integer, default is 64.
- unsigned (optional): Specifies whether the integer is unsigned, default is False.
mock()
Returns a random integer within the specified range.
These subclasses are used to generate integers of specific byte sizes, either unsigned or signed.
- UInt8: Generates an 8-bit unsigned integer
- Int8: Generates an 8-bit signed integer
- UInt16: Generates a 16-bit unsigned integer
- Int16: Generates a 16-bit signed integer
- UInt32: Generates a 32-bit unsigned integer
- Int32: Generates a 32-bit signed integer
- UInt: Generates a 64-bit unsigned integer
UInt8()
Int8()
UInt16()
Int16()
UInt32()
Int32()
UInt()
mock()
Returns a random integer of the specified type and byte size.
from pistol_magazine import Int, UInt, Int8, UInt8, Int16, UInt16, Int32, UInt32
# Generate a 64-bit signed integer
i = Int()
print(i.mock())
# Generate a 64-bit unsigned integer
ui = UInt()
print(ui.mock())
# Generate an 8-bit unsigned integer
ui8 = UInt8()
print(ui8.mock())
# Generate an 8-bit signed integer
i8 = Int8()
print(i8.mock())
# Generate a 16-bit unsigned integer
ui16 = UInt16()
print(ui16.mock())
# Generate a 16-bit signed integer
i16 = Int16()
print(i16.mock())
# Generate a 32-bit unsigned integer
ui32 = UInt32()
print(ui32.mock())
# Generate a 32-bit signed integer
i32 = Int32()
print(i32.mock())
The Float class is used to generate random floating-point numbers. You can specify the maximum number of digits to the left and right of the decimal point, and whether the number is unsigned.
Float(left=2, right=2, unsigned=False)
- left (optional): Specifies the maximum number of digits to the left of the decimal point, default is 2.
- right (optional): Specifies the maximum number of digits to the right of the decimal point, default is 2.
- unsigned (optional): Specifies whether the floating-point number is unsigned, default is False.
mock()
Returns a random floating-point number within the specified range. If unsigned is False, the returned value will be positive.
get_datatype()
Returns the data type name.
from pistol_magazine import Float
# Generate a floating-point number with default range
f = Float()
print(f.mock()) # e.g., 12.34
# Generate an unsigned floating-point number with specified range
f_unsigned = Float(left=3, right=4, unsigned=True)
print(f_unsigned.mock()) # e.g., 123.4567
# Get the data type name
print(f.get_datatype()) # Output: "Float"
The Bool class is used to generate random boolean values. It can also check if a given value is of boolean type.
mock()
Returns a random boolean value (True or False).
match(value)
Checks if the given value is of boolean type.
from pistol_magazine import Bool
# Generate a random boolean value
b = Bool()
print(b.mock()) # e.g., True or False
# Check if a value is of boolean type
print(Bool.match(True)) # Output: True
print(Bool.match(1)) # Output: False
The Datetime class is used to generate random dates and times. You can specify the date format and time delta.
Datetime(date_format="%Y-%m-%d %H:%M:%S", **kwargs)
- date_format (optional): Specifies the format of the date and time, default is "%Y-%m-%d %H:%M:%S".
- kwargs (optional): Specifies time deltas like days, seconds, microseconds, milliseconds, minutes, hours, weeks.
mock()
Returns a random date and time within the current time range. If a time delta is specified, the returned date and time will be a random time within the range of current time minus and plus the delta.
match(value)
Checks if the given string value matches any of the defined date formats.
get_datatype()
Returns the data type name and date format.
from pistol_magazine import Datetime
# Generate current date and time with default format
dt = Datetime()
print(dt.mock()) # e.g., "2024-06-26 15:32:45"
# Generate a random date and time with specified format and time delta
dt_with_delta = Datetime(date_format="%Y-%m-%d %H:%M", days=2, hours=5)
print(dt_with_delta.mock()) # e.g., "2024-06-24 10:27"
# Check if a date string matches any of the defined date formats
print(Datetime.match("2024-06-26 15:32:45")) # Output: "%Y-%m-%d %H:%M:%S"
print(Datetime.match("2024-06-26T15:32:45")) # Output: "%Y-%m-%dT%H:%M:%S"
# Get the data type name and date format
print(dt.get_datatype()) # Output: "Datetime_%Y-%m-%d %H:%M:%S"
The Timestamp class is used to generate random timestamps with specified precision and time delta.
Timestamp(times=3, **kwargs)
- times (optional): Specifies the precision of the timestamp, can be 10 or 13, default is 13. 13 indicates millisecond precision, 10 indicates second precision.
- kwargs (optional): Specifies time deltas like days, seconds, microseconds, milliseconds, minutes, hours, weeks.
mock()
Returns a random timestamp within the current time range. If a time delta is specified, the returned timestamp will be a random time within the range of current time minus and plus the delta.
match(value)
Checks if the given value is a valid timestamp and returns its precision.
get_datatype()
Returns the data type name and timestamp precision.
from pistol_magazine import Datetime, Timestamp
# Generate a random timestamp with default precision
ts = Timestamp()
print(ts.mock()) # e.g., 1656089173123
# Generate a random timestamp with specified time delta
ts_with_delta = Timestamp(days=2, hours=5)
print(ts_with_delta.mock()) # e.g., 1656005173123
# Check if a value is a valid timestamp and return its precision
print(Timestamp.match(1656089173123)) # Output: 13
# Get the data type name and timestamp precision
print(ts.get_datatype()) # Output: "Timestamp_13"
The List class is used to generate random lists containing different types of field objects.
List(list_fields=None)
- list_fields (optional): Specifies a list of field objects for the list. If not specified, it defaults to including a string, an integer, and a float field object.
mock(to_json=False)
Returns a list of randomly generated data from the list of field objects. If the to_json parameter is True, the returned data will be serialized into JSON format.
get_datatype()
Returns a list of data type names for each field object in the list.
from pistol_magazine import List, Datetime, Timestamp, Str, Int, Float
# Generate a random list with default fields
lst = List()
print(lst.mock()) # e.g. ["involve", 42, 3.14]
# Generate a random list with custom fields
custom_list_format = [
Datetime(Datetime.D_FORMAT_YMD, days=2),
Timestamp(Timestamp.D_TIMEE10, days=2),
Float(left=2, right=4, unsigned=True),
Str(data_type="file_name"),
Int(byte_nums=6, unsigned=True)
]
lst_custom = List(list_fields=custom_list_format)
print(lst_custom.mock()) # e.g., ["2024-06-25 21:45:16", 1719483880, 76.4993, "coach.csv", 62]
# Convert the generated list into JSON format
print(lst.mock(to_json=True)) # Output: '["coach", 42, 3.14]'
# Get the data type names for each field object in the list
print(lst.get_datatype()) # Output: ["Str", "Int", "Float"]
The Dict class is used to generate random dictionaries containing different types of field objects.
Dict(dict_fields=None)
- dict_fields (optional): Specifies the field objects in the dictionary. If not specified, it defaults to including an integer, a string, and a timestamp field object.
mock(to_json=False)
Returns a dictionary of randomly generated data from the field objects. If the to_json parameter is True, the returned data will be serialized into JSON format.
get_datatype()
Returns a dictionary of data type names for each field object in the dictionary.
from pistol_magazine import Dict, Int, Str, Timestamp, Float, List, Datetime, StrInt
# Generate a random dictionary with default fields
d = Dict()
print(d.mock()) # e.g., {"a": 42, "b": "random_string", "c": 1656089173123}
# Generate a random dictionary with custom fields
custom_dict_format = {
"a": Float(left=2, right=4, unsigned=True),
"b": Timestamp(Timestamp.D_TIMEE10, days=2),
"C": List(
[
Datetime(Datetime.D_FORMAT_YMD_T, weeks=2),
StrInt(byte_nums=6, unsigned=True)
]
)
}
d_custom = Dict(dict_fields=custom_dict_format)
print(d_custom.mock()) # e.g., {"a": -25.5595, "b": 1719450850, "C": ["2024-07-05T03:00:27", "11"]}
# Convert the generated dictionary into JSON format
print(d.mock(to_json=True)) # Output: '{"a": 42, "b": "land", "c": 1656089173123}'
# Get the data type names for each field object in the dictionary
print(d.get_datatype()) # Output: {"a": "Int", "b": "Str", "c": "Timestamp"}
To define custom data providers, use the @provider decorator to designate a class as a data provider. Below is an example of defining a MyProvider class with a method user_status that returns either "ACTIVE" or "INACTIVE":
from pistol_magazine import provider
from random import choice
@provider
class MyProvider:
def user_status(self):
return choice(["ACTIVE", "INACTIVE"])
Hooks are functions executed at different stages of data generation. Use the @hook decorator to define hooks. Specify the hook_type, order, and hook_set parameters to control when and how hooks are triggered. For example:
from pistol_magazine.hooks.hooks import hook
@hook('pre_generate', order=1, hook_set='SET1')
def pre_generate_first_hook():
print("Start Mocking User Data")
@hook('after_generate', order=1, hook_set="SET1")
def after_generate_first_hook(data):
data['user_status'] = 'ACTIVE' if data['user_age'] >= 18 else 'INACTIVE'
return data
@hook('final_generate', order=1, hook_set="SET1")
def final_generate_second_hook(data):
# Suppose there is a function send_to_message_queue(data) to send data to the message queue
pass
- hook_type: Type of hook ('pre_generate', 'after_generate', 'final_generate').
- pre_generate: Executes operations before generating all data. Suitable for tasks like logging or starting external services.
- after_generate: Executes operations after generating each data entry but before final processing. Suitable for tasks like data validation or conditional modifications.
- final_generate: Executes operations after generating and processing all data entries. Suitable for final data processing, sending data to message queues, or performing statistical analysis.
- order: Execution order of the hook (lower values execute earlier).
- hook_set: Name of the hook set to group related hooks.
Utilize the mock method provided by your custom data model class (e.g., UserInfo) to generate mock data. Customize generation options such as JSON serialization, number of entries, key generation, output format, and hook set.
Define a class that inherits from DataMocker, such as UserInfo, to generate structured data. Customize fields using various field types (e.g., Int, Str, Timestamp, ProviderField, Dict, List) provided by the mock data framework.
Example: UserInfo Class
from pistol_magazine import DataMocker, Str, Int, Timestamp, Bool, ProviderField, Dict, List, StrInt, Datetime, Float, MyProvider
class UserInfo(DataMocker):
create_time: Timestamp = Timestamp(Timestamp.D_TIMEE10, days=2)
user_name: Str = Str(data_type="name")
user_email: Str = Str(data_type="email")
user_age: Int = Int(byte_nums=6, unsigned=True)
user_status: ProviderField = ProviderField(MyProvider().user_status)
user_marriage: Bool = Bool()
user_dict: Dict = Dict({
"a": Float(left=2, right=4, unsigned=True),
"b": Timestamp(Timestamp.D_TIMEE10, days=2)
})
user_list: List = List([
Datetime(Datetime.D_FORMAT_YMD_T, weeks=2),
StrInt(byte_nums=6, unsigned=True)
])
mock(
to_json: bool = False,
num_entries: Optional[int] = None,
key_generator: Optional[Callable[[], str]] = None,
as_list: bool = False,
hook_set: Optional[str] = 'default'
)
- to_json (bool): Serialize generated data as JSON (default False).
- num_entries (int): Number of data entries to generate (default None for single entry).
- key_generator (Callable[[], str]): Function to generate dictionary keys (default lambda: str(uuid.uuid4())).
- as_list (bool): Return generated data as a list (default False).
- hook_set (str): Name of the hook set to use (default 'default').
from pprint import pprint
import uuid
pprint(
UserInfo().mock(
num_entries=2,
as_list=False,
to_json=False,
hook_set='SET1',
key_generator=lambda: str(uuid.uuid4())
)
)
"""
e.g.
Start Mocking User Data
{'7f79875c-fe79-401b-9386-e68417dda747': {'create_time': 1719541419,
'user_age': 33,
'user_dict': {'a': -0.616,
'b': 1719549151},
'user_email': 'fullercheryl@example.com',
'user_list': ['2024-06-22T23:38:48',
'48'],
'user_marriage': True,
'user_name': 'Tiffany Blankenship',
'user_status': 'ACTIVE'},
'97a13622-0db4-4a1e-b176-bad1cd23777d': {'create_time': 1719458525,
'user_age': 14,
'user_dict': {'a': 79.333,
'b': 1719410667},
'user_email': 'qnicholson@example.net',
'user_list': ['2024-06-28T20:37:57',
'17'],
'user_marriage': True,
'user_name': 'James Nichols',
'user_status': 'INACTIVE'}}
"""
Supports exporting to CSV, JSON, XML, and MySQL.
Can be used in conjunction with hook functions.
The following examples demonstrate how to export data to CSV, JSON, and XML files, as well as how to export data to a MySQL database.
To export data to a CSV file, use the CSVExporter class:
from pistol_magazine import CSVExporter
data = [
{"name": "Alice", "age": 25, "city": "New York"},
{"name": "Bob", "age": 30, "city": "Los Angeles"},
{"name": "Charlie", "age": 35, "city": "Chicago"}
]
csv_exporter = CSVExporter()
csv_exporter.export(data, 'output.csv')
To export data to a JSON file, use the JSONExporter class:
from pistol_magazine import JSONExporter
data = [
{"name": "Alice", "age": 25, "city": "New York"},
{"name": "Bob", "age": 30, "city": "Los Angeles"},
{"name": "Charlie", "age": 35, "city": "Chicago"}
]
json_exporter = JSONExporter()
json_exporter.export(data, 'output.json')
To export data to an XML file, use the XMLExporter class:
from pistol_magazine import XMLExporter
data_xml = {
"users": [
{
"id": 1,
"name": "Alice",
"email": "alice@example.com",
"profile": {
"age": 30,
"city": "New York"
}
},
{
"id": 2,
"name": "Bob",
"email": "bob@example.com",
"profile": {
"age": 25,
"city": "Los Angeles"
}
}
]
}
xml_exporter = XMLExporter()
xml_exporter.export(data_xml, 'output.xml')
To export data to a MySQL database, use the DBExporter class:
from pistol_magazine import DBExporter
data = [
{"name": "Alice", "age": 25, "city": "New York"},
{"name": "Bob", "age": 30, "city": "Los Angeles"},
{"name": "Charlie", "age": 35, "city": "Chicago"}
]
db_config = {
"user": "User",
"password": "Password",
"host": "Localhost",
"port": 3306,
"database": "DB"
}
db_exporter = DBExporter(table_name='TableName', db_config=db_config)
db_exporter.export(data)
Provides several built-in providers for common use cases:
This class provides parameters in a cyclic manner from the given list.If no list is provided, it uses a default list of parameters.
from pistol_magazine import DataMocker, ProviderField, CyclicParameterProvider
class Param(DataMocker):
param: ProviderField = ProviderField(
CyclicParameterProvider(parameter_list=[10, 11, 12]).get_next_param
)
def param_info(self):
return self.mock(num_entries=6, as_list=True)
param = Param()
print(param.param_info())
[{'param': 10}, {'param': 11}, {'param': 12}, {'param': 10}, {'param': 11}, {'param': 12}]
This class always returns a fixed value.If no value is provided, it uses a default fixed value.
from pistol_magazine import DataMocker, ProviderField, FixedValueProvider
class Param(DataMocker):
param: ProviderField = ProviderField(
FixedValueProvider(fixed_value="STATIC").get_fixed_value
)
def param_info(self):
return self.mock(num_entries=2, as_list=True)
param = Param()
print(param.param_info())
[{'param': 'STATIC'}, {'param': 'STATIC'}]
This class provides incrementing values starting from a given value.
from pistol_magazine import DataMocker, ProviderField, IncrementalValueProvider
class Param(DataMocker):
param: ProviderField = ProviderField(
IncrementalValueProvider(start=0, step=2).get_next_value
)
def param_info(self):
return self.mock(num_entries=3, as_list=True)
param = Param()
print(param.param_info())
[{'param': 0}, {'param': 2}, {'param': 4}]
This class provides random values from the given list.If no list is provided, it uses a default list of values.
from pistol_magazine import DataMocker, ProviderField, RandomChoiceFromListProvider
class Param(DataMocker):
param: ProviderField = ProviderField(
RandomChoiceFromListProvider(value_list=["value1", "value2", "value3"]).get_random_value
)
def param_info(self):
return self.mock(num_entries=4, as_list=True)
param = Param()
print(param.param_info())
[{'param': 'value3'}, {'param': 'value1'}, {'param': 'value2'}, {'param': 'value1'}]
This class provides random float values within a specified range and precision.
from pistol_magazine import DataMocker, ProviderField, RandomFloatInRangeProvider
class Param(DataMocker):
param: ProviderField = ProviderField(
RandomFloatInRangeProvider(start=0.00, end=4.00, precision=4).get_random_float
)
def param_info(self):
return self.mock(num_entries=6, as_list=True)
param = Param()
print(param.param_info())
[{'param': 3.8797}, {'param': 3.4613}, {'param': 2.193}]