ceph.rados.core_workflows module
core_workflows module is a rados layer configuration module for Ceph cluster. It allows us to perform various day1 and day2 operations such as 1. Creating , modifying, setting , getting, writing, scrubbing, reading various pools like EC and replicated 2. Increase decrease PG counts, enable - disable - configure modules that do this 3. Enable logging to file, set and reset config params and cluster checks 4. Set-up email alerts and other cluster operations More operations to be added as needed
- class ceph.rados.core_workflows.RadosOrchestrator(node: CephAdmin)
Bases:
objectRadosOrchestrator class contains various methods that perform various day1 and day2 operations on the cluster Usage: The class is initialized with the CephAdmin object for various operations
- autoscaler_pool_settings(**kwargs)
Sets various options on pools wrt PG Autoscaler :param **kwargs: various kwargs to be sent
Supported kw args: 1. pg_autoscale_mode: PG saler mode for the indivudial pool. Values-> on, warn, off. (str) 2. target_size_ratio: ratio of cluster pool will utilize. Values -> 0 - 1. (float) 3. target_size_bytes: size the pool is assumed to utilize. eg: 10T (str) 4. pg_num_min: minimum pg’s for a pool. (int)
Returns:
- bench_read(pool_name: str, **kwargs) bool
Method to trigger Read operations via the Rados Bench tool :param pool_name: pool on which the operation will be performed :param kwargs: Any other param that needs to passed
rados_read_duration -> duration of read operation (int)
Returns: True -> pass, False -> fail
- bench_write(pool_name: str, **kwargs) bool
Method to trigger Write operations via the Rados Bench tool :param pool_name: pool on which the operation will be performed :param kwargs: Any other param that needs to passed :param 1. rados_write_duration -> duration of write operation: :type 1. rados_write_duration -> duration of write operation: int :param 2. byte_size -> size of objects to be written: eg : 10KB, 4096 :type 2. byte_size -> size of objects to be written: str
Returns: True -> pass, False -> fail
- change_osd_state(action: str, target: int) bool
Changes the state of the OSD daemons wrt the action provided :param action: operation to be performed on the service, i.e start, stop, restart :param target: ID osd the target OSD
Returns: Pass -> True, Fail -> False
- change_recover_threads(config: dict, action: str)
increases or decreases the recovery threads based on the action sent :param config: Config from the suite file for the run :param action: Set or remove increase the backfill / recovery threads
- Values“set” -> set the threads to specified value
“rm” -> remove the config changes made
- check_compression_size(pool_name: str, **kwargs) bool
Checks the given pool size against “compression_required_ratio” and verifies that data is compressed in accordance to the ratio provided :param pool_name: Name of the pool :param **kwargs: additional params needed.
- Allowed values:
compression_required_ratio: ratio set on the pool for compression
Returns: True -> pass, False -> fail
- collect_osd_daemon_ids(osd_node) dict
The method is used to collect the various OSD daemons present on a particular node :param osd_node: name of the OSD node on which osd daemon details are collected (ceph.ceph.CephNode): ceph node :return: list of OSD ID’s
- configure_pg_autoscaler(**kwargs) bool
Configures pg_Autoscaler as a global global parameter and on pools :param **kwargs: Any other param that needs to be set
mon_target_pg_per_osd -> Sets the target number of PG’s per OSD
- pool_config -> Config to be changed on the given pool (dict)
for supported args, look autoscaler_pool_settings() doc
- pg_autoscale_value -> Mode of pg auto-scaling to be set, if pool name is provided (str)
the allowed values are : 1. off -> turns off PG autoscaler on the given pool 2. warn -> displays warnings in ceph status, but does not trigger autoscale 3. on -> automatically autoscale based on PG count in pool
- default_mode -> Default mode to be set for all the newly created pools on the cluster (str)
the allowed values are : 1. off -> turns off PG autoscaler on the given pool 2. warn -> displays warnings in ceph status, but does not trigger autoscale 3. on -> automatically autoscale based on PG count in pool
Returns: True -> pass, False -> fail
- create_erasure_pool(name: str, **kwargs) bool
Creates a erasure code profile and then creates a pool with the same References: https://docs.ceph.com/en/latest/rados/operations/erasure-code/ :param name: Name of the profile to create :param **kwargs: Any other param that needs to be set in the EC profile
k -> the number of data chunks (int)
m -> the number of coding chunks (int)
l -> Group the coding and data chunks into sets of size locality.
crush-failure-domain -> crush object to be us to store replica sets (str)
- plugin -> plugin to be set (str)
supported plugins: 1. jerasure (default) 2. isa 3. lrc 4. shec 5. clay
pool_name -> pool name to create and associate with the EC profile being created
Returns: True -> pass, False -> fail
- create_pool(pool_name: str, **kwargs) bool
- Create a pool named from the pool_name parameter.
- Args:
pool_name: name of the pool being created. kwargs: Any other args that need to be passed
pg_num -> number of PG’s and PGP’s
ec_profile_name -> name of EC profile if pool being created is a EC pool
min_size -> min replication size for pool for pool to serve data
size -> min replication size for pool for pool to write data
erasure_code_use_overwrites -> allows overrides in an erasure coded pool
allow_ec_overwrites -> This lets RBD and CephFS store their data in an erasure coded pool
disable_pg_autoscale -> sets auto-scale mode off on the pool
crush_rule -> custom crush rule for the pool
pool_quota -> limit the maximum number of objects or the maximum number of bytes stored
Returns: True -> pass, False -> fail
- detete_pool(pool: str) bool
Deletes the given pool from the cluster :param pool: name of the pool to be deleted
Returns: True -> pass, False -> fail
- disable_configuration_checks(configs: list) bool
disables checks for the configs provided Note: Once enabled the module, all the config checks are enabled by default :param configs: list of config checks that need to be disabled. (list)
Returns: True -> Pass, False -> fail
- enable_balancer(**kwargs) bool
Enables the balancer module with the given mode :param kwargs: Any other args that need to be passed :param Supported kw args:
balancer_mode: There are currently two supported balancer modes (str) -> crush-compat -> upmap (default )
- target_max_misplaced_ratiothe percentage of PGs that are allowed to misplaced by balancer (float)
target_max_misplaced_ratio = .07
- sleep_intervalnumber of seconds to sleep in between runs (int)
sleep_interval = 60
Returns: True -> pass, False -> fail
- enable_configuration_checks(configs: list) bool
Enables checks for the configs provided Note: Once enabled the module, all the config checks are enabled by default :param configs: list of config checks that need to be Enabled. (list)
Returns: True -> Pass, False -> fail
- enable_email_alerts(**kwargs) bool
Enables the email alerts module and configures alerts to be sent References : https://docs.ceph.com/en/latest/mgr/alerts/ :param **kwargs: Any other param that needs to be set :param Various args that can be passed are: :param 1. smtp_host: :param 2. smtp_sender: :param 3. smtp_ssl: :param 4. smtp_port: :param 5. interval: :param 6. smtp_from_name: :param 7. smtp_destination:
Returns: True -> pass, False -> fail
- enable_file_logging() bool
Enables the cluster logging into files at var/log/ceph and checks file permissions Returns: True -> pass, False -> fail
- fetch_host_node(daemon_type: str, daemon_id: Optional[str] = None)
Provides the Ceph cluster object for the given daemon. ceph_cluster :param daemon_type: type of daemon
Allowed values: alertmanager, crash, mds, mgr, mon, osd, rgw, prometheus, grafana, node-exporter
- Parameters
daemon_id – name of the daemon, ID in case of OSD’s
Returns: ceph object for the node
- get_cluster_date()
Used to get the osd parameter value :param cmd: Command that needs to be run on container
Returns : string value
- get_pg_acting_set(**kwargs) list
Fetches the PG details about the given pool and then returns the acting set of OSD’s from sample PG of the pool :param kwargs: Args that can be passed to fetch acting set
pool_name: name of the pool whose one of the acting OSD set is needed. pg_num: pg whose acting set needs to be fetched None: Collects the acting set of pool with ID 1
- Parameters
eg –
Returns: list osd’s part of acting set eg : [3,15,20]
- get_pool_property(pool, props)
Used to fetch a given property set on the pool :param pool: name of the pool :param props: property to be fetched. :param Allowed values: :param size|min_size|pg_num|pgp_num|crush_rule|hashpspool|nodelete|nopgchange|nosizechange|: :param write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count|: :param hit_set_fpp|use_gmt_hitset|target_max_objects|target_max_bytes|cache_target_dirty_ratio|: :param cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age|: :param erasure_code_profile|min_read_recency_for_promote|all|min_write_recency_for_promote|fast_read|: :param hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval|: :param deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode|: :param compression_algorithm|compression_required_ratio|compression_max_blob_size|: :param compression_min_blob_size|csum_type|csum_min_block|csum_max_block|allow_ec_overwrites|: :param fingerprint_algorithm|pg_autoscale_mode|pg_autoscale_bias|pg_num_min|target_size_bytes|: :param target_size_ratio|dedup_tier|dedup_chunk_algorithm|dedup_cdc_chunk_size:
Returns: key value pair for the requested property Note : Trying to fetch the value for property, which has not been set will error out
- list_pools() list
Collect the list of pools present on the cluster Returns: list of pool names
- pool_inline_compression(pool_name: str, **kwargs) bool
BlueStore supports inline compression using snappy, zlib, or lz4. This module sets various compression modes and other related configs :param pool_name: pool name on which compression needs to be enabled and configured :param **kwargs: Various args that can be passed:
- compression_modeWhether data in BlueStore is compressed is determined by compression mode.
- The modes are:
none: Never compress data. passive: Do not compress data unless the write operation has a compressible hint set. aggressive: Compress data unless the write operation has an incompressible hint set. force: Try to compress data no matter what.
- compression_algorithmcompression algorithm to be used.
- Supported:
<empty string> snappy zlib zstd lz4
- compression_required_ratioThe ratio of the size of the data chunk after compression.
eg : 0.7
- compression_min_blob_sizeChunks smaller than this are never compressed.
eg : 10B
- compression_max_blob_sizeChunks larger than this value are broken into smaller blobs
eg : 10G
Returns: Pass -> true , Fail -> false
- reweight_crush_items(**kwargs) bool
Performs Re-weight of various CRUSH items, based on key-value pairs sent :param **kwargs: Arguments for the commands
Returns: True -> pass, False -> fail
- run_ceph_command(cmd: str, timeout: int = 300)
Runs ceph commands with json tag for the action specified otherwise treats action as command and returns formatted output :param cmd: Command that needs to be run :param timeout: Maximum time allowed for execution.
Returns: dictionary of the output
- run_deep_scrub(**kwargs)
- Run scrub on the given OSD or on all OSD’s
Args: kwargs: 1. osd : if a OSD id is passed , scrub to be triggered on that osd
eg: obj.run_deep_scrub(osd=3)
Returns: True -> pass, False -> fail
- run_scrub(**kwargs)
- Run scrub on the given OSD or on all OSD’s
- Args:
kwargs: 1. osd : if a OSD id is passed , scrub to be triggered on that osd
eg: obj.run_scrub(osd=3)
Returns: True -> pass, False -> fail
- set_cluster_configuration_checks(**kwargs) bool
- Sets up Cephadm to periodically scan each of the hosts in the cluster, and to understand the state of the OS,
disks, NICs etc ref doc : https://docs.ceph.com/en/latest/cephadm/operations/#cluster-configuration-checks
- Parameters
kwargs – Any other param that needs to passed
are (The allowed list of configuration values that can be sent) –
disable_check_list (1.) – list of config checks that need to be disabled. (list)
enable_check_list (2.) – list of config checks that need to be Enabled. (list)
are –
kernel_security (1.) – checks SELINUX/Apparmor profiles are consistent across cluster hosts
os_subscription (2.) – checks subscription states are consistent for all cluster hosts
public_network (3.) – check that all hosts have a NIC on the Ceph public_netork
osd_mtu_size (4.) – check that OSD hosts share a common MTU setting
osd_linkspeed (5.) – check that OSD hosts share a common linkspeed
network_missing (6.) – checks that the cluster/public networks defined exist on the Ceph hosts
ceph_release (7.) – check for Ceph version consistency - ceph daemons should be on the same release
kernel_version (8.) – checks that the MAJ.MIN of the kernel on Ceph hosts is consistent
Returns: True -> pass, False -> fail
- set_pool_property(pool, props, value)
Used to fetch a given property set on the pool :param pool: name of the pool :param props: property to be set on pool.
Allowed values : size|min_size|pg_num|pgp_num|crush_rule|hashpspool|nodelete|nopgchange|nosizechange| write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count| hit_set_fpp|use_gmt_hitset|target_max_objects|target_max_bytes|cache_target_dirty_ratio| cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age| erasure_code_profile|min_read_recency_for_promote|all|min_write_recency_for_promote|fast_read| hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval| deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode| compression_algorithm|compression_required_ratio|compression_max_blob_size| compression_min_blob_size|csum_type|csum_min_block|csum_max_block|allow_ec_overwrites| fingerprint_algorithm|pg_autoscale_mode|pg_autoscale_bias|pg_num_min|target_size_bytes| target_size_ratio|dedup_tier|dedup_chunk_algorithm|dedup_cdc_chunk_size
- Parameters
value – value to be set for the property
Returns: Pass -> True, Fail -> False
- verify_ec_overwrites(**kwargs) bool
Creates RBD image on overwritten EC pool & replicated metadata pool :param **kwargs: various kwargs to be sent
- Supported kw args:
image_name : name of the RBD image
image_size : size of the RBD image
Returns: True -> pass, False -> fail
- verify_reweight(affected_osds: list, osd_info: list) bool
Verifies if Re-weight of various CRUSH items reduced the data on the re-weighted OSD’s :param affected_osds: osd’s whose weights were changed :param osd_info: OSD details before the re-weight was performed
Returns: Pass -> True, Fail -> False