ceph.rados.core_workflows module

core_workflows module is a rados layer configuration module for Ceph cluster. It allows us to perform various day1 and day2 operations such as 1. Creating , modifying, setting , getting, writing, scrubbing, reading various pools like EC and replicated 2. Increase decrease PG counts, enable - disable - configure modules that do this 3. Enable logging to file, set and reset config params and cluster checks 4. Set-up email alerts and other cluster operations More operations to be added as needed

class ceph.rados.core_workflows.RadosOrchestrator(node: CephAdmin)

Bases: object

RadosOrchestrator class contains various methods that perform various day1 and day2 operations on the cluster Usage: The class is initialized with the CephAdmin object for various operations

autoscaler_pool_settings(**kwargs)

Sets various options on pools wrt PG Autoscaler :param **kwargs: various kwargs to be sent

Supported kw args: 1. pg_autoscale_mode: PG saler mode for the indivudial pool. Values-> on, warn, off. (str) 2. target_size_ratio: ratio of cluster pool will utilize. Values -> 0 - 1. (float) 3. target_size_bytes: size the pool is assumed to utilize. eg: 10T (str) 4. pg_num_min: minimum pg’s for a pool. (int)

Returns:

bench_read(pool_name: str, **kwargs) bool

Method to trigger Read operations via the Rados Bench tool :param pool_name: pool on which the operation will be performed :param kwargs: Any other param that needs to passed

  1. rados_read_duration -> duration of read operation (int)

Returns: True -> pass, False -> fail

bench_write(pool_name: str, **kwargs) bool

Method to trigger Write operations via the Rados Bench tool :param pool_name: pool on which the operation will be performed :param kwargs: Any other param that needs to passed :param 1. rados_write_duration -> duration of write operation: :type 1. rados_write_duration -> duration of write operation: int :param 2. byte_size -> size of objects to be written: eg : 10KB, 4096 :type 2. byte_size -> size of objects to be written: str

Returns: True -> pass, False -> fail

change_osd_state(action: str, target: int) bool

Changes the state of the OSD daemons wrt the action provided :param action: operation to be performed on the service, i.e start, stop, restart :param target: ID osd the target OSD

Returns: Pass -> True, Fail -> False

change_recover_threads(config: dict, action: str)

increases or decreases the recovery threads based on the action sent :param config: Config from the suite file for the run :param action: Set or remove increase the backfill / recovery threads

Values“set” -> set the threads to specified value

“rm” -> remove the config changes made

check_compression_size(pool_name: str, **kwargs) bool

Checks the given pool size against “compression_required_ratio” and verifies that data is compressed in accordance to the ratio provided :param pool_name: Name of the pool :param **kwargs: additional params needed.

Allowed values:

compression_required_ratio: ratio set on the pool for compression

Returns: True -> pass, False -> fail

collect_osd_daemon_ids(osd_node) dict

The method is used to collect the various OSD daemons present on a particular node :param osd_node: name of the OSD node on which osd daemon details are collected (ceph.ceph.CephNode): ceph node :return: list of OSD ID’s

configure_pg_autoscaler(**kwargs) bool

Configures pg_Autoscaler as a global global parameter and on pools :param **kwargs: Any other param that needs to be set

  1. mon_target_pg_per_osd -> Sets the target number of PG’s per OSD

  2. pool_config -> Config to be changed on the given pool (dict)

    for supported args, look autoscaler_pool_settings() doc

  3. pg_autoscale_value -> Mode of pg auto-scaling to be set, if pool name is provided (str)

    the allowed values are : 1. off -> turns off PG autoscaler on the given pool 2. warn -> displays warnings in ceph status, but does not trigger autoscale 3. on -> automatically autoscale based on PG count in pool

  4. default_mode -> Default mode to be set for all the newly created pools on the cluster (str)

    the allowed values are : 1. off -> turns off PG autoscaler on the given pool 2. warn -> displays warnings in ceph status, but does not trigger autoscale 3. on -> automatically autoscale based on PG count in pool

Returns: True -> pass, False -> fail

create_erasure_pool(name: str, **kwargs) bool

Creates a erasure code profile and then creates a pool with the same References: https://docs.ceph.com/en/latest/rados/operations/erasure-code/ :param name: Name of the profile to create :param **kwargs: Any other param that needs to be set in the EC profile

  1. k -> the number of data chunks (int)

  2. m -> the number of coding chunks (int)

  3. l -> Group the coding and data chunks into sets of size locality.

  4. crush-failure-domain -> crush object to be us to store replica sets (str)

  5. plugin -> plugin to be set (str)

    supported plugins: 1. jerasure (default) 2. isa 3. lrc 4. shec 5. clay

  6. pool_name -> pool name to create and associate with the EC profile being created

Returns: True -> pass, False -> fail

create_pool(pool_name: str, **kwargs) bool
Create a pool named from the pool_name parameter.
Args:

pool_name: name of the pool being created. kwargs: Any other args that need to be passed

  1. pg_num -> number of PG’s and PGP’s

  2. ec_profile_name -> name of EC profile if pool being created is a EC pool

  3. min_size -> min replication size for pool for pool to serve data

  4. size -> min replication size for pool for pool to write data

  5. erasure_code_use_overwrites -> allows overrides in an erasure coded pool

  6. allow_ec_overwrites -> This lets RBD and CephFS store their data in an erasure coded pool

  7. disable_pg_autoscale -> sets auto-scale mode off on the pool

  8. crush_rule -> custom crush rule for the pool

  9. pool_quota -> limit the maximum number of objects or the maximum number of bytes stored

Returns: True -> pass, False -> fail

detete_pool(pool: str) bool

Deletes the given pool from the cluster :param pool: name of the pool to be deleted

Returns: True -> pass, False -> fail

disable_configuration_checks(configs: list) bool

disables checks for the configs provided Note: Once enabled the module, all the config checks are enabled by default :param configs: list of config checks that need to be disabled. (list)

Returns: True -> Pass, False -> fail

enable_balancer(**kwargs) bool

Enables the balancer module with the given mode :param kwargs: Any other args that need to be passed :param Supported kw args:

  1. balancer_mode: There are currently two supported balancer modes (str) -> crush-compat -> upmap (default )

  2. target_max_misplaced_ratiothe percentage of PGs that are allowed to misplaced by balancer (float)

    target_max_misplaced_ratio = .07

  3. sleep_intervalnumber of seconds to sleep in between runs (int)

    sleep_interval = 60

Returns: True -> pass, False -> fail

enable_configuration_checks(configs: list) bool

Enables checks for the configs provided Note: Once enabled the module, all the config checks are enabled by default :param configs: list of config checks that need to be Enabled. (list)

Returns: True -> Pass, False -> fail

enable_email_alerts(**kwargs) bool

Enables the email alerts module and configures alerts to be sent References : https://docs.ceph.com/en/latest/mgr/alerts/ :param **kwargs: Any other param that needs to be set :param Various args that can be passed are: :param 1. smtp_host: :param 2. smtp_sender: :param 3. smtp_ssl: :param 4. smtp_port: :param 5. interval: :param 6. smtp_from_name: :param 7. smtp_destination:

Returns: True -> pass, False -> fail

enable_file_logging() bool

Enables the cluster logging into files at var/log/ceph and checks file permissions Returns: True -> pass, False -> fail

fetch_host_node(daemon_type: str, daemon_id: Optional[str] = None)

Provides the Ceph cluster object for the given daemon. ceph_cluster :param daemon_type: type of daemon

Allowed values: alertmanager, crash, mds, mgr, mon, osd, rgw, prometheus, grafana, node-exporter

Parameters

daemon_id – name of the daemon, ID in case of OSD’s

Returns: ceph object for the node

get_cluster_date()

Used to get the osd parameter value :param cmd: Command that needs to be run on container

Returns : string value

get_pg_acting_set(**kwargs) list

Fetches the PG details about the given pool and then returns the acting set of OSD’s from sample PG of the pool :param kwargs: Args that can be passed to fetch acting set

pool_name: name of the pool whose one of the acting OSD set is needed. pg_num: pg whose acting set needs to be fetched None: Collects the acting set of pool with ID 1

Parameters

eg

Returns: list osd’s part of acting set eg : [3,15,20]

get_pool_property(pool, props)

Used to fetch a given property set on the pool :param pool: name of the pool :param props: property to be fetched. :param Allowed values: :param size|min_size|pg_num|pgp_num|crush_rule|hashpspool|nodelete|nopgchange|nosizechange|: :param write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count|: :param hit_set_fpp|use_gmt_hitset|target_max_objects|target_max_bytes|cache_target_dirty_ratio|: :param cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age|: :param erasure_code_profile|min_read_recency_for_promote|all|min_write_recency_for_promote|fast_read|: :param hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval|: :param deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode|: :param compression_algorithm|compression_required_ratio|compression_max_blob_size|: :param compression_min_blob_size|csum_type|csum_min_block|csum_max_block|allow_ec_overwrites|: :param fingerprint_algorithm|pg_autoscale_mode|pg_autoscale_bias|pg_num_min|target_size_bytes|: :param target_size_ratio|dedup_tier|dedup_chunk_algorithm|dedup_cdc_chunk_size:

Returns: key value pair for the requested property Note : Trying to fetch the value for property, which has not been set will error out

list_pools() list

Collect the list of pools present on the cluster Returns: list of pool names

pool_inline_compression(pool_name: str, **kwargs) bool

BlueStore supports inline compression using snappy, zlib, or lz4. This module sets various compression modes and other related configs :param pool_name: pool name on which compression needs to be enabled and configured :param **kwargs: Various args that can be passed:

  1. compression_modeWhether data in BlueStore is compressed is determined by compression mode.
    The modes are:

    none: Never compress data. passive: Do not compress data unless the write operation has a compressible hint set. aggressive: Compress data unless the write operation has an incompressible hint set. force: Try to compress data no matter what.

  2. compression_algorithmcompression algorithm to be used.
    Supported:

    <empty string> snappy zlib zstd lz4

  3. compression_required_ratioThe ratio of the size of the data chunk after compression.

    eg : 0.7

  4. compression_min_blob_sizeChunks smaller than this are never compressed.

    eg : 10B

  5. compression_max_blob_sizeChunks larger than this value are broken into smaller blobs

    eg : 10G

Returns: Pass -> true , Fail -> false

reweight_crush_items(**kwargs) bool

Performs Re-weight of various CRUSH items, based on key-value pairs sent :param **kwargs: Arguments for the commands

Returns: True -> pass, False -> fail

run_ceph_command(cmd: str, timeout: int = 300)

Runs ceph commands with json tag for the action specified otherwise treats action as command and returns formatted output :param cmd: Command that needs to be run :param timeout: Maximum time allowed for execution.

Returns: dictionary of the output

run_deep_scrub(**kwargs)
Run scrub on the given OSD or on all OSD’s

Args: kwargs: 1. osd : if a OSD id is passed , scrub to be triggered on that osd

eg: obj.run_deep_scrub(osd=3)

Returns: True -> pass, False -> fail

run_scrub(**kwargs)
Run scrub on the given OSD or on all OSD’s
Args:

kwargs: 1. osd : if a OSD id is passed , scrub to be triggered on that osd

eg: obj.run_scrub(osd=3)

Returns: True -> pass, False -> fail

set_cluster_configuration_checks(**kwargs) bool
Sets up Cephadm to periodically scan each of the hosts in the cluster, and to understand the state of the OS,

disks, NICs etc ref doc : https://docs.ceph.com/en/latest/cephadm/operations/#cluster-configuration-checks

Parameters
  • kwargs – Any other param that needs to passed

  • are (The allowed list of configuration values that can be sent) –

  • disable_check_list (1.) – list of config checks that need to be disabled. (list)

  • enable_check_list (2.) – list of config checks that need to be Enabled. (list)

  • are

  • kernel_security (1.) – checks SELINUX/Apparmor profiles are consistent across cluster hosts

  • os_subscription (2.) – checks subscription states are consistent for all cluster hosts

  • public_network (3.) – check that all hosts have a NIC on the Ceph public_netork

  • osd_mtu_size (4.) – check that OSD hosts share a common MTU setting

  • osd_linkspeed (5.) – check that OSD hosts share a common linkspeed

  • network_missing (6.) – checks that the cluster/public networks defined exist on the Ceph hosts

  • ceph_release (7.) – check for Ceph version consistency - ceph daemons should be on the same release

  • kernel_version (8.) – checks that the MAJ.MIN of the kernel on Ceph hosts is consistent

Returns: True -> pass, False -> fail

set_pool_property(pool, props, value)

Used to fetch a given property set on the pool :param pool: name of the pool :param props: property to be set on pool.

Allowed values : size|min_size|pg_num|pgp_num|crush_rule|hashpspool|nodelete|nopgchange|nosizechange| write_fadvise_dontneed|noscrub|nodeep-scrub|hit_set_type|hit_set_period|hit_set_count| hit_set_fpp|use_gmt_hitset|target_max_objects|target_max_bytes|cache_target_dirty_ratio| cache_target_dirty_high_ratio|cache_target_full_ratio|cache_min_flush_age|cache_min_evict_age| erasure_code_profile|min_read_recency_for_promote|all|min_write_recency_for_promote|fast_read| hit_set_grade_decay_rate|hit_set_search_last_n|scrub_min_interval|scrub_max_interval| deep_scrub_interval|recovery_priority|recovery_op_priority|scrub_priority|compression_mode| compression_algorithm|compression_required_ratio|compression_max_blob_size| compression_min_blob_size|csum_type|csum_min_block|csum_max_block|allow_ec_overwrites| fingerprint_algorithm|pg_autoscale_mode|pg_autoscale_bias|pg_num_min|target_size_bytes| target_size_ratio|dedup_tier|dedup_chunk_algorithm|dedup_cdc_chunk_size

Parameters

value – value to be set for the property

Returns: Pass -> True, Fail -> False

verify_ec_overwrites(**kwargs) bool

Creates RBD image on overwritten EC pool & replicated metadata pool :param **kwargs: various kwargs to be sent

Supported kw args:
  1. image_name : name of the RBD image

  2. image_size : size of the RBD image

Returns: True -> pass, False -> fail

verify_reweight(affected_osds: list, osd_info: list) bool

Verifies if Re-weight of various CRUSH items reduced the data on the re-weighted OSD’s :param affected_osds: osd’s whose weights were changed :param osd_info: OSD details before the re-weight was performed

Returns: Pass -> True, Fail -> False