Ceph Community News (2022-03-15 to 2022-04-15)

rosinL2022-04-26CephNewsQuincyopenEulerODD

Ceph Progress in the openEuler Community

ODD 2022 was an online developer conference held April 13, 2022 to April 15, 2022 and founded by the openEuler community of OpenAtom Foundation. The conference explored technical innovation for openEuler in server, cloud computing, edge computing, and embedded system scenarios, and officially released the all-scenario, long-term openEuler 22.03. The conference also presented joint innovation achievements of partners, shared business cases of large-scale deployment of openEuler in multiple industries, and held online meetings for community governance organizations. openEuler Ceph, built on native Ceph, is the high-quality, -reliability, and -performance version suited to the ARM architecture. The topics related to Ceph storage in ODD 2022 are as follows:

  • Impact of Cache Policies on Ceph Performance (Chi Xinze, XSKY)
    • This presentation described the impact of cache policies of different levels on performance in the Ceph RADOS architecture.
  • Improving Ceph and Big Data Application Performance Based on the UADK Accelerator (Dai Zhiwei, Huawei)
    • Data storage encryption and compression require high CPU computing power. UADK is an SVA-based user space accelerator development kit for Kunpeng CPUs. It can accelerate and offload computation encryption and compression services. This presentation also described the Ceph data system and compression features, and adaptation progress and performance of UADK in Ceph and big data applications.
  • Using Storage-Compute Decoupling and Operator Pushdown of Big Data + Ceph to Improve Compute and Storage Efficiency (Wu Qiqing, Huawei)
    • As a solution to implement data lakes, big data + Ceph uses data passthrough and operator pushdown schemes to decouple storage from compute, improving compute and storage efficiency and providing better performance and storage rate than HDFS.
  • Accelerating High-Performance Computing Library Using ARM SVE Vector Instructions (Xu Guodong and Zhuang Haojian, Linaro)
    • For high-performance computing (HPC) and storage fields, Arm released the Scalable Vector Extenstion (SVE), which has evolved to SVE2 and SME (Armv9). In addition to SVE evolution, chain support, and software adaptation, this presentation also introduced the advantages of SVE and key development technologies that run on ISA-L and XXHash coding.

openEuler 22.09 Ceph Version Planning

Ceph v17.2.0 Quincy Released

Major Changes from Pacific:

  • General
    • Filestore has been deprecated in Quincy. BlueStore becomes Ceph's default object store.
    • The device_health_metric pool has been renamed .mgr. It is now used as a common store for all ceph-mgr modules.
    • The ceph pg dump command now prints three additional columns: LAST_SCRUB_DURATION, SCRUB_SCHEDULING, and OBJECTS_SCRUBBED.
    • WITH_LEVEL_DB has been removed. Related services need to be migrated to RocksDB.
    • osd_memory_target_autotune is enabled by default, which sets the total RAM to 0.7.
  • Rados
    • OSD: Ceph now uses mclock_scheduler as the default osd_op_queue for BlueStore OSDs to provide QoS.
    • MGR: pg_autoscaler can now be turned on and off globally with the noautoscale flag. By default, it is set to on.
    • OSD: The on-wire compression for osd-osd communication is supported and it is set to off by default.
    • OSD: Concise reporting of slow ops is saved in the cluster log. The old and more verbose logging behavior can be regained by setting osd_aggrregated_slow_ops_logging to false.
  • RBD
    • rbd-nbd: The rbd device attch and rbd device detach commands are added, allowing for safe reattach after the rbd-nbd daemon is restarted since Linux kernel 5.14.
    • rbd-nbd: The notrim map option is added to support thick-provisioned images.
    • Large stabilization effort has been made for client-side persistent caching on SSD devices.
  • RGW
    • RGW now supports rate limiting by user and by bucket. Rate limiting could be applied to all users and all buckets by using a global configuration.
    • The message format of the S3 bucket notification event is modified.
    • The behavior for Multipart Upload was modified so that only a CompleteMultipartUpload notification is sent at the end of the multipart upload. The POST notification at the beginning of the upload and the notifications that were sent on each part are no longer sent.
  • CephFS
    • A file system can be created with a specific ID ("fscid"). This is useful in certain recovery scenarios (for example, when a monitor database has been lost and rebuilt, and the restored file system is expected to have the same ID as before).

Recently Merged PRs

Recently, the PRs mainly focus on bug fixing. The following describes notable changes:

  • BlueStore:
    • Disabled the NCB function on drives. The NCB code is used to restore the allocation map after the OSD crashes. The performance of the HDD is 20 times slower than that of the SSD. Therefore, this function is not applicable to the HDD environment. pr#45340
    • Fixed the NCB SimpleBitMap boundary check. pr#45733
    • Reconstructed the get_block_extents interface. pr#45250
  • Crimson:
    • Fixed the OSD crash caused by bufferlist parameter reference. pr#45415
    • Disabled the pg-autoscaling function for crimson-osd in vstart.sh. pr#45317
    • Added the following functions: cmp xattr cancel operation (pr#45774), omap cmp (pr#45630), osd_op_assert_ver (pr#45348)
  • OSD:
    • Fixed the snapshot deletion bug. Snapshot deletion is restarted only after scrub is complete. In Ceph FS scenarios, the PG status can be restored during the deletion of a large number of snapshots. pr#45785
    • Supported truncation sequences in sparse reads. pr#45103
  • Mon:
    • Modified the value range of target_size_ratio from 0.0->1.0 to 0.0->nolimit. Solved the problem that the input ratio is greater than 1.0 when ceph osd pool create <poolname> --target_size_ratio <ratio> is used.
  • Others:
    • Fixed some bugs in the dashboard and cephadm.
    • Fixed some bugs in RBD and GW.
    • Optimized documentation.

Recent Ceph Developer News

Each module of the Ceph community holds regular meetings to discuss and align the development progress. Meeting videos are uploaded to YouTube. The major meetings are as follows:

MeetingRemarksFrequency
Crimson SeaStore OSD Weekly MeetingCrimson & Seastore developmentWeekly
Ceph Orchestration MeetingCeph management module (mgr)Weekly
Ceph DocUBetter MeetingDocument optimizationBiweekly
Ceph Performance MeetingCeph performance optimizationBiweekly
Ceph Developer MonthlyCeph developer meetingMonthly
Ceph Testing MeetingVersion verification and releaseMonthly
Ceph Science User Group MeetingCeph scientific computingIrregularly
Ceph Leadership Team MeetingCeph leadership teamWeekly
Ceph Tech talksDiscussion on Ceph community technologiesMonthly

Performance Meeting

  • 2022-03-17: performance weekly
    • Crimson cyanstore performance test
    • RocksDB iterator performance discussion
    • PGLog optimization discussion
  • 2022-03-24: performance weekly
    • Quincy large write performance regression testing
    • RocksDB TombStone: Deleting a deleter from the LSM is quick, but it slows down query operations, especially a large number of TombStones hinder range queries.
  • 2022-03-31 performance weekly and 2022-04-07 performance weekly
    • Quincy write performance regression testing
    • The performance is lower than that of Pacific 16.2.7. The cause is still being analyzed and discussed.

[Disclaimer] This article only represents the author's opinions, and is irrelevant to this website. This website is neutral in terms of the statements and opinions in this article, and does not provide any express or implied warranty of accuracy, reliability, or completeness of the contents contained therein. This article is for readers' reference only, and all legal responsibilities arising therefrom are borne by the reader himself.