Wednesday, July 31, 2019

Podcast 1:3 - Decreasing Ingestion Congestion with Optane DCPMM

Big Data analytics needs data to be valuable. Collecting data from different machines, IoT devices, or sensors is the first step to being able to derive value from data. Ingesting data with Kafka is a common method used to collect this data. Find out how using Intel's Optane DC Persistent Memory to decrease ingestion congestion and increase total thruput of your ingestion solution.


Kafka in real examples

  • Ingestion is the first step in getting data into your data-lake or data warehouse
  • Kafka is basically a highly available distributed PubSubHub.
  • Data from a producer is published on Kafka Topics which consumers subscribe to. Topics give the ability segment the data for ingestion.
  • Topics can be spread over a set of servers on different physical devices to increase reliability and thruput.
  • Performance Best Practices
    • Buffer Size of the Producers should be a multiple of the message size
    • Batch Size can be changed based on the message size for optimal performance
    • * Spread Logs for partitions across multiple drives or on fast drives.
  • Example Configuration (LinkedIn)
    • 13 million messages per second, (2.75 GB/s) 
    • 1100 Kafka brokers organized into more than 60 clusters.
  • Automotive Space
    • One Customer has 100 Million Cars - 240KB/min/car
      • 1.6 Million Messages/sec
      • 800 GB/s
    • Approximate size of the installation
      • 4400 Brokers, over 240 Clusters.

Optane DC Persistent Memory

  • Ability to use Optane technology in a DDR4 DIMM form factor.
  • 128GB, 256 GB, 512GB PMMs are available. Up to 3 TB per socket
  • Two modes of operation: App Direct Mode, and Memory Mode.
  • Memory Mode gives the ability to have cheaper memory than typical DDR4 prices at a fraction of the cost. 
  • App Direct Mode means you can write a program to write directly to memory and it is persistent. Survives over reboots or power loss.
  • App Direct Mode can also be used to create ultra-fast filesystems with memory drives.
  • DCPMM uses DDR4 memory  and DCPMM in a mixed mode. Example a 16G DIMM paired with a 128G PMM. or a 64G DIMM Paired with a 512GB PMM.
  • Memory modes can be controlled in the Bios of from the linux kernel. 
    • ipmctl - utility for configuring and managing Intel Optane DC persistent memory modules (PMM).
    • ndctl – utility for managing (non-volatile memory device) sub-system in the Linux kernel

Improving Ingestion using Persistent Memory

  • Use Larger Memory Footprint for more kafa servers on the same machine with larger Heap Space
  • Change Kafka to write directly to Persistent Memory
  • Create a Persistent Memory Filesystem and point kafka logs to the new filesystem

Testing Results

  • Isolate performance variability by limiting the testing to one broker on the same machine as the producer.
    • Remove network variability and bottleneck of the network.
    • Decrease inter-broker communication and replica bottlenecks
    • Only producer is run to find the maximum that can be ingested.
    • Only Consumers are run to find the maximum that can be egressed.
    • Mixed Producer and Consumer are run to find passthru rates.
  • First approach. 50% persistent memory in App Direct Mode
    • 3x performance over Sata Drive mounted log files
    • 2x performance over Optance NVMe drives
  • Second approach. 100% persistent memory in App Direct Mode
    • 10x performance over Sata Drive mounted log files.
    • approximately ~2 Giga Bytes per second. over 150 MB/sec for SATA drive
  • Additional testing has been performed with Cluster to increase total thruput and we found we were limited not by the drive speed which is normally the case, but by the network speed. We were limited to 10 G bit network.



No comments:

Post a Comment