Skip to content

Packetize Data Optimally

Level: L3-L5 Topics: Greedy, Math, Division, Edge Cases

Problem Statement

You need to split data of a given length into packets. Each packet has a maximum allowed size. The optimization goals are (in priority order):

  1. Minimize the number of packets k.
  2. Among solutions with k packets, distribute the data as evenly as possible: every packet must have size either floor(data_length / k) or ceil(data_length / k). Exactly data_length mod k packets take the larger size; the rest take the smaller. (This makes both the maximum packet and the maximum-minus-minimum spread minimal.)

Implement the function:

vector<size_t> splitData(size_t data_length, size_t max_packet_size)

The function returns a list of packet sizes that sum to data_length, where each packet is at most max_packet_size.

Background & Constraints

  • data_length and max_packet_size are non-negative integers.
  • Handle edge cases: data_length = 0 (return empty list), max_packet_size = 0 with non-zero data (error or impossible).
  • Packets should be as equal in size as possible to minimize the largest one.
  • The order of packets in the returned list does not matter.

Examples

Input: data_length = 10, max_packet_size = 4

  • Minimum packets: ceil(10 / 4) = 3.
  • Even distribution with k = 3: floor(10 / 3) = 3, 10 mod 3 = 1. So 1 packet of size 4 and 2 packets of size 3.

Output: [4, 3, 3] (or any permutation)

Why not [4, 4, 2]? Same packet count (3) and same max (4), but rule 2 requires every packet to be either floor(10/3)=3 or ceil(10/3)=4. Size 2 is neither, so [4, 4, 2] is rejected.

Input: data_length = 12, max_packet_size = 5

  • Minimum packets: ceil(12/5) = 3.
  • Even distribution: [4, 4, 4].

Output: [4, 4, 4]

Hints & Common Pitfalls

  • The minimum number of packets is ceil(data_length / max_packet_size).
  • To distribute evenly: with k packets, each packet gets at least data_length / k (integer division). The remainder data_length % k packets get one extra unit.
  • Be careful with integer division and ceiling: ceil(a/b) = (a + b - 1) / b for positive integers.
  • Handle the zero and negative input cases explicitly — do not let them cause division by zero.
  • A common mistake is making all packets max_packet_size and leaving a tiny last packet, which does not minimize the largest packet.

Follow-Up Questions

  1. Header space: The first packet must reserve H bytes for a header, so its effective data capacity is max_packet_size - H. How does this change the calculation? Recompute the minimum packet count accounting for the reduced first packet, then distribute the remaining data evenly across the other packets.
  2. Discrete split positions: Data can only be split at specific positions (e.g., you cannot split in the middle of a record). Given a list of allowed split positions, find the optimal packetization. This becomes a more complex optimization — consider dynamic programming or a greedy approach that picks split points closest to the ideal even distribution.