Day 26 - Python - Strings 1

Celebrating the success of our 25 days, let's have a format of each DSA concepts for 10 days. Let's start with Strings! 

Problem Statement
You are given a compressed string where segments can include:

  • Lowercase and uppercase English letters
  • Digits
  • Special characters/punctuation (!@#$%^&*()_+-=[]{}|;:',.<>?)

Each segment in the compressed string consists of a character followed by its frequency (e.g., a3 means "aaa", @2 means "@@"). Consecutive segments are concatenated without spaces or delimiters. You must decompress the string by expanding each segment.

Special Rules

  1. Uppercase and lowercase letters are treated distinctly (e.g., A2 is "AA" and a2 is "aa").
  2. Digits that appear as characters in segments are expanded (e.g., 4 in 43 means "444", where the first 4 is a character and 3 is its frequency).
  3. Special characters are expanded just like letters or digits (e.g., @2 is "@@").

Your task is to reconstruct the original string by expanding all segments.

Input Format

  • A single line containing the compressed string SS.

Output Format

  • A single line containing the decompressed string.

Constraints

  • 1S2001 \leq |S| \leq 200 (length of the compressed string).
  • Each frequency is a positive integer between 1 and 99.
  • SS contains only valid segments.

Example Input 1

a3B2@4#1

Example Output 1

aaBB@@@@#

Example Input 2

x5Y1!3z2

Example Output 2

xxxxxY!!!zz

Example Input 3

A2b1C4$5

Example Output 3

AAbCCCC$$$$$

Explanation

  • In the first example:

    • a3 expands to "aaa"
    • B2 expands to "BB"
    • @4 expands to "@@@@"
    • #1 expands to "#"
  • In the second example:

    • x5 expands to "xxxxx"
    • Y1 expands to "Y"
    • !3 expands to "!!!"
    • z2 expands to "zz"
  • In the third example:

    • A2 expands to "AA"
    • b1 expands to "b"
    • C4 expands to "CCCC"
    • $5 expands to "$$$$$"

Python Code

import re
def decompress_string(s: str) -> str: matches = re.findall(r'(.)\d+', s) frequencies = re.findall(r'(\d+)', s) decompressed = ''.join(char * int(freq) for char,
                    freq in zip(matches, frequencies)) return decompressed compressed_string = input().strip() result = decompress_string(compressed_string) print(result)

Notes

  • The problem introduces complexity with diverse character sets and larger constraints.
  • Using regular expressions, the program efficiently extracts characters and their corresponding frequencies for reconstruction.
  • Edge cases include segments like A1 (output: "A"), #2 (output: "##"), and long strings to ensure the solution scales well.
Insights
  1. Diverse Character Sets: The inclusion of uppercase letters, lowercase letters, digits, and special characters challenges the ability to generalize solutions for all character types.

  2. Distinct Handling of Case Sensitivity: Uppercase and lowercase letters are treated distinctly (e.g., a2 is "aa" and A2 is "AA"), ensuring accurate case-specific decompression.

  3. Handling Special Characters: Special characters like @, #, or ! are expanded similarly to alphanumeric characters, requiring the algorithm to treat all characters uniformly.

  4. Multi-Digit Frequencies: The program must correctly handle frequencies with more than one digit (e.g., a12 for "aaaaaaaaaaaa") by ensuring the digits are parsed as integers without truncation.

  5. Regular Expression Utilization: Using regular expressions like (.)\d+ simplifies pattern matching by identifying any single character followed by its frequency, regardless of character type.

  6. Validation of Input: Although the problem guarantees valid input, ensuring robust parsing for future scalability (e.g., unexpected formats) improves the program's reliability.

  7. Efficient Decompression Logic: Using a single pass with zip(matches, frequencies) reduces unnecessary loops, optimizing the performance for longer strings.

  8. Scalability:With constraints up to  , the solution needs to handle up to 100 segments efficiently, especially if frequencies involve large numbers.

  9. Edge Case Awareness: Examples like A1 (output: "A"), #1 (output: "#"), or !3 (output: "!!!") test whether the program handles minimal and punctuation-based segments correctly.

  10. General Applicability: This problem introduces a real-world scenario for decompressing encoded data streams, making it applicable for text processing, data storage, and other computational tasks.

String compression is like life: cut out the fluff, hold onto what counts, and always make space for special characters! 🦋

Comments