Extract and Validate Dates from Text
You are given a block of text that may contain dates in various formats. Your task is to extract and validate these dates using the following criteria:
The date must be in one of the following formats:
DD-MM-YYYY
MM/DD/YYYY
YYYY.MM.DD
Month Day, Year
(e.g.,January 10, 2025
)
The month should be a valid month name or number (e.g.,
January
,February
,03
,04
, etc.).The day should be a valid day number for the given month.
The year should be a valid 4-digit number.
If the date is valid, extract and print the date in YYYY-MM-DD
format. If invalid, print Invalid Date
.
Input Format:
- The first line contains an integer , the number of lines of text.
- The next lines each contain a string of text that may contain one or more dates.
Output Format:
For each line of text:
- If a valid date is found, print the date in the format
YYYY-MM-DD
. - If no valid date is found or the date is invalid, print: Invalid Date
Sample Input 1:
Sample Output 1:
Python Code:
Insights:
- Date Format Flexibility: The code supports multiple date formats, such as
DD-MM-YYYY
,MM/DD/YYYY
,YYYY.MM.DD
, andMonth Day, Year
, providing flexibility for various input scenarios. Regex Matching: The use of regular expressions allows the extraction of date patterns from strings, making it easy to handle different date formats in a concise manner.
Pattern-Based Validation: Each date format has a specific validation rule associated with it, ensuring that both numeric and textual representations of dates are processed correctly.
Month Validation: The code ensures that the month is within the valid range (1-12), preventing dates with non-existent months from being accepted.
Day Validation: The day is validated against the number of days in a specific month, accounting for variations such as leap years in February.
Leap Year Consideration: February is correctly validated for leap years (28 days in 2025), ensuring that the code can handle date validation in different years.
Multiple Dates Per Line: The code can handle multiple dates in a single line by using
re.findall()
, which extracts all matching date patterns, allowing for efficient processing of complex input.Error Handling: Invalid dates (such as those with out-of-range months or days) are identified and marked as "Invalid Date," ensuring clear feedback to users.
Scalability: This approach can be extended to include additional date formats by adding new patterns, making it scalable for future enhancements or changes in date formatting requirements.
Readability: The code uses clear and structured patterns with well-defined validation rules, making it easy to understand and maintain, even for complex date input scenarios.
Comments
Post a Comment