Modules
Returns a list of tuples comprising the located date and the word index at which it was found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The corpus of text in which to find dates/times. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[str, int]]
|
list[tuple[str, int]]: A list of tuples containing a string representing the date and time and an integer word index at which it was found. |
Examples:
Get dates from a text sample.
>>> find_dates("A thing happened on Jan 1st 2012 and the next morning at 09:15 and also jan 15th at 12am in 2018.")
[
('2012-01-01', 4),
('2012-01-02 09:15', 9),
('2018-01-15 12:00', 15)
]
Source code in date_fuzz/extraction.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
Function to remove all date/time indicators from a text sample.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text block to strip. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Text block with dates/times removed. |
Examples:
This can be used to get raw text once dates have been extracted.
>>> strip_dates("Jan 1st 2012: A thing happened.")
A thing happened.
Source code in date_fuzz/extraction.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |