Transcript:
Hello in this video. I would like to show you how to extract dates or time stamps from strings in pandas dataframes using regular expressions. So for example, we have four data in excel sheets like this. It is a simple table, and we have a very different sentences where time’s time is inside and for example we have. This timestamp is in the end of the sentences. This timestamp is in the middle of the sentence. This timestamp is in the middle of the sentence. Also, this timestamp is in the beginning of the sentence. Okay, and we have a very different variations of timestamps are located and this is a data what we have, and the first thing is to load the data, but before load the data we have to import pandas model import pandas. SPD and we have to read the data. Our data will be pandas read excel and we have to specify the file name and set index. It will be ideal because we want to. I’d will be our index. Yes, we have our data and go to the step number. Two in this step, we are going to create a one extra column where the date will be located. So do it date equal known so far it will be unknown and okay. We have one extra column. It will be with date and so far we happen on. In this step, we create indexes in order to manipulate with data frame more easily, so let’s do like this. For example, we will manage with descriptions and date column, so we need to create index of description. Equal data columns not get lock and it will be description and do the same with date column index of date and change here for better understanding. We can print out index description and index date, and we have 0 & 2 That means we have index 0 and Q. Because we have a numerical indexing of on the stage frame. It will be index 0 1 2 I think this is the most important part of this video, and we need to define the pattern of the date as you can see in the data frame. We have a timestamp in the same structure. We have the day we have. Mont and and here, and as you can see, we have two characters of day two characters of Monde and four characters of year. Then let’s specify this structure and Python. Let’s say it only is data pattern equal to let’s define a string, so it will be the first number from 0 to 9 from 0 to 9 and we have two characters for for day for day, so we have also from 0 to 9 – characters for the month and we have from 0 to 9 four characters for for the year. And what you can to explain here. We have two characters for day two characters for month and four characters for year and in the structure, we have separate signs between different parts of timestamp. So run it and go to the next step, okay, and now we have almost a final step and we’re now ready to extract dates from strengths values from our pandas dataframe and for this we have to import one. Extra model is le is responsible for regular expressions and in order to extract dates from data frame. We need to iterate through each row in data frame. So for this, we need to specify the for loop and let’s say for row and range from 0 to length of our data frame do following actions. The first thing in the loop is to together rate date from a string so for this, we need to use a regular expression model and use the search method and the first argument in here is a pattern that we have specified in previous step and let’s do is date pattern and the second parameter is where we want to extract the date from and we want to extract date from from the same row, but from description column and we can to specify it like this, the same row and index of description. OK, and what we get when we get a date, we we can to write it to date. Column, let’s do like this date the same row and they’d call them equal date. So let’s do like this, okay. No error print our data frame. Yes, and maybe you see one strange fake here. We don’t have a strength well. We have a object while we’re in date column, and for this, we need to write one extra comment in regular expression. It will be a group group method. Return a string value from this long line. That’s it and go to the final step so finally! I would like to have a different where the date will be on. The first place description will be on the second and well will be on on the last position in column, so let’s do like this data equal data and specify the order of column that I want to get so at the first position will be a date and the second will be this corruption and the last would be Waller date. Yes, we have. I’d date description and voila, so thank you for watching this video and see you on the next one.